The Birthday Problem

← Back

The Birthday Problem is one of those clear examples that our intuitions about probability can be severely flawed. I find these examples fascinating (see my post on The Monty Hall Problem) and I think they’re valuable to learn about because they can help us improve our understanding of probability in our own lives. Most of us will never have a good enough understanding of probability to be able to accurately estimate the probability of complex events but, at the very least, we can gain a better understanding of the limits of our intuitions and the kind of cases where we are most likely to make mistakes if we aren’t careful.
 
The problem concerns the probability of a group of people sharing a birthday. It has become well-known because of how surprising most people find the probability to be. Let’s say you work with 22 colleagues and you want to know the probability of any 2 people sharing the same birthday. If you’re not familiar with the Birthday Problem then take a moment to guess. You will probably be surprised to learn that it’s over 50%; 50.7% percent to be precise. If your company keeps growing then by the time you’ve got 40 colleagues the odds of a shared birthday are 90%, and by the time there’s 57 of you the odds are 99%. I think most people would be surprised at just how unlikely it would be for a group of 57 people to not have a shared birthday among them. It’s important to note that this doesn’t mean that you personally, or any other individual, would have a 99% chance of sharing a birthday with someone else; it’s any 2 people sharing a birthday.
 
So why is the probability so high? Well, there are 365 days in a year (we’ll ignore leap-years for the sake of simplicity) and, to state the obvious, for there to be no shared birthdays, each new person you add must have a birthday that is not the same as any of the previous birthdays. It’s easiest to invert the problem and find the probability of any birthday being unique. If you’ve got one person then the numerator is 365 and the denominator is 365 so the probability of the birthday being unique (there’s nobody else in the group to share a birthday with) is 100%. As you add more people to the group the numerator goes down by 1 each time because there is 1 less unique birthday available. By the time there’s 366 people in the group the numerator is 0, which means the probability of no shared birthdays is (0 / 320) * 100 = 0%; it’s impossible for there to not be any shared birthdays because there are more people than there are days.
 
So the formula for working out the probability that any one birthday will be unique is: (365 – (n – 1)) / 365 where “n” is the number of people in the group.
 
Some examples:
 

u = Probability of a birthday being unique

Group of 2 people:
n = 2
u = (365 - (n - 1)) / 365 = 0.997

Group of 3 people:
n = 3
u = (365 - (n - 1)) / 365 = 0.995

 
So we’ve worked out the probability of any individual birthday being unique. In order to find the probability of all the birthdays being unique we need to combine the probabilities by multiplying them. We can then find the probability of the opposite outcome (that any birthdays will be shared) by subtracting the previous answer from 1.
 

Group of 3 people:
s = Probability of a shared birthday

p = 0.997 * 0.995 = 0.992
s = 1 - 0.992 = 0.008 (0.008 * 100 = 0.8%)

 
So the probability of anyone sharing a birthday in a group of 3 people is 0.8%. That sounds perfectly reasonable. So why, by the time we get to 23 people, does the probability exceed 50%? It’s because at each step of the way the probability multiplies. The probability of any individual sharing a birthday in a group of 23 remains quite low at just 6% but it’s the combined probability of all those low probability events along the way, from the 0.3% chance of the first pair sharing a birthday, to the 0.8% chance of a shared birthday between the first 3, all the way up to the 6% chance of the 23rd person sharing a birthday with anyone else. You multiply all these probabilities together and you get 50.7%, meaning it would actually be more unlikely if there were no shared birthdays in a group of 23.
 

Making an Algorithm

 
Working out the probability of shared birthdays, particularly when dealing with large numbers of people, would be very time consuming if done by hand. This is because in order to find the probability of any shared birthday among a group of 30 people, for example, you would need to work out the probability for each step of the way. In other words, you’d need to perform the calculations 30 times. This makes it an ideal candidate for a computer algorithm to do the work for us.
 
Here’s my JavaScript solution:
 

function calculateProbabilityOfSharedBirthdays(n) {
  let accumulator = 1;

  for (n; n > 0; n--) {
    accumulator = ((365 - (n - 1)) / 365) * accumulator;
  }

  // invert to find probability of shared birthdays and convert to percentage
  return (1 - accumulator) * 100;
}

Recent Blog Posts

A laptop with PHP code on the screen

A look at some of PHP 7.4’s new features

PHP 7.4 was released on 28 November 2019 and with it comes a lot of new features. In this post we'll examine a few of the more interesting ones along with examples of how they can be used. PHP has long been the punchline in low-effort memes, and though some of the disdain for the language is justified, more recent… Continue reading »

Some text from a computer terminal

Scheduling database backups with cron jobs

We're going to create a bash script which runs mysqldump (a MySQL backup program) to generate a database backup with a filename containing the date. We're then going to set up a cron jobĀ (a utility for running scheduled tasks) to run the script at regular intervals. We'll do all of this securely so that other users on the server can't… Continue reading »