Birthday Problem

DOWNLOAD Mathematica Notebook

Consider the probability Q_1(n,d) that no two people out of a group of n will have matching birthdays out of d equally possible birthdays. Start with an arbitrary person's birthday, then note that the probability that the second person's birthday is different is (d-1)/d, that the third person's birthday is different from the first two is [(d-1)/d][(d-2)/d], and so on, up through the nth person. Explicitly,

Q_1(n,d)=(d-1)/d(d-2)/d...(d-(n-1))/d
(1)
=((d-1)(d-2)...[d-(n-1)])/(d^(n-1)).
(2)

But this can be written in terms of factorials as

 Q_1(n,d)=(d!)/((d-n)!d^n),
(3)

so the probability P_2(n,d) that two or more people out of a group of n do have the same birthday is therefore

P_2(n,d)=1-Q_1(n,d)
(4)
=1-(d!)/((d-n)!d^n).
(5)

In general, let Q_i(n,d) denote the probability that a birthday is shared by exactly i (and no more) people out of a group of n people. Then the probability that a birthday is shared by k or more people is given by

 P_k(n,d)=1-sum_(i=1)^(k-1)Q_i(n,d).
(6)

In general, Q_k(n,d) can be computed using the recurrence relation

 Q_k(n,d)=sum_(i=1)^(|_n/k_|)[(n!d!)/(d^(ik)i!(k!)^i(n-ik)!(d-i)!)sum_(j=1)^(k-1)Q_j(n-ik,d-i)((d-i)^(n-ik))/(d^(n-ik))]
(7)

(Finch 1997). However, the time to compute this recursive function grows exponentially with k and so rapidly becomes unwieldy.

If 365-day years have been assumed, i.e., the existence of leap days is ignored, and the distribution of birthdays is assumed to be uniform throughout the year (in actuality, there is a more than 6% increase from the average in September in the United States; Peterson 1998), then the number of people needed for there to be at least a 50% chance that at least two share birthdays is the smallest n such that P_2(n,365)>=1/2. This is given by n=23, since

P_2(23,365)=(38093904702297390785243708291056390518886454060947061)/(75091883268515350125426207425223147563269805908203125)
(8)
 approx 0.507297.
(9)

The number n of people needed to obtain P_2(n,d)>=1/2 for d=1, 2, ..., are 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, ... (OEIS A033810). The minimal number of people to give a 50% probability of having at least n coincident birthdays is 1, 23, 88, 187, 313, 460, 623, 798, 985, 1181, 1385, 1596, 1813, ... (OEIS A014088; Diaconis and Mosteller 1989).

The probability P_2(n,d) can be estimated as

P_2(n,d) approx 1-e^(-n(n-1)/2d)
(10)
 approx 1-(1-n/(2d))^(n-1),
(11)

where the latter has error

 epsilon<(n^3)/(6(d-n+1)^2)
(12)

(Sayrafiezadeh 1994).

Q_2 can be computed explicitly as

Q_2(n,d)=(n!)/(d^n)sum_(i=1)^(|_n/2_|)1/(2^i)(d; i)(d-i; n-2i)
(13)
=(d!)/(d^n(d-n)!)[_2F_1(1/2n,1/2(1-n);d-n+1;2)-1],
(14)

where (n; m) is a binomial coefficient and _2F_1(a,b,;c;z) is a hypergeometric function. This gives the explicit formula for P_3(n,d) as

P_3(n,d)=1-Q_1(n,d)-Q_2(n,d)
(15)
=1-d^(-n)d!_2F^~_1(1/2n,1/2(1-n);1+d-n;2),
(16)

where _2F^~_1(a,b;c;z) is a regularized hypergeometric function.

A good approximation to the number of people n such that p=P_k(n,d) is some given value can be given by solving the equation

 ne^(-n/(dk))=[d^(k-1)k!ln(1/(1-p))(1-n/(d(k+1)))]^(1/k)
(17)

for n and taking [n], where [n] is the ceiling function (Diaconis and Mosteller 1989). For p=0.5 and k=1, 2, 3, ..., this formula gives n=1, 23, 88, 187, 313, 459, 622, 797, 983, 1179, 1382, 1592, 1809, ... (OEIS A050255), which differ from the true values by from 0 to 4. A much simpler but also poorer approximation for n such that p=0.5 for k<20 is given by

 n=47(k-1.5)^(3/2)
(18)

(Diaconis and Mosteller 1989), which gives 86, 185, 307, 448, 606, 778, 965, 1164, 1376, 1599, 1832, ... for k=3, 4, ... (OEIS A050256).

The "almost" birthday problem, which asks the number of people needed such that two have a birthday within a day of each other, was considered by Abramson and Moser (1970), who showed that 14 people suffice. An approximation for the minimum number of people needed to get a 50-50 chance that two have a match within k days out of d possible is given by

 n(k,d)=1.2sqrt(d/(2k+1))
(19)

(Sevast'yanov 1972, Diaconis and Mosteller 1989).

Wolfram Web Resources

Mathematica »

The #1 tool for creating Demonstrations and anything technical.

Wolfram|Alpha »

Explore anything with the first computational knowledge engine.

Wolfram Demonstrations Project »

Explore thousands of free applications across science, mathematics, engineering, technology, business, art, finance, social sciences, and more.

Computerbasedmath.org »

Join the initiative for modernizing math education.

Online Integral Calculator »

Solve integrals with Wolfram|Alpha.

Step-by-step Solutions »

Walk through homework problems step-by-step from beginning to end. Hints help you try the next step on your own.

Wolfram Problem Generator »

Unlimited random practice problems and answers with built-in Step-by-step solutions. Practice online or make a printable study sheet.

Wolfram Education Portal »

Collection of teaching and learning tools built by Wolfram education experts: dynamic textbook, lesson plans, widgets, interactive Demonstrations, and more.

Wolfram Language »

Knowledge-based programming for everyone.