Distribution of matches between two shuffled decks

Take two desks of cards and shuffle them. They can be standard 52-card decks, though the number of cards in the decks doesn’t matter as long as they’re the same and the decks are fairly large.

Now count the number of times the two desks match, i.e. how many times the same card is in the same position in both desks. The number of matches is random, and its distribution is approximately Poisson with mean 1. Let’s do a simulation and see how close the results come to the predicted outcome.

Here’s the Python code:

import numpy as np
from scipy.stats import poisson
import matplotlib.pyplot as plt

def count_zeros(x):
    return len(x[x==0])

num_reps = 10000
deck_size = 52

matches = np.zeros(deck_size+1, dtype=int)

# Simulation
for _ in range(num_reps):

    # Shuffle two decks
    a = np.random.permutation(deck_size)
    b = np.random.permutation(deck_size)

    # Count how often they match
    num_matches = count_zeros(a-b)
    matches[ num_matches ] += 1

# Cut off outputs too small to see
cutoff = 8

# Matches predicted by a Poisson distribution with mean 1.
predicted = [num_reps*poisson(1).pmf(i) for i in range(cutoff)]

# Plot results
x = np.arange(cutoff)
w = 0.3 # bar width
plt.bar(x, matches[0:cutoff], w)
plt.bar(x+w, predicted, w)
plt.legend(["actual", "predicted"])

And here’s the output based on 10,000 simulations:

Plot showing good fit between predicted and actual

About 1/3 of the time, you get no matches, another 1/3 of the time you get one match, and the rest of the time you get more. More precisely, according to the Poisson model zero matches and one match are both have probability 1/e.

8 thoughts on “Distribution of matches between two shuffled decks

  1. I think there’s a mistake in the count_zeros() function. First I thought, what a nifty syntax for filtering a list! How come I never saw that anywhere! But it doesn’t seem to work, x==0 just evaluates to false when x is a list and then you just get x[0] out of it. When I run your code, the “actual” column show as zero.

  2. No, sorry, my mistake. It does work correctly with numpy arrays. It’s for some other reason that the plot shows incorrectly for me..

  3. Ah, I managed to break something in copy-pasting. Everything is correct, please ignore my comments.

  4. I wonder how this would look if you used a more real-life shuffle simulation (eg. move a random number of cards from the top of the deck to the bottom of the deck (in the same order) a certain number of times)?

  5. Nice Python code, thanks. I always advocate Python for this purpose with all its libraries.

    By the way, you do not need to shuffle two decks. Quite obviously, you just need to count the number of cards that keep place in a shuffled deck.

    Here is the code in Euler Math Toolbox:

    >loop 1 to nreps; …
    > k=sum(1:52==shuffle(1:52))+1; matches[k]=matches[k]+1; …
    >i=0:7; columnsplot(matches[i+1],i); …

  6. @Rene: Thanks.

    You’re right that you could just shuffle one deck. I thought about doing that in the simulation, but I thought some readers might be distracted by that.

Comments are closed.