Four views of multisets

This post will define multisets and basic operations on multisets.

We’ll view union, intersection, inclusion and sum each from four perspectives:

    1. Examples with words
    2. Example with prime factorization
    3. Using Python’s  multiset module
    4. Multisets as functions

Definition and examples

A multiset is like a set, except each element may appear more than once. We say each element has a multiplicity.

Example with words

For example, the set of letters in the word mississippi is {i, m, p, s} but the multiset of letters would be

{i, i, i, i, m, p, p, s, s, s, s}

As with sets, order doesn’t matter. We could write the multiset of letters in mississippi more compactly as

{i4, m1, p2, s4}

This looks a lot like a prime factorization, and indeed prime factorization will be our running example in this post.

Example with prime factors

The factors of 2022 form an ordinary set because

2022 = 2 × 3 × 337

but the factors of 20221026 form a multiset because

20221026 = 2 × 3 × 7² × 109 × 631,

i.e. the factor 7 has multiplicity 2 because 20221026 is divisible by 49.

Python example

The factorint function in sympy returns a multiset, represented as a dictionary. The keys are the prime factors and the values are the multiplicities.

    >>> from sympy import factorint
    >>> factorint(2022)
    {2: 1, 3: 1, 337: 1}
    >>> factorint(20221026)
    {2: 1, 3: 1, 7: 2, 109: 1, 631: 1}

Suppose the members of our multisets come from a universal set A. A multiset can be represented as a function f from A to the set of non-negative integers, mapping each element to its multiplicit. In the case of prime factorizations, A is the set of prime numbers and f(p) is the exponent of p in the prime factorization of some number.

We will use the Python library multiset to represent multisets. We could initialize a Multiset object with the return value of a call to factorint.

    >>> from multiset import Multiset
    >>> M = Multiset(factorint(20221026))
    >>> M
    Multiset({2: 1, 3: 1, 7: 2, 109: 1, 631: 1})

Union

The elements in the union of two multisets are the union of the elements of in each of the multisets, with multiplicity equal to the maximum from each multiset. An example will make this clearer.

Example with words

The union of

{m, i, s, s, i, s, s, i, p, p, i}

and

{m, a, s, s, a, c, h, u, s, e, t, t, s}

would be

{a, a, c, e, h, i, i, i, i, m, p, p, s, s, s, s, t, t, u}

Multiset unions don’t have to be ordered, but ordering them makes it easier to compute the union.

Example with Python

Here’s Python code to repeat the example above.

    >>> MS = Multiset("mississippi")
    >>> MA = Multiset("massachusetts")
    >>> MS.union(MA)
    Multiset({'m': 1, 'i': 4, 's': 4, 'p': 2, 'a': 2, 'c': 1, 'h': 1, 'u': 1, 'e': 1, 't': 2})

Prime factor example

Let a and b be two non-negative integers. The union of a‘s prime factorization with b‘s prime factorization is the prime factorization of the least common multiple of a and b.

For example, let a = 12 and b = 10.

lcm(2² × 3, 2 × 5) = 2² × 3 × 5.

Function perspective

If the function f represents the multiset A and g represents the multiset B then the function max(f, g) represents the union of A and B.

Intersection

To form the intersection of two multisets, intersect the underlying sets, and set the multiplicities to the minimum of the multiplicities in each of the original multisets.

Example with words

The intersection of

{m, i, s, s, i, s, s, i, p, p, i}

and

{m, a, s, s, a, c, h, u, s, e, t, t, s}

would be

{m, s, s, s, s}

Example with Python

Reusing MA and MS above, we have

    >>> MS.intersection(MA)
    Multiset({'m': 1, 's': 4})

Prime factorization

The intersection of the prime factorization of a and the prime factorization of b is the prime factorization of the greatest common factor of a and b.

For example,

gcd(2³ × 3 × 5³, 2 × 5² × 17) = 2 × 5²

Function perspective

If the function f represents the multiset A and g represents the multiset B then the function min(f, g) represents the intersection of A and B.

Inclusion

A multiset A is included in a multiset B if every element of A is an element of B with the same or greater multiplicity.

Example with words

The multiset of letters in the word kale is included in the multiset of letters in kalman filter.

The multiset of letters in apple is not included in the multiset of letters in pale rider. The set

{a, p, l, e}

is contained in the set

{p, a, l, e, r, i, d}

but the multiset

{a, p, p, l, e}

is not contained in the multiset

{p, a, l, e, e, r, r, i, d}

because ‘p’ has multiplicity 2 in the former but multiplicity 1 in the latter.

Python example

The multiset module overloads <= for inclusion.

    >>> Multiset("kale") <= Multiset("kalman filter")
    True
    >>> Multiset("apple") <= Multiset("pale rider")
    False

Prime factorization

The multiset of prime factors of a is included in the multiset of prime factors of b if and only if a divides b.

Function perspective

If the function f represents the multiset A and g represents the multiset B then A is included in B if and only if fg.

Sum

The union of two multisets may not be exactly what you expect. For example, maybe you expected the union of

{a, b, b}

with

{b, c}

to be

{a, b, b, b, c}

rather than

{a, b, b, c}.

There is an operation on multisets like this, but it’s called sum rather than union. In the sum of two multisets, multiplicities add.

Example with words

The sum of the multiset of letters in harry and potter is the multiset of letters in harrypotter.

Example with Python

    >>> Multiset("harry") + Multiset("potter")
    Multiset({'h': 1, 'a': 1, 'r': 3, 'y': 1, 'p': 1, 'o': 1, 't': 2, 'e': 1})

Note that ‘r’ has multiplicity 3 in the sum because it has multiplicity 2 in the first multiset and 1 in the second.

Prime factorization

The sum of the multiset of prime factors of a and the multiset of prime factors of b is the multiset of prime factors of ab.

Function perspective

If the function f represents the multiset A and g represents the multiset B then f + g represents the sum of A and B.

2 thoughts on “Four views of multisets

  1. Sticking with the letters rather than the prime factors: the multiset corresponding to a word is a bag of scrabble tiles sufficient to spell that word. The union of two words’ multisets is a bag of scrabble tiles sufficient to spell *either* word, but probably not both. (The union of two general multisets is a bag that can spell any word in either multiset.)

  2. Good analogy. With that analogy, the sum of two multisets is pouring the tiles from each bag into a single bag.

Comments are closed.