Roman numeral puzzle

Posted on 14 January 2012 by John

I noticed an ad for Super Bowl XLVI on a pizza box this morning. The Roman numeral XLVI does not repeat any character. This brought up a couple questions.

How many Roman numerals are possible if you’re not allowed to repeat a character?
Could you write a (reasonably short) regular expression to find all such numbers?

You can post your solutions to either question in the comments.

There has never been universal agreement on the rules for constructing Roman numerals, so your solution would depend on your choice of rules. For our purposes here, assume the valid characters are I, V, X, L, C, D, and M. Also, assume any character can be subtracted from a larger character. For example, you can assume IL is a valid representation of 49.

For a more challenging problem, you can use the more restrictive subtraction rules.

I can be subtracted from V and X only.
X can be subtracted from L and C only.
C can be subtracted from D and M only.
V, L, and D can never be subtracted.

23 thoughts on “Roman numeral puzzle”

Max Lybbert

14 January 2012 at 10:54

I’ll need to play with the problem a little before answering. Right now I can only say that Perl’s extended version of regular expressions can include enough logic to match such numbers (in Perl it’s possible to modify the pattern on the fly based on what has already matched, which is why Perl’s regular expressions can determine if parentheses are properly balanced but classic regular expressions can’t).
Laura McLay

14 January 2012 at 12:08

This sounds like a fun little problem to work on while watching the Super Bowl (-:
John Bartholomew

14 January 2012 at 14:42

Not totally sure I’ve got this right, but here goes…

It uses the more restrictive rules, with the relaxation that it also generates the empty string.

romanNumerals = "(M? D? C? | M? CD | CM D?) ((L? X? | XL) (V? I? | IV) | L? IX V?) | M? D? XC L? (V? I? | IV)"
enumerateRegex = nub . sort . fst . go1 . filter (not . isSpace) where go1 = go [""] [""] go :: [String] -> [String] -> String -> ([String], String) go prefix atom ('(':xs) = let (a,b) = go1 xs in go (allPairs prefix atom) a b go prefix atom (')':xs) = (allPairs prefix atom, xs) go prefix atom ('|':xs) = let (a,b) = go1 xs in (allPairs prefix atom ++ a, b) go prefix atom ('?':xs) = go (allPairs prefix (atom ++ [""])) [""] xs go prefix atom ('?':xs) = go prefix atom xs go prefix atom ( c :xs) = go (allPairs prefix atom) [[c]] xs go prefix atom [] = (allPairs prefix atom, "") allPairs :: [String] -> [String] -> [String] allPairs a b = concatMap (x -> map (x ++) b) a

(I guess most of your readers will recognise the language, but for those that don’t, it’s Haskell)
John Bartholomew

14 January 2012 at 14:47

Sorry, bad formatting. You’ll have to copy to a text editor to see the layout properly and get rid of the emoticon. Also, piping through ‘sort’ was unnecessary, but I always forget how nub works and assume that it only removes equal and *adjacent* items.
Mr Being

14 January 2012 at 15:43

Could you write a (reasonably short) regular expression to find all such numbers?

No – regular expressions can’t count and therefore would not be able to determine if you repeated a character.

Josiah C.

14 January 2012 at 16:11

In Python:

def roman_numerals():
    import itertools
    values = list('IVXLCDM')
    index = dict((j,i) for i,j in enumerate(values))
    prefix = {0: (1,2), 2: (3,4), 4: (5,6)}
    for i in xrange(2, 7):
        for perm in itertools.permutations(values[:7], i):
            for j, v in enumerate(map(index.get, perm[:-1])):
                vn = index[perm[j+1]]
                if vn > v and vn not in prefix.get(v, ()):
                    break
            else:
                values.append(''.join(perm))
    return values

Josiah C.

14 January 2012 at 16:12

Bah! Better indentation here: http://pastebin.com/ZtB9eKxx
John

14 January 2012 at 16:25

Anyone want to venture a combinatorial argument, i.e. using math instead of writing a program to count the possibilities?

Josiah: I fixed your indentation problem by changing your code tag to a pre tag.

Mr Being: I know what you’re referring to, but regexes can count to 1. If it weren’t for the subtraction rule, you could use something like M?D?C?L?X?V?I?. And if you use more general regular expressions like the first comment alluded to, there are new possibilities. (Though these are not strictly regular expressions in the CS sense.)
mp

14 January 2012 at 19:12

at least

^([IVX])(?!.*1)([IVX])?(?!.*2)([IVX]?)$

checks for all combinations of I, V, X with no repeats.

|=
John Bartholomew

14 January 2012 at 19:31

Combinatorial:

Rules: restrictive subtraction rules. I’ll also add a rule that a
smaller value cannot appear before a larger value except as a subtraction.

The alphabet contains 7 characters, which are (in order from lowest-value
to highest-value): I, V, X, L, C, D, and M.

V, L, D and M can never be subtracted, but may be included or not.

I, X and C each have four possibilities: they can be skipped; they can
appear in their positive position, or they can appear in one of two negative
positions.

Those choices lead to 2**4 * 4**3 = 2**10 possibilities. Howeve, that assumes
all the choices are independent, which they are not. If ‘V’ does not appear,
then ‘I’ cannot be subtracted from it; if ‘X’ is subtracted from ‘L’ or ‘C’,
then ‘I’ cannot be subtracted from ‘X’, and so on.

So let’s start again. We can split the possibilities into four cases,
based on the number of subtractions in each one. Given N subtractions,
we have 2^(7-2N) possibilities. This comes from having 7 – 2N positive
value characters that can (each, independently) be chosen to appear or
not appear in the string. For the case of N = 0, we subtract 1 to remove
the possibility of the empty string.

0 subtractions: 2^7 – 1
1 subtraction: 6 * 2^5
2 subtractions: 8 * 2^3
(IV,XL ; IV,XC; IV,CD; IV,CM; IX,CD ; IX,CM; XL,CD ; XL,CM)
3 subtractions: 2 * 2
(IV,XL,CD ; IV,XL,CM)

Total possibilities: 2^7 + 6*2^5 + 8*2^3 + 4 – 1 = 287

This matches the number of results emitted by the code in my previous
comment, which gives me a lot of confidence that the result is correct.

I’ve put the code from my first comment (plus this combinatorial explanation) into a github gist:
https://gist.github.com/1613780
John Bartholomew

14 January 2012 at 19:35

My previous comment was ambiguous. Where I said “which gives me a lot of confidence that the result is correct,” I meant that working out that combinatorial analysis gave me confidence that my code was correct, not that my code gave me confidence that the maths was correct (though I suppose both are true to some extent).

It took several attempts to find an approach that was clear enough for me to be confident in it, so I’m glad to have done it: a fun exercise all round!
Pseudonym

14 January 2012 at 19:52

An observation: The problem is a lot easier if your regular expression language includes set intersection and difference. Regular languages are closed under these operators, so there’s really no excuse.
John Bartholomew

15 January 2012 at 03:45

So, a friend pointed out that the value I gave in comment #10 was incorrect:
2^7 + 6*2^5 + 8*2^3 + 4 – 1 = 387
Not 287 (just a typo when I wrote the result into the comment). The argument stands, just not the final number.
David Damerell

15 January 2012 at 04:57

For listing them all I rolled up http://pastebin.com/9zr5gcug which is a bit rough and ready (in particular it really ought to work from the right-hand end of $string). As a regexp I rolled http://pastebin.com/ziZ3qtsh because I happen just to have read about named backreferences.

Jacob

15 January 2012 at 05:16

I get 316, both mathematically and with an emacs lisp program to implement the math:

(defun rn-MDCLXVI () (rn-union (rn-cross '("MD" "M" "D" "") (rn-CLXVI)) (rn-cross '("MCD" "CM" "CD") (rn-LXVI))))
(defun rn-CLXVI () (rn-union (rn-cross '("CL" "C" "L" "") (rn-XVI)) (rn-cross '("CXL" "XC" "XL") (rn-VI))))
(defun rn-LXVI () (rn-union (rn-cross '("L" "") (rn-XVI)) (rn-cross '("XL") (rn-VI))))
(defun rn-XVI () (rn-union (rn-cross '("XV" "X" "V" "") (rn-I)) (rn-cross '("XIV" "IX" "IV") (rn-))))
(defun rn-VI () (rn-union (rn-cross '("V" "") (rn-I)) '("IV")))
(defun rn-I () '("I" ""))
(defun rn- () '(""))
(defun rn-union (left right)
  (let ((result right))
    (dolist (left-elt left)
      (if (member left-elt result)
          (princ (format "duplicate: %sn" left-elt))
          (push left-elt result)))
    result))
(defun rn-cross (left right)
  (let ((result '()))
    (dolist (left-elt left)
      (dolist (right-elt right)
        (push (concat left-elt right-elt) result)))
    result))
(1- (length (rn-MDCLXVI)))

Anybody see a mistake?

Jacob

15 January 2012 at 05:29

Ah. I think John Bartholomew is counting invalid constructs like “IXV”. At least I think they are invalid. One could argue that the restriction on subtraction was ambiguous.
John

15 January 2012 at 07:57

I agree that IXV is invalid. I haven’t checked whether John Bartholomew is including it.

By the way, the reason I specified a reasonably short regular expression is this: there are a finite number of possibilities, so one could create a regular expression by brute force simply by listing them all: I|IV|V|IX| … A short regular expression that matches a little too much or not enough is more interesting than a correct expression created by brute force.
Mark M

15 January 2012 at 08:05

John – The line:
I noticed an ad for Super Bowl XLVI on a pizza box this morning.
conjures a picture of you awakening in the middle of the opening scene from ‘The Bachelor Party’. :-)
John Bartholomew

15 January 2012 at 14:03

I am including IXV, since that restriction was not mentioned in the post at all: ‘I’ can only be subtracted from ‘X’ or ‘V’ does not in itself imply that ‘IX’ cannot appear with ‘V’ — that restriction comes purely external knowledge of common Roman numeral writing rules.

I agree that excluding such constructs is a sensible rule though, and produces a “nicer” set of valid numerals.
John Bartholomew

15 January 2012 at 14:19

@Jacob I agree with your result of 316.

I have updated my github gist with the adjusted regular expression and combinatorial solution for excluding CMD, XCL and IXV.
Huseyin YIlmaz

17 January 2012 at 06:56

My solution in python is here https://gist.github.com/1626535 . It is not the best solution but still good enough to share.
John Venier

17 January 2012 at 12:43

I always thought it was interesting that the Number of the Beast, 666, in Roman numerals is DCLXVI.
Vince Ypma

4 June 2015 at 02:41

function ConvertTo-RomanNumeral
{

[CmdletBinding()]
[OutputType([string])]
Param
(
[Parameter(Mandatory=$true,
HelpMessage=”Enter an integer in the range 1 to 3,999″,
ValueFromPipeline=$true,
Position=0)]
[ValidateRange(1,3999)]
[int]
$Number
)

Begin
{
$DecimalToRoman = @{
Thousands = “”,”M”,”MM”,”MMM”
Hundreds = “”,”C”,”CC”,”CCC”,”CD”,”D”,”DC”,”DCC”,”DCCC”,”CM”
Tens = “”,”X”,”XX”,”XXX”,”XL”,”L”,”LX”,”LXX”,”LXXX”,”XC”
Ones = “”,”I”,”II”,”III”,”IV”,”V”,”VI”,”VII”,”VIII”,”IX”
}

$column = @{
Thousands = 0
Hundreds = 1
Tens = 2
Ones = 3
}
}
Process
{
[int[]]$digits = $Number.ToString().PadLeft(4,”0″).ToCharArray() |
ForEach-Object { [Char]::GetNumericValue($_) }

$RomanNumeral = “”
$RomanNumeral += $DecimalToRoman.Thousands[$digits[$column.Thousands]]
$RomanNumeral += $DecimalToRoman.Hundreds[$digits[$column.Hundreds]]
$RomanNumeral += $DecimalToRoman.Tens[$digits[$column.Tens]]
$RomanNumeral += $DecimalToRoman.Ones[$digits[$column.Ones]]

$RomanNumeral
}
End
{
}
}

1..3999 |
ForEach-Object { ConvertTo-RomanNumeral $_} |
ForEach-Object { if ($_ -notmatch ‘^.*(.).*\1.*$’) { $_ } }

Comments are closed.

Other puzzle posts

23 thoughts on “Roman numeral puzzle”