Divisibility rules in hex

We all learn in elementary school that a number is divisible by 2 if the last digit is even. A number is divisible by 3 if the sum of the digits is divisible by 3. A number is divisible by 5 if its last digit is 0 or 5. Etc.

But imagine we were born with 8 fingers on each hand and did arithmetic in base 16, also called hexadecimal. Or suppose schools decided to teach children hexadecimal instead of decimal to give them a head start in computer science. We would count 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, 10, 11, …

What would divisibility rules look like? Here are the simplest rules.

  • n divisible by 2 if its last digit is even.
  • n is divisible by 3 if its digit sum is divisible by 3.
  • n is divisible by 4 if its last digit is 0, 4, 8, or C.
  • n is divisible by 5 if its digit sum is divisible by 5.
  • n is divisible by 6 if it is divisible by 2 and 3.
  • n is divisible by 8 if it ends in 0 or 8.
  • n is divisible by A if it is divisible by 2 and 5.
  • n is divisible by F if its digit sum is divisible by F.
  • n is divisible by 10 (i.e. sixteen) if its last digit is 0.

In this list “digit” means hexadecimal digit.

There are also rules for checking divisibility by 7 and 11 (i.e. seventeen) in hexadecimal analogous to the rules for divisibility by 7 and 11 in decimal.

To test divisibility by 7, split off the last digit from the rest of the number. Triple it and subtract from what’s left of the original number. Repeat this process until you get something you recognize as being divisible by 7 or not.

For example, suppose you start with F61. Split this into F6 and 1. Subtract 3 from F6 to get F3. Now split F3 into F and 3. Subtract 9 from F to get 6. F61 is not divisible by 7 because 6 is not divisible by 7. The explanation for how this works is completely analogous to the corresponding rule in base 10.

You can test for divisibility by 11 in base 16 (i.e. by seventeen) just as you’d test divisibility by 11 in base 10 (or any other base). Add up the digits in an odd position and subtract the sum of the digits in an even position. For example, suppose you start with 135B8. The digits in odd position sum to 1 + 5 + 8. The digits in even positions sum to 3 + B. Both sums are equal, so there difference is 0, which is divisible by 11, so 135B8 is divisible by 11.

There’s nothing special about base 16 or base 10. You could come up with analogous divisibility rules in any base.

Related posts

Divisibility by 7

How can you tell whether a number is divisible by 7? Almost everyone knows how to easily tell whether a number is divisible by 2, 3, 5, or 9. A few less know tricks for testing divisibility by 4, 6, 8, or 11. But not many people have ever seen a trick for testing divisibility by 7.

Here’s the trick. Remove the last digit from the number, double it, and subtract it from the first part of the number. Do this repeatedly until you get something you recognize as being divisible by 7 or not.

For example, start with 432. Split into 43 and 2. Subtract 4 from 43 to get 39. Since 39 isn’t divisible by 7, neither is 432.

For another example, start with 8631. Split into 863 and 1. Subtract 2 from 863 to get 861.

Now split 861 into Split into 86 and 1. Subtract 2 from 86. Maybe you recognize 84 as a multiple of 7. If not, double 4 and subtract from 8 to get 0, which is divisible by 7. Either way, we conclude that 8631 is divisible by 7.

Why does this work? Let b be the last digit of a number n and let a be the number we get when we split off b. That says n = 10a + b. Now n is divisible by 7 if and only if n – 21b is divisible by 7. But n – 21b = 10(a – 2b) and this is divisible by 7 if and only if a – 2b is divisible by 7.

What about the remainder when you divide a number by 7? Here’s where the rule for 7 differs from the more familiar divisibility rules. For example, a number is divisible by 3 if its digit sum is divisible by 3, and furthermore the remainder when a number is divided by 3 is the remainder when its digit sum is divided by 3. But the divisibility rule for 7 does not give the remainder when a number is divided by 7. For a simple example, the divisibility rule for 31 terminates in 1, but the remainder with 31 is divided by 7 is 3.

Why doesn’t the divisibility rule for 7 give the remainder? It is true that 10a + b and (10a + b) – 21b have the same remainder when divided by 7. But then we factored this into 10(a -2b). It’s true that 10(a – 2b) is divisible by 7 if and only if (a – 2b) is divisible by 7, but if neither is divisible by 7 then they will leave different remainders.

Related posts

Why a computer buffer is called a buffer

Why is a chunk of working memory called a “buffer”?

The word ‘buffer’, by the way, comes from the meaning of the word as a cushion that deadens the force of a collision. In early computers, a buffer cushioned the interaction between files and the computer’s central processing unit. The drums or tapes that held a file and the central processing unit were pieces of equipment that were very different from each other, working at their own speeds, in spurts. The buffer made it possible for them to work together effectively. Eventually, the buffer grew from being an intermediary, a temporary holding place, to being the place where work is done. This transformation is rather like that of a small seaport that grew into a great city: once it was merely the place where cargo was warehoused temporarily before being loaded onto ships; then it became a business and cultural center in its own right.

From An Introduction to Programming in Emacs Lisp. The same book explains two meanings of “argument”.

Approximations with Pythagorean triangles

A few days ago someone asked the following on math.stackexchange:

Is it possible to get arbitrarily near any acute angle with Pythagorean triangles?

A Pythagorean triangle is a right triangle with all integer sides. The sides of such a triangle are called Pythagorean triples. The most famous example is a 3-4-5 triangle. A little less known example is 5-12-13. Since not many Pythagorean triples come to mind, you might think there aren’t that many of them. Or maybe they’re unevenly distributed so that you can approximate some angles as well as you’d like but there’s a limit to how well could you approximate others.

It turns out you can indeed approximate any acute angle with that of a Pythagorean triangle. The answer on math.stackexchange explains why this is possible and here I give an algorithm. I wrote a little online calculator for finding Pythagorean triangle approximations based on the algorithm below.

A result going all the way back to Euclid says that (a, b, c) is a Pythagorean triple if and only if (a, b, c) = (m2n2, 2mn, m2 + n2) where m and n are positive integers with m > n. The tangent of the corresponding triangle is 2mn/(m2n2) = 2t/(1 – t2) where t = n/m. Given an acute angle θ, we need to find t such that tan(θ) = 2t/(1 – t2) and then find a rational approximation to t.

We can solve for t using the quadratic formula and find that t = (1 – cos(θ))/sin(θ). We can then find a rational approximation to t using Farey’s algorithm.

We’d like to not use larger numbers than necessary in our Pythagorean triangles. Suppose we don’t want the hypotenuse to be much more than H. If tn/m and m2 + n2 = H, then (1 + t2) m2H. Therefore we’d like to look for rational approximations to t with denominator m no more than sqrt(H/(1 + t2)). This is approximate, however, since t is only approximately equal to n/m.

If we use the Python function farey from a previous post, the code for finding a Pythagorean approximation to an acute angle θ is simply the following.

    from math import sin, cos, sqrt

    def pythagorean_triple(acute_angle, max_hypoteneuse):

        t = (1 - cos(acute_angle))/sin(acute_angle)
        max_denominator = sqrt( max_hypoteneuse /(1 + t*t) )
        n, m = farey(t, max_denominator)
        return m*m - n*n, 2*m*n, m*m + n*n

If the angle is small and the maximum hypotenuse is too small, the best rational approximation to t might be zero. In that case the algorithm returns a degenerate triangle, i.e. one of the sides has zero length. You could increase the maximum hypotenuse and try again until you get a valid triangle.

Update: The procedure above will return a Pythagorean triple, but not necessarily a primitive Pythagorean triple, i.e. the three numbers may have a common factor as Douglas pointed out in the comments. The triple is only primitive if m and n have opposite parity. You could guarantee a primitive triple by dividing by the greatest common divisor of the first two terms. (Anything that divides a and b must divide c.)

Here’s revised code that returns only primitive triples.

    def gcd(a, b):
        if b == 0:
            return a
        else:
            return gcd(b, a%b)

    def pythagorean_triple(acute_angle, max_hypoteneuse):

        t = (1 - cos(acute_angle))/sin(acute_angle)
        max_denominator = sqrt( max_hypoteneuse /(1 + t*t) )
        n, m = farey(t, max_denominator)
        a = m*m - n*n
        b = 2*m*n
        c = m*m + n*n
        g = gcd(a, b)
        a /= g
        b /= g
        c /= g
        return a, b, c

Related post: What’s so hard about finding a hypotenuse?

Paleolithic nonsense

The “paleolithic diet” has gotten more press lately. Paleolithic diet advocates say our ancestors lived in a state of gastronomic innocence, eating mostly meat, before they were seduced by agriculture and fell into eating grain. Some go further and say that not only should we eat like cavemen, we should live like cavemen as well. For example, we should have random bursts of exercise, as if fleeing a saber-toothed tiger, followed by long periods of leisure.

I am amused by how much some people believe they know about paleolithic life. Most of us don’t know that much about how our great grandparents lived, and yet others make confident detailed claims about the lifestyles of pre-historic ancestors. This is convenient since their claims are unlikely to be proven wrong, given how little we know or are ever likely to know ancient lifestyles. Of course those making the boldest claims are not scientists but popularizers who take a hint from the scientists and run with it.

I have no opinion on the actual recommendations of the fans of paleolithic culture. Maybe we would be better off eating more meat or having random bursts of intense exercise; I have no idea. However, I object to the pseudo-scientific rhetoric used to support the recommendations. I also object to the implicit assumption that it would necessarily be good to emulate the lives of paleolithic humans even if we did know how they lived.

Even the little we think we know about ancient cuisine should be called into question. A paper entitled Thirty thousand-year-old evidence of plant food processing was posted online this week which suggests people were making flour 10,000 years earlier than previously thought. Of course this doesn’t mean that people all over the world were living on pasta, but it does underscore how little we know about the real paleolithic diet.

Best rational approximation

Suppose you have a number x between 0 and 1. You want to find a rational approximation for x, but you only want to consider fractions with denominators below a given limit.

For example, suppose x = 1/e = 0.367879…  Rational approximations with powers of 10 in the denominator are trivial to find: 3/10, 36/100, 367/1000, etc. But say you’re willing to have a denominator as large as 10. Could you do better than 3/10? Yes, 3/8 = 0.375 is a better approximation. What about denominators no larger than 100? Then 32/87 = 0.36781… is the best choice, much better than 36/100.

How do you find the best approximations? You could do a brute force search. For example, if the maximum denominator size is N, you could try all fractions with denominators less than or equal to N. But there’s a much more efficient algorithm. The algorithm is related to the Farey sequence named after John Farey, though I don’t know whether he invented the algorithm.

The idea is to start with two fractions, a/b = 0/1 and c/d = 1/1. We update either a/b or c/d at each step so that a/b will be the best lower bound of x with denominator no bigger than b, and c/d will be the best upper bound with denominator no bigger than d. At each step we do a sort of binary search by introducing the mediant of the upper and lower bounds. The mediant of a/b and c/d is the fraction (a+c)/(b+d) which always lies between a/b and c/d.

Here is an implementation of the algorithm in Python. The code takes a number x between 0 and 1 and a maximum denominator size N. It returns the numerator and denominator of the best rational approximation to x using denominators no larger than N.

def farey(x, N):
    a, b = 0, 1
    c, d = 1, 1
    while (b <= N and d <= N):
        mediant = float(a+c)/(b+d)
        if x == mediant:
            if b + d <= N:
                return a+c, b+d
            elif d > b:
                return c, d
            else:
                return a, b
        elif x > mediant:
            a, b = a+c, b+d
        else:
            c, d = a+c, b+d

    if (b > N):
        return c, d
    else:
        return a, b

In Python 3.0, the float statement could be removed since the division operator does floating point division of integers.

Read more about rational approximation in Breastfeeding, the golden ratio, and rational approximation.

Here’s an example of a situation in which you might need rational approximations. Suppose you’re designing an experiment which will randomize subjects between two treatments A and B. You want to randomize in blocks of size no larger than N and you want the probability of assigning treatment A to be p. You could find the best rational approximation a/b to p with denominator b no larger than N and use the denominator as the block size. Each block would be a permutation of a A’s and ba B’s.

Update: Eugene Wallingford wrote a blog post about implementing the algorithm in Klein, a very restricted language used for teaching compiler writing.

Update: There’s a bug in the code. See discussion below.

Good old regular expressions

Here are two examples that persuaded me long ago that regular expressions could be powerful. Both come from The Unix Programming Environment by Kernighan and Pike (1984).

The first problem is to produce a list of all English words that contain all five vowels exactly once and in alphabetical order.

The book creates a regular expression aphavowels

^[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*$

then uses it to filter a dictionary file

egrep -f alphavowels /usr/dict/web2

This produced 16 words ranging from abstemious to majestious.

The second problem is to produce a list of all English words of at least six letters with letters appearing in increasing alphabetical order.

The book creates a regular expression named monotonic

^a?b?c?d?e?f?g?h?i?j?k?l?m?n?o?p?q?r?s?t?u?v?w?x?y?z?$

then uses it to filter a dictionary file as before, except there is an additional filter stage.

egrep -f monotonic /usr/dict/web2 | grep '......'

This produced 17 words including common words such as almost and ghosty. Some of the more interesting results were bijoux, chintz, and egilops. Kernighan and Pike explain that egilops is a disease that attacks wheat.

Explanation

The regular expressions above are fairly long, but shorter and more transparent than a procedural program to solve the same problem. The solutions may look mysterious at first sight, but they are entirely straightforward once you know the most basic features of regular expressions.

In the first problem, the pattern [^aeiou] says to look for anything that isn’t a vowel, i.e. is a consonant (assuming entries in the dictionary file contain only letters). So the regular expression says to start at the beginning of each line and look for zero or more consonants, followed by an ‘a’, followed by zero or more consonants, followed by an ‘e’, and so on down to a ‘u’ optionally followed by consonants at the end of the line.

In the second problem, the question mark matches zero or one instances of a character, i.e. the character is optional. The regular expression says to start at the beginning of each line, look for an optional ‘a’, followed by an optional ‘b’, and so forth to the end of the line. Then the output is filtered by another regular expression ....... Since a period matches any character, a sequence of six periods says to select only words that contain six characters.

Related: Tips for learning regular expressions

Buggy code is biased code

As bad as corporate software may be, academic software is usually worse. I’ve worked in industry and in academia and have seen first-hand how much lower the quality bar is in academia. And I’m not the only one who has noticed this.

Why does this matter? Because buggy code is biased code. Bugs that cause the software to give unwanted results are more likely to be noticed and fixed. Bugs that cause software to produce expected results are more likely to remain in place.

If your software simulates some complex phenomena, you don’t know what it’s supposed to do; that’s why you’re simulating. Errors are easier to spot in consumer software. A climate model needs a higher level of quality assurance than a word processor because bugs in the latter are more obvious. Genomic analysis may contain egregious errors and no one ever know, but a bug in an MP3 player is sure to annoy users.

You have to test simulation software carefully. You have to test special cases and individual components to have any confidence in the final output. You can’t just look at the results and say “Yeah, that’s about what I expected.”

Related posts