How do you efficiently compute *a ^{n}* for integer

*n*? You could multiply

*a**

*a**

*a**…*

*a*,

*n*-1 multiplications, but there are much more efficient ways. For example,

*a*

^{8}could be computed by ((

*a*

^{2})

^{2})

^{2}in three multiplications. This example suggests it may be worthwhile to think about the binary representation of the exponent.

Here’s an algorithm. Write the exponent n in binary. Read the binary representation from left to right, starting with the *second* bit from the left. Start with the number *a*, and every time you read a 0 bit, square what you’ve got. Every time you read a 1 bit, square what you’ve got and multiply by *a*. It follows that *a ^{n}* can be computed using no more than 2 log

_{2}(

*n*) multiplications.

As an example, suppose we want to calculate 7^{13}. In binary, 13 is written as 1101. We skip over the leading 1 then read 1, 0, and 1. So we calculate ((((7^{2})7)^{2})^{2})7. In C notation, we would compute as follows.

x = 7; x *= x; x *= 7; x *= x; x *= x; x *= 7;

In number theory calculations, such as arise in cryptography, it’s often necessary to compute *a ^{n}* (mod

*m*) for very large integers

*a*,

*n*, and

*m*. These numbers may be hundreds or thousands of digits long and so of course the calculations cannot be carried out using ordinary machine integers. You could compute

*a*first then reduce the result (mod

^{n}*m*), but it would be more efficient to reduce (mod

*m*) after every multiplication. This puts an upper bound on the size of numbers (and hence the amount of memory) needed for the calculation.

The same algorithm works when *a* is not an integer but a matrix. For example, consider a large matrix *M* that gives the connections between nodes in a graph. *M* contains a 1 in position (*i*, *j*) if the *i*^{th} and *j*^{th} nodes are connected by an edge and contains a 0 otherwise. The *n*^{th} power of *M* tells which nodes are connected by paths of length less than or equal to *n*. If *M* is large, we don’t want to do unnecessary multiplications in computing *M ^{n}*.

For some exponents, the method given above can be improved. The smallest exponent for which you can improve on the binary algorithm is 15. Using the binary algorithm, computing *a*^{15} requires six multiplications. But you could compute the same quantity in five multiplications as follows.

b = a*a*a; c = b*b; c *= c; c *= b;

For more examples, see Seminumerical Algorithms section 4.6.3.

I ran across this package (Pari/GP) which has very large integer capabilities. You’ve probably heard of it or used it, but I found it fascinating. It seems to be well maintained. I used it for testing some encryption algorithms some time ago. I know it’s a little off topic, but some might find it interesting.

http://pari.math.u-bordeaux.fr/

I also want to point out that the newest “math coprocessor” architectures built into dual core Intel CPU’s use some fascinating optimizations for large integer arithmetic, similar to what you describe in your post.

Hmmmm… I had always thought that the standard procedure to calculate arbitrary power of a matrix was>a href=”http://mathworld.wolfram.com/MatrixDiagonalization.html” title=”diagonalization”>. That is, if you can write [A] as [B^-1][D][B], then [A^n] reduces to [B^-1][D^n][B], since all other [B^-1]s and [B]s cancel out. But of course diagonalizing a large matrix is equivalent to finding all roots of a very large polynomial… For very large exponents diagonalization has to be better, but do you have any hint of at what point both approaches break even?

By the way, you’ve got a great blog, John…

@Jaime —

If it is an integer matrix, diagonalization might not be the best idea.

For example, the powers of the following matrix generate the Fibonacci numbers:

(1 1)

(1 0)

Using the “repeated squaring” technique, this matrix lets you compute the Nth Fibonacci number in logarithmic time using only integer arithmetic.

If you diagonalize this matrix, you will find the golden ratio and its reciprocal along the diagonal. Since those are irrational, they are not precisely representable on your computer. (Well, not with the usual floating-point representations, anyway…) That will lead to rounding errors when you exponentiate.

GNU MP does the job!

Cool, thanks for the concise explanation. I’ve been using php’s bcmath functions, but the bcpow function doesn’t seem to perform fast exponentiation with large exponents – so I’m having to write my own version using the technique you describe.

Nice post. 🙂 Oh, by the way, how about the complexity of this fast exponentiation algorithm? Thank you.

in your connections example, (since all the elements of M are 1 or 0), instead of doing a matrix multiply aproach to exponentiaton (i.e. repeated inner products M +.× M aka “plus dot times” ) couldn’t you do the presumably simpler to calculate logicals “xor” and “and” and compute repeated ( M “xor”.”and” M aka “xor dot and” )

oops, my prior post shouldn’tuse an exclusive xor-function-operator, but an inclusive “either or both” sort of or-function-operator

Back in the 70s, I had a summer job programming in RPG2, on an IBM System/34 which was about the speed of the later Apple II. Some of the calculations I was doing needed high-degree exponentiation (e.g. 360th power for a 30-year mortgage amortization), so the successive-squaring method made a huge difference in performance.