Arbitrary precision math in gawk

The idea of using awk for any math beyond basic arithmetic is kinda strange, and yet it has some nice features.

Awk was designed for file munging, a task it does well with compact syntax. GNU awk (gawk) supports the original minimalist version of awk and adds more features. It supports arbitrary precision arithmetic by wrapping the GMP and MPFR libraries. (When I refer to “awk” in this post I mean the features common to awk and gawk.)

If you just want extended precision, it would be easier to use bc. But if you’ve written an awk script and want to do some extended precision calculation in place rather than in a different tool, you could take advantage of gawk’s extension.

Here’s a one-liner to compute π to 100 decimal places using gawk.

    gawk -M -v PREC=333 'BEGIN {printf("%0.100f\n", 4*atan2(1,1))}'

The -M flag tells gawk to use arbitrary precision arithmetic.

NB: Your version of gawk may not support arbitrary precision. If so, running the code above will give the following warning:

    gawk: warning: -M ignored: MPFR/GMP support not compiled in

The -v flag tells gawk you’re about to set a variable, namely PREC variable for precision. You could set PREC in the body of your code instead.

Precision is measured in bits, not digits, and so for 100 digits we require 333 bits [1].

Awk is line-oriented. There’s an implicit loop around an awk program that loops over every line of input files. But you can specify code to run before the loop in a BEGIN block, and code to run after the loop in an END block. Here we just need the BEGIN block, no loop and no END.

We compute π as 4 times the arctangent of 1. Awk has an atan2 function rather than a atan function. The function atan(z) returns an angle between -π/2 and π/2 whose tangent is z. The function atan2(y, x) returns an angle between -π and π, in the same quadrant as x and y, whose tangent is y/x. The atan2 function is more general than atan since atan(z) equals atan2(z, 1). This is one advantage gawk has over bc, since bc has only an atan function (which it calls a).


[1] Where did 333 bits come from? It’s log2 10100, rounded up to the next integer. You could compute the logarithm in awk using

    awk 'BEGIN {print 100*log(10)/log(2)}'

Note that this uses awk; you don’t need gawk for this calculation. Awk doesn’t have a function to compute log base 2, only natural logs. So you use the fact that

log2(x) = log(x)/log(2).