Formatting numbers at the command line

The utility numfmt, part of Gnu Coreutils, formats numbers. The main uses are grouping digits and converting to and from unit suffixes like k for kilo and M for mega. This is somewhat useful for individual invocations, but like most command line utilities, the real value is using it as part of pipeline.

The --grouping option will separate digits according to the rules of your locale. So on my computer

    numfmt --grouping 123456789

returns 123,456,789. On a French computer, it would return 123.456.789 because the French use commas as decimal separators and use periods to group digits [1].

You can also use numfmt to convert between ordinary numbers and numbers with units. Unfortunately, there’s some ambiguity regarding what units like kilo and mega mean. A kilogram is 1,000 grams, but a kilobyte is 210 = 1,024 bytes. (Units like kibi and mebi were introduced to remove this ambiguity, but the previous usage is firmly established.)

If you want to convert 2M to an ordinary number, you have to specify whether you mean 2 × 106 or 2 × 220. For the former, use --from=si (for Système international d’unités) and for the latter use --from=iec (for International Electrotechnical Commission).

    $ numfmt --from=si 2M
    2000000
    $ numfmt --from=iec 2M
    2097152 

One possible gotcha is that the symbol for kilo is capital K rather than lower case k; all units from kilo to Yotta use a capital letter. Another is that there must be no space between the numerals and the suffix, e.g. 2G is legal but 2 G is not.

You can use Ki for kibi, Mi for mebi etc. if you use --from=iec-i.

    $ numfmt --from=iec-i 2Gi  
    2147483648   

To convert from ordinary numbers to numbers with units use the --to option.

    $ numfmt --to=iec 1000000 
    977K  

Related links

[1] I gave a presentation in France years ago. Much to my chagrin, the software I was demonstrating had a hard-coded assumption that the decimal separator was a period. I had tested the software on a French version of Windows, but had only entered integers and so I didn’t catch the problem.

To make matters worse, there was redundant input validation. Entering 3.14 rather than 3,14 would satisfy the code my team wrote, but the input was first validated by a Windows API which rejected 3.14 as invalid in the user’s locale.

4 thoughts on “Formatting numbers at the command line

Comments are closed.