Perl as a better …

Today I ran across Minimal Perl: For UNIX and Linux People. The book was published a few years ago but I hadn’t heard of it because I haven’t kept up with the Perl world. The following chapters from the table of contents jumped out at me because I’ve been doing a fair amount of awk and sed lately.:

3. Perl as a (better) grep command
4. Perl as a (better) sed command
5. Perl as a (better) awk command
6. Perl as a (better) find command

These chapters can be read a couple ways. The most obvious reading would be “Learn a few features of Perl and use it as a replacement for a handful of separate tools.”

But if you find these tools familiar and are not looking to replace them, you could read the book as saying “Here’s an introduction to Perl that teaches you the language by comparing it to things you already know well.”

The book suggests learning one tool instead of several, and in the bargain getting more powerful features, such as more expressive pattern matching. It also suggests not necessarily committing to learn the entire enormous Perl language, and not necessarily committing to use Perl for every programming task.

Regarding Perl’s pattern matching, I could relate to the following quip from the book.

What’s the only thing worse than not having a particular metacharacter … in a pattern-matching utility? Thinking you do, when you don’t! Unfortunately, that’s a common problem when using Unix utilities for pattern matching.

That was my experience just yesterday. I wrote a regular expression containing \d for a digit and couldn’t understand why it wasn’t matching.

Most of the examples rely on giving Perl command line options such as -e so that it acts more like command line utility. The book gives numerous examples carrying out common tasks in grep etc. and with Perl one-liners. The latter tend to be a little more verbose. If a task falls in the sweet spot of a common tool, that tool’s syntax will be more succinct. But when a task falls outside that sweet spot, such as matching a pattern that cannot be easily expressed with traditional regular expressions, the Perl solution will be shorter.

More specifics

This is an update, written March 3, 2021.

If you’re going to use Perl as a replacement for command line tools, you’ll need to know about one-liners and quoting.

Here is a post that covers Perl as a better grep.

If your main use for sed is to run commands like s/foo/bar/g, you can do this in Perl with

    perl -ple 's/foo/bar/g'

I talk more about using Perl to replace sed here.

If you want to use Perl as a replacement for awk, the main thing you need to know about is the -a option. This populates an array @F which corresponds to $1, $2, $3, etc. in awk. Note however that Perl arrays are indexed from 0, so $F[0] corresponds to $1 etc. A few more correspondences between the languages are given in the table below.

    | awk | perl  |
    | $0  | $_    |
    | $2  | $F[1] |
    | RS  | $/    |
    | ORS | $\    |
    | OFS | $,    |

Perl can have BEGIN and END blocks just like awk.

You can set the field separator in Perl with -F, such as -F: to make the field separator a colon. In newer versions of Perl 5 you don’t have to specify -a if you specify -F; it figures that if you’re setting the field separator, you must want an array of fields to play with.

Awk one-liners

Peteris Krumins has written a fine little book Awk One-Liners Explained. It’s just 58 pages, and it’s an easy read.

As I commented here, I typically try to master the languages I use. But for some languages, like awk and sed, it makes sense to learn just a small, powerful subset. (The larger a language is, the harder it can be to just learn part of it because the features intertwine.) Krumins’ book would be good for someone looking to learn just a little awk rather than wanting to explore every dark corner of the language.

Awk One-Liners Explained is exactly what title would lead you to expect. It has 70 awk one-liners along with a commentary on each. Some of the one-liners solve common specific problems, such as converting between Windows and Unix line endings. Most of the one-liners are solutions to general types of problems rather than code anyone is likely to run verbatim. For example, one of the one-liners is

Change “scarlet” or “ruby” or “puce” to “red.”

I doubt anybody has ever had to solve that exact problem, but it’s not hard to imagine wanting to do something similar.

Because the book is entirely about one-line programs, it doesn’t cover how to write complex programs in awk. That’s perfect for me. If something takes more than one line of awk, I probably don’t want to use awk. I use awk for quick file filtering. If a task requires writing several lines of code, I’d use Python.

You can get an idea of the style of the book by reading the author’s blog post Famous Awk One-Liners Explained, Part I: File Spacing, Numbering and Calculations.

* * *

If you’d like to learn the basics sed and awk by receiving one tip per day, you can follow @SedAwkTip on Twitter.

A little Awk

Greg Grothaus posted an article today entitled Why you should learn just a little Awk. The article recommends taking a few minutes to learn only the most basic parts of the Awk language. I find this interesting for several reasons.

First, it is impressive what you can accomplish with just a few keystrokes in Awk. The language was designed for file munging and it does this very well. Many people, myself included, think of Perl as a language for file munging. And so it is, but I remember reading something from Larry Wall, creator of Perl, saying that he uses Awk for some tasks.

Second, Grothaus isn’t encouraging people to master the language. He’s saying to just learn a handful of features, at least to start. That goes against my grain. When I learn a language, I want to learn it thoroughly. On the other hand, I don’t have the time or energy lately to learn a new language on top of everything else I have going on. But I think Grothaus has a good point: if you just take a few minutes to learn only how to do several very specific tasks, it could be worth it.

Finally, I found it interesting to read a blog post about a language I haven’t touched in well over a decade. I used Awk in grad school for a little while, and was quite impressed with it. But someone suggested that Perl was similar but even better and I dropped Awk for Perl. Looking back I’d say Perl is more general than Awk, but not necessarily better.  Awk is quite good at the kinds of tasks it was designed for.

I’ve been trying to consolidate the list of programming languages I use after reaching programming language fatigue. Adding yet another language to the list of languages I haven’t mastered but use occasionally would not be progress. But Grothaus’ article tempts me to look at Awk again, not with the intention of mastering it but rather to learn how to do just a small number of things it does remarkably well.

Related posts