tcgrep: grep rewritten in Perl

In The Perl Cookbook, Tom Christiansen gives his rewrite of the Unix utility grep that he calls tcgrep. You don’t have to know Perl to use tcgrep, but you can send it Perl regular expressions.

Why not grep with PCRE?

You can get basically the same functionality as tcgrep by using grep with its PCRE option -P. Since tcgrep searches directories recursively, a more direct comparison would be

    grep -R -P

However, your version of grep might not support -P. And if it does, its Perl-compatible regular expressions might not be completely Perl-compatible. The man page for grep on my machine says

    -P, --perl-regexp
        Interpret the pattern as a Perl-compatible regular 
        expression (PCRE). This is experimental and grep -P 
        may warn of unimplemented features.

The one implementation of regular expressions guaranteed to be fully Perl-compatible is Perl.

If the version of grep on your system supports the -P option and is adequately Perl-compatible, it will run faster than tcgrep. But if you find yourself on a computer that has Perl but not a recent version of grep, you may find tcgrep handy.

Installation

tcgrep is included as part of the Unicode::Tussle Perl module; since tcgrep is a wrapper around Perl, it is as Unicode-compliant as Perl is. So you could install tcgrep (and several more utilities) with

    cpan Unicode::Tussle

This worked for me on Linux without any issues but the install failed on Windows.

I installed tcgrep on Windows by simply copying the source code. (I don’t recall now where I found the source code. I didn’t see it this morning when I searched for it, but I imagine I could have found it if I’d been more persistent.) I commented out the definition of %Compress to disable searching inside compressed files since this feature required Unix utilities not available on Windows.

Consistency

Another reason to use tcgrep is consistency. Perl is criticized for being inconsistent. The Camel book itself says

In general, Perl functions do exactly what you want—unless you want consistency.

But Perl’s inconsistencies are different, and in my opinion less annoying, than the inconsistencies of Unix tools.

Perl is inconsistent in the sense that functions behave differently in different contexts, such as a scalar context or a list context.

Unix utilities are inconsistent across platforms and across tools. For example, a tool like sed will have different features on different platforms, and it will not support the same regular expressions as another tool such as awk.

Perl was written to be a “portable distillation of Unix culture.” As inconsistent as Perl is, it’s more consistent that Unix.

Related posts

2 thoughts on “tcgrep: grep rewritten in Perl

  1. Another options for actual Perl regexes is ack (https://beyondgrep.com) which is pure Perl and allows you to use Perl’s capture variables, such as:

    ack ‘#include ‘ -h –output=’$1’ –cc

    to get a list of all the header files included in your C files.

    However, ack stumbles with Unicode, so if that’s important, then certainly tchrist’s work is the way to go.

Comments are closed.