In The Perl Cookbook, Tom Christiansen gives his rewrite of the Unix utility grep
that he calls tcgrep
. You don’t have to know Perl to use tcgrep
, but you can send it Perl regular expressions.
Why not grep with PCRE?
You can get basically the same functionality as tcgrep
by using grep
with its PCRE option -P
. Since tcgrep
searches directories recursively, a more direct comparison would be
grep -R -P
However, your version of grep
might not support -P
. And if it does, its Perl-compatible regular expressions might not be completely Perl-compatible. The man
page for grep
on my machine says
-P, --perl-regexp Interpret the pattern as a Perl-compatible regular expression (PCRE). This is experimental and grep -P may warn of unimplemented features.
The one implementation of regular expressions guaranteed to be fully Perl-compatible is Perl.
If the version of grep
on your system supports the -P
option and is adequately Perl-compatible, it will run faster than tcgrep
. But if you find yourself on a computer that has Perl
but not a recent version of grep
, you may find tcgrep
handy.
Installation
tcgrep
is included as part of the Unicode::Tussle
Perl module; since tcgrep
is a wrapper around Perl, it is as Unicode-compliant as Perl is. So you could install tcgrep
(and several more utilities) with
cpan Unicode::Tussle
This worked for me on Linux without any issues but the install failed on Windows.
I installed tcgrep
on Windows by simply copying the source code. (I don’t recall now where I found the source code. I didn’t see it this morning when I searched for it, but I imagine I could have found it if I’d been more persistent.) I commented out the definition of %Compress
to disable searching inside compressed files since this feature required Unix utilities not available on Windows.
Consistency
Another reason to use tcgrep
is consistency. Perl is criticized for being inconsistent. The Camel book itself says
In general, Perl functions do exactly what you want—unless you want consistency.
But Perl’s inconsistencies are different, and in my opinion less annoying, than the inconsistencies of Unix tools.
Perl is inconsistent in the sense that functions behave differently in different contexts, such as a scalar context or a list context.
Unix utilities are inconsistent across platforms and across tools. For example, a tool like sed
will have different features on different platforms, and it will not support the same regular expressions as another tool such as awk
.
Perl was written to be a “portable distillation of Unix culture.” As inconsistent as Perl is, it’s more consistent that Unix.
I liked Perl – it’s my first production language, but I found Ripgrep https://github.com/BurntSushi/ripgrep way more powerfull.
Another options for actual Perl regexes is ack (https://beyondgrep.com) which is pure Perl and allows you to use Perl’s capture variables, such as:
ack ‘#include ‘ -h –output=’$1’ –cc
to get a list of all the header files included in your C files.
However, ack stumbles with Unicode, so if that’s important, then certainly tchrist’s work is the way to go.