Learn one Perl command

A while back I wrote a post Learn one sed command. In a nutshell, I said it’s worth learning sed just do commands of the form sed s/foo/bar/ to replace “foo” with “bar.”

Dan Haskin and Will Fitzgerald suggested in their comments that instead of sed use perl -pe with the same command. The advantage is that you could use Perl’s more powerful regular expression syntax. Will said he uses Perl like this:

cat file | perl -pe "s/old/new/g" > newfile

I think they’re right. Except for the simplest regular expressions, sed’s regular expression syntax is too restrictive. For example, I recently needed to remove commas that immediately follow a digit and this did the trick:

cat file | perl -pe "s/(?<=d),//g" > newfile

Since sed does not have the look-behind feature or d for digits, the corresponding sed code would be more complicated.

I quit writing Perl years ago. I don’t miss Perl as a whole, but I do miss Perl’s regular expression support.

Learning Perl is a big commitment, but just learning Perl regular expressions is not. Perl is the leader in regular expression support, and many programming languages implement a subset of Perl’s regex features. You could just use a subset of Perl features you already know, but you’d have the option of using more features.

Related post: Perl 6 as the anti-JavaScript

Tagged with: , ,
Posted in Software development
11 comments on “Learn one Perl command
  1. Ronan says:

    FWIW, that’s a useless use of cat. You can do:
    perl -pe “s/(? newfile

    (In particular, Windows users may have perl but no cat.)

    And if you just want to do it in-place, with a backup to “file.bak”:
    perl -pi.bak -e “s/(?<=d),//g" file

  2. CJ says:

    I still find myself using sed for its in-place feature: sed -i '' 's/foo/bar/' file.txt

    I haven’t found any other way to do in-place editing without writing a whole script.

  3. Ben says:

    There is an even better way. Perl allows in-place editing with -i. This argument takes an optional suffix to append to the original files.

    This means that you can do your search/replace on a bunch of files and have backups in one line.

    perl -pi.bak -e "s///g"

    Of course, if you do not want backups, just leave off the suffix. Every file will be edited in place and no backups created.

    perl -pi -e "s///g"

    This is how to get a nice slice of Perl Pie (-pi -e).

  4. Magnum says:

    sed ‘s/([0-9][0-9]*),/1/g’ newfile

  5. Magnum says:

    edit:
    The comment system ate the redirect symbols, but you get the idea.

  6. Marmaduke says:

    I stopped writing new scripts in Perl a while ago, but I still use these sorts of Perl one liners all the time. I never switched to sed for more or less the same reasons: I already know Perl pretty well; and it’s much more powerful.

  7. g says:

    Magnum: surely that can be drastically simplified to sed ‘s/([0-9]),/1 /g’ ?

  8. Philip Ngai says:

    g: the * notation allows for 0 or more instances. So the first [0-9] is needed to be sure there is at least one digit and the second [0-9] swallows any additional digits.

  9. g says:

    Philip: I understand the meaning of Magnum’s RE, but the point is that “if you have 1 or more digits followed by a comma, replace them with the same digits and then a space” is equivalent to “if you have one digit followed by a comma, replace that with the same digit followed by a space”. It’s not matching the same set of characters, but it is making the same change. Unless I’m being stupid, of course, which is entirely possible.

  10. Philip Ngai says:

    g: you are right, I didn’t read your simplification closely enough.

  11. Beetle B. says:

    There are people in the world with nothing better to do than compile
    lists of dummy uses of the `cat’ command, as in that example, and pour
    scorn on them, but I’ll just have to brave it out.

    From the User’s Guide to zsh

1 Pings/Trackbacks for "Learn one Perl command"
  1. [...] Cook has an interesting blog post, “Learn one Perl command”. He highlights text replacement, using the Unix command sed. But, others commented and did him [...]