Learn one sed command

You may have seen sed programs even if you didn’t know that’s what they were. In online discussions it’s common to hear someone say

s/foo/bar/

as a shorthand to mean “replace foo with bar.” The line s/foo/bar/ is a complete sed program to do such a replacement.

sed comes with every Unix-like operating system and is available for Windows here. It has a range of features for editing files, but sed is worth using even if you only know how to do one thing with it:

sed "s/pattern1/pattern2/g" file.txt > newfile.txt

This will replace every instance of pattern1 with pattern2 in the file file.txt and will write the result to newfile.txt. The original file file.txt is unchanged.

I used to think there was no reason to use sed when other languages like Python will do everything sed does and much more. Suppose you agree with that. Now suppose you find you often have to make global search-and-replace operations and so you write a script to do this, say a Python script. You’ve got to call your script something, remember what you called it, and put it in your path. How about calling it sed? Or better, don’t write your script, but pretend that you did. If you’re on Linux, it’s already in your path. One advantage of the real sed over your script named sed is that the former can do a lot more, should you ever need it to.

Now for a few details regarding the sed command above. The “s” on the front stands for “substitute” and the “g” on the end stands for “global.” Without the “g” on the end, sed would only replace the first instance of the pattern on each line. If that’s what you want, then remove the “g.”

The patterns inside a sed command are regular expressions, so it’s best to get in the habit of always quoting sed commands. This isn’t necessary for simple string substitutions, but regular expressions often contain characters that you’ll need to prevent the shell from interpreting.

You may find the default regular expression support in sed odd or restrictive. If you’re used to regular expressions in Perl, Python, JavaScript, etc. and you’re using a Gnu implementation of sed, you can add the -r option for more familiar regular expression syntax.

I got the idea for this post from Greg Grouthaus’ post Why you should learn just a little Awk. He makes a good case that you can benefit from learning just a few commands of a language like Awk with no intention to learn more of the language.

More regular expression posts

10 thoughts on “Learn one sed command

  1. I have used sed for quite a while now. I have found its default regular expressions syntax quite restrictive, and even with extended regular expressions, they are still implemented as a DFA per the POSIX standard (this can be annoying when you’re using an expression like foo.*bar, since foobar may simply just match foo.*) and does not have the familiar character classes, again for POSIX compliance.

    The best way to accomplish everything that sed and awk do is to use the following command:

    perl -p -e 'some perl command' myfile > newfile

    this tells the perl interpreter to execute the perl command on each line of myfile, assigning that line in question to $_ (as far as the perl input command — or set of commands — is concerned). This construct has more than fulfilled all my needs which Iwould have previously used awk or sed for, and perl is shipped with every unix distro anyways. See this article — http://www.techrepublic.com/article/use-command-line-perl-to-make-unix-administration-easier/1044668 . Cheers!

  2. Dan, thanks for the tip. Along with Greg Grouthaus’ line of thinking, someone could use Perl as you suggested without learning any more Perl. The “some perl command” could be sed commands of the form in this post.

  3. I third ‘perl -e’. I use it all the time, much more than invoking Perl scripts, although for more complicated jobs the script is the way to go.

    I don’t use ‘perl -e’ so much for file editing as for things like:

    perl -e "foreach $i (0..999) {mkdir "foo$i"}"

    I haven’t found a simpler or more convient way to do this on Windows. PowerShell could probably do it, and I’m sure there’s a way to install and use Unix-y tools to do it, too. But I find the Perl one-liners extremely convenient.

  4. Some distributions come with a command called ‘replace’.
    I often use it for simple recursive substitutions in multiple files at once:
    find . -type f -name "*.ext" | xargs replace "BEFORE" "AFTER" --

  5. @Chris: it’s a common pattern if you’re using more than one pipe. Eg: ‘cat foo | sed -e s/foo/bar/’ | grep ’37’ > foo.37s.fixed_foo’. Doing it this way makes it easier to reorder the pipe components, and also makes it easier for people who look at your scripts to tell what file you’re working with. Also, it’s a bit easier to reason about if you’re doing a for x in a b c; do … done loop. In the end, though, I find it’s most common amongst people who do a lot of work on one command line (which, like many things, is naughty and something you should never do, even though everyone starts doing it sooner or later).

  6. With -i you can do all the changes within the original file without the need to create a second output file (I think it just moves the output to the input after it’s done).

  7. Some parts of perl were designed to appeal to then-current UNIX users. The Perl Power Tools were a proof of concept reimplementation of the (common)UNIX tools, and generally only took a few lines for each tool.

    As to “learn one xyz command,” the fact that these sort of commands are re-implemented over and over indicates that there is a common problem that isn’t solved by by the UNIX tools. That problem might just be the horrible documentation, but I’m not so sure.

  8. if you want to replace newline chars, it’s a problem with sed. Also, if your text span multi-lines. After i learned perl, i thought perl supercedes all unix commands, but actually i find it faster and more efficient to mine apache log files by several piping with sed/awk/grep/cat/sort etc. These days, all my text processing script is elisp in emacs.

Comments are closed.