Editing by semantic units

The most basic text editor commands operate on lines and characters: move up or down a line, delete the next or previous character, etc. More advanced commands operate on context-specific semantic units. In the context of English prose, this means words, sentences, and paragraphs. In the context of source code, it means blocks and functions, however these are delimited in a particular language. In the context of an outline, it means navigating the tree.

One way to become more proficient at using your editor of choice is to become fluent at working with larger semantic units. Of course this saves keystrokes, though as I argued here, I don’t think typing speed matters so much. However, working in larger semantic units reduces errors. It also improves the chances that you can act on an idea before you forget what you were doing. These factors matter more than typing speed per se.

The general principles above apply to any editor. Here I’ll sketch how these ideas work in Emacs.

As a general rule, commands that operate on physical units begin with Control and analogous commands that work on semantic units being with Meta (Alt). For example, C-f moves forward a character and M-f moves forward a word. C-e moves to the end of a line and M-e moves to the end of a sentence.

Emacs extends the meaning of prose units by analogy in other contexts. For example, in calendar mode, a “line” corresponds to a week, and a “paragraph” corresponds to a month.

Emacs has numerous commands to work on “balanced expressions,” i.e. text surrounded by parentheses, brackets, braces, quotation marks, etc. Often the default keybindings are consistent with those used for other units. For example, “f” means forward and “b” means backward in any context. C-f and C-b move by character, M-f and M-b move by word, and C-M-f and C-M-b move by balanced expression. Emacs also has commands for working with function definitions: selecting a function definition, moving to the beginning or end of a function definition, etc. In Lisp, function definitions are balanced expressions. In other languages, like C, these ideas are distinct.

This morning, Irreal points out an Emacs package expand-region that selects larger and larger semantic regions. The first time it is invoked, it selects the immediate context of the cursor. Each subsequent invocation selects a larger context, moving up the parse tree of the content. I look forwarding to trying it.

3 thoughts on “Editing by semantic units

  1. Intellij IDEA has a semantic “expand region” command (Command-W) too. Very much recommended. The refactoring commands, e.g. “Extract method” also understand the code surrounding the cursor, asking which semantic region you want to extract.

  2. As an emacs user myself, I loathe IDEs. For example, over a million man-hours have gone into Eclipse, and they don’t even have soft word wrap!

    Unfortunately, alt-shift-upArrow in Eclipse (IntelliJ’s ctrl-w) demands that I use an IDE to edit Java.

    If expand-region works like alt-shift-upArrow / ctrl-w, I may be able to end my nightmare!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>