Readable path listings

Windows has never made it easy to read long environment variables. If I display the path on one machine I get something like this, both from cmd and from PowerShell.

C:bin;C:binPython25;C:binTeXmiktexbin;C:binTeXMiKTeXmiktexbin;C:binPerlbin;C:ProgramFilesCompaqCompaq Management AgentsDmiWin32Bin; ...

The System Properties window is worse since you can only see a tiny slice of your path at a time.

screen shot of path UI

Here’s a PowerShell one-liner to produce readable path listing:

$env:path -replace ";", "`n"

This produces

C:bin
C:binPython25
C:binTeXmiktexbin
C:binTeXMiKTeXmiktexbin
C:binPerlbin
C:Program FilesCompaqCompaq Management AgentsDmiWin32Bin
...

(If you’re not familiar with PowerShell, note the backquote before the n to indicate the newline character to replace semicolons. This is one of the most unconventional features of PowerShell since backslash is the escape character in most contexts. Because Windows uses either forward or backward slashes as path separators, PowerShell could not use backslash as an escape character. Think of the backquote as a little backslash. Once you get over the initial shock, you get used to the backquote quickly.)

Update: It occurred to me after the original post that there’s an even simpler way to display the path.

$env:path.split(';')

Integrating the clipboard and the command line

Two of my favorite cmdlets from the PowerShell Community Extensions are get-clipboard and out-clipboard. These cmdlets let you read from and write to the Windows clipboard from PowerShell. For example, the following code will grab the contents of the clipboard, replace every block of white-space with a comma, and paste the result back to the clipboard.

(get-clipboard) -replace 's+(?!$)', ',' | out-clipboard

I saved this to a file comma.ps1 in my path and run it when I get a list of numbers from one program delimited by newlines or tabs and need to make it the input to another program expecting comma-delimited values. For example, turning a column of numbers into an array for R. I copy one format, run comma.ps1, and paste in the new format.

In case you’re curious about the mysterious characters in the script, s+(?!$) is a regular expression describing where I want to substitute a comma. The s refers to white-space characters (tabs, spaces, newlines) and the +says this is repeated one or more times. So match one or more consecutive white-space characters. That would be enough by itself, but it would replace trailing white-space with a comma too, so I might get an unwanted comma at the end. The sequence (?!$) fixes that. The $ matches the end of line. The (?! before and the ) after form a negative look ahead, meaning “except when the thing inside matches.” So taken all together, the regular expression matches chunks of white-space except at the end of the input.

Update: See Manipulating the clipboard with PowerShell

One program to rule them all

Do you have a single program that you “live in” when you’re at a computer? Emacs users are known for “living” inside Emacs. This means more than just using the program for a large part of the day. It means using the program as the integration point for other programs, a sort of backplane for tying other things together.

Steve Yegge’s most recent blog post described his switch from Windows to Mac. He said the main reason for the switch was that he prefers the appearance of the fonts on a Mac. Changing operating systems was not a big deal for Yegge because he didn’t really live in Windows before, nor does he live in OS X now. He lives in Emacs. He concludes his essay by saying

So I’ll keep using my Macs. They’re all just plumbing for Emacs, anyway. And now my plumbing has nicer fonts.

Graphic artists may spend the majority of their work day using Photoshop, but they don’t send email from Photoshop, and they don’t keep their calendar in Photoshop. So I wouldn’t say they “live” in Photoshop. Microsoft developers spend a great deal of their time inside Visual Studio, though they don’t live inside Visual Studio to the same extent that Emacs users live inside Emacs. The Visual Studio experience is somewhere between Photoshop and Emacs on the “live in” scale. Unlike Emacs, Visual Studio has no ambition to become an operating system, probably because the company that makes Visual Studio already has an operating system.

I once knew someone who lived in Mathematica, doing his word processing etc. inside this mathematical package. Mathematica is a nice place to visit, but I wouldn’t want to live there.

A growing number of people now live inside their web browser, particularly if that browser is Firefox. There are Firefox plug-ins available to mow your lawn and take your children to the orthodontist. Maybe Firefox is becoming the Emacs of a new generation.

The choice of a program to live in is really a choice of how you want to tie applications together. To live in Emacs, you have to write Emacs Lisp, and that’s a deal-breaker for many. Interestingly, Microsoft has a project to create a highly configurable editor some have nicknamed Emacs.NET. You can bet that the extension language will not be Emacs Lisp.

Some people live in their command shell and use shell scripts to tie everything together. While many Unix folks live that way, that hasn’t been practical on Windows until recently when PowerShell came out.

Automated software builds

My first assignment as a professional programmer was to build another person’s program. I learned right away not to assume a project will build just because the author says it will. I’ve seen the same pattern repeated everywhere I’ve worked. Despite version control systems and procedures, there’s usually some detail in the developer’s head that doesn’t get codified and only the original developer can build the project easily.

The first step in making software builds reproducible is documentation. There’s got to be a document explaining how to extract the project from version control and build it. Requiring screen shots helps since developers have to rehearse their own instructions in order to produce the shots.

The second step is verification. Documentation needs to be tested, just like software. Someone who hasn’t worked on the project needs to extract the code onto a clean machine and build the project using only written instructions — no conversation with the developer allowed. Everyone thinks their code is easy to build; experience says most people are wrong.

The verifiers need to rotate. If one person serves as build master very long, they develop the same implicit knowledge that the original programmers failed to codify.

The third step is automation. Automated instructions are explicit and testable. If automation also saves time, so much the better, but automation is worthwhile even if it does not save time. Clift Norris and I just wrote an article on CodeProject entitled Automated Extract and Build from Team System using PowerShell that helps with this third step if you’re using Visual Studio and VSTS.

Text reviews for software

When users find spelling and grammar errors in your software, your credibility takes a hit. But apparently very few software projects review the text their software displays. I imagine the ones that do review their text use a combination of two leaky methods: asking execution testers to take note of prose errors, and requiring that all text displayed to users be stored in a string table.

There are a couple problems with asking execution testers to be copy editors. First, they’re not copy editors. They may not recognize a grammatical error when they see it. Second, they only see the text that their path through the software exposes. Messages displayed to the user under unusual circumstances slip through testing.

String tables are a good idea. They can be reviewed by a professional editor. (Or translator, if you’re application is internationalized.) But it’s difficult to make sure that every string the user might see is in the string table. When you need to add a few quick lines of error-handling code, it’s so easy to just include the text right there in the code rather than adding an entry to the string table. After all, you say to yourself, the code’s probably not going to run anyway.

My solution was to write a script that extracts all the quoted text from a source tree so it can be reviewed separately. The script tries to only pick out strings that a user could see, filtering out, for example, code quoted inside code. Doing this perfectly would be very hard, but by tolerating a small error rate, the problem can be solved quickly in a few lines of code. I’ve used this script for years. Nearly every time I run it I discover potentially embarrassing errors.

In addition to helping with copy editing, an extract of all the string literals in a project gives an interesting perspective on the source code. For example, it could help uncover security risks such as SQL injection vulnerabilities.

I’ve posted an article on CodeProject along with the script I wrote.

PowerShell Script for Reviewing Text Shown to Users

The script on CodeProject is written for Microsoft’s PowerShell. If anyone would like a Perl version of the script, just let me know. I first wrote the script in Perl, but then moved it to PowerShell as my team was moving to PowerShell for all administrative scripting.

C# verbatim strings vs. PowerShell here-strings

C# verbatim strings and PowerShell here-strings have just enough in common to be confusing. The differences are summarized here.

C# verbatim strings PowerShell here-strings
May contain line breaks Must contain line breaks
Only double quote variety Single and double quote varieties
Begins with @” Begins with @” (or @’) plus a line break
Ends with “ Ends with a line break followed by “@ (or ‘@)
Cannot contain un-escaped double quotes May contain quotes
Turns off C# escape sequences @’ turns off PowerShell escape sequences but @” does not