Unix philosophy says a program should do only one thing and do it well. Solve problems by sewing together a sequence of small, specialized programs. Doug McIlroy summarized the Unix philosophy as follows.
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
This design philosophy is closely related to “orthogonality.” Programs should have independent features just as perpendicular (orthogonal) lines run in independent directions.
In practice, programs gain overlapping features over time. A set of programs may start out orthogonal but lose their uniqueness as they evolve. I used to think that the departure from orthogonality was due to a loss of vision or a loss of discipline, but now I have a more charitable explanation.
The hard part isn’t writing little programs that do one thing well. The hard part is combining little programs to solve bigger problems. In McIlroy’s summary, the hard part is his second sentence: Write programs to work together.
Piping the output of a simple shell command to another shell command is easy. But as tasks become more complex, more and more work goes into preparing the output of one program to be the input of the next program. Users want to be able to do more in each program to avoid having to switch to another program to get their work done.
An example of the opposite of the Unix philosophy would be the Microsoft Office suite. There’s a great deal of redundancy between the programs. At a high level, Word is for word processing, Excel is the spreadsheet, Access is the database etc. But Excel has database features, Word has spreadsheet features, etc. You could argue that this is a terrible mess and that the suite of programs should be more orthogonal. But someone who spends most of her day in Excel doesn’t want to switch over to Access to do a query or open Word to format text. Office users are grateful for the redundancy.
Software applications do things they’re not good at for the same reason companies do things they’re not good at: to avoid transaction costs. Companies often pay employees more than they would have to pay contractors for the same work. Why? Because the total cost includes more than the money paid to contractors. It also includes the cost of evaluating vendors, writing contracts, etc. Having employees reduces transaction costs.
When you have to switch software applications, that’s a transaction cost. It may be less effort to stay inside one application and use it for something it does poorly than to switch to another tool. It may be less effort to learn how to do something awkwardly in a familiar application than to learn how to do it well in an unfamiliar application.
Companies expand or contract until they reach an equilibrium between bureaucracy costs and transaction costs. Technology can cause the equilibrium point to change over time. Decades ago it was efficient for organizations to become very large. Now transaction costs have lowered and organizations outsource more work.
Software applications may follow the pattern of corporations. The desire to get more work done in a single application leads to bloated applications, just as the desire to avoid transaction costs leads to bloated bureaucracies. But bloated bureaucracies face competition from lean start-ups and eventually shed some of their bloat or die. Bloated software may face similar competition from leaner applications. There are some signs that consumers are starting to appreciate software and devices that do less and do it well.
A nice post. I would offer a corollary to “do one thing and do it well:”
Have a clear understanding of the purpose of the program, address an appropriate subset of the available problems (and stick to it).
Word and office are good examples of this: rather than writing a general purpose “office application” functionality has been divided into functional areas like documentation and spreadsheets.
I think I may be moving away from the original topic a bit but: Where many companies (within the limits of my experience) get into trouble is that someone who does not really understand the limitations of software development is the person who is in charge of a development effort. The inevitable result is feature creep (implemented by hacks that “will be fixed later”) and code that is increasingly difficult to maintain. This results in expenses that no one ever plans for.
Upon further reflection, perhaps the important part of the Unix Philosophy is the “Do it well.”
I think most good programmers try to keep the complexity and size of their programs at a level where they can “do it well.” As a result, the normal pressures in the commercial environment make it easier to produce higher quality small programs than large programs. This is a substantial conflict in the commercial world: My boss isn’t interested in me taking the time to make a perfect program; he wants me to make one that is good enough, then move on to other important things. In this situation, it is easier for me to produce a series of very small programs that each does their task well, than to try to produce the ideal large program.
Nice post.
What’s the impact of plugin architectures in this analysis?. I think that this architecture has reduced the need for interoperability between different programs. Though you can say that plugins are ‘little programs’ themselves.
“Companies often pay employees more than they would have to pay contractors for the same work.” — I’ve always seen the polar opposite! Contractors are almost universally hired tactically, short-term. Higher fees than a corresponding annualized salary are the premium paid for such intervention.
@Mark: I meant that companies pay employees more over a year than they would pay contractors for short-term bursts throughout the year. So I think you and I agree.
The philosophy would break down IF each of these small moving parts had a separate UI. But the better design has the user staying in a single UI and the flow behind the scenes is among these small moving parts.
The unix philosophy has limitations.
Technical limitations: as you point, using an armada of little programs is not efficient, because launching programs has a cost. Furthermore, the plain text format is also inefficient: one program computes some date, convert it to text, pass it to another program, which parses the date string and do something else with the result. That’s just insane. And the fact that the glue programming language that’s supposed to do that, bash shell scripts, is not exactly the best language one can dream of, doesn’t help.
As you also point, often it is not practical to achieve this orthogonally. Text editors like Vi or Emacs are a good example: one just cannot replace them with ed + a pretty-printer + awk or M4 macros + a CVS for the undo function.
Omar suggested something that is better: plugin architectures. I would go a step further an put this solution upside-down: an architecture based on a scripting language to glue together libraries or software components. This is what some people try to do with TCL, Python or Lua. If only they could use them more as platforms than as scripting languages.
@Astrobe: PowerShell offers an interesting variation on the Unix philosophy. It has a pipe analogous to the Unix pipe, but it passes objects rather than passing text. Every command outputs a .NET object. When you call a utility at the command line, you’ll see a text representation of the output, but if you pipe the output to another utility, the latter receives an object. So in your example, the output of the first command might be a date object or an object with a date attribute.
Somehow I knew PowerShell would come up in this discussion. It takes a while to get used to it but it’s a quite flexible design. Also passing objects around makes PowerShell kind of a shell equivalent of a plugin architecture. You can add easily new functions and cmdlets (modules in v2 are very plugin-like since they’re quite easy to write and add to a session) and even though there are objects of differing types it all fits together very well. Probably akin to the realization that duck typing doesn’t equal horrible disaster at runtime.
Also, as an example of a quite bloated standard Unix tool,
ls
has quite many options:usage: ls [-ABCFGHILPRSTUWZabcdfghiklmnopqrstuwx1] [-D format] [file ...]
There are only 5 lower-case letters missing and 11 upper-case ones. There’s only so much you can express with single-letter options and that program is now probably well beyond the number of reasonable options.
Generally, the “do one thing and do it well” philosophy is probably hard to justify in modern GUI applications as well. You probably don’t want a set of single applications to manage, edit, export and print photos each with one of them. I just use Lightroom for that purpose and am happy it does many things (well, even).
Doing single, simple things well makes sense from a developer perspective and you find it again in libraries: Usually they don’t abstract everything away from you. Few libraries consist of a few high-level operations that are then easily modeled into a complete program that does exactly those tasks. Most provide an interface or n abstraction to an API or do a single thing well. It’s up to the developer to chain library calls and own logic to create something useful. Given that Unix started out as a playground for developers (and even today many people seem to assume that the only people who ever use Unix-likes are developers of the same system) it’s no surprise that the shell followed that premise. Still, libraries and plugins often have better interfaces between the parts that make things less painful at times.
In any case, while composing complex solutions from many smaller ones is a nice thing to have for a developer it’s less so for the end-user. Strictly separating functionality is bound to creep into the user interface as well and integration between those parts will often suffer. Unless you give the user the power to compose everything on their own as well – in which case they are developers again and the whole UX is only suitable for a fraction of the people.
We have other things where developer interfaces have crept into the general user experience for decades: file systems for example. There is hardly any user-friendly abstraction over files.
Ah well, I think this has become off-topic enough by now :-)
I can think of several apps that do a great job of this from the Mac platform. WriteRoom is a simple text editor that allows you to edit in fullscreen mode. 1Password saves all your login/password/financial data and integrates it with popular browsers. Things is a simple but powerful to do application. I realize your focus here is command line, but I think these are good GUI examples. I think you see more of this on Apple’s platforms than Microsoft’s. Palm was good at this back in the day too.
this is the extreme opposite… when it becomes missleading. I stopped using MS Word and MS PowerPoint many years ago. But once in a while I had to edit one here and there. When I use PowerPoint I realize that I can not do many thing that are possible in Word. For example insert equations or images inline, or tables that flow with the text. It is not much to ask for 2GB/$400 software suite. Bloated software tricks us into thinking that features are repeated among programs for good, but it is not even true!! Do many things and everything poorly.
alfC: As for math typesetting in Office. In ye olde days before Office 2007 there was something along the lines of »Do one thing and do it well« which was either MathType or the Microsoft Equation Editor (a simplistic version of the same). Since you inserted the equation as an OLE object no application needed to know anything about it except who was responsible for it. That was the upside. The downside on the other hand is that no program can treat the object natively since it’s always another application simply rendering the object. This meant that Word had no power whatsoever over the formula object which became very noticeable if you inserted formulas with differing font sizes or different line heights or even different fonts – the results were all equally awful. That’s the downside of such an approach.
Since typesetting and text layout has to account for formulas in various ways (for example, if you use it inline, the term (a/sqrt(b/c)) needs to be placed at a different height than (sqrt(a/b)/c) simply because the main fraction is at another level. You can’t do this with automatically objects that are simply images placed in the text. You can do it if you have math typesetting natively.
You could probably solve this by having a baseline property for OLE objects which sullies a generic interface for anything with word processing-specific information that isn’t even relevant to most data moved around that way. Not a great way either.
With Office 2010 math typesetting made its way in the other Office applications too, by the way. And it was quite a long way to get there – the whole feature started out even before Office 2003. And while he Office applications surely all share code to some extent it’s probably not just a matter of an afternoon to add math support to PowerPoint or OneNote. Remember that specifications have to be written, things implemented, tested, etc. Those are many man-hours and there is only so much planned for new features – something has to be cut – always. Unless you enver want to ship a product, that is.
P.S.: Another thing I’d rather have natively than as OLE objects would be syntax-highlighted source code listings. Still planning on doing such a thing for Word, I just have to find the time.
In the middle of your article you switch from talking about programs to talking about applications. I understand the Unix Philosophy to state that one builds applications by comprising programs. It’s a subtle difference. You point to limits as to how far programs may be nested.
I have to disagree with some of the things in both the entry and some of the comments.
I think the flaw some people here are making is that they’re assuming that the Unix philosophy implies linearity – that everything has to be done by piping repeatedly. Why the assumption? The UNIX philosophy does not prohibit a hierarchical structure.
In the Office example, the UNIX philosophy wouldn’t dictate that you have to close Word and then open Excel. The UNIX philosophy states that if everything were coded well with the proper modularity, that it would be easy to write a wrapper program (which would incorporate, say, your GUI). So your wrapper program would automatically take your “Word” data and pass it on to “Excel”, and present you with everything transparently.
And note, of course, that the model here is simplistic. In the UNIX philosophy, Word wouldn’t be one unit, but itself a wrapper program.
I also disagree with the Emacs vs UNIX comparison. In fact, I often tell friends that Emacs exhibits the UNIX philosophy better than apps in UNIX typically do! The reason Emacs has become an “operating system” is the ease with which one can interconnect all the various packages built for it. It’s incredibly modular.
Finally, yes, there is some overhead to the UNIX philosophy (with reference to the example of dates). But you can’t have everything. Adhering to the UNIX philosophy gives you a lot, but at a cost.
To a large degree I think you are on the right track. I would offer, though, that it is not a transaction cost, but rather a context switch cost. The separation of concern you describe in Unix is rather clean because, after all, the problems being worked are closer to the machine. As we get closer to the human being, complexity rises and the problem fundamentally changes from compute bound to human brain bound.
The OP reminds me strongly of ESR’s discussion of why Emacs is so perennially popular: http://schemacs.com/articles/taoup/ch13s03.html#id2967765
@John (comment 8): Check out object shell (http://geophile.com/osh). It’s the powershell approach, but done in Python for Linux. Commands run in one python process and are connected in a pipe-like manner. Python objects are passed from one command to the next. Database access fits in neatly — tuples can be piped to and from the database. It also include remote access. A command can be run on every node of a cluster. It’s just like running locally, except that each object returned includes the identity of the node that produced it. Sort of like a lightweight map/reduce. Commands can be run from the command line or through a Python API.
Yes, very interesting.
I think complaining that it’s hard to combine small simple things into large complex things is complaining that programming is hard. “How do I take a bunch of integers, some control-flow modifiers, and addition and build a word-processing program?” Well of course it’s hard when you break it down to that level, in the same way figuring out how to pipe things together in *NIX environments can be hard if all you have to work with is touch, ls, and grep… But that’s not the reality. There are abstractions. Pick the right one, and use it. And if that’s too hard, stop programming because you’re doing it wrong.
As Don pointed out above, you’re confusing programs with applications, although what was considered a program back then is now called a utility or tool. Not only that but the Unix philosophy is really just that, a philosophy for Unix, system administration. However, it has expanded to software development and is used all the time to manage large applications, just like those found in the Office Suite, it’s called “modular programming.” Now if this post was simply to say that it’s hard writing robust applications using unix tools and shell scripting glue, then yes, yes that’s correct.
But someone who spends most of her day in Excel doesn’t want to switch over to Access to do a query or open Word to format text. Office users are grateful for the redundancy.
Maybe this says more about the surrounding OS, the quality of integration, the user-experience and the implementation of this suite then about how it proves the UNIX philosphy fails? Woud this still be the case, if the surroundings made switching dead-easy? If they were actually very well integrated?
To take this back to the UNIX-philosophy:
Instead of writing my invoices in word, copy-pasting the calculations from my timesheets-CSV-import in excel, copy-pasting the address from my outlook addressbook (which, I can tell you s*cks), UNIX preaches the following:
One single application uses:
A program or library (not a suite, or an application!) that handles CSV-timesheets-imports and allows simple pre-definable turn-this-sheet-into-a-billable-table.
A single low-level central addressbook with proper API-able and parsable entries (like a sqllite, or couch database).
A dead-simple text-writer that allows me to enter some additional texts to my invoice.
A library or program that turns marked-up-text into PDF.
Now, that would be a nice invoicing tool. It uses the same addressbook my mail-client uses. It will find my timesheets and parse them for me. It will allow me to focus on the additional texts (instead of fiddling with spacing, font-sizes and indentations every single invoice). It is, therefore a truly UNIX-tool.
a bit controversial, but i think the so-called “unix philosophy” is just made up slogan that’s semi-marketing and semi-wishful-thinking. It’s a “philosophy” after-the-fact.
unix philosophy says “small is beautiful” and “do one thing well”, but vast majority of unix commands are not beautiful and doesn’t do one thing well. Lots examples… grep supports “-r” option, while it really should be just “grep” + “find”. “grep -r” itself has lots of problems. e.g. when you want recursive but only “*.html” files, it can’t be done in a simple way. Same with “find”… lots of problems, then there’s need for “xarg”. These applies to almost the whole unix bag.
also, i think this unix philosophy slogan arose in 1970s or 1980s. At the time, its competition afaik are Multics or later lisp machines. It’s before my time, but from what i read these systems are far better.
i think this joke summarize unix well at the time: http://xahlee.org/UnixResource_dir/_fastfood_dir/fastfood.html
also wrote about this here http://xahlee.org/UnixResource_dir/writ/unix_phil.html
Beetle B.: I believe the expression TANSTAAFL has relevance here. You are correct, and as Heinlein intended with his quip, the world is nothing but trade offs. If we were all logical, process-oriented beings, we’d compose in plain text and then typeset. But few, if any of us, are truly logical, and Word, while eschewing everything Unix is not, finds itself an audience quite easily.
But the Unix philosophy is still a noble one to pursue, as other commenters have already pointed out. Again, I especially like the idea of hierarchical “one-use programs” with wrappers to make them useful as a bigger group. There’s no reason to embed a specific utility into a GUI skin when it can be abstracted out of it’s frame and be made more useful to everyone.
Diversity is the reason life exists, and is also the reason to live. If we’d all solved the modular program debate we wouldn’t have anything to debate.
(P.S. As for parsing dates, sed/awk do a fabulously fast job of that, making the task only “seem” onerous to those trying to parse with Python or Ruby or what-not.)
Parsing dates onerous in Python? Seriously? Python and Perl are fantastic for such activities – will run neck and neck any day with sed/awk and all the downstream awk variants.
I thought the UNIX philosophy and elegance was in the pipes and filters model, along with the toolbox of individually useful bits and pieces. UNIX (and its nascent community) never espoused the philosophy of higher level, multi-activity user facing programs. The “philosophy” is one level down. Claiming it one level up is incorrect.
Filippo: I agree that Python and Perl are great at file parsing, certainly more convenient than using something like Java or C++. But I believe the advantage of sed and awk is in small, on-the-fly parsing. Some things that are easy in Perl or Python are even easier, or at least more succinct, in sed or awk. Even the creator of Perl says he uses awk for some tasks.
Of course, I agree, but you are addressing something wholly different. The OP didn’t say “small, on-the-fly …” He was specifically talking about parsing dates being onerous in Python.
DBus is an interesting solution in the UNIX world for reducing transaction costs between applications, similar to Microsoft’s OLE/COM… but probably has the same downsides as well.
There may not be a real benefit for command line applications, since they must still communicate with POSIX-compliant software. But modern desktop environments such as GNOME and KDE and related applications take real advantage today.
There’s Programming and there’s Solving a Problem. Programming is much easier. Making programs work together is like programming too. Your program’s building blocks are little programs, written by others. Often you have to switch to the mindset of the original authors which might be totally different than yours. With many programs, this may mean many switches. Hence one thinks that it is faster to start from scratch and built their own custom version of the same hack. Or a programmer (a team, or a project manager) could suffer from the “not invented here” syndrome.
One of the tenets that Mike Gancarz (author of the “Unix Philosophy”) mentioned so many years ago was to avoid captive user interfaces.
So, in an environment run strictly by the unix philosophy, there would be no transaction costs between switching from a spreadsheet to a word doc because you wouldn’t be running a gui – you would be sitting in your shell writing quick and dirty scripts.
Before you think that’s impossible in today’s world – that is how I work. I work in a major bank where we have to keep supporting foreign exchange wires and trouble shooting issues. We deal with 1000 rows of data.
I don’t work directly in excel – it would take weeks to do what I do in minutes.
I pull data out of Oracle into a flat file, write quick/dirty unix/awk scripts, build csv files, ftp them to windows, and then load them into excel for sending them out.
Many times, I have to work backwards, saving excel sheets with 1000+ rows into a CSV file, ftp’ing it to unix, and using an awk script to generate sql commands for the database.
I think that using Microsoft Windows as example to explain UNIX philosophy is not a good idea. Using existing tools to solve complex problems is the esence of UNIX and that’s why it’s so powerfull.
Nice post. The question is where the “transaction cost” of Unix comes from. Interestingly, it is from the third clause of the Unix philosophy. My post has the details of this observation:
http://yinwang0.wordpress.com/2011/04/14/unix
Brilliantly written post. The part about people using a familar application for a task even it means it is awkward, was spot on. But many often, I find these overly bloated softwares overwhelming and distracting users from the core function it is supposed to do.
But as you said, the unix philosphy is coming back, as more and more people are looking for lean softwares that do a particular thing well. So there is hope.
The Un*x philosophy works for the OS it was designed for, where text is the primary interface and the way data flows. The problem is, Windows does not have a single reliable rich data exchange format (the clipboard is just a blob, files are not naturally structured, etc), so piping data around isn’t natively standardized. I humbly submit that an OS with *visual* rich data pipes built into it as a meme for programmers to use when having data flow between programs, so users can snap them together visually… modern OSes are more likely to follow in Un*x’s footsteps. The inconvenience of opening multiple programs and incompatibility between them could be largely mitigated if, rather than launching programs by extension, users were prompted based on the unsatisfied data pipes. Just a thought.
@Jason Hughes:
I think there existed Khoros that could be used for visual pipelines of data. But I do not know much more about it than the abstract that I linked to.
Jason: Perhaps I muddied the waters by bringing up a Windows program as an example, though my intention was to focus on the Unix philosophy, not on Unix. Unix itself doesn’t follow the Unix philosophy, not strictly. Even Unix command line utilities have become less orthogonal over time, and I suspect the reason is that people want to avoid switching tools. For example,
find
duplicates the functionality of several other utilities. And that’s not necessarily a bad thing. It’s convenient, for example, to specify a regular expression in a call tofind
rather than pipingfind
output togrep
.Windows does have a single rich data exchange format, OLE (Object Linking and Embedding), and the clipboard uses it. OLE is what lets you copy from one program and paste into another in a context-sensitive way. OLE is an amazing technology. Microsoft doesn’t talk much about OLE/COM since the advent of .NET, but it’s still OLE that’s providing the plumbing for much Windows software, including Windows itself. I imagine the alternative .NET APIs are largely wrappers around COM and OLE.
On the other hand, OLE is complicated. The barrier to entry is much higher than the barrier to entry for parsing text. You don’t have to read a 1000+ page book before you can parse text. If you’re going to embed one kind of rich media inside another, you need something complicated like OLE. But this is overkill for a great deal of day-to-day work where text and pipes are adequate.
Programs with similar function to MS Office could be written in the “software tools” style of programming. We can create more complex tools as a simple composition of simpler tools.
I think that the unix philosphy is great for defining what unix is.
I am not convinced, though, that applications should be considered a part of unix — even if they run on unix.
I would also note that the standard unix toolkit contains programs which have severe and easily fixable problems because of how they have ignored important aspects of unix (for example, xargs, and its problem dealing with newline delimited names when the names contain spaces).
I loved the diagonal shortcut metaphor – very vivid.
The worst breakdown I am seeing are services, especially those who process large volume of data. To achieve efficiency, computations are bring to the data and very close to the data tier, it become increasing hard to make sure the processing are orthogonal and at the same time keep the performance guarantee.
Business rule are hard coded within SQL Stored Procedures to make processing fast. So you just can’t reconfigure it to do something else without changing it.
It’s not the result of the UNIX philosophy breaking down, but the refusal of programmers to adopt it, which creates greater complexity than sticking with the UNIX model. Emacs is a great use case for this; Emacs can actually be considered producing many programs, but they all adhere to the main premise of the function of editing text. LISP allows the communication between these sub-programs in Emacs to remain consistent to the UNIX model.
Sadly the author is right that monoliths are viewed as more cost-effective and so it’s going to be a long time before the industry accepts the Unix Philosophy, which I am a strong proponent of. I’m going to fight the battle in my little world though, simply because I’ve lost the ability and motivation to write code the bureaucratic way.
Good points and well laid out.
The Unix philosophy is relevant as long as we stay in the cli. As we transition into a GUI world and the complexity that brings, the old philosophy no longer holds true. Users are less savvy and expect their software to relieve them of the need to understand how their tools actually work. With this numbing of knowledge, comes these mega crossover applications that enable today’s users to remain productive.