<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Endeavour &#187; Software development</title>
	<atom:link href="http://www.johndcook.com/blog/category/software-development/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.johndcook.com/blog</link>
	<description>The blog of John D. Cook</description>
	<lastBuildDate>Fri, 10 Feb 2012 23:03:26 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Programming language popularity</title>
		<link>http://www.johndcook.com/blog/2012/02/09/programming-langauge-popularity/</link>
		<comments>http://www.johndcook.com/blog/2012/02/09/programming-langauge-popularity/#comments</comments>
		<pubDate>Thu, 09 Feb 2012 11:26:31 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10699</guid>
		<description><![CDATA[Here are two ways of measuring programming language popularity:

Rank by number of questions tagged with that language on Stack Overflow
Rank by number of project on GitHub using that language

According to this article, these two measures are well correlated.
I&#8217;d be skeptical of either metric by itself. A large number of questions on a language could indicate [...]]]></description>
			<content:encoded><![CDATA[<p>Here are two ways of measuring programming language popularity:</p>
<ol>
<li>Rank by number of questions tagged with that language on Stack Overflow</li>
<li>Rank by number of project on GitHub using that language</li>
</ol>
<p>According to <a href="http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/">this article</a>, these two measures are well correlated.</p>
<p>I&#8217;d be skeptical of either metric by itself. A large number of questions on a language could indicate that it&#8217;s poorly documented, for example, rather than popular. And GitHub projects may not representative. But the two measures give similar pictures of the programming language landscape, so together they have more credibility. On the other hand, both measures are probably biased in favor newer languages.</p>
<p><a href="http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/">The RedMonk Programming Language Rankings: February 2012</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2012/02/09/programming-langauge-popularity/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Perl One-Liners Explained</title>
		<link>http://www.johndcook.com/blog/2012/02/04/perl-one-liners-explained/</link>
		<comments>http://www.johndcook.com/blog/2012/02/04/perl-one-liners-explained/#comments</comments>
		<pubDate>Sat, 04 Feb 2012 18:27:38 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Perl]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10638</guid>
		<description><![CDATA[Peteris Krumins has a new book, Perl One-Liners Explained. His new book is in the same style as his previous books on awk and sed, reviewed here and here.
All the books in this series are organized by task. For each task, there is a one-line solution followed by detailed commentary. The explanations frequently offer alternate [...]]]></description>
			<content:encoded><![CDATA[<p>Peteris Krumins has a new book, <a href="http://www.catonmat.net/blog/perl-book/">Perl One-Liners Explained</a>. His new book is in the same style as his previous books on <a href="http://www.catonmat.net/blog/awk-book/">awk</a> and <a href="http://www.johndcook.com/blog/2011/09/27/sed-one-liners/">sed</a>, reviewed <a href="http://www.johndcook.com/blog/2011/08/31/awk-one-liners/">here</a> and <a href="http://www.johndcook.com/blog/2011/09/27/sed-one-liners/">here</a>.</p>
<p>All the books in this series are organized by task. For each task, there is a one-line solution followed by detailed commentary. The explanations frequently offer alternate solutions with varying degrees of concision and clarity. Sections are seldom more than one page long, so the books are easy to read a little at a time.</p>
<p>Programmers who have written a lot of Perl may still learn a few things from Krumins. In particular, those who have primarily written Perl in script files may not be familiar with some of the tricks for writing succinct Perl on the command line.</p>
<p><strong>Other Perl posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2009/05/11/all-languages-equally-complex/">All languages equally complex?</a><br />
<a href="http://www.johndcook.com/blog/2008/02/21/periodic-table-of-perl-operators/">Periodic table of Perl operators</a><br />
<a href="http://www.johndcook.com/blog/2008/01/31/three-hour-a-week-language/">Three-hour-a-week language</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2012/02/04/perl-one-liners-explained/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preparing for change, expressing intent</title>
		<link>http://www.johndcook.com/blog/2012/01/17/expressing-intent/</link>
		<comments>http://www.johndcook.com/blog/2012/01/17/expressing-intent/#comments</comments>
		<pubDate>Tue, 17 Jan 2012 13:00:21 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10442</guid>
		<description><![CDATA[Many good programming practices boil down to preparing for change or expressing intent. It seems to me that novices emphasize the former, experts the latter.
One of the first things you learn in programming is to use symbolic constants rather than magic numbers. For example, if you have a maximum of 12 items in a shopping [...]]]></description>
			<content:encoded><![CDATA[<p>Many good programming practices boil down to preparing for change or expressing intent. It seems to me that novices emphasize the former, experts the latter.</p>
<p>One of the first things you learn in programming is to use symbolic constants rather than magic numbers. For example, if you have a maximum of 12 items in a shopping cart, define a constant like <code>MAX_ITEMS</code> to be 12 and use that symbol rather than the number &#8220;12&#8243; throughout the code. That way if you have to increase the maximum to 25 some day, you can just make the change in one place. Symbolic constants prepare for change.</p>
<p>Sounds good, but then why define a constant for pi? It&#8217;s not going to change. But having a constant <code>PI</code> in source code conveys the intention of the number.</p>
<p>There are 3,628,800 seconds in six weeks. Coincidentally, this number also equals 10!. But constants like <code>SECONDS_PER_SIX_WEEKS</code> and <code>TEN_FACTORIAL</code> clearly convey where the numbers come from. That&#8217;s why it&#8217;s sometimes worthwhile to give one thing two names. The symbol <code>SECONDS_PER_SIX_WEEKS</code> looks like a conversion factor, while <code>TEN_FACTORIAL</code> makes you think somewhere there are 10 things being arranged. Using the symbols in the opposite context would be clever, but not in a good way.</p>
<p>Expressing intent is easier to justify than preparing for change. If you argue that some chunk of code should be pulled out into its own function in case it needs to change, someone may argue &#8220;But that&#8217;ll never change.&#8221; If you argue that the same chuck of code should be pulled out and given a name to express what it&#8217;s trying to do, you&#8217;re likely to get less resistance.</p>
<p><strong>If you focus on making your intentions clear, your code will be easier to maintain</strong>. If you focus on maintainability alone, it might backfire. You might get lots of <a href="http://www.johndcook.com/blog/2009/10/05/yangi/">unneeded code</a>, inserted with the intent of making future maintenance easier, that makes maintenance harder.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2011/10/21/software-maintenance/">Why does software have to be maintained?</a><br />
<a href="http://www.johndcook.com/blog/2012/01/09/holographic-source-code/">Holographic code</a><br />
<a href="http://www.johndcook.com/blog/2012/01/12/risks-of-buggy-code/">Bugs, features, and risk</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2012/01/17/expressing-intent/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Bugs, features, and risk</title>
		<link>http://www.johndcook.com/blog/2012/01/12/risks-of-buggy-code/</link>
		<comments>http://www.johndcook.com/blog/2012/01/12/risks-of-buggy-code/#comments</comments>
		<pubDate>Thu, 12 Jan 2012 11:50:20 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Probability and Statistics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Quality]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10362</guid>
		<description><![CDATA[All software has bugs. Someone has estimated that production code has about one bug per 100 lines. Of course there&#8217;s some variation in this number. Some software is a lot worse, and some is a little better.
But bugs-per-line-of-code is not very useful for assessing risk. The risk of a bug is the probability of running [...]]]></description>
			<content:encoded><![CDATA[<p>All software has bugs. Someone has estimated that <strong>production code has about one bug per 100 lines</strong>. Of course there&#8217;s some variation in this number. Some software is a lot worse, and some is a little better.</p>
<p>But bugs-per-line-of-code is not very useful for assessing risk. The risk of a bug is the probability of running into it multiplied by its impact. Some lines of code are far more likely to execute than others, and some bugs are far more consequential than others.</p>
<p>Devoting equal effort to testing all lines of code would be wasteful. You&#8217;re not going to find all the bugs anyway, so you should concentrate on the parts of the code that are most likely to run and that would produce the greatest harm if they were wrong.</p>
<p>However, here&#8217;s a complication. <strong>The probability of running into a bug can change</strong> over time as people use the software in new ways. For whatever reason people to want to use features that had not been exercised before. When they do so, they&#8217;re likely to uncover new bugs.</p>
<p>(This helps explain why everyone thinks his preferred software is more reliable than others. When you&#8217;re a typical user, you tread the well-tested paths. You also learn, often subconsciously, to avoid buggy paths. When you bring your expectations from an old piece of software to a new one, you&#8217;re more likely to uncover bugs.)</p>
<p>Even though usage patterns change, they don&#8217;t change arbitrarily. It&#8217;s still the case that some code is far more likely than other code to execute.</p>
<p><strong>Good software developers think ahead</strong>. They solve more than they&#8217;re asked to solve. They think &#8220;I&#8217;m going to go ahead and include this other case while I&#8217;m at it in case they need it later.&#8221; They&#8217;re heroes when it turns out their guesses about future needs were correct.</p>
<p>But there&#8217;s a <strong>downside to this initiative</strong>. <a href="http://www.johndcook.com/blog/2008/09/01/you-do-pay-for-what-you-dont-use/">You pay for what you don&#8217;t use</a>. Every speculative feature either has to be tested, incurring more expense up front, or delivered untested, incurring more risk. This suggests its better to disable unused features.</p>
<p>You cannot avoid speculation entirely. Writing maintainable software requires speculating well, anticipating and preparing for change. <strong>Good software developers place good bets</strong>, and these tend to be small bets, going to a little extra effort to make software much more flexible. As with bugs, you have to consider probabilities and consequences: how likely is this part of the software to change, and how much effort will it take to prepare for that change?</p>
<p>Developers learn from experience what aspects of software are likely to change and they prepare for that change. But then they get angry at a rookie who wastes a lot of time developing some unnecessary feature. They may not realize that the rookie is doing the same thing they are, but with a less informed idea of what&#8217;s likely to be needed in the future.</p>
<p>Disputes between developers often involve <strong>hidden assumptions about probabilities</strong>. Whether some aspect of the software is responsible preparation for maintenance or wasteful gold plating depends on your idea of what&#8217;s likely to happen in the future.</p>
<p><strong>Related post</strong>: <a href="http://www.johndcook.com/blog/2009/10/05/yangi/">Why programmers write unneeded code</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2012/01/12/risks-of-buggy-code/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Holographic code</title>
		<link>http://www.johndcook.com/blog/2012/01/09/holographic-source-code/</link>
		<comments>http://www.johndcook.com/blog/2012/01/09/holographic-source-code/#comments</comments>
		<pubDate>Mon, 09 Jan 2012 13:00:50 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10284</guid>
		<description><![CDATA[In a hologram, information about each small area of image is scattered throughout the holograph. You can&#8217;t say this little area of the hologram corresponds to this little area of the image. At least that&#8217;s what I&#8217;ve heard; I don&#8217;t really know how holograms work.
I thought about holograms the other day when someone was describing [...]]]></description>
			<content:encoded><![CDATA[<p>In a hologram, information about each small area of image is scattered throughout the holograph. You can&#8217;t say this little area of the hologram corresponds to this little area of the image. At least that&#8217;s what I&#8217;ve heard; I don&#8217;t really know how holograms work.</p>
<p>I thought about holograms the other day when someone was describing some source code with deeply nested templates. He told me &#8220;You can&#8217;t just read it. You can only step through the code with a debugger.&#8221; I&#8217;ve ran into similar code. The execution sequence of the code at run time is almost unrelated to the sequence of lines in the source code. The run time behavior is scattered through the source code like image information in a holograph.</p>
<p>Holographic code is an advanced anti-pattern. It&#8217;s more likely to result from good practice taken to an extreme than from bad practice.</p>
<p>Somewhere along the way, programmers learn the &#8220;DRY&#8221; principle: Don&#8217;t Repeat Yourself. This is good advice, within reason. But if you wring every bit of redundancy out of your code, you end up with something like <a rel="nofollow" href="http://en.wikipedia.org/wiki/Huffman_coding">Huffman encoded</a> source. In fact, DRY is very much a compression algorithm. In moderation, it makes code easier to maintain. But carried too far, it makes reading your code like reading a zip file. Sometimes a little redundancy makes code much easier to read and maintain.</p>
<p>Code is like wine: a little dryness is good, but too much is bitter or sour.</p>
<p>Note that functional-style code can be holographic just like conventional code. A pure function is self-contained in the sense that everything the <em>function</em> needs to know comes in as arguments, i.e. there is no dependence on external state. But that doesn&#8217;t mean that everything the <em>programmer</em> needs to know is in one contiguous chuck of code. If you have to jump all over your code base to understand what&#8217;s going on anywhere, you have holographic code, regardless of what style it was written in. However, I imagine functional programs would usually be less holographic.</p>
<p><strong>Related post</strong>: <a href="http://www.johndcook.com/blog/2009/07/27/baklav-code/">Baklava code</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2012/01/09/holographic-source-code/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Just what do you mean by &#8217;scale&#8217;?</title>
		<link>http://www.johndcook.com/blog/2012/01/04/just-what-do-you-mean-by-scale/</link>
		<comments>http://www.johndcook.com/blog/2012/01/04/just-what-do-you-mean-by-scale/#comments</comments>
		<pubDate>Wed, 04 Jan 2012 13:00:00 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10370</guid>
		<description><![CDATA[&#8220;Fancy algorithms are slow when n is small, and n is usually small.&#8221; &#8212; Rob Pike
Someone might object that Rob Pike&#8217;s observation is irrelevant. Everything is fast when the problem size n is small, so design your code to be efficient for large n and don&#8217;t worry about small n. But it&#8217;s not that simple.
Suppose [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;Fancy algorithms are slow when <em>n</em> is small, and <em>n</em> is usually small.&#8221; &#8212; Rob Pike</p>
<p>Someone might object that Rob Pike&#8217;s observation is irrelevant. Everything is fast when the problem size <em>n</em> is small, so design your code to be efficient for large <em>n</em> and don&#8217;t worry about small <em>n</em>. But it&#8217;s not that simple.</p>
<p>Suppose you have two sorting algorithms, Simple Sort and Fancy Sort. Simple Sort is more efficient for lists with less than 50 element and Fancy Sort is more efficient for lists with more than 50 elements.</p>
<p>You could say that Fancy Sort scales better. What if <em>n</em> is a billion? Fancy Sort could be a lot faster.</p>
<p>But there&#8217;s another way a problem could scale. Instead of sorting <em>longer</em> lists, you could sort <em>more</em> lists. What if you have a billion lists of size 40 to sort?</p>
<p>People toss around the term &#8220;scaling,&#8221; assuming everyone has the same notion of scaling. But projects could scale along different dimensions. Whether Simple Sort or Fancy Sort scales better depends on how the problem scales.</p>
<p>The sorting example just has two dimensions: the length of each list and the number of lists. Software trade-offs are often much more complex. The more dimensions a problem has, the more opportunities there are for competing solutions to each claim that it scales better.</p>
<p><strong>Related posts</strong>:</p>
<ul>
<li><a href="http://www.johndcook.com/blog/2011/03/23/appropriate-scale/">Appropriate scale</a></li>
<li><a href="http://www.johndcook.com/blog/2008/07/16/scaling-the-number-of-projects/">Scaling the number of projects</a></li>
<li><a href="http://www.johndcook.com/blog/2010/07/19/stupidity-scales/">Stupidity scales</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2012/01/04/just-what-do-you-mean-by-scale/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Convention versus compulsion</title>
		<link>http://www.johndcook.com/blog/2011/12/24/convention-versus-compulsion/</link>
		<comments>http://www.johndcook.com/blog/2011/12/24/convention-versus-compulsion/#comments</comments>
		<pubDate>Sat, 24 Dec 2011 15:18:41 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10291</guid>
		<description><![CDATA[An alternate title for this post could be &#8220;Software engineering wisdom from a lecture on economics given in 1945.&#8221;
F. A. Hayek gave a lecture on December 17, 1945 entitled &#8220;Individualism: True and False.&#8221; A transcript of the talk is published in his book Individualism and Economic Order. In this talk Hayek argues that societies must [...]]]></description>
			<content:encoded><![CDATA[<p>An alternate title for this post could be &#8220;Software engineering wisdom from a lecture on economics given in 1945.&#8221;</p>
<p>F. A. Hayek gave a lecture on December 17, 1945 entitled &#8220;Individualism: True and False.&#8221; A transcript of the talk is published in his book <a href="http://www.amazon.com/gp/product/0226320936/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0226320936">Individualism and Economic Order</a>. In this talk Hayek argues that societies must decide between convention and compulsion as means to coordinate activity. The former is preferable, in part because it is more flexible. Individualism depends on</p>
<blockquote><p>… traditions and conventions which evolve in a free society and which, without being enforceable, establish flexible but normally observed rules that make behavior of other people predictable in a high degree.</p></blockquote>
<p>Of course Hayek wasn&#8217;t thinking of software development, but his comments certainly are applicable to software development. Software engineers are fond of flexibility, but suspicious of rules that cannot be enforced by a machine. And yet there are some kinds of flexibility that require traditions and conventions rather than enforceable rules. Hayek looks beyond the letter of the law to the spirit: the purpose of rules in software engineering is to make the behavior of software (and software engineers) &#8220;predictable in a high degree.&#8221;</p>
<p>I&#8217;ve written a couple blog posts on this theme. One was <a href="http://www.johndcook.com/blog/2011/05/26/software-architecture-and-trust/">Software architecture as a function of trust</a>:</p>
<blockquote><p>If you trust that your developers are highly competent and self-disciplined, you’ll organize your software differently than if you assume developers have mediocre skill and discipline. One way this shows up is the extent that you’re willing to rely on convention to maintain order. … In general, I see more reliance on convention in open source projects than in enterprise projects.</p></blockquote>
<p>Another was a post on <a href="http://www.johndcook.com/blog/2011/05/16/bumblebee-software/">the architecture of Emacs</a>:</p>
<blockquote><p>In short, Emacs expects developers to be self-disciplined and does not enforce a great deal of external discipline. However, because the software is so light on bureaucracy, it is easy to customize and to contribute to.</p></blockquote>
<p>The quotation from Hayek above continues:</p>
<blockquote><p>The willingness to submit to such rules, not merely so long as one understands the reason for them but so long as one has no definite reason to the contrary, is an essential condition for the gradual evolution and improvement of rules of social intercourse … an indispensable condition if it is to be possible to dispense with compulsion.</p></blockquote>
<p>Imagine a rookie programmer who joins a new team and only follows those conventions he fully understands. That&#8217;s not much better than the rookie doing whatever he pleases. The real benefit comes from his following the conventions he doesn&#8217;t yet understand (provided he &#8220;has no definite reason to the contrary&#8221;) because these distill the ideas of more experienced developers.</p>
<p>It takes time to pass on a set of traditions and conventions, especially to convey the rationale behind them. Machine-enforceable rules are a shortcut to establishing a culture.</p>
<p>Every project will be somewhere along a continuum between total reliance on convention and total reliance on rules a computer can check. Emacs is pretty far toward the conventional end of the spectrum, and enterprise Java projects are near the opposite end. If you want to move away from the compulsion end of the spectrum, you need more emphasis on convention.</p>
<p><strong>Related post</strong>: <a href="http://www.johndcook.com/blog/2011/01/05/style-and-understanding/">Style and understanding</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/12/24/convention-versus-compulsion/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>The importance of being textual</title>
		<link>http://www.johndcook.com/blog/2011/12/24/the-importance-of-being-textual/</link>
		<comments>http://www.johndcook.com/blog/2011/12/24/the-importance-of-being-textual/#comments</comments>
		<pubDate>Sat, 24 Dec 2011 13:45:55 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10276</guid>
		<description><![CDATA[&#8220;When you feel the urge to design a complex binary file format, or a  complex binary application protocol, it is generally wise to lie down  until the feeling passes.&#8221; &#8212; Eric Raymond
Taken from the section of his book entitled The Importance of Being Textual.

]]></description>
			<content:encoded><![CDATA[<p>&#8220;When you feel the urge to design a complex binary file format, or a  complex binary application protocol, it is generally wise to lie down  until the feeling passes.&#8221; &#8212; Eric Raymond</p>
<p>Taken from the section of <a href="http://www.amazon.com/gp/product/0131429019/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0131429019">his book</a> entitled <a href="http://catb.org/~esr/writings/taoup/html/ch05s01.html">The Importance of Being Textual</a>.</p>
<p><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=theende-20&amp;l=as2&amp;o=1&amp;a=0131429019" border="0" alt="" width="1" height="1" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/12/24/the-importance-of-being-textual/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Most popular programming posts of 2011</title>
		<link>http://www.johndcook.com/blog/2011/12/20/most-popular-programming-posts-of-2011/</link>
		<comments>http://www.johndcook.com/blog/2011/12/20/most-popular-programming-posts-of-2011/#comments</comments>
		<pubDate>Tue, 20 Dec 2011 13:01:37 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10236</guid>
		<description><![CDATA[These have been my most popular programming-related posts this year.

Why do C++ folks make things so complicated?
Plumber programmers
The myth of the Lisp genius
How to delete pages from a PDF
Programmers without computers

My favorite on the list is #5.
Post #4 was written in 2009, but it got a lot of traffic this year.
Thanks to everyone who shared [...]]]></description>
			<content:encoded><![CDATA[<p>These have been my most popular programming-related posts this year.</p>
<ol>
<li><a href="http://www.johndcook.com/blog/2011/06/14/why-do-c-folks-make-things-so-complicated/">Why do C++ folks make things so complicated?</a></li>
<li><a href="http://www.johndcook.com/blog/2011/11/15/plumber-programmers/">Plumber programmers</a></li>
<li><a href="http://www.johndcook.com/blog/2011/04/26/the-myth-of-the-lisp-genius/">The myth of the Lisp genius</a></li>
<li><a href="http://www.johndcook.com/blog/2009/11/06/how-to-delete-pages-from-a-pdf-without-adobe-acrobat/">How to delete pages from a PDF</a></li>
<li><a href="http://www.johndcook.com/blog/2011/02/28/programmers-without-computers/">Programmers without computers</a></li>
</ol>
<p>My favorite on the list is #5.</p>
<p>Post #4 was written in 2009, but it got a lot of traffic this year.</p>
<p>Thanks to everyone who shared these posts on Hacker News, Reddit, Twitter, etc.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/12/20/most-popular-programming-posts-of-2011/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Web programming</title>
		<link>http://www.johndcook.com/blog/2011/12/18/web-programming/</link>
		<comments>http://www.johndcook.com/blog/2011/12/18/web-programming/#comments</comments>
		<pubDate>Sun, 18 Dec 2011 20:40:32 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10252</guid>
		<description><![CDATA[From Greg Brockman on Twitter:

Web programming is the science of coming up with increasingly complicated ways of concatenating strings.
]]></description>
			<content:encoded><![CDATA[<p>From <span>Greg Brockman on <a href="https://twitter.com/#!/thegdb">Twitter</a>:<br />
</span></p>
<blockquote><p>Web programming is the science of coming up with increasingly complicated ways of concatenating strings.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/12/18/web-programming/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>New programmer&#8217;s survival manual</title>
		<link>http://www.johndcook.com/blog/2011/12/13/new-programmers-survival-manual/</link>
		<comments>http://www.johndcook.com/blog/2011/12/13/new-programmers-survival-manual/#comments</comments>
		<pubDate>Tue, 13 Dec 2011 13:01:36 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10191</guid>
		<description><![CDATA[A computer science degree doesn&#8217;t prepare you to be a programmer. Here&#8217;s an excerpt from a blog post I wrote comparing computer scientists and programmers:
I had a conversation yesterday with someone who said he needed to hire a computer scientist.  I replied that actually he needed to hire someone who could program, and that [...]]]></description>
			<content:encoded><![CDATA[<p>A computer science degree doesn&#8217;t prepare you to be a programmer. Here&#8217;s an excerpt from a <a href="http://www.johndcook.com/blog/2008/09/19/writes-large-correct-programs/">blog post</a> I wrote comparing computer scientists and programmers:</p>
<blockquote><p>I had a conversation yesterday with someone who said he needed to hire a computer scientist.  I replied that actually he needed to hire someone who could program, and that not all computer scientists could program. He disagreed, but I stood by my statement.  I’ve known too many people with computer science degrees, even advanced degrees, who were ineffective software developers.  Of course I’ve also known people with computer science degrees, especially advanced degrees, that were terrific software developers.  The most I’ll say is that programming ability is positively correlated with computer science achievement.</p></blockquote>
<p>How do you bridge the gap between obtaining a computer science degree and becoming a professional programmer? For years I&#8217;ve recommended that CS grads read <a href="http://www.amazon.com/gp/product/0735619670/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0735619670">Code Complete</a>. Now I&#8217;d also recommend <a href="http://www.amazon.com/gp/product/1934356816/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1934356816">New Programmer&#8217;s Survival Manual</a> by Josh Carter. This new book has some similarly to Code Complete. However, Code Complete is about good programming technique, not programming as a profession.</p>
<p>The Survival Manual has four parts:</p>
<ol>
<li>Professional Programming</li>
<li>People Skills</li>
<li>The Corporate World</li>
<li>Looking Forward</li>
</ol>
<p>The first part has the most similarity to Code Complete, though even there the two books are complementary. The second part, people skills, has some great advice, though I imagine most CS graduates will skim over this part because they don&#8217;t realize it is important.</p>
<p>CS students may do well to read the Survival Manual, especially parts one and three, to find out whether they want to be programmers. Some who find abstract computer science fascinating will find a typical programming sorely disappointing. See Mike Taylor&#8217;s post <a href="http://reprog.wordpress.com/2010/03/03/whatever-happened-to-programming/">Whatever happened to programming</a>.</p>
<p>A few of these may be able to find refuge as computer science professors, but not many. If you want to become a professor and think you&#8217;ll be able to get an academic job, watch <a href="http://www.xtranormal.com/watch/7520547/so-you-want-to-get-a-phd-in-theoretical-computer-science">So you want to get a PhD in theoretical computer science</a> and read <a href="http://northwesthistory.blogspot.com/2011/11/open-letter-to-my-students-no-you.html">No, you cannot be a professor</a>.</p>
<p>The Survival Manual assumes the majority programmers will be working in cube farms on <a href="http://www.johndcook.com/blog/2008/02/14/enterprising-software/">enterprise software</a>, which is true. But there is a small middle ground between enterprise development and academia, jobs that will give you a chance to use advanced computer science without having to write papers about it.</p>
<p>One reservation I have about this book is that it may be overwhelming. If you have a friend who is starting a new career as a programmer, maybe you could buy a copy of the Survival Manual and rip it into chapters. Then mail your friend one chapter a week.</p>
<p>Another reservation I have is that new CS graduates may not benefit much from the book because they won&#8217;t believe it. They may deny that the real world is as Josh Carter describes.</p>
<p>The people who may benefit the most from reading the Survival Manual are programmers with some experience who want to improve their skills. They may have learned through hard knocks about some of the challenges Carter writes about. Also, Carter describes life in a software shop with fairly high standards. Those who are used to producing lower quality software will do well to read about life in an organization with higher standards.</p>
<p><a href="http://www.amazon.com/gp/product/1934356816/ref=as_li_ss_il?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1934356816"><img src="http://ws.assoc-amazon.com/widgets/q?_encoding=UTF8&amp;Format=_SL160_&amp;ASIN=1934356816&amp;MarketPlace=US&amp;ID=AsinImage&amp;WS=1&amp;tag=theende-20&amp;ServiceVersion=20070822" border="0" alt="" /></a><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=theende-20&amp;l=as2&amp;o=1&amp;a=1934356816" border="0" alt="" width="1" height="1" /></p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2009/03/18/where-does-the-programming-effort-go/">Where does programming effort go?</a><br />
<a href="http://www.johndcook.com/blog/2011/01/25/coming-full-circle/">Coming full circle</a><br />
<a href="http://www.johndcook.com/blog/2011/05/17/writing-software-is-harder-than-writing-books/">Writing software is harder than writing books</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/12/13/new-programmers-survival-manual/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Three views of Windows and Unix</title>
		<link>http://www.johndcook.com/blog/2011/12/09/three-views-of-windows-and-unix/</link>
		<comments>http://www.johndcook.com/blog/2011/12/09/three-views-of-windows-and-unix/#comments</comments>
		<pubDate>Fri, 09 Dec 2011 14:16:09 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Unix]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10177</guid>
		<description><![CDATA[Rob Pike gave a presentation in 2001 entitled &#8220;The Good, the Bad, and the Ugly: The Unix Legacy.&#8221; His main point is that diversity has been bad for Unix. He opens his presentation with a couple of quotes to set this up.
‘‘The number of UNIX installations has grown to 10, with more expected.’’ — The [...]]]></description>
			<content:encoded><![CDATA[<p>Rob Pike gave a <a href="http://doc.cat-v.org/bell_labs/good_bad_ugly/slides.pdf">presentation</a> in 2001 entitled &#8220;The Good, the Bad, and the Ugly: The Unix Legacy.&#8221; His main point is that diversity has been bad for Unix. He opens his presentation with a couple of quotes to set this up.</p>
<blockquote><p>‘‘The number of UNIX installations has grown to 10, with more expected.’’ — The UNIX Programmer’s Manual, 2nd Edition, June, 1972.</p></blockquote>
<blockquote><p>The number of UNIX <em>variants</em> has grown to dozens, with more expected.</p></blockquote>
<p>He discusses much more than diversity, and I believe the more interesting parts of his talk are on other topics, but he begins and ends with diversity. One of his last slides says</p>
<blockquote><p>Microsoft succeeds not because it’s good, but because there’s only one of them. … Unixes of the World, Unite!</p></blockquote>
<p>Joel Spolsky has a different take on the differences between the operating systems in his article <a href="http://www.joelonsoftware.com/articles/Biculturalism.html">Biculturalism</a>. Spolsky says that Unix software is programmer-friendly but Windows software is user-friendly for the vast majority of users who are not programmers. But Spolsky does touch on the diversity issue that Pike raised.</p>
<blockquote><p>For example, Unix has a value of separating policy from mechanism which, historically, came from the designers of X. This directly led to a schism in user interfaces; nobody has ever quite been able to agree on all the details of how the desktop UI should work, <em>and they think this is OK</em>, because their culture values this diversity, but for Aunt Marge it is very much not OK to have to use a different UI to cut and paste in one program than she uses in another.</p></blockquote>
<p>Just to throw in my two cents worth, I&#8217;ll mention my blog post <a href="http://www.johndcook.com/blog/2010/06/30/where-the-unix-philosophy-breaks-down/">Where the Unix philosophy breaks down</a>. The Unix philosophy is to write little programs that do one thing well, then sew these little programs together to do your work. The problem is that many people lack the desire or skill to do the sewing. They want to avoid the transaction costs of switching software applications. Pike alludes to this problem, dismissively saying that people want &#8220;hand-holding&#8221; rather than pipes.</p>
<p>I don&#8217;t think this desire for integrated applications is necessarily a problem for Unix, only for the Unix philosophy that Unix doesn&#8217;t follow too strictly. The emphasis on orthogonal programs is a laudable ideal. It just needs to be tempered a bit for the convenience of mortal users.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/12/09/three-views-of-windows-and-unix/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Global variables</title>
		<link>http://www.johndcook.com/blog/2011/12/01/global-variables/</link>
		<comments>http://www.johndcook.com/blog/2011/12/01/global-variables/#comments</comments>
		<pubDate>Thu, 01 Dec 2011 13:00:19 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10068</guid>
		<description><![CDATA[Here&#8217;s an answer I gave on Stack Overflow to someone asking when it&#8217;s OK to use global variables.

Here&#8217;s a cheap way to get rid of all global  variables: put all your code in one big fat class and change the global  variables to member variables.  Nothing has changed as far as the [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s an answer I gave on <a href="http://stackoverflow.com/questions/357187/when-are-global-variables-acceptable/357361#357361">Stack Overflow</a> to someone asking when it&#8217;s OK to use global variables.</p>
<div>
<blockquote><p>Here&#8217;s a cheap way to get rid of all global  variables: put all your code in one big fat class and change the global  variables to member variables.  Nothing has changed as far as the  maintainability of your code, but technically it no longer has global  variables.</p>
<p>It&#8217;s better to talk about size of scope than whether or not something  is global. &#8220;Global&#8221; just means maximum scope.  Instead of saying  &#8220;global variables are bad,&#8221; I think it&#8217;s more helpful to say &#8220;minimize  variable scope.&#8221;</p>
<p>A global variable in a 100-line program has a scope of 100 lines.   But a member variable in a 1000-line class has a scope of 1000 lines.   The latter may be worse.</p></blockquote>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/12/01/global-variables/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Fundamental theorem of code readability</title>
		<link>http://www.johndcook.com/blog/2011/11/28/fundamental-theorem-of-readability/</link>
		<comments>http://www.johndcook.com/blog/2011/11/28/fundamental-theorem-of-readability/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 22:08:02 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10088</guid>
		<description><![CDATA[In The Art of Readable Code, the authors call the following the &#8220;Fundamental Theorem of Readability&#8221;:
Code should be written to minimize the time it would take for someone else to understand it.
They go on to explain
And when we say &#8220;understand,&#8221; we have a very high bar … they should be able to make changes to [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://www.amazon.com/gp/product/0596802293/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=0596802293">The Art of Readable Code</a>, the authors call the following the &#8220;Fundamental Theorem of Readability&#8221;:</p>
<blockquote><p>Code should be written to minimize the time it would take for someone else to understand it.</p></blockquote>
<p>They go on to explain</p>
<blockquote><p>And when we say &#8220;understand,&#8221; we have a very high bar … they should be able to make changes to it, spot bugs, and understand how it interacts with the rest of your code.</p></blockquote>
<p><a href="http://www.amazon.com/gp/product/0596802293/ref=as_li_ss_il?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=0596802293"><img src="http://ws.assoc-amazon.com/widgets/q?_encoding=UTF8&amp;Format=_SL160_&amp;ASIN=0596802293&amp;MarketPlace=US&amp;ID=AsinImage&amp;WS=1&amp;tag=theende-20&amp;ServiceVersion=20070822" border="0" alt="" /></a><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=theende-20&amp;l=as2&amp;o=1&amp;a=0596802293&amp;camp=217145&amp;creative=399373" border="0" alt="" width="1" height="1" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/28/fundamental-theorem-of-readability/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Readability</title>
		<link>http://www.johndcook.com/blog/2011/11/28/readability/</link>
		<comments>http://www.johndcook.com/blog/2011/11/28/readability/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 14:52:34 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Typography]]></category>
		<category><![CDATA[Literate programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10071</guid>
		<description><![CDATA[The Readability bookmarklet lets you reformat any web to make it easier to read. It strips out flashing ads and other distractions. It uses black text on a white background, wide margins, a moderate-sized font, etc. I use Readability fairly often. (Instapaper is a similar service. I discuss it at the end of this post.)
Yesterday [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.readability.com/bookmarklets">Readability bookmarklet</a> lets you reformat any web to make it easier to read. It strips out flashing ads and other distractions. It uses black text on a white background, wide margins, a moderate-sized font, etc. I use Readability fairly often. (Instapaper is a similar service. I discuss it at the end of this post.)</p>
<p>Yesterday I used it to reformat an <a href="http://axiom-developer.org/axiom-website/litprog.html">article</a> on literate programming. For some inexplicable reason, the author chose to use a lemon yellow background. It&#8217;s ironic that the article is about making source  code easier to read. The <em>content</em> of the article is easy to read, but the <em>format</em> is not.</p>
<p>Readability to the rescue! Here are before and after screen shots.</p>
<p>Before:</p>
<p style="text-align:center"><img src="http://www.johndcook.com/lemon.png" alt="" width="252" height="228" /></p>
<p>After:</p>
<p style="text-align:center"><img src="http://www.johndcook.com/readability" alt="" width="283" height="213" /></p>
<p>I recommend the article, <a href="http://axiom-developer.org/axiom-website/litprog.html">Example of Literate Programming in HTML</a>, and I also recommend using reformatting the page unless you enjoy reading black text on a yellow background.</p>
<p>Readability did a good job until about half way through the article. The article has C and HTML code examples, and perhaps these confused Readability. (Readability usually handles code samples well. It correctly formats the first few code samples in this article.) The last half of the article renders like source code, and the font gets smaller and smaller.</p>
<p style="text-align:center"><img src="http://www.johndcook.com/readability_fail.png" alt="" width="181" height="161" /></p>
<p>I ran the page through an <a href="http://htmlhelp.com/tools/validator/">HTML validator</a> to see whether some malformed HTML could be the source of the problem. The validator found numerous problems, so perhaps that was the issue.</p>
<p>I haven&#8217;t seen Readability fail like this before. I&#8217;ve been surprised how well it has handled some pages I thought might trip it up.</p>
<p>I ended up saving the article and editing its source, changing the <code>bgcolor</code> value to white. It&#8217;s a nice article on literate programming once you get past the formatting. The best part of the article is the first section, and that much Readability formats correctly.</p>
<p><strong>Instapaper</strong></p>
<p><a href="http://www.instapaper.com/">Instapaper</a> reformats web pages similarly. It produces a narrower column of text, but otherwise the output looks quite similar.</p>
<p>Instapaper did not discover the title of the literate programming article. (The title of the article was not in an <code>&lt;h1&gt;</code> tag as software might expect but was only in a <code>&lt;title&gt;</code> tag in the page header.) However, it did format the entire body of the article correctly.</p>
<p>I find it slightly more convenient to use the Readability bookmarklet than to submit a link to Instapaper. I imagine there are browser plug-ins that make Instapaper just as easy to use, though I haven&#8217;t looked into this because I&#8217;m usually satisfied with Readability.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2008/01/15/literate-programming-and-statistics/">Literate programming and statistics</a><br />
<a href="http://www.johndcook.com/blog/2008/04/07/tricky-code/">Tricky code</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/28/readability/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>The Tangled Web</title>
		<link>http://www.johndcook.com/blog/2011/11/27/the-tangled-web/</link>
		<comments>http://www.johndcook.com/blog/2011/11/27/the-tangled-web/#comments</comments>
		<pubDate>Sun, 27 Nov 2011 20:00:00 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=10059</guid>
		<description><![CDATA[The Tangled Web is a security book that you may find interesting even if you&#8217;re not interested in security. The first half of the book is an excellent explanation of how Web technologies work in theory and especially in practice. This material is included in order to discuss security implications, but it&#8217;s interesting on its [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.amazon.com/gp/product/1593273886/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=1593273886">The Tangled Web</a> is a security book that you may find interesting even if you&#8217;re not interested in security. The first half of the book is an excellent explanation of how Web technologies work in theory and especially in practice. This material is included in order to discuss security implications, but it&#8217;s interesting on its own. The second half of the book is directly devoted to Web security.</p>
<p>The author, Michal Zalewski, has a colorful writing style. His book is serious and loaded with technical detail, but that doesn&#8217;t stop him from turning a nice phrase here and there.</p>
<p>Here&#8217;s an excerpt from The Tangled Web that I particularly liked, one that explains why security concerns on the Web differ from previous security concerns.</p>
<blockquote><p>In the traditional model followed by virtually all personal computers … there are very clear boundaries between high-level data objects (documents), user-level code (applications), and the operating system kernel … These boundaries are well studied and useful for building practical security schemes. A file opened in your text editor is unlikely to be able to steal your email …</p>
<p>In the browser world, this separation is practically nonexistent: Documents and code live as parts of the same intermingled blobs of HTML, isolation between completely unrelated applications is partial at best …</p>
<p>In the end, the seemingly unlikely scenario of a text file stealing your email is, in fact, a frustratingly common pattern on the Web.</p></blockquote>
<p><a href="http://www.amazon.com/gp/product/1593273886/ref=as_li_ss_il?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=1593273886"><img src="http://ws.assoc-amazon.com/widgets/q?_encoding=UTF8&amp;Format=_SL160_&amp;ASIN=1593273886&amp;MarketPlace=US&amp;ID=AsinImage&amp;WS=1&amp;tag=theende-20&amp;ServiceVersion=20070822" border="0" alt="" /></a><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=theende-20&amp;l=as2&amp;o=1&amp;a=1593273886&amp;camp=217145&amp;creative=399373" border="0" alt="" width="1" height="1" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/27/the-tangled-web/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Norris&#8217; number</title>
		<link>http://www.johndcook.com/blog/2011/11/22/norris-number/</link>
		<comments>http://www.johndcook.com/blog/2011/11/22/norris-number/#comments</comments>
		<pubDate>Tue, 22 Nov 2011 13:00:20 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9792</guid>
		<description><![CDATA[My friend Clift Norris has identified a fundamental constant that I call Norris&#8217; number, the average amount of code an untrained programmer can write before he or she hits a wall. Clift estimates this as 1,500 lines. Beyond that the code becomes so tangled that the author cannot debug or modify it without herculean effort.
Related [...]]]></description>
			<content:encoded><![CDATA[<p>My friend Clift Norris has identified a fundamental constant that I call Norris&#8217; number, the average amount of code an untrained programmer can write before he or she hits a wall. Clift estimates this as 1,500 lines. Beyond that the code becomes so tangled that the author cannot debug or modify it without herculean effort.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2008/09/19/writes-large-correct-programs/">Writes large correct programs</a><br />
<a href="http://www.johndcook.com/blog/2010/02/03/little-programs-versus-big-programs/">Little programs versus big programs</a><br />
<a href="http://www.johndcook.com/blog/2008/06/03/experienced-programmers-and-lines-of-code/">Experienced programmers and lines of code</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/22/norris-number/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
		</item>
		<item>
		<title>Career advice regarding tools</title>
		<link>http://www.johndcook.com/blog/2011/11/21/career-advice-regarding-tools/</link>
		<comments>http://www.johndcook.com/blog/2011/11/21/career-advice-regarding-tools/#comments</comments>
		<pubDate>Mon, 21 Nov 2011 15:10:27 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Software development]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9801</guid>
		<description><![CDATA[
A few weeks ago, J. D. Long gave some interesting advice in a Google+ discussion. He starts out
Lunch today with an analyst 13 years my junior made me think about  things I wish I had known about the technical analytical profession when  I was 25. Here&#8217;s some things that popped into my head:
The [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.johndcook.com/jdlong.jpeg" alt="J. D. Long wearing a Panama and smoking a Dominican" width="254" height="208" /></p>
<p>A few weeks ago, <a href="https://plus.google.com/107121399840634452924/posts">J. D. Long</a> gave some interesting advice in a Google+ discussion. He starts out</p>
<blockquote><p>Lunch today with an analyst 13 years my junior made me think about  things I wish I had known about the technical analytical profession when  I was 25. Here&#8217;s some things that popped into my head:</p></blockquote>
<p>The entire list is worth reading, but I want to focus on two things he said about tools.</p>
<ul>
<li>Use tools you don&#8217;t have to ask permission to install (i.e. open source).</li>
<li>Dependence on tools that are closed license and un-scriptable will   limit the scope of problems you can solve. (i.e. Excel) Use them, but   build your core skills on more portable &amp; scalable technologies.</li>
</ul>
<p>I would have disagreed a few years ago, but now I think this is good advice.</p>
<p>In the late 90&#8217;s I used mostly Microsoft tools. That was a good time to be a Microsoft developer. Windows was on the rise; Unix and Mac OS were on the ropes. Desktop applications were the norm and were easier to write on Windows. Open source software was hard to install and hard to use. People who used open source software often did so for ideological reasons, not because it made their work easier.</p>
<p>Of course times have changed. Mac recovered from its near death experience. Unix didn&#8217;t, but it has been resurrected as Linux. The web made it easier to write cross-platform software. And above all, open source software has matured. The open source community is more positive, focused on promoting good software rather than trying to give some corporation a stick in the eye.</p>
<p>Now the advantages of open source are clearer. There&#8217;s not the same hidden cost in frustration that there was a few years ago. Now I would say yes, it&#8217;s a great advantage to use tools you can install whenever and wherever you want, without having to go through a purchasing bureaucracy.</p>
<p>It&#8217;s interesting that JD equates open source with scriptability. Open source software often is scriptable, not because it&#8217;s open source, but because of the Unix aesthetic that pervades the open source community. Closed source software is often not scriptable, not because it&#8217;s closed source, but because it is often written for consumers who value <a href="http://www.johndcook.com/blog/2011/08/15/usability-versus-composability/">usability over composability</a>. Commercial server-side products may be scriptable. If I were to restate JD&#8217;s advice on this point, I&#8217;d say to keep composability in mind and don&#8217;t just think about usability.</p>
<p>I appreciate JD&#8217;s attitude toward applications such as Excel. He&#8217;s not saying you should never defile your conscience by opening Excel. Some tasks are incredibly easy in Excel. The danger comes from pushing the tool into territory where other tools are better. There are still some in the open source community who believe that opening Excel is a sin, but I&#8217;m much more in agreement with the people who say, for example, that Excel isn&#8217;t the best tool for statistical analysis.</p>
<p>Portability is funny. In the early days of computing, there were no dominant players, and portability was important (and difficult). Then for a while, portability didn&#8217;t matter if you were content with only running on the 95% of the world&#8217;s computers that ran Windows. Now portability is important again. Windows still has a huge market share on the desktop, but the desktop itself is losing market share.</p>
<p>And portability matters for more than consumer operating systems. JD mentions portability and scalability in one breath. You may want to move code between operating systems to scale up (e.g. to run on a cluster) or to scale down (e.g. to run on a mobile device).</p>
<p>There&#8217;s also the aspect of career portability. You want to master tools that you can take with you from job to job. I would be leery of building a career around a small company&#8217;s proprietary tools. If I were in that situation, I&#8217;d learn something else on the side that&#8217;s more portable.</p>
<p>In closing, I&#8217;ll give the rest of JD&#8217;s career advice without commentary. These points could make interesting fodder for future blog posts.</p>
<ul>
<li>Be a profit center, not a cost center.</li>
<li>Use tools you don&#8217;t have to ask permission to install (i.e. open source).</li>
<li>Dependence on tools that are closed license and un-scriptable will limit the scope of problems you can solve. (i.e. Excel) Use them, but build your core skills on more portable &amp; scalable technologies.</li>
<li>Learn basic database tools.</li>
<li>Learn a programming language.</li>
<li>Your internal job description may say, &#8220;Analyst&#8221; but get something else on your business cards. Analyst is so vague as to be meaningless. My external title is currently &#8220;Sr. Risk Economist.&#8221; I like the term &#8220;Data Scientist&#8221; for now. I expect that term will be meaningless in 5 years.</li>
<li>Large organizations do not properly appreciate agile and smart analytic types. Time at large firms should be seen as subsidized learning. Learn lots, but get out.</li>
<li>Ensure you can explain any of your projects to your wife or non-technical friends. It&#8217;s good practice for board meetings later in your career.</li>
<li>Be sure you know the handful of things that you can do better than most anyone else. Add something to that list every year. Make sure you can explain these things to non techies.</li>
<li>Be a profit center, not a cost center. At least be as close to the profit center as possible. The chief analyst for the sales SVP is closer to the profit center than an IT analyst supporting billing operations.</li>
<li>Get really good at asking questions so you understand problems before you start solving them.</li>
<li>Yes, that bit about being a profit center not a cost center is in there twice. It should probably be in there 5 times.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/21/career-advice-regarding-tools/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>The plumber programmer</title>
		<link>http://www.johndcook.com/blog/2011/11/15/plumber-programmers/</link>
		<comments>http://www.johndcook.com/blog/2011/11/15/plumber-programmers/#comments</comments>
		<pubDate>Tue, 15 Nov 2011 12:58:54 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9940</guid>
		<description><![CDATA[I called someone a plumber programmer the other day. The person I was speaking to didn&#8217;t realize that &#8220;plumber programmer&#8221; is a term of great respect. The plumber is often the most experienced programmer on a team.
As with literal plumbing, software plumbing connects things together. It deals with things other people don&#8217;t want to see [...]]]></description>
			<content:encoded><![CDATA[<p>I called someone a plumber programmer the other day. The person I was speaking to didn&#8217;t realize that &#8220;plumber programmer&#8221; is a term of great respect. The plumber is often the most experienced programmer on a team.</p>
<p>As with literal plumbing, software plumbing connects things together. It deals with things other people don&#8217;t want to see or think about. And it&#8217;s crucial.</p>
<p><a href="http://wordaligned.org/articles/distorted-software">Thomas Guest</a> made a couple diagrams that illustrate this. Managers draw software diagrams with big boxes and little arrows. The boxes represent software components and the arrows represent the code that connects them together.</p>
<p style="text-align:center"><img src="http://www.johndcook.com/application.png" alt="" width="300" height="201" /></p>
<p>This gives the impression that the boxes are the hard part and the arrows are easy. The opposite is probably true. Thomas says if we drew the diagram so that the size of the components is proportional to the effort, it might look like this:</p>
<p style="text-align:center"><img src="http://www.johndcook.com/distorted-application.png" alt="" width="300" height="232" /></p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2009/03/18/where-does-the-programming-effort-go/">Where does programming effort go?</a><br />
<a href="http://www.johndcook.com/blog/2011/01/14/your-job-is-trivial-but-i-couldnt-do-it/">Your job is trivial (but I couldn&#8217;t do it)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/15/plumber-programmers/feed/</wfw:commentRss>
		<slash:comments>34</slash:comments>
		</item>
		<item>
		<title>Separating presentation from content</title>
		<link>http://www.johndcook.com/blog/2011/11/14/separating-presentation-from-content/</link>
		<comments>http://www.johndcook.com/blog/2011/11/14/separating-presentation-from-content/#comments</comments>
		<pubDate>Mon, 14 Nov 2011 18:47:29 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[LaTeX]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9956</guid>
		<description><![CDATA[In the late &#8217;90s I went to a fair number of Microsoft presentations. One presentation would say &#8220;The problem with Technology X is that it mixes presentation and content. We&#8217;ve introduced Technology Y to make your code cleaner, separating presentation and content.&#8221; A few months later I&#8217;d be at another presentation that would announce &#8220;The [...]]]></description>
			<content:encoded><![CDATA[<p>In the late &#8217;90s I went to a fair number of Microsoft presentations. One presentation would say &#8220;The problem with Technology X is that it mixes presentation and content. We&#8217;ve introduced Technology Y to make your code cleaner, separating presentation and content.&#8221; A few months later I&#8217;d be at another presentation that would announce &#8220;The problem with Technology Y is that it mixes presentation and content. We&#8217;ve introduced Technology Z …&#8221; (Does this remind anyone else of <a href="http://www.amazon.com/gp/product/0394800028/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0394800028">The Cat in the Hat Comes Back</a>?)</p>
<p>When I first learned LaTeX, I was told that one of its strengths is that it separates presentation and content. Then a few years later I hear complaints that the problem with LaTeX is that it mingles presentation and content, unlike XHTML. A few years later, guess what? XHTML mixes presentation and content, so we need something else.</p>
<p>I shut down when I hear someone announce that everything before their product was bad because it mixed presentation and content, and now with their solution, presentation and content will be <em>completely</em> separate.</p>
<p>Sometimes one technology really does make a cleaner separation of presentation and content. But at best the separation is relative. LaTeX separates presentation and content more than Word, though not as much as well-written HTML and CSS, maybe. But presentation and content cannot be <em>entirely</em> separated. Nor is their unanimous agreement on what exactly the dividing line is between the two.</p>
<p>Many people don&#8217;t want to separate their presentation and content. They don&#8217;t understand why this would be desirable, and they&#8217;ll fight against anything designed to encourage separation. Maybe they need to learn the advantages, or maybe they&#8217;re just doing the best they can to get their job done and they can&#8217;t be bothered with long term advantages that may not materialize.</p>
<p>The principle of separating presentation and content is admirable. It really does have advantages, but it&#8217;s easier said than done.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/14/separating-presentation-from-content/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Unix tool tips</title>
		<link>http://www.johndcook.com/blog/2011/11/10/unix-tool-tips/</link>
		<comments>http://www.johndcook.com/blog/2011/11/10/unix-tool-tips/#comments</comments>
		<pubDate>Fri, 11 Nov 2011 03:06:07 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Software development]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9908</guid>
		<description><![CDATA[I&#8217;ve renamed my SedAwkTip twitter account to UnixToolTip to reflect its new scope. If you were following SedAwkTip, there&#8217;s no need to do anything. You&#8217;ll just see a different name.
I have about a week&#8217;s worth of sed and awk tips scheduled. Then I&#8217;ll start adding in tips on grep, find, uniq, etc. And I&#8217;ll come [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve renamed my SedAwkTip twitter account to <a href="https://twitter.com/#!/unixtooltip">UnixToolTip</a> to reflect its new scope. If you were following SedAwkTip, there&#8217;s no need to do anything. You&#8217;ll just see a different name.</p>
<p>I have about a week&#8217;s worth of <code>sed</code> and <code>awk</code> tips scheduled. Then I&#8217;ll start adding in tips on <code>grep</code>, <code>find</code>, <code>uniq</code>, etc. And I&#8217;ll come back to <code>sed</code> and <code>awk</code> now and then.</p>
<p>These tools came from the Unix world, but they&#8217;re also available on <a href="http://gnuwin32.sourceforge.net/packages.html">Windows</a>.</p>
<p>For now I&#8217;m keeping the original icon. I&#8217;m open to suggestions if someone has an idea for a better icon.</p>
<p><a href="https://twitter.com/#!/unixtooltip"><img class="alignnone" src="http://www.johndcook.com/SedAwk_32.png" alt="s///" width="32" height="32" /></a></p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2011/09/06/thermonuclear-word-processor/">Thermonuclear word processor</a><br />
<a href="http://www.johndcook.com/blog/2011/08/06/perverse-hipster-desire-for-retro-computing/">Retro computing</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/10/unix-tool-tips/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Firsthand knowledge</title>
		<link>http://www.johndcook.com/blog/2011/11/06/firsthand-knowledge/</link>
		<comments>http://www.johndcook.com/blog/2011/11/06/firsthand-knowledge/#comments</comments>
		<pubDate>Sun, 06 Nov 2011 21:37:56 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Software development]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9866</guid>
		<description><![CDATA[From C. S. Lewis:
It has always therefore been one of my main endeavors as a teacher to persuade the young that firsthand knowledge is not only more worth acquiring than secondhand knowledge, but it usually much easier and more delightful to acquire.
This quote comes from the essay On the Reading of Old Books, part of [...]]]></description>
			<content:encoded><![CDATA[<p>From C. S. Lewis:</p>
<blockquote><p>It has always therefore been one of my main endeavors as a teacher to persuade the young that firsthand knowledge is not only more worth acquiring than secondhand knowledge, but it usually much easier and more delightful to acquire.</p></blockquote>
<p>This quote comes from the essay <em>On the Reading of Old Books</em>, part of the collection <a href="http://www.amazon.com/gp/product/0802808689/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0802808689">God in the Dock: Essays on Theology and Ethics</a>. Lewis says here that it is easier to read Plato or St. Paul, for example, than to read books <em>about</em> Plato or St. Paul.  Lewis says that the fear of reading great authors</p>
<blockquote><p>… springs from humility. The student is half afraid to meet one of the great philosophers face to face. He feels himself inadequate and thinks he will not understand him. But if he only knew, the great man, just because of his greatness, is much more intelligible than his modern commentators.</p></blockquote>
<p>This does not only apply to literature. I see the same theme in math. Sometimes <a href="http://www.johndcook.com/blog/2009/02/14/old-math-books/">early math papers</a> are easier to read because they are more concrete. When I was a postdoc at Vanderbilt I asked <a href="http://en.wikipedia.org/wiki/Richard_Arenstorf">Richard Arenstorf</a> about a theorem attributed to him in a book I was reading.  He scoffed that he didn&#8217;t recognize it. He had done his work in a relatively concrete setting and did not approve of the fancy window dressing the author had placed around his theorem. I sat in on a few lectures by Arenstorf and found them amazingly clear.</p>
<p>The same theme appears in software development. Sometimes you can dive to the bottom of an abstraction hierarchy and find that things are simpler there than you would have supposed. The intervening layers obscure the substance of the program, making its core seem unduly mysterious. Like a mediocre mind commenting on the work of a great mind, developers who build layers of software around core functionality intend to make things easier but may do the opposite.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2010/07/05/endless-preparation/">Endless preparation</a><br />
<a href="http://www.johndcook.com/blog/2009/10/14/opening-black-boxes/">Opening black boxes</a><br />
<a href="http://www.johndcook.com/blog/2009/08/16/why-shakespeare-is-hard-to-read/">Why Shakespeare is hard to read</a><br />
<a href="http://www.johndcook.com/blog/2009/01/19/c-s-lewis-on-reading-old-books/">C. S. Lewis on reading old books</a><a href="http://www.johndcook.com/blog/2009/02/14/old-math-books/"></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/06/firsthand-knowledge/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Code bloat</title>
		<link>http://www.johndcook.com/blog/2011/11/01/code-bloat/</link>
		<comments>http://www.johndcook.com/blog/2011/11/01/code-bloat/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 00:00:27 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9833</guid>
		<description><![CDATA[&#8220;Back when I was starting out in computer science I thought by today we’d  be writing a few lines of code to accomplish much. Instead, we write  hundreds of thousands of lines of code to accomplish little.&#8221; &#8212; Lispian
]]></description>
			<content:encoded><![CDATA[<p>&#8220;Back when I was starting out in computer science I thought by today we’d  be writing a few lines of code to accomplish much. Instead, we write  hundreds of thousands of lines of code to accomplish little.&#8221; &#8212; <a href="http://lispian.net/2011/11/01/lasagna-code/">Lispian</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/01/code-bloat/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Floating point error is the least of my worries</title>
		<link>http://www.johndcook.com/blog/2011/11/01/floating-point-worries/</link>
		<comments>http://www.johndcook.com/blog/2011/11/01/floating-point-worries/#comments</comments>
		<pubDate>Tue, 01 Nov 2011 12:00:13 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Software development]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9825</guid>
		<description><![CDATA[&#8220;Nothing brings fear to my heart more than a floating point number.&#8221; &#8212; Gerald Jay Sussman
The context of the above quote was Sussman&#8217;s presentation We really don&#8217;t know how to compute. It was a great presentation and I&#8217;m very impressed by Sussman. But I take exception to his quote.
I believe what he meant by his [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;Nothing brings fear to my heart more than a floating point number.&#8221; &#8212; Gerald Jay Sussman</p>
<p>The context of the above quote was Sussman&#8217;s presentation <a href="http://www.infoq.com/presentations/We-Really-Dont-Know-How-To-Compute">We really don&#8217;t know how to compute</a>. It was a great presentation and I&#8217;m very impressed by Sussman. But I take exception to his quote.</p>
<p>I believe what he meant by his quote was that he finds floating point arithmetic unsettling because it is not as easy to rigorously understand as integer arithmetic. Fair enough. Floating point arithmetic can be tricky. Things can go spectacularly bad for reasons that catch you off guard if you&#8217;re unprepared. But I&#8217;ve been doing numerical programming long enough that I believe I know where the landmines are and how to stay away from them. And even if I&#8217;m wrong, I have bigger worries.</p>
<p>Nothing brings fear to my heart more than <strong>modeling error</strong>.</p>
<p>The weakest link in applied math is often the step of turning a physical problem into a mathematical problem.  We begin with a raft of assumptions that are educated guesses. We know these assumptions can&#8217;t be exactly correct, but we suspect (hope) that the deviations from reality are small enough that they won&#8217;t invalidate the conclusions. In any case, these assumptions are usually far more questionable than the assumption that floating point arithmetic is sufficiently accurate.</p>
<p>Modeling error is usually several orders of magnitude greater than floating point error. People who nonchalantly model the real world and then sneer at floating point as <a href="http://www.johndcook.com/blog/2011/09/30/just-an-approximation/">just an approximation</a> strain at gnats and swallow camels.</p>
<p>In between modeling error and floating point error on my scale of worries is <strong>approximation error</strong>. As <a href="http://people.maths.ox.ac.uk/trefethen/">Nick Trefethen</a> has said, if computers were suddenly able to do arithmetic with perfect accuracy, 90% of numerical analysis would remain important.</p>
<p>To illustrate the difference between modeling error, approximation error, and floating point error, suppose you decide that the probability of something can be represented by a normal distribution. This is actually two assumptions: that the process is random, and that as a random variable it has a normal distribution. Those assumptions won&#8217;t be exactly true, so this introduces some modeling error.</p>
<p>Next we have to compute something about a normal distribution, say the probability of a normal random variable being in some range. This probability is given by an integral, and some algorithm estimates this integral and introduces approximation error. The approximation error would exist even if the steps in the algorithm could be carried out in infinite precision. But the steps are not carried out with infinite precision, so there is some error introduced by implementing the algorithm with floating point numbers.</p>
<p>For a simple example like this, approximation error and floating point error will typically be about the same size, both extremely small. But in a more complex example, say something involving a high-dimensional integral, the approximation error could be much larger than floating point error, but still smaller than modeling error. I imagine approximation error is often roughly the geometric mean of modeling error and floating point error, i.e. somewhere around the middle of the two on a log scale.</p>
<p>In Sussman&#8217;s presentation he says that people worry too much about correctness. Often correctness is not that important. It&#8217;s often good enough to produce a correct answer with reasonably high probability, provided the consequences of an error are controlled. I agree, but in light of that it seems odd to be too worried about inaccuracy from floating point arithmetic. I suspect he&#8217;s not that worried about floating point and that the opening quote was just an entertaining way to say that floating point math can be tricky.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2009/04/06/numbers-are-a-leaky-abstraction/">Floating point numbers are a leaky abstraction</a><br />
<a href="http://www.codeproject.com/KB/recipes/avoiding_overflow.aspx">Avoiding overflow, underflow, and loss of precision</a><br />
<a href="http://www.johndcook.com/blog/2011/09/30/just-an-approximation/">Just an approximation</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/11/01/floating-point-worries/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Software engineering and alarm clocks</title>
		<link>http://www.johndcook.com/blog/2011/10/30/alarm-clocks-and-dst/</link>
		<comments>http://www.johndcook.com/blog/2011/10/30/alarm-clocks-and-dst/#comments</comments>
		<pubDate>Sun, 30 Oct 2011 17:35:22 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9807</guid>
		<description><![CDATA[This morning at church a woman said she was running late because of a software issue. Her alarm clock was manufactured before the US changed the end date of daylight saving time. Her clock &#8220;fell back&#8221; an hour because daylight saving time would have ended today had the law not changed.
Here are a few thoughts [...]]]></description>
			<content:encoded><![CDATA[<p>This morning at church a woman said she was running late because of a software issue. Her alarm clock was manufactured before the US changed the end date of daylight saving time. Her clock &#8220;fell back&#8221; an hour because daylight saving time would have ended today had the law not changed.</p>
<p>Here are a few thoughts about what went wrong and how it might have been prevented.</p>
<ul>
<li>Laws have unforeseen consequences. When the change was being debated, I doubt many asked about the impact on alarm clocks and other devices with embedded software.</li>
<li>The clock tried to be helpful by automating the time change. It would have been better had it done nothing. Moderately smart software is often worse than no software.</li>
<li>Should the clock have been designed to check for software updates? What would it have done to the cost to turn a simple clock into a computer with a network connection?</li>
<li>The clock could depend on a radio signal for time. Some do, and they&#8217;re very accurate. But they&#8217;re also more expensive.</li>
<li>Should we get rid of daylight saving time? It made more sense when nearly everyone had a 9:00 to 5:00 work schedule. But now that so different people work shifts or have flexible schedules, it doesn&#8217;t seem to add as much value.</li>
</ul>
<p><strong>Related post</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2010/01/28/universal-time/">Universal time</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/10/30/alarm-clocks-and-dst/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Python is a voluntary language</title>
		<link>http://www.johndcook.com/blog/2011/10/26/python-is-a-voluntary-language/</link>
		<comments>http://www.johndcook.com/blog/2011/10/26/python-is-a-voluntary-language/#comments</comments>
		<pubDate>Wed, 26 Oct 2011 22:01:41 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9658</guid>
		<description><![CDATA[People who write Python choose to write Python.
I don&#8217;t hear people say &#8220;I use Python at work because I have to, but I&#8217;d rather be writing Java.&#8221; But often I do hear people say they&#8217;d like to use Python if their job would allow it. There must be someone out there writing Python who would [...]]]></description>
			<content:encoded><![CDATA[<p>People who write Python choose to write Python.</p>
<p>I don&#8217;t hear people say &#8220;I use Python at work because I have to, but I&#8217;d rather be writing Java.&#8221; But often I do hear people say they&#8217;d like to use Python if their job would allow it. There must be someone out there writing Python who would rather not, but I think that&#8217;s more common with other languages.</p>
<p>My point isn&#8217;t that everyone loves Python, but rather that those who don&#8217;t care for Python simply don&#8217;t write it.</p>
<p>Since Python isn&#8217;t a common choice for <a href="http://www.johndcook.com/blog/2008/02/14/enterprising-software/">enterprise software projects</a>, it can resist the pressure to be all things to all people. Having a &#8220;Benevolent Dictator for Life&#8221; also helps Python maintain <a href="http://www.johndcook.com/blog/2008/03/18/conceptual-integrity/">conceptual integrity</a>. Python is popular enough to have a critical mass of users, but not so popular that it is under pressure to lose its uniqueness.</p>
<p>I don&#8217;t know much about the Ruby world, but I wonder whether the increasing popularity of Ruby for web development has created pressure for Ruby to compromise its original philosophy. And I wonder whether Ruby&#8217;s creator Yukihiro Matsumoto has &#8220;dictatorial&#8221; control over his language analogous to the control Guido van Rossum has over Python.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2009/05/08/plain-python/">Plain Python</a><br />
<a href="http://www.johndcook.com/blog/2010/11/28/ruby-python-and-science/">Ruby, Python, and science</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/10/26/python-is-a-voluntary-language/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
		</item>
		<item>
		<title>John McCarthy and the origin of Lisp</title>
		<link>http://www.johndcook.com/blog/2011/10/24/john-mccarthy-and-the-origin-of-lisp/</link>
		<comments>http://www.johndcook.com/blog/2011/10/24/john-mccarthy-and-the-origin-of-lisp/#comments</comments>
		<pubDate>Mon, 24 Oct 2011 22:32:13 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9762</guid>
		<description><![CDATA[As I write this, word has it that John McCarthy passed away yesterday. Tech Crunch is reporting this as fact, citing Hacker News, which in turn cites a single tweet as the ultimate source. So the only authority we have, for now, is one person on Twitter, and we don&#8217;t know what relation she has [...]]]></description>
			<content:encoded><![CDATA[<p>As I write this, word has it that John McCarthy passed away yesterday. Tech Crunch is reporting this as fact, citing Hacker News, which in turn cites a single tweet as the ultimate source. So the only authority we have, for now, is one person on Twitter, and we don&#8217;t know what relation she has to McCarthy.</p>
<p>[<strong>Update</strong>: More recent comments on Hacker News corroborate the story. Also, the twitterer cited above, Wendy Grossman, said McCarthy's daughter called her.]</p>
<p>I also have an unsubstantiated story about John McCarthy. I believe I read the following some time ago, but I cannot remember where. If you know of a reference, please let me know. [<strong>Update 2</strong>: Thanks to Leandro Penz for leaving a link to this <a href="http://www.paulgraham.com/icad.html">article</a> by Paul Graham in the comments below.]</p>
<p>As I recall, McCarthy invented Lisp to be a purely theoretical language, something akin to lambda calculus. When his graduate student Steve Russell spoke of implementing Lisp, McCarthy objected that he didn&#8217;t intend Lisp to actually run on a physical computer. Russell then implemented a Lisp interpreter and showed it to McCarthy.</p>
<p>Steve Russell is an unsung hero who deserves some of the credit for Lisp being an actual programming language and not merely a theoretical construct. This does not diminish McCarthy&#8217;s achievement, but it does mean that someone else also deserves recognition.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2010/11/23/lisp-and-the-anti-lisp/">Lisp and the anti-Lisp</a><br />
<a href="http://www.johndcook.com/blog/2011/05/16/bumblebee-software/">Bumblebee software</a><br />
<a href="http://www.johndcook.com/blog/2011/04/26/the-myth-of-the-lisp-genius/">The myth of the Lisp genius</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/10/24/john-mccarthy-and-the-origin-of-lisp/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Why does software have to be maintained?</title>
		<link>http://www.johndcook.com/blog/2011/10/21/software-maintenance/</link>
		<comments>http://www.johndcook.com/blog/2011/10/21/software-maintenance/#comments</comments>
		<pubDate>Fri, 21 Oct 2011 11:20:04 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9731</guid>
		<description><![CDATA[The idea of software maintenance sounds absurd. Why do you have to maintain software? Do the bits try to sneak off the disk so that someone has to put them back?
Software doesn&#8217;t change, but the world changes out from under it.

People discover bugs. This does not change the software but rather our knowledge of the [...]]]></description>
			<content:encoded><![CDATA[<p>The idea of software maintenance sounds absurd. Why do you have to maintain software? Do the bits try to sneak off the disk so that someone has to put them back?</p>
<p>Software doesn&#8217;t change, but the world changes out from under it.</p>
<ul>
<li>People discover bugs. This does not change the software but rather our knowledge of the software.</li>
<li>As people use the software, they get new ideas regarding how they want to use it.</li>
<li>The human environment around the software changes. Organizational priorities change. Laws change. Project sponsors and users turn over.</li>
<li>The technological environment of the software changes. Operating systems, networks, and hardware all change.</li>
<li>New possibilities emerge and make us less content with old possibilities.</li>
</ul>
<p><strong>People often perceive these changes as changes to the software</strong>, like someone standing on a dock, eyes fixed on a ship, who feels the dock is moving. We speak of software as if it were some mechanical think that physically wears out. Of course it isn&#8217;t, but the effect may be the same.</p>
<p><strong>Related post</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2010/03/31/maintenance-costs/">Maintenance costs</a><br />
<a href="http://www.johndcook.com/blog/2010/05/10/taking-your-code-for-a-walk/">Taking your code for a walk</a><br />
<a href="http://www.johndcook.com/blog/2010/01/12/software-sins-of-omission/">Software sins of omission</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/10/21/software-maintenance/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Software knowledge shelf life</title>
		<link>http://www.johndcook.com/blog/2011/10/18/software-shelf-life/</link>
		<comments>http://www.johndcook.com/blog/2011/10/18/software-shelf-life/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 11:54:48 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Software development]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Unix]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9392</guid>
		<description><![CDATA[In my experience, software knowledge has a longer useful shelf life in the Unix world than in the Microsoft world. (In this post Unix is a shorthand for Unix and Linux.)
A pro-Microsoft explanation would say that Microsoft is more progressive, always improving their APIs and tools, and that Unix is stagnant.
A pro-Unix explanation would say [...]]]></description>
			<content:encoded><![CDATA[<p>In my experience, software knowledge has a longer useful shelf life in the Unix world than in the Microsoft world. (In this post Unix is a shorthand for Unix and Linux.)</p>
<p>A pro-Microsoft explanation would say that Microsoft is more progressive, always improving their APIs and tools, and that Unix is stagnant.</p>
<p>A pro-Unix explanation would say that Unix got a lot of things right the first time, that it is more stable, and that Microsoft&#8217;s technology turn-over is more churn than progress.</p>
<p>Pick your explanation. But for better or worse, change comes slower on the Unix side. And when it comes, it&#8217;s less disruptive.</p>
<p>At least that&#8217;s how it seems to me. Although I&#8217;ve used Windows and Unix, I&#8217;ve done different kinds of work on the two platforms. Maybe the pace of change relates more to the task than the operating system. Also, I have more experience with Windows and so perhaps I&#8217;m more aware of the changes there. But most of the things I knew about Unix 20 years ago are still useful, and most the things I knew about Windows 10 years ago are not.</p>
<p><strong>Related posts</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2011/02/28/programmers-without-computers/">Programmers without computers</a><br />
<a href="http://www.johndcook.com/blog/2010/06/30/where-the-unix-philosophy-breaks-down/">Where the Unix philosophy breaks down</a><br />
<a href="http://www.johndcook.com/blog/2011/03/28/software-development-and-the-myth-of-progress/">Software development and the myth of progress</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/10/18/software-shelf-life/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>The Art of R Programming</title>
		<link>http://www.johndcook.com/blog/2011/10/10/the-art-of-r-programming/</link>
		<comments>http://www.johndcook.com/blog/2011/10/10/the-art-of-r-programming/#comments</comments>
		<pubDate>Mon, 10 Oct 2011 15:00:40 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Software development]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[Rstats]]></category>

		<guid isPermaLink="false">http://www.johndcook.com/blog/?p=9598</guid>
		<description><![CDATA[Here are my first impressions of The Art of R Programming. I haven&#8217;t had time to read it thoroughly, and I doubt I will any time soon. Rather than sitting on it, I wanted to get something out quickly. I may say more about the book later.
The book&#8217;s author, Norman Matloff, began his career as [...]]]></description>
			<content:encoded><![CDATA[<p>Here are my first impressions of <a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_tl?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=1593273843">The Art of R Programming</a>. I haven&#8217;t had time to read it thoroughly, and I doubt I will any time soon. Rather than sitting on it, I wanted to get something out quickly. I may say more about the book later.</p>
<p>The book&#8217;s author, Norman Matloff, began his career as a statistics professor and later moved into computer science. That may explain why his book seems to be more programmer-friendly than other books I&#8217;ve seen on R.</p>
<p>My impression is that few people actually sit down and learn R the way they&#8217;d learn, say, Java. Most learn R in the context of learning statistics. Here&#8217;s a statistical chore, and here&#8217;s a snippet of R to carry it out. Books on R tend to follow that pattern, organized more by statistical task than by language feature. That serves statisticians well, but it&#8217;s daunting to outsiders.</p>
<p>Matloff&#8217;s book is organized more like a typical programming book and may be more accessible to a <a href="http://www.johndcook.com/R_language_for_programmers.html">programmer needing to learn R</a>. He explains some things that might require no explanation if you were learning R in the context of a statistics class.</p>
<p>The last four chapters would be interesting even for an experienced R programmer:</p>
<ul>
<li>Debugging</li>
<li>Performance enhancement: memory and speed</li>
<li>Interfacing R to other languages</li>
<li>Parallel R</li>
</ul>
<p>No one would be surprised to see the same chapters in a Java textbook if you replaced &#8220;R&#8221; with &#8220;Java&#8221; in the titles. But these topics are not typical in a book on R. They wouldn&#8217;t come up in a statistics class because they don&#8217;t provide any statistical functionality <em>per se</em>. As long as you don&#8217;t make mistakes, don&#8217;t care how long your code takes to run, and don&#8217;t need to interact with anything else, these chapters are unnecessary. But of course these chapters are quite necessary in practice.</p>
<p>As I mentioned up front, I haven&#8217;t read the book carefully. So I&#8217;m going out on a limb a little here, but I think this may be the book I&#8217;d recommend for someone wanting to learn R, especially for someone with more experience in programming than statistics.</p>
<p><a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_il?ie=UTF8&amp;tag=theende-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=1593273843"><img src="http://ws.assoc-amazon.com/widgets/q?_encoding=UTF8&amp;Format=_SL160_&amp;ASIN=1593273843&amp;MarketPlace=US&amp;ID=AsinImage&amp;WS=1&amp;tag=theende-20&amp;ServiceVersion=20070822" border="0" alt="" /></a><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=theende-20&amp;l=as2&amp;o=1&amp;a=1593273843&amp;camp=217145&amp;creative=399373" border="0" alt="" width="1" height="1" /></p>
<p><strong>Related post</strong>:</p>
<p><a href="http://www.johndcook.com/blog/2009/05/01/r-the-good-parts/">R: The Good Parts</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.johndcook.com/blog/2011/10/10/the-art-of-r-programming/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.192 seconds -->

