Posts tagged as:

Programming

Interview with Dan Bricklin

by John on June 22, 2009


Dan Bricklin is best known for creating VisiCalc along with Bob Frankston in 1979. Since that time he has been active as a software developer and entrepreneur. His new book is Bricklin on Technology.

Bricklin on Technology

I quoted Dan Bricklin in a blog post a few weeks ago and he left a couple comments in the discussion. This started an email correspondence that lead to the following interview.

JC: Do you ever feel that the fame of VisiCalc has overshadowed some of your more recent accomplishments?

DB: It had better. VisiCalc was a pretty big thing to have done, and I’m very happy that I had the opportunity to make such a big contribution to the world. On the other hand, I frequently run into people who remember me because of some of my other products, especially Dan Bricklin’s Demo, or my writings that had a major impact on their work, so I know it’s not all that I’ve done of interest. Having done VisiCalc has opened many doors for me, and I surely appreciate that. I wouldn’t call it overshadowed, I’d call it added to and enhanced.

JC: What would your 30-second bio be without VisiCalc?

DB: I am a long-term toolmaker and commentator in the area of the personal use of computing power. I’ve stayed current in the technology area, and continually programmed and developed products in the latest genre, and shared my observations through blogging, podcasting, and other means, including a book.

JC: What are you doing these days as a programmer? As an entrepreneur?

DB: I have been working on an Open Source JavaScript-based spreadsheet called SocialCalc. It is being used throughout the world on the One Laptop Per Child’s XO computer, as well as by enterprise social-software company Socialtext, which paid for much of its development. I also serve on a few high-tech boards, and do a variety of types of consulting, including speaking engagements. I plan to continue developing software of various sorts and consulting.

JC: What trends do you see in software development?

DB: Software development is pervading more and more fields as a major component. We have moved from the computer being an adjunct to other means of expression or deployment to being the only or dominant means. The use of major system components, be they libraries or services, has continued to grow.

JC: Every time a new technology comes out, someone asks what the killer app will. That is, what application will do for the new technology what VisiCalc did for personal computers. Could you comment on some other “killer apps” since VisiCalc?

DB: I viewed VisiCalc as an app that made buying it and the whole system needed to run it an extremely simple decision. I saw it as having a “two week payback” for buying the whole system. That came from being two orders of magnitude better than what was used before. In VisiCalc’s case, you could use paper and pencil, taking at least 100 times as long to do the same thing, or, in those days, a timesharing system at a few thousand dollars a month (at least).

Similarly compelling applications since VisiCalc (for businesses) were desktop publishing, email, and mobile computing (like the Blackberry, Treo, and now iPhone). For the home, initially CD-ROM encyclopedias were a pretty compelling reason for homes with children to buy a PC (less than the cost of a paper encyclopedia and a bookcase to hold it but you could also use it for word processing), then the combination of email and the web with an always-connected Internet connection.

JC: The personal computer had a killer app and became popular. Are we reading too much into history by expecting that every technology must have a killer app before it can take off?

DB: You only need something that justifies buying a whole system if the sum of other applications or other reasons don’t cause the purchase on their own. For the iPhone, for some people, just having a large catalog of things you might want (those long tail apps I discuss in Chapter 7 of my book) may be enough.

JC: What do you think of open source business models? Ad sponsored, freemium, selling support/consulting services, etc.

DB: As I point out in Chapter 2 of my book where I talk about artists getting paid, there are many ways to make money. A “business model” is just saying here is how the pieces of what I do fit together and end up making enough money to meet the needs I have. This includes the cost structure as well as the sources of revenue and desired results. All long term endeavors, be they mainly based on developing or using Open Source or proprietary source or a mixture, look to different mixtures. They have historically used selling support, relationships with other companies (which advertising is a variant of), and other techniques as part of their mix. Open Source just gives us other options, including on the cost side. Also, as Prof. Ariely explains in the interview I did with him (Chapter 5) once you move into the realm of “free”, and when you appropriately invoke “community”, both of which Open Source can do, you get added benefits in your relationship with other people that can leverage your marketing and other costs.

JC: What did you learn in the process of writing your book? In particular, could you say a little bit about typography?

DB: Most of what I went through is in my essay on the topic, Turning My Blog Into A Book. I think that typography is important, and we’ve seen that as web pages have moved from very basic to better layout to full use of CSS. Typography is a way of expressing ideas and information outside of the direct flow of what you are saying. It is very valuable. Just as a well-delivered speech can convey much more than just the raw words, appropriate use of typographical techniques can convey much more than simple text.

Related posts:

Would you rather have a chauffeur or a Ferrari?
Two kinds of software challenges

{ 0 comments }

The Unix Programming Environment

by John on June 15, 2009


Joel Spolsky recommends the following books to self-taught programmers who apply to his company and need to fill in some gaps in their training.

  1. Structure and Interpretation of Computer Programs
  2. The C Programming Language
  3. The Unix Programming Environment
  4. Introduction to Algorithms

The one that has me scratching my head is The Unix Programming Environment, first published in 1984. After listening to Joel’s podcast, I thumbed through my old copy of the book and thought “Man, I could never work like this.” Of course I could work like that, because I did, back around 1990. But the world has really changed since then.

I appreciate history and old books. I see the value in learning things you might not directly apply. But imagine telling twentysomething applicants to go read an operating system book that was written before they were born. Most would probably think you’re insane.

{ 6 comments }

Upcoming Y2K-like problems

by John on June 13, 2009


The world’s computer systems kept working on January 1, 2000 thanks to billions of dollars spent on fixing old software. Two wrong conclusions to draw from Y2K are

  1. The programmers responsible for Y2K bugs were losers.
  2. That’s all behind us now.

The programmers who wrote the Y2K bugs were highly successful: their software lasted longer than anyone imagined it would. The two-digit dates were only a problem because their software was still in use decades later. (OK, some programmers were still writing Y2K bugs as late as 1999, but I’m thinking about COBOL programmers from the 1970’s.)

Y2K may be behind us, but we will be facing Y2K-like problems for years to come. Twitter just faced a Y2K-like problem last night, the so called Twitpocalypse. Twitter messages were indexed with a signed 32-bit integer. That means the original software was implicitly designed with a limit of around two billion messages. Like the COBOL programmers mentioned above, Twitter was more successful than anticipated. Twitter fixed the problem without any disruption, except that some third party Twitter clients need to be updated.

We are running out of Internet addresses because these addresses also use 32-bit integers. To make matters worse, an Internet address has an internal structure that greatly reduces the number of possible 32-bit addresses. IPv6 will fix this by using 128-bit addresses.

The US will run out of 10-digit phone numbers at some point, especially since not all 10-digit combinations are possible phone numbers. For example, the first three digits are a geographical area code. One area code can run out of 7-digit numbers while another has numbers left over.

At some point the US will run out of 9-digit social security numbers.

The original Unix systems counted time as the number of seconds since January 1, 1970, stored in a signed 32-bit integer. On January 19, 2038, the number of seconds will exceed the capacity of such an integer and the time will roll over to zero, i.e. it will be January 1, 1970 again. This is more insidious than the Y2K problem because there are many software date representations in common use, including the old Unix method. Some (parts of) software will have problems in 2038 while others will not, depending on the whim of the programmer when picking a way to represent dates.

There will always be Y2K-like problems. Computers are finite. Programmers have to guess at limitations for data. Sometimes these limitations are implicit, and so we can pretend they are not there, but they are. Sometimes programmers guess wrong because their software succeeds beyond their expectations.

{ 3 comments }


Software development is not just engineering, it’s also reverse engineering. It’s not just about deciding what code should do, it’s also about experimentally discovering what code does do. Many software developers, especially those at the bottom of the career ladder, spend more time reverse engineering than engineering. But even developers working on new “green field” projects spend a significant amount of time reverse engineering either their own code or third party code.

Anyone writing software these days spends a great deal of time researching software libraries. Anybody writing a little web page in PHP, for example, is leveraging a tremendous amount of code that others have written. Programmers work at a much higher level than they did a few years ago. We’re standing on the shoulders of giants.

But there are problems. At a minimum, you’ve got to look up the software pieces you want to use. (Engineers call this catalog engineering, spending all day looking up parts in catalogs rather than designing new parts.) Worse, the software pieces are often poorly documented and buggy. If documentation exists, you have to experiment with the software to determine whether the documentation is correct (or at least to test whether your understanding of the documentation is correct). If documentation doesn’t exist, you have to infer what the software does by searching the web and doing your own experiments. Some pieces of software are better documented and less buggy than others, but all documentation is incomplete and all software has bugs.

That’s what professional software development is like. If you enjoy problem solving and experimentation, you’ll enjoy software development. But if you can’t stand catalog engineering and reverse engineering, don’t go into software development.

{ 0 comments }

Two kinds of software challenges

by John on June 4, 2009


Here’s a quote from Bricklin on Technology regarding what colleges should teach in software engineering. (I added the bullets.)

For years we emphasized

  • execution speed,
  • memory constraints,
  • data organization,
  • flashy graphics,

and algorithms for accomplishing this all. Curricula need to also emphasize

  • robustness,
  • testing,
  • maintainability,
  • ease of replacement,
  • security, and
  • verifiability.

The criteria in the first list are primarily mathematical. The criteria in the second list have more to do with human nature. For example, code is maintainable if it’s organized so that a person can readily understand and modify it. That’s a matter of psychology.

More projects fail due to problems with the second list. Problems with the first list tend to be localized. Problems with the second list tend to permeate a project. A clever person may have a quick fix for problems with the first list. Quick fixes are rare for problems on the second list.

{ 4 comments }


Guillaume Marceau posted an excellent article yesterday that gives a graphical comparison of numerous programming languages. (The page failed to load the first time I tried to load it and it loaded slowly on my second attempt. Be patient and keep trying if it doesn’t work at first.)

It took me a while to realize that the graph axes are the reverse of my expectations. The axes are undesirable quantities — slowness and code size — and so the ideal is in the lower left. Usually comparisons use desirable quantities for the axes — in this case, efficiency and expressiveness — so that the ideal is up and to the right.

{ 1 comment }

Microsoft Ramp Up

by John on May 20, 2009


A recent .NET Rocks podcast featured Doug Turnure and Johanna White talking about Ramp Up, a new free online training program from Microsoft. It sounds like this program will organize and revise a lot of the scattered training material that Microsoft has produced.

I liked two things I heard about Ramp Up. First, the material for each course will be offered in multiple formats and from multiple perspectives such as conceptual overviews, code-centric drill downs, articles, videos, audio podcasts. Second, they’re not just going to focus on the latest technology. In the past, it’s been easiest to find material on software that hasn’t even been released, followed by the current shipping version. After that, good luck finding material on anything a release or two behind the latest. Microsoft has said that Ramp Up will leave their material online as new versions come out.

{ 0 comments }

All languages equally complex

by John on May 11, 2009


This post compares complexity in spoken languages and programming languages.

There is a theory in linguistics that all human languages are equally complex. Languages may distribute their complexity in different ways, but the total complexity is roughly the same across all spoken languages. One language may be simpler in some aspect than another but more complicated in some other respect. For example, Chinese has simple grammar but a complex tonal system.

Even if all languages are equally complex, that doesn’t mean all languages are equally difficult to learn. An English speaker might find French easier to learn than Russian, not because French is simpler than Russian in some objective sense, but because French is more similar to English.

All spoken languages are supposed to be equally complex because languages reach an equilibrium between at least two forces. Skilled adult speakers tend to complicate languages by looking for ways to be more expressive. But children must be able to learn their language relatively quickly, and less skilled speakers need to be able to use the language as well.

I wonder what this says about programming languages.There are analogous dynamics. Programming languages can be relatively simpler in some way while being relatively complex in another way. And programming languages become more complex over time due to the demands of skilled users.

But there are several important differences. Programming languages are part of a complex system of language, standard libraries, idioms, tools, etc. It may make more sense to speak of a programming “system” to make better comparisons, taking into account the language and its environment.

I do not think that all programming systems are equally complex. Some are better designed than others. Some are more appropriate for a given task than others. Some programming systems achieve simplicity by sacrificing efficiency. Some abstractions leak less than others.

On the other hand, I imagine the levels of complexity are more similar when comparing programming systems rather than just comparing programming languages.  Larry Wall said something to the effect that Perl is ugly so you can write beautiful programs in it. I think there’s some truth to that. A language can always be small and elegant by simply not providing much functionality, forcing the user to implement that functionality in application code.

See Larry Wall’s article Natural Language Principles in Perl for more comparisons of spoken languages and programming languages.

Related posts:

Rate of regularizing English verbs
The cost of breaking things down and putting them back together
Two perspectives on the design of C++

{ 6 comments }

Plain Python

by John on May 8, 2009


Perl is cool, much more so than Python. But I prefer writing Python.

Perl is fun to read about. It has an endless stream of features to discover. Python by comparison is kinda dull. But the aspects of a language that make it fun to read about do not necessarily make it pleasant to use.

I wrote Perl off and on for several years before trying Python. People would tell me I should try Python and every six months or so I’d skim through a Python book. My impression was that Python was prosaic. It didn’t seem to offer any advantage over Perl, so I stuck with Perl. (Not that I was ever very good at Perl.)

Then I read an article by Bruce Eckel saying that he liked Python because he could remember the syntax. He said that despite teaching Java and writing books on Java, he could never remember the syntax for opening a file in Java, for example. But he could remember the corresponding Python syntax. I would never have picked up on that by skimming books. You’ve got to actually use a language a while to know how memorable the syntax is. But  I had used Perl enough to know that I could not remember its syntax without frequent use. Memorable syntax increases productivity. You don’t have to break your train of thought as often to reach for a reference book.

I stand by my initial impression that Python is plain, but I now think that’s a strength. It just gets out of my way and lets me get my work done. I’m sure  Perl gurus can be extremely productive in Perl. I tried being a Perl guru, and I never made it. I wouldn’t say I’m a Python guru, but I also don’t feel the need to be a guru to use Python.

Python code is not cool in a line-by-line sense, not in the way that an awesomely powerful Perl one-liner is cool. Python is cool in more subtle ways.

Related posts:

Languages that are easy to pick back up
API symmetry
Periodic table of Perl operators
Three-hour-a-week language

{ 14 comments }

Highlights from Reproducible Ideas

by John on May 5, 2009


Here are some of my favorite posts from the Reproducible Ideas blog.

Three reasons to distrust microarray results
Provenance in art and science
Forensic bioinformatics (continued)
Preserving (the memory of) documents
Programming is understanding
Musical chairs and reproducibility drills
Taking your code out for a walk

The most popular and most controversial was the first in the list, reasons to distrust microarray results.

The emphasis shifts from science to software development as you go down the list, though science and software are intertwined throughout the posts.

{ 0 comments }


Douglas Crockford’s book JavaScript: The Good Parts is terrific. Crockford is both a critic of and advocate for JavaScript. He’s quite frank about the language’s faults. His book is the clearest exposition of the pitfalls of JavaScript that I’ve seen. But he also believes there’s a great language at the heart of JavaScript. He doesn’t just complain about the bad parts; he explains how to avoid them. He has identified his recommended subset of the language. He has written programming style guide intended to increase the chances that JavaScript code does what the programmer intends. And he has written a tool, JSLint, to warn of potential problems. (Crockford reminds me of Luke Skywalker, convinced that there is good in Darth Vader and determined to rescue him from the dark side of the force.)

I wish someone would write a book for R analogous to the one Crockford wrote for JavaScript.

The R language has a lot in common with JavaScript. Both are Lisp-like languages at their core with C-like syntax. Both are dominant languages in their respective niches: R in academic statistics and JavaScript in web browsers. (R doesn’t have the monopoly in statistics that JavaScript has in the browser, but it’s still pervasive.) Both languages are powerful but maddening to debug. JavaScript has an undeserved reputation for being ugly because it is typically used to program the browser DOM; it’s the DOM that’s buggy and non-standard, not JavaScript. Similarly, R’s reputation may suffer from the numerous poorly written modules implemented in R.

Related posts:

Five kinds of subscripts in R
R programming coming from other languages
Programming language fatigue
Programming language subsets

{ 6 comments }

Do you really want to be indispensable?

by John on April 22, 2009


One strategy for increasing job security is to make yourself indispensable by never documenting anything. Deliberately following such a strategy is unethical. Passively falling into such a situation is more understandable, and more common, but it’s not very smart either.

If you’re indispensable , you can hold on to your job — maybe. But the flip side is that you can’t let go of your job either. You can never wash your hands of a project, never hand it over to someone else. You cannot be promoted. You’ll need to take your laptop with you on vacation, if you’re able to take vacation.

I’ve seen this play out in software projects that are never quite finished. The project minimally works, but only with the developer’s intervention. The developer isn’t trying to be indispensable. Quite the opposite: the developer desperately wants to get away from the project.  But the software isn’t stable. Bugs are discovered every time a new part of the code is exercised. These may be fixed quickly, but only by the original developer. Or maybe the code is stable, but only the original developer can reproduce the build. Or some part of the code ought to be configurable, but instead the developer has to constantly tweak the source code. For whatever reason, the project isn’t wrapped up and the developer cannot extricate himself from it.

The solution is to plan to make yourself dispensable from the beginning. Ask yourself throughout the project, “How am I going to be able to hand this over to someone else?” Or more graphically, “What if I get hit by a bus?”

Make yourself valuable for what you’re expected to accomplish in the future, not for what you’ve have accomplished in the past.

Related post:
Programming the last mile

{ 11 comments }

Civic duty on StackOverflow

by John on April 12, 2009


On StackOverflow, users gain reputation points when other users vote up their questions or answers. Voting is considered a civic duty. Voting doesn’t increase your own reputation. The only direct reward for voting is the “Civic Duty” badge for voting 300 times. But voting makes the site work well. Good questions and answers generally rise to the top.

How civic-minded are StackOverflow users? Where do the votes come from? Are people who receive more or less likely to give? That is, do those who have received high reputation scores through other users’ votes also give away reputation points in the form of votes? Jeff Atwood mailed me some data the other day so I could answer these questions.

Sixty percent of  StackOverflow users haven’t cast one vote, but that doesn’t tell the whole story. The site is growing rapidly and so there are always a large number of users who haven’t been on the site long enough to vote much or gain much reputation. Also, there are a large number of users who registered some time ago but hardly participate on the site.

When you compare reputation scores and votes, things get more interesting. For starters, users who are somewhat invested in the site, as indicated by reputation score > 100, have voted 91 times on average. That still doesn’t tell the full story because it averages over a huge range of reputation scores. Here’s the more interesting story: The number of votes users cast is proportional to their reputation.

The graph above shows average number of answer votes as a function of reputation. I divided reputation ranges into blocks of 100 (i.e. 0 - 99, 100 - 199, etc.) and averaged the number of times users in that range voted up an answer. There are two reasons I only considered answer votes: there are far more question votes than answer votes, and question votes follow a similar pattern to answer votes.

The graph starts to feather out on the right end because there are fewer users in each reputation range; there is more random variation because there are fewer people in the higher ranges to average over.The number of users at each reputation level drops off rapidly according to a power law. Although 99.4% of users have reputation less than 5000, the largest reputation score was 51,313 on the day Jeff collected the data. Here’s a graph from my earlier post, StackOverflow reputation statistics, that shows how quickly the number of people in each reputation range drops.

The graph above was based on data collected at the end of February this year but the data discussed in this post was collected in April. As you look at higher reputation scores, the curve continues to drop of quickly. Since reputations follow a power law, the decrease is linear on a log scale.

Even though users with the highest reputation scores vote the most, most votes come from users with lower reputation scores. That’s just because the large majority of users have lower reputation scores. Users with reputation < 1400 account for a little over half the answer votes cast. They also account for over 96% of all users. If you turn this around, it says that nearly half the votes come from the top 4% of users in reputation. This explains in part why the best answers usually rise to the top: the most knowledgeable users are active voters, assuming reputation and knowledge are correlated.

(The situation is analogous to that of income taxes. The very wealthy pay the most taxes per person, but the bulk of tax revenue comes from those who are not so wealthy. Even so, the percentage of total tax revenue from the top earners is surprisingly high. According to this site, the top 1% of tax payers were responsible for about 40% of all income tax revenue in 2008. The analogy holds for good reasons. Wealth, like StackOverflow reputation, follows a power law distribution. And taxes increase roughly linearly with wealth the same way StackOverflow votes increase with reputation.)

In short, it looks like StackOverflow users are civic minded. Those who receive the most votes also give the most votes. And users in the lower end of the reputation range cast most of the votes in total even though they cast fewer votes per person.

Related post: StackOverflow reputation statistics

{ 5 comments }

Programming language fatigue

by John on April 10, 2009


Joe Brinkman wrote an insightful article the other day, Ployglot Programming: Death By A Thousand DSLs. Here’s an excerpt:

I don’t know about other programmers, but I am drowning in DSLs [domain specific languages].  It is hard enough keeping up with my primary development language and the associated platform APIs, but these DSLs are going to be the death of me.  The end result is that I have a pretty decent handle on maybe 3 or 4 of these DSLs but rarely do I have the requisite knowledge to make the right choices in anything beyond that.

It takes a dozen programming languages to do any web project these days. Whenever I bring this up in conversation, most developers say “Oh, well. That’s just the way it is. It isn’t so bad.” But I think it really is a problem. Obviously it’s intimidating amount of material for new developers to learn. But the more subtle problem is that experienced developers who think they understand all the different languages they use are probably wrong.

Case in point: JavaScript. Nearly every web project involves some client-side JavaScript, and 99% of the people who write JavaScript do not know the language. I never claimed to be a JavaScript expert, but I thought I understood the language better than I really did until I saw some presentations by Douglas Crockford.

Crockford has written an excellent book: JavaScript: The Good Parts. His position is that there is an elegant, powerful language at the core of JavaScript but it is surrounded by landmines. His book focuses on the good parts, but along the way he tells you how to avoid or disarm the landmines.

Related post: Programming language subsets

{ 0 comments }

Copy and paste warning

by John on March 30, 2009


Tony Rasa has written a Clippy-like program that will nag you every time you copy and paste code in Visual Studio.

screen shot from AntiPaste

See his post AntiPaste, because Pasting Code Is Harmful.

It’s a joke, but many a truth is told in jest.

{ 1 comment }