From the category archives:

Software development

Interview with Dan Bricklin

by John on June 22, 2009

Dan Bricklin is best known for creating VisiCalc along with Bob Frankston in 1979. Since that time he has been active as a software developer and entrepreneur. His new book is Bricklin on Technology.

Bricklin on Technology

I quoted Dan Bricklin in a blog post a few weeks ago and he left a couple comments in the discussion. This started an email correspondence that lead to the following interview.

JC: Do you ever feel that the fame of VisiCalc has overshadowed some of your more recent accomplishments?

DB: It had better. VisiCalc was a pretty big thing to have done, and I’m very happy that I had the opportunity to make such a big contribution to the world. On the other hand, I frequently run into people who remember me because of some of my other products, especially Dan Bricklin’s Demo, or my writings that had a major impact on their work, so I know it’s not all that I’ve done of interest. Having done VisiCalc has opened many doors for me, and I surely appreciate that. I wouldn’t call it overshadowed, I’d call it added to and enhanced.

JC: What would your 30-second bio be without VisiCalc?

DB: I am a long-term toolmaker and commentator in the area of the personal use of computing power. I’ve stayed current in the technology area, and continually programmed and developed products in the latest genre, and shared my observations through blogging, podcasting, and other means, including a book.

JC: What are you doing these days as a programmer? As an entrepreneur?

DB: I have been working on an Open Source JavaScript-based spreadsheet called SocialCalc. It is being used throughout the world on the One Laptop Per Child’s XO computer, as well as by enterprise social-software company Socialtext, which paid for much of its development. I also serve on a few high-tech boards, and do a variety of types of consulting, including speaking engagements. I plan to continue developing software of various sorts and consulting.

JC: What trends do you see in software development?

DB: Software development is pervading more and more fields as a major component. We have moved from the computer being an adjunct to other means of expression or deployment to being the only or dominant means. The use of major system components, be they libraries or services, has continued to grow.

JC: Every time a new technology comes out, someone asks what the killer app will. That is, what application will do for the new technology what VisiCalc did for personal computers. Could you comment on some other “killer apps” since VisiCalc?

DB: I viewed VisiCalc as an app that made buying it and the whole system needed to run it an extremely simple decision. I saw it as having a “two week payback” for buying the whole system. That came from being two orders of magnitude better than what was used before. In VisiCalc’s case, you could use paper and pencil, taking at least 100 times as long to do the same thing, or, in those days, a timesharing system at a few thousand dollars a month (at least).

Similarly compelling applications since VisiCalc (for businesses) were desktop publishing, email, and mobile computing (like the Blackberry, Treo, and now iPhone). For the home, initially CD-ROM encyclopedias were a pretty compelling reason for homes with children to buy a PC (less than the cost of a paper encyclopedia and a bookcase to hold it but you could also use it for word processing), then the combination of email and the web with an always-connected Internet connection.

JC: The personal computer had a killer app and became popular. Are we reading too much into history by expecting that every technology must have a killer app before it can take off?

DB: You only need something that justifies buying a whole system if the sum of other applications or other reasons don’t cause the purchase on their own. For the iPhone, for some people, just having a large catalog of things you might want (those long tail apps I discuss in Chapter 7 of my book) may be enough.

JC: What do you think of open source business models? Ad sponsored, freemium, selling support/consulting services, etc.

DB: As I point out in Chapter 2 of my book where I talk about artists getting paid, there are many ways to make money. A “business model” is just saying here is how the pieces of what I do fit together and end up making enough money to meet the needs I have. This includes the cost structure as well as the sources of revenue and desired results. All long term endeavors, be they mainly based on developing or using Open Source or proprietary source or a mixture, look to different mixtures. They have historically used selling support, relationships with other companies (which advertising is a variant of), and other techniques as part of their mix. Open Source just gives us other options, including on the cost side. Also, as Prof. Ariely explains in the interview I did with him (Chapter 5) once you move into the realm of “free”, and when you appropriately invoke “community”, both of which Open Source can do, you get added benefits in your relationship with other people that can leverage your marketing and other costs.

JC: What did you learn in the process of writing your book? In particular, could you say a little bit about typography?

DB: Most of what I went through is in my essay on the topic, Turning My Blog Into A Book. I think that typography is important, and we’ve seen that as web pages have moved from very basic to better layout to full use of CSS. Typography is a way of expressing ideas and information outside of the direct flow of what you are saying. It is very valuable. Just as a well-delivered speech can convey much more than just the raw words, appropriate use of typographical techniques can convey much more than simple text.

Related posts:

Would you rather have a chauffeur or a Ferrari?
Two kinds of software challenges

{ 0 comments }

Software development is not just engineering, it’s also reverse engineering. It’s not just about deciding what code should do, it’s also about experimentally discovering what code does do. Many software developers, especially those at the bottom of the career ladder, spend more time reverse engineering than engineering. But even developers working on new “green field” projects spend a significant amount of time reverse engineering either their own code or third party code.

Anyone writing software these days spends a great deal of time researching software libraries. Anybody writing a little web page in PHP, for example, is leveraging a tremendous amount of code that others have written. Programmers work at a much higher level than they did a few years ago. We’re standing on the shoulders of giants.

But there are problems. At a minimum, you’ve got to look up the software pieces you want to use. (Engineers call this catalog engineering, spending all day looking up parts in catalogs rather than designing new parts.) Worse, the software pieces are often poorly documented and buggy. If documentation exists, you have to experiment with the software to determine whether the documentation is correct (or at least to test whether your understanding of the documentation is correct). If documentation doesn’t exist, you have to infer what the software does by searching the web and doing your own experiments. Some pieces of software are better documented and less buggy than others, but all documentation is incomplete and all software has bugs.

That’s what professional software development is like. If you enjoy problem solving and experimentation, you’ll enjoy software development. But if you can’t stand catalog engineering and reverse engineering, don’t go into software development.

{ 0 comments }

Two kinds of software challenges

by John on June 4, 2009

Here’s a quote from Bricklin on Technology regarding what colleges should teach in software engineering. (I added the bullets.)

For years we emphasized

  • execution speed,
  • memory constraints,
  • data organization,
  • flashy graphics,

and algorithms for accomplishing this all. Curricula need to also emphasize

  • robustness,
  • testing,
  • maintainability,
  • ease of replacement,
  • security, and
  • verifiability.

The criteria in the first list are primarily mathematical. The criteria in the second list have more to do with human nature. For example, code is maintainable if it’s organized so that a person can readily understand and modify it. That’s a matter of psychology.

More projects fail due to problems with the second list. Problems with the first list tend to be localized. Problems with the second list tend to permeate a project. A clever person may have a quick fix for problems with the first list. Quick fixes are rare for problems on the second list.

{ 4 comments }

Beautiful Testing

by John on June 2, 2009

Beautiful Testing is available for pre-order at Amazon. Proceeds from the book will go to Nothing But Nets, a project to distribute anti-malaria bed nets. I contributed a chapter on how to test random number generators.

Beautiful Testing: Leading Programmers Reveal How They Test

{ 1 comment }

.NET Rocks episode 438 interviewed Patrick Hynds on why projects fail. One of the reasons is unclear expectations. He said in the interview that no matter what you say you’re going to do on a project, clients have additional implicit expectations. Hynds says that in order to have a successful project, you have to “destroy any hope” that you will deliver anything outside the specification. Here’s an excerpt from the podcast transcript, emphasis added.

… if I give you a spec I’m going to give you everything this document says and nothing more. In other words, if it’s not shown or described in detail in this document, it will not be done. … this is going to cost you X thousand dollars and if you expected something else to be in there, it won’t be. It sounds bad but you have to destroy any hope they have that you’re thinking the way that they’re thinking. …

I mean, in web projects we state explicitly what resolutions we will support and none others. What browser versions we will support and no others, what back-end database versions and libraries we will support and no others, that kind of thing. … I find for everything you say you’re going to do, you have to define one or two things you’re not going to do.

Hynds’ advice may sound adversarial, but everyone is happier in the long run when there are clear expectations up front.

Here’s another good quote from Hynds later in the interview.

There’s always someone out there willing to bid less to do a bad job.

Related posts:

Medieval software project management
Feasibility studies
Status report questions
Good, fast, or cheap: can you really pick two?

{ 4 comments }

Guillaume Marceau posted an excellent article yesterday that gives a graphical comparison of numerous programming languages. (The page failed to load the first time I tried to load it and it loaded slowly on my second attempt. Be patient and keep trying if it doesn’t work at first.)

It took me a while to realize that the graph axes are the reverse of my expectations. The axes are undesirable quantities — slowness and code size — and so the ideal is in the lower left. Usually comparisons use desirable quantities for the axes — in this case, efficiency and expressiveness — so that the ideal is up and to the right.

{ 1 comment }

Microsoft Ramp Up

by John on May 20, 2009

A recent .NET Rocks podcast featured Doug Turnure and Johanna White talking about Ramp Up, a new free online training program from Microsoft. It sounds like this program will organize and revise a lot of the scattered training material that Microsoft has produced.

I liked two things I heard about Ramp Up. First, the material for each course will be offered in multiple formats and from multiple perspectives such as conceptual overviews, code-centric drill downs, articles, videos, audio podcasts. Second, they’re not just going to focus on the latest technology. In the past, it’s been easiest to find material on software that hasn’t even been released, followed by the current shipping version. After that, good luck finding material on anything a release or two behind the latest. Microsoft has said that Ramp Up will leave their material online as new versions come out.

{ 0 comments }

Programs, not just projects

by John on May 12, 2009

My frustration with personal productivity systems like GTD is that they’re all about projects and tasks. They leave out a third category: programs. GTD thinks of a project as something that can be broken into a manageable number of tasks and scratched off a list. But programs go on indefinitely and cannot be divided into a small number of one-time tasks.

I’m using the word “program” as in an “exercise program” or a “research program.” (I could think of my exercise program as a project, but it’s one I hope not to complete for a few more decades.) Sometimes there is a neat hierarchy where programs spawn off projects that can be divided into tasks. But sometimes you just have programs and tasks.

One of my frustrations with managing software development in an academic environment was the large number of programs disguised as projects. (Sorry, I know it’s confusing to talk about “programs” in the context of software development and not mean computer instructions.) You can’t manage programs as if they were projects. For example, you can’t talk about “after” project is done if it’s not really a project but a never-ending program. You have to either acknowledge that a program is really a program, or you have to have some way to make it into a finite project.

{ 6 comments }

All languages equally complex

by John on May 11, 2009

This post compares complexity in spoken languages and programming languages.

There is a theory in linguistics that all human languages are equally complex. Languages may distribute their complexity in different ways, but the total complexity is roughly the same across all spoken languages. One language may be simpler in some aspect than another but more complicated in some other respect. For example, Chinese has simple grammar but a complex tonal system.

Even if all languages are equally complex, that doesn’t mean all languages are equally difficult to learn. An English speaker might find French easier to learn than Russian, not because French is simpler than Russian in some objective sense, but because French is more similar to English.

All spoken languages are supposed to be equally complex because languages reach an equilibrium between at least two forces. Skilled adult speakers tend to complicate languages by looking for ways to be more expressive. But children must be able to learn their language relatively quickly, and less skilled speakers need to be able to use the language as well.

I wonder what this says about programming languages.There are analogous dynamics. Programming languages can be relatively simpler in some way while being relatively complex in another way. And programming languages become more complex over time due to the demands of skilled users.

But there are several important differences. Programming languages are part of a complex system of language, standard libraries, idioms, tools, etc. It may make more sense to speak of a programming “system” to make better comparisons, taking into account the language and its environment.

I do not think that all programming systems are equally complex. Some are better designed than others. Some are more appropriate for a given task than others. Some programming systems achieve simplicity by sacrificing efficiency. Some abstractions leak less than others.

On the other hand, I imagine the levels of complexity are more similar when comparing programming systems rather than just comparing programming languages.  Larry Wall said something to the effect that Perl is ugly so you can write beautiful programs in it. I think there’s some truth to that. A language can always be small and elegant by simply not providing much functionality, forcing the user to implement that functionality in application code.

See Larry Wall’s article Natural Language Principles in Perl for more comparisons of spoken languages and programming languages.

Related posts:

Rate of regularizing English verbs
The cost of breaking things down and putting them back together
Two perspectives on the design of C++

{ 6 comments }

Plain Python

by John on May 8, 2009

Perl is cool, much more so than Python. But I prefer writing Python.

Perl is fun to read about. It has an endless stream of features to discover. Python by comparison is kinda dull. But the aspects of a language that make it fun to read about do not necessarily make it pleasant to use.

I wrote Perl off and on for several years before trying Python. People would tell me I should try Python and every six months or so I’d skim through a Python book. My impression was that Python was prosaic. It didn’t seem to offer any advantage over Perl, so I stuck with Perl. (Not that I was ever very good at Perl.)

Then I read an article by Bruce Eckel saying that he liked Python because he could remember the syntax. He said that despite teaching Java and writing books on Java, he could never remember the syntax for opening a file in Java, for example. But he could remember the corresponding Python syntax. I would never have picked up on that by skimming books. You’ve got to actually use a language a while to know how memorable the syntax is. But  I had used Perl enough to know that I could not remember its syntax without frequent use. Memorable syntax increases productivity. You don’t have to break your train of thought as often to reach for a reference book.

I stand by my initial impression that Python is plain, but I now think that’s a strength. It just gets out of my way and lets me get my work done. I’m sure  Perl gurus can be extremely productive in Perl. I tried being a Perl guru, and I never made it. I wouldn’t say I’m a Python guru, but I also don’t feel the need to be a guru to use Python.

Python code is not cool in a line-by-line sense, not in the way that an awesomely powerful Perl one-liner is cool. Python is cool in more subtle ways.

Related posts:

Languages that are easy to pick back up
API symmetry
Periodic table of Perl operators
Three-hour-a-week language

{ 14 comments }

Highlights from Reproducible Ideas

by John on May 5, 2009

Here are some of my favorite posts from the Reproducible Ideas blog.

Three reasons to distrust microarray results
Provenance in art and science
Forensic bioinformatics (continued)
Preserving (the memory of) documents
Programming is understanding
Musical chairs and reproducibility drills
Taking your code out for a walk

The most popular and most controversial was the first in the list, reasons to distrust microarray results.

The emphasis shifts from science to software development as you go down the list, though science and software are intertwined throughout the posts.

{ 0 comments }

Douglas Crockford’s book JavaScript: The Good Parts is terrific. Crockford is both a critic of and advocate for JavaScript. He’s quite frank about the language’s faults. His book is the clearest exposition of the pitfalls of JavaScript that I’ve seen. But he also believes there’s a great language at the heart of JavaScript. He doesn’t just complain about the bad parts; he explains how to avoid them. He has identified his recommended subset of the language. He has written programming style guide intended to increase the chances that JavaScript code does what the programmer intends. And he has written a tool, JSLint, to warn of potential problems. (Crockford reminds me of Luke Skywalker, convinced that there is good in Darth Vader and determined to rescue him from the dark side of the force.)

I wish someone would write a book for R analogous to the one Crockford wrote for JavaScript.

The R language has a lot in common with JavaScript. Both are Lisp-like languages at their core with C-like syntax. Both are dominant languages in their respective niches: R in academic statistics and JavaScript in web browsers. (R doesn’t have the monopoly in statistics that JavaScript has in the browser, but it’s still pervasive.) Both languages are powerful but maddening to debug. JavaScript has an undeserved reputation for being ugly because it is typically used to program the browser DOM; it’s the DOM that’s buggy and non-standard, not JavaScript. Similarly, R’s reputation may suffer from the numerous poorly written modules implemented in R.

Related posts:

Five kinds of subscripts in R
R programming coming from other languages
Programming language fatigue
Programming language subsets

{ 6 comments }

Do you really want to be indispensable?

by John on April 22, 2009

One strategy for increasing job security is to make yourself indispensable by never documenting anything. Deliberately following such a strategy is unethical. Passively falling into such a situation is more understandable, and more common, but it’s not very smart either.

If you’re indispensable , you can hold on to your job — maybe. But the flip side is that you can’t let go of your job either. You can never wash your hands of a project, never hand it over to someone else. You cannot be promoted. You’ll need to take your laptop with you on vacation, if you’re able to take vacation.

I’ve seen this play out in software projects that are never quite finished. The project minimally works, but only with the developer’s intervention. The developer isn’t trying to be indispensable. Quite the opposite: the developer desperately wants to get away from the project.  But the software isn’t stable. Bugs are discovered every time a new part of the code is exercised. These may be fixed quickly, but only by the original developer. Or maybe the code is stable, but only the original developer can reproduce the build. Or some part of the code ought to be configurable, but instead the developer has to constantly tweak the source code. For whatever reason, the project isn’t wrapped up and the developer cannot extricate himself from it.

The solution is to plan to make yourself dispensable from the beginning. Ask yourself throughout the project, “How am I going to be able to hand this over to someone else?” Or more graphically, “What if I get hit by a bus?”

Make yourself valuable for what you’re expected to accomplish in the future, not for what you’ve have accomplished in the past.

Related post:
Programming the last mile

{ 11 comments }

Status report questions

by John on April 20, 2009

The latest .NET Rocks podcast interviews Pat Hynds on why projects fail. Toward the end of his interview he mentions a simple template for status reports.

  1. What did you work on?
  2. What did you get done?
  3. What did you do that you didn’t anticipate having to do?
  4. What did you plan to do that you didn’t get done?
  5. What do you plan to do?
  6. What do you need from others?

When I started managing a group of programmers, I’d focus on #1 and #2. But in some ways #3 is the most important question. That question can alert you to a major time sink that’s not include in your project estimates. That question can let you know of problems beyond an individual developer’s ability to resolve. That question that can tell you it’s time to buy something you were planning on building yourself.

{ 2 comments }

Civic duty on StackOverflow

by John on April 12, 2009

On StackOverflow, users gain reputation points when other users vote up their questions or answers. Voting is considered a civic duty. Voting doesn’t increase your own reputation. The only direct reward for voting is the “Civic Duty” badge for voting 300 times. But voting makes the site work well. Good questions and answers generally rise to the top.

How civic-minded are StackOverflow users? Where do the votes come from? Are people who receive more or less likely to give? That is, do those who have received high reputation scores through other users’ votes also give away reputation points in the form of votes? Jeff Atwood mailed me some data the other day so I could answer these questions.

Sixty percent of  StackOverflow users haven’t cast one vote, but that doesn’t tell the whole story. The site is growing rapidly and so there are always a large number of users who haven’t been on the site long enough to vote much or gain much reputation. Also, there are a large number of users who registered some time ago but hardly participate on the site.

When you compare reputation scores and votes, things get more interesting. For starters, users who are somewhat invested in the site, as indicated by reputation score > 100, have voted 91 times on average. That still doesn’t tell the full story because it averages over a huge range of reputation scores. Here’s the more interesting story: The number of votes users cast is proportional to their reputation.

The graph above shows average number of answer votes as a function of reputation. I divided reputation ranges into blocks of 100 (i.e. 0 - 99, 100 - 199, etc.) and averaged the number of times users in that range voted up an answer. There are two reasons I only considered answer votes: there are far more question votes than answer votes, and question votes follow a similar pattern to answer votes.

The graph starts to feather out on the right end because there are fewer users in each reputation range; there is more random variation because there are fewer people in the higher ranges to average over.The number of users at each reputation level drops off rapidly according to a power law. Although 99.4% of users have reputation less than 5000, the largest reputation score was 51,313 on the day Jeff collected the data. Here’s a graph from my earlier post, StackOverflow reputation statistics, that shows how quickly the number of people in each reputation range drops.

The graph above was based on data collected at the end of February this year but the data discussed in this post was collected in April. As you look at higher reputation scores, the curve continues to drop of quickly. Since reputations follow a power law, the decrease is linear on a log scale.

Even though users with the highest reputation scores vote the most, most votes come from users with lower reputation scores. That’s just because the large majority of users have lower reputation scores. Users with reputation < 1400 account for a little over half the answer votes cast. They also account for over 96% of all users. If you turn this around, it says that nearly half the votes come from the top 4% of users in reputation. This explains in part why the best answers usually rise to the top: the most knowledgeable users are active voters, assuming reputation and knowledge are correlated.

(The situation is analogous to that of income taxes. The very wealthy pay the most taxes per person, but the bulk of tax revenue comes from those who are not so wealthy. Even so, the percentage of total tax revenue from the top earners is surprisingly high. According to this site, the top 1% of tax payers were responsible for about 40% of all income tax revenue in 2008. The analogy holds for good reasons. Wealth, like StackOverflow reputation, follows a power law distribution. And taxes increase roughly linearly with wealth the same way StackOverflow votes increase with reputation.)

In short, it looks like StackOverflow users are civic minded. Those who receive the most votes also give the most votes. And users in the lower end of the reputation range cast most of the votes in total even though they cast fewer votes per person.

Related post: StackOverflow reputation statistics

{ 5 comments }