The success of OOP

Allen Wirfs-Brock gave the following defense of OOP a few days ago in a series of six posts on Twitter:

A young developer approached me after a conf talk and said, “You must feel really bad about the failure of object-oriented programming.” I was confused. I said, “What do you mean that object-orient programming was a failure. Why do you think that?”

He said, “OOP was supposed to fix all of our software engineering problems and it clearly hasn’t. Building software today is just as hard as it was before OOP. came along.”

“Have you ever look at the programs we were building in the early 1980s? At how limited their functionality and UIs were? OOP has been an incredible success. It enabled us to manage complexity as we grew from 100KB applications to today’s 100MB applications.”

Of course OOP hasn’t solved all software engineering problems. Neither has anything else. But OOP has been enormously successful in allowing ordinary programmers to write much larger applications. It has become so pervasive that few programmers consciously think about it; it’s simply how you write software.

I’ve written several posts poking fun at the excesses of OOP and expressing moderate enthusiasm for functional programming, but I appreciate OOP. I believe functional programming will influence object oriented programming, but not replace it.

Related:

Launching missiles with Haskell

Haskell advocates are fond of saying that a Haskell function cannot launch missiles without you knowing it. Pure functions have no side effects, so they can only do what they purport to do. In a language that does not enforce functional purity, calling a function could have arbitrary side effects, including launching missiles. But this cannot happen in Haskell.

The difference between pure functional languages and traditional imperative languages is not quite that simple in practice.

Programming with pure functions is conceptually easy but can be awkward in practice. You could just pass each function the state of the world before the call, and it returns the state of the world after the call. It’s unrealistic to pass a program’s entire state as an argument each time, so you’d like to pass just that state that you need to, and have a convenient way of doing so. You’d also like the compiler to verify that you’re only passing around a limited slice of the world. That’s where monads come in.

Suppose you want a function to compute square roots and log its calls. Your square root function would have to take two arguments: the number to find the root of, and the state of the log before the function call. It would also return two arguments: the square root, and the updated log. This is a pain, and it makes function composition difficult.

Monads provide a sort of side-band for passing state around, things like our function call log. You’re still passing around the log, but you can do it implicitly using monads. This makes it easier to call and compose two functions that do logging. It also lets the compiler check that you’re passing around a log but not arbitrary state. A function that updates a log, for example, can effect the state of the log, but it can’t do anything else. It can’t launch missiles.

Once monads get large and complicated, it’s hard to know what side effects they hide. Maybe they can launch missiles after all. You can only be sure by studying the source code. Now how do you know that calling a C function, for example, doesn’t launch missiles? You study the source code. In that sense Haskell and C aren’t entirely different.

The Haskell compiler does give you assurances that a C compiler does not. But ultimately you have to study source code to know what a function does and does not do.

Related post: Monads are hard because …

Weak static type systems

Comment by Simon Peyton Jones in an interview:

People often dislike static type systems because they’ve only met weak ones. A weak or not very expressive type system gets in your way all the time. It prevents you from writing functions you want to write that you know are fine. … The solution is not to abandon the type system but to make the type system more expressive.

In particular, he mentions Haskell’s polymorphic types and type inference as ways to make strong static typing convenient to use.

Monads are hard because …

Here’s a nice quip from Luke Gorrie on Twitter:

Monads are hard because there are so many bad monad tutorials getting in the way of finally finding Wadler’s nice paper.

Here’s the paper by Philip Wadler that I expect Luke Gorrie had in mind: Monads for functional programming.

Here’s the key line from Wadler’s paper:

Pure functional languages have this advantage: all flow of data is made explicit. And this disadvantage: sometimes it is painfully explicit.

That’s the problem monads solve: they let you leave implicit some of the repetitive code otherwise required by functional programming. That simple but critical point left out of many monad tutorials.

Dan Piponi wrote a blog post You Could Have Invented Monads! (And Maybe You Already Have) very much in the spirit of Wadler’s paper. He starts with the example of adding logging to a set of functions. You could have every function return not just the return value that you’re interested in but also the updated state of the log. This quickly gets messy. For example, suppose your basic math functions write to an error log if you call them with illegal arguments. That means your square root function, for example, has to take as input not just a real number but also the state of the error log before it was called. Monads give you a way of effectively doing the same thing, but conceptually separating the code that takes square roots from the code that maintains the error log.

As for “so many bad monad tutorials,” see Brent Yorgey on the monad tutorial fallacy.

By the way, this post is not Yet Another Monad Tutorial. It’s simply an advertisement for the tutorials by Philip Wadler and Dan Piponi.

How to get started with functional programming

Someone asked me this weekend how to get started with functional programming. My answer: Start by writing pure functions in the programming language you’re currently using.

The only input to a pure function is its argument list and the only output is its return value. If you haven’t seen this before, you might think all functions are pure. After all, any function takes in values and returns a value. But in conventional programming there are typically out-of-band ways for information to flow in or out of a function. For example, an impure function may depend on a global variable or class member data. In that case, it’s behavior is not entirely determined by its arguments. Similarly, an impure function might set a global variable or write to a database. In that case the function has a side effect in addition to its return value.

You can write pure functions in any language, though it’s easier in some languages than others. For example, no one would call Fortran a functional language, but there are people (M. J. D. Powell comes to mind) who discipline themselves to write pure functions in Fortran.

Why write pure functions? A pure function has referential transparency, meaning it will always return the same value when given the same inputs. The output does not depend on the system time, the state of a database, which functions were called previously, or anything else that is not explicitly passed as an argument to the function. This means pure functions are easier to understand (and hence easier to debug and test).

You can’t always write pure functions. If you need to stick a value in a database, this cannot be accomplished with a pure function. Or if you’re calling a random number generator, you don’t want it to have referential transparency, always returning the same output! But the goal is to use pure functions when practical. You want to eliminate out-of-band communication when it’s convenient to do so. Total purity is not practical; some argue that the sweet spot is about 85% purity.

So why don’t programmers use pure functions more often? One reason is that pure functions require longer argument lists. In an object oriented language, object methods can have shorter argument lists by implicitly depending on object state. The price to pay for shorter method signatures is that you can’t understand a method by itself. You have to know the state of the object when the method is called. Is it worthwhile to give up referential transparency in order to have shorter method signatures? It depends on your context and your taste, though in my opinion its often worthwhile to use longer function signatures in exchange for more pure functions.

Another reason people give for not writing pure functions is that its too expensive to copy large data structures to pass them into a function. But that’s what pointers are for. You can conceptually pass an object into a function without having to actually make a copy of the object’s bits.

You can also fake purity for the sake of efficiency. For example, Mike Swaim left a comment recently giving an example of how memoization sped up a program by several orders of magnitude. (Memoization is a technique of caching computations. When a function is asked to compute something, it first looks to see whether it has already done the calculation. If so, it returns the cached value. If not, it does the calculation and adds its output to the cache.) A function that uses memoization is not strictly pure — its calculations have a persistent impact on the state of its cache — but such a function can still have referential transparency, always returning the same output given the same input. You could say it’s cheating to call such functions pure, and it is, but if you’re really a stickler about it, all pure functions have side effects.

Related posts

You wanted a banana but you got a gorilla holding the banana

Joe Armstrong, creator of Erlang, on software reusability.

I think the lack of reusability comes in object-oriented languages, not functional languages. Because the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.

If you have referentially transparent code, if you have pure functions — all the data comes in its input arguments and everything goes out and leave no state behind — it’s incredibly reusable.

Source: Coders at Work. Emphasis added.

Gorilla holding a banana

I do most of my work in object-oriented languages and I don’t see that changing any time soon. I’m more interested in functional techniques than functional languages: using pure functions, using functions as arguments and return values, etc. As Joe Armstrong says, such code is easier to reuse. If you want to reuse (or test) a functional banana, you don’t have to set up a stateful gorilla to hold the banana first.

More programming posts

Pure functions have side-effects

Functional programming emphasizes “pure” functions, functions that have no side effects. When you call a pure function, all you need to know is the return value of the function. You can be confident that calling a function doesn’t leave any state changes that will effect future function calls.

But pure functions are only pure at a certain level of abstraction. Every function has some side effect: it uses memory, it takes CPU time, etc. Harald Armin Massa makes this point in his PyCon 2010 talk “The real harm of functional programming.” (His talk is about eight minutes into the February 21, 2010 afternoon lightning talks:  video.)

Even pure functions in programming have side effects. They use memory. They use CPU. They take runtime. And if you look at those evil languages, they are quite fast at doing Fibonacci or something, but in bigger applications you get reports “Hmm, I have some runtime problems. I don’t know how to get it faster or what it going wrong.

Massa argues that the concept of an action without side effects is dangerous because it disassociates us from the real world. I disagree. I appreciate his warning that the “no side effect” abstraction may leak like any other abstraction. But pure functions are a useful abstraction.

You can’t avoid state, but you can partition the stateful and stateless parts of your code. 100% functional purity is impossible, but 85% functional purity may be very productive.

Related posts

85% functional language purity

James Hague offers this assessment of functional programming:

My real position is this: 100% pure functional programming doesn’t work. Even 98% pure functional programming doesn’t work. But if the slider between functional purity and 1980s BASIC-style imperative messiness is kicked down a few notches — say to 85% — then it really does work. You get all the advantages of functional programming, but without the extreme mental effort and unmaintainability that increases as you get closer and closer to perfectly pure.

More functional programming posts

F# may succeed where others have failed

Philip Wadler wrote an article a decade ago entitled Why no one uses functional languages. He begins the article by explaining that yes, there have been a number of large successful projects developed in functional programming languages. But compared to the number of programmers who work in procedural languages, the number working in functional languages is essentially zero. The reasons he listed fall into eight categories.

  1. Lack of compatibility with existing code
  2. Limited library support compared to popular languages
  3. Lack of portability across operating systems
  4. Small communities and correspondingly little community support
  5. Inability to package code well for reuse
  6. Lack of sophisticated tool support
  7. Lack of training for new developers in functional programming
  8. Lack of popularity

Most of these reasons do not apply to Microsoft’s new functional language F# since it is built on top of the .NET framework. For example, F# has access to the enormous Common Language Runtime library and smoothly interoperates with anything developed with .NET. And as far as tool support, Visual Studio will support F# starting with the 2010 release. Even portability is not a barrier: The Mono Project has been quite successful in porting .NET code to non-Microsoft platforms. (Listen to this Hanselminutes interview with Aaron Bockover for an idea how mature Mono is.)

The only issues that may apply to F# are training and popularity. Programmers receive far more training in procedural programming, and the popularity of procedural programming is self-reinforcing. Despite these disadvantages, interest in functional programming in general is growing. And when programmers want to learn a functional programming language, I believe many will choose F#.

It will be interesting to see whether F# catches on. It resolves many of the accidental difficulties of functional programming, but the intrinsic difficulties remain. Functional programming requires a different mindset, one that programmers have been reluctant to adopt. But now programmers have a new incentive to give functional languages a try: multi-core processors.

Individual processor cores are not getting faster, but we’re getting more of them per box. We have to write multi-threaded code to take advantage of extra cores, and multi-threaded programming in procedural languages is hard, beyond the ability of most programmers. Strict functional languages eliminate many of the difficulties with multi-threaded programming, and so it seems likely that at least portions of systems will be written in functional languages.

Related post: Functional in the small, OO in the large

MIT replaces Scheme with Python

According to an article on The Snowtide Blog, MIT has decided to teach beginning CS classes in Python rather than Scheme. (Thanks to procoders.net for the link.) The article paraphrases remarks by Gerdald Sussman at the announcement.The main rationale was that the world is more complicated now than when the course was designed and SICP was written. Now programmers spend more time researching libraries than writing everything from scratch.

Here’s something I expected Sussman to say though apparently he did not: you can use Python as a functional language if you like, you just don’t have to. I don’t know why more people don’t emphasize this. Python is not a functional language, but Python does make it easy to declare functions on the fly, pass functions as arguments, etc. You’re free to write Scheme-like programs in Python if you want to, but you’re also free to use other styles when they are more appropriate.