Arguments in favor of functional programming are often unconvincing. For example, the most common argument is that functional programming makes it easier to “reason about your code.” That’s true to some extent. All other things being equal, it’s easier to understand a function if all its inputs and outputs are explicit. But all other things are not equal. In order to make one function easier to understand, you may have to make something else harder to understand.
Here’s an argument from Brian Beckman for using a functional style of programming in a particular circumstance that I find persuasive. The immediate context is Kalman filtering, but it applies to a broad range of mathematical computation.
By writing a Kalman ﬁlter as a functional fold, we can test code in friendly environments and then deploy identical code with confidence in unfriendly environments. In friendly environments, data are deterministic, static, and present in memory. In unfriendly, real-world environments, data are unpredictable, dynamic, and arrive asynchronously.
If you write the guts of your mathematical processing as a function to be folded over your data, you can isolate the implementation of your algorithm from the things that make code hardest to test, i.e. the data source. You can test your code in a well-controlled “friendly” test environment and deploy exactly the same code into production, i.e. an “unfriendly” environment.
The flexibility to deploy exactly the code that was tested is especially important for numerical code like filters. Detecting, diagnosing and correcting numerical issues without repeatable data sequences is impractical. Once code is hardened, it can be critical to deploy exactly the same code, to the binary level, in production, because of numerical brittleness. Functional form makes it easy to test and deploy exactly the same code because it minimizes the coupling between code and environment.
I ran into this early on when developing clinical trial methods first for simulation, then for production. Someone would ask whether we were using the same code in production as in simulation.
“Yes we are.”
“Exactly the same code?”
“Essentially” was not good enough. We got to where we would use the exact same binary code for simulation and production, but something analogous to Kalman folding would have gotten us there sooner, and would have made it easier to enforce this separation between numerical code and its environment across applications.
Why is it important to use the exact same binary code in test and production, not just a recompile of the same source code? Brian explains:
Numerical issues can substantially complicate code, and being able to move exactly the same code, without even recompiling, between testing and deployment can make the difference to a successful application. We have seen many cases where differences in compiler flags, let alone differences in architectures, even between different versions of the same CPU family, introduce enough differences in the generated code to cause qualitative differences in the output. A filter that behaved well in the lab can fail in practice.
Emphasis added, here and in the first quote above.
Note that this post gives an argument for a functional style of programming, not necessarily for the use of functional programming languages. Whether the numerical core or the application that uses it would best be written in a functional language is a separate discussion.
9 thoughts on “One practical application of functional programming”
I like this perspective.
Also, an “unfriendly” environment that is important for numerical work is parallel computation. Functional programming may help us take advantage of modern parallel computing hardware architectures.
IMHO, functional programming is most useful only when it also reduces the developer’s mental load. That is, if a problem has a natural description or definition in the functional domain, then its best solution and implementation is likely to be found there as well.
No matter the other benefits functional programming provides, they are moot if the developer has to struggle to apply them. If it feels like you are trying to shove a square peg into a round hole, you probably are.
When it comes to “using the same binary for test and deployment”, I was initially surprised anyone would think there was any other way! But then my perspective may be skewed: Most of my work is building safety-critical systems, where tests don’t “count” unless they are performed on the final system.
All testing done prior to final systems testing (which can be fiendishly difficult and expensive) has the singular goal of making those final tests be both as rigorous as possible, and also pass the first time through.
An apparent contradiction? Not if you consider your first job to be developing tests, rather than an application. A good application will be the natural product of aggressive and robust testing.
Fortunately, tests often have a natural functional description. Test development was my “gateway drug” to the benefits of functional programming.
I’m an advocate for functional programming but to play play Devil’s Advocate, if the goal is to ensure the bits I’m using in dev are the same I’m using in prod then couldn’t I use docker?
Docker is very useful for moving the bits around, but the difficulty I have in mind here is designing the software so that the same interface is used in testing and production.
Although I’m a fan of functional programming, I’m not too convinced by the argument. Having the same exact code in a friendly environment doesn’t mean that it will behave as expected in the unfriendly environment (when data might be polluted in an unforeseen way for example).
Additionally, I don’t think the concept of a fold is the essential point to make about functional programming, but rather the idea of keeping state explicit/immutability
Dependency Injection (When properly done) seems to follow the rules of functional programming, and also is one of the primary drivers toward better unit tests; the components, when wired together by the container are more stand-alone testable.
This really seems to tie in with what you are saying here. DI allows only those “live versus test” aspects to vary between the two systems, and thus the code in production is as close to the unit-tested versions as possible.
In functional programming, “dependency injection” doesn’t have a name. It’s just how you write code.
I use FP often, mostly as pieces of R and Python code, and not just in state-space modeling. On an entirely different Jane Street Capital and Jane Street Europe use OCAML exclusively to code up active trading scripts which they first prove correct. Yaron Minsky has a YouTube or two where he explains why.
State-space methods and filtering can be used for an amazing variety of problems, as long as the number of state components don’t get ridiculously large. The textbooks by Durbin and Koopman, and especially by Harvey illustrate many of these, along with papers, like forecasting the Oxford-Cambridge boat race. They dispel many notions which are superstition, especially Harvey, like the idea that points in a series need to be equally spaced.
Finally, for numerical soundness I tend to use the DLM package in R because it uses an SVD-based formulation of filters and smoothers, which is about as stable as you can get. Often this is good enough, but when it isn’t I implement equations first in DLM and then in the target and I check that results are comparable.
Here is Erik Meijers take: https://queue.acm.org/detail.cfm?id=2611829
“There is a trend in the software industry to sell “mostly functional” programming as the silver bullet for solving problems developers face with concurrency, parallelism (manycore), and, of course, Big Data. Contemporary imperative languages could continue the ongoing trend, embrace closures, and try to limit mutation and other side effects. Unfortunately, just as “mostly secure” does not work, “mostly functional” does not work either. [..]”