Dimensional analysis and types

ruler and protractor on grid paper

It’s spooky how well dimensional analysis catches errors. If you’re trying to calculate a number of horses, does your final result have units of horses? If it has units of cats or kilograms, something has gone wrong. This is such a simple idea, it’s remarkable that it’s even worth checking.

Dimensions are analogous to type systems for programming languages. Type systems prevent you, for example, from adding a number and a word or from trying to take the square root of an image. You could think of dimensional analysis as a subset of type theory.

Is more better?

If a minimal amount of type checking, i.e. checking units of measurement, is effective at catching errors, more type checking should catch more errors, right?. Although I mostly agree with this argument, it leaves out cost. Stronger type checking catches more errors, but at what cost? Dimensional analysis is practically free. It can literally be a back-of-an-envelope calculation and has great return on effort. What about type systems?

Most people would agree that some minimal amount of type checking is worth the effort. The controversy is over when it is no longer economical to add additional structure. Haskell has a much more expressive type system than less formal languages, and yet the Haskell community is looking for ways to make the type system even more expressive. Some would say we’ve yet to discover the point where additional typing isn’t worth the effort.

Context

As with most technical controversies, the resolution depends on context. You don’t want types to get in your way when you’re writing a quick-and-dirty script. But you might greatly appreciate them in large, mission-critical projects.

It takes skill to get the most benefit from a type system. A good type system won’t automatically do anything for your code. You could actively work against the type system by writing “stringly typed” code, for example. Every function takes in a string and returns a string. You could passively work with the type system, using built-in types as intended but not creating any new types. Or you could actively work with the type system, creating new types so that more logical errors will become compiler errors.

An example of passively using a type system would be distinguishing integers and floating point numbers, say using integers for counting things, floating point numbers for measurements like temperature and mass, and strings for character data. You could argue that there aren’t many natural types in such program, and so a strong type system wouldn’t be that helpful. A more active approach would be, for example, to introduce different types for temperatures and masses. You could go much further than this, organizing data into more complex types.

All other things being equal, I like strong, static typing. But there are usually other factors that outweigh my typing preferences: availability of libraries, convenience, client requirements, etc.

14 thoughts on “Dimensional analysis and types

  1. Dimensional analysis is domain specific. Type systems are generic. They’re completely different. The number of horses and the number of cats has the same type, but different dimensions. bananas=0 and “yes, we have no bananas” have the same dimension, but different types.

    Much as I like computational theory, I would very much like a language that is duck-typed, but with support for dimensional analysis. I don’t know how it’d work; probably you need to explicitly declare the relationship between types (implemented as structs/objects) for your application. But I bet that approach would catch a lot more errors than a strict generic type tower.

  2. Janne: Python is duck-typed and has unit libraries. I haven’t used Pint myself, but I’ve heard good things about it.

  3. One thing I like about Haskell’s typing is that the programmer only has to specify types in a few cases (usually more to help themself than to make the compiler happy), and Haskell will automatically figure out the type of every other expression and error-out when types don’t match.

    This is in contrast to Java and (pre-auto) C++, where the coder has to do the work of specifying types everywhere, and type-hinted languages where, although you’re relieved of the burden of specifying types everywhere, you only get as much type checking back as you put in.

  4. Jonathan: I agree that type inference and strong typing go well together. If you can get stronger typing (in the CS sense) without more typing (in the sense of moving your fingers on a keyboard) that’s a win.

  5. Typescript is another language where you can choose between Javascript-style variant types and strong typing as you see fit. I quite like it, and think it would be great if I could also specify types with unit information.
    We seen to be entering a phase where languages allow the mixture of multiple philosophies. For example, Scala mixes functional and imperative programming.

  6. If you take the simple view that the type is everything the compiler knows about an expression, then the extreme position (opposite to the “string” extreme) is that the programmer creates an appropriate type for each (sub)expression. People might like to consider the Wombat type system. I believe it gets in the programmers way less because values can have multiple types, in this sense: If we declare Integer isA Rational, then values of type Integer acquire all the properties of rationals (such as .denominator), and are thus also Rationals without any conversion. You might also be interested in Wombat’s proposed torsor-based units system: http://wombatlang.blogspot.com.au/2015/10/units-in-wombat.html.

  7. I suspect that programming language designers could indeed learn something from studying how dimensional analysis is used. Some rough first observations:

    1) Dimensional analysis is a simple protocol with a well-defined application domain. Type systems in programming languages want to be universal and in most cases compulsory for all of a program. Gradual typing is more flexible, but still assumes that a single type system is good for everything. Could we have type systems as libraries and use them as it seems appropriate?

    2) Dimensions are unlimited like the integers, as units can be multiplied arbitrarily. Most type systems are much more rigid: they need every single type declared explicity with a name. That’s why most type systems cannot express dimensional analysis.

    3) Type systems are mostly marketed as tools for ensuring correctness, but they are usually designed with optimization in mind as well. The basic types of typed languages just happen to be those that have simple and efficient representations at the bit level, although there is no a priori reason that these are good choices for proofs about correctness. Dimensional analysis predates computers and was developed with nothing but correctness in mind – perhaps type systems should explore that approach as well.

    4) Similarly, the static vs dynamic typing distinction exists for the benefit of compiler writers rather than programmers. I’d want to add as many verifiable statements to my code as I consider reasonable, and let the compiler figure out which ones it can prove at compile time and which ones require run-time check. Making this decision at language-choice time is a clear case of premature optimization.

  8. Oh that was a terrible pun. You can be proud of yourself.

    The next language that adds some automated type inference really needs to announce it with “Now, more typing with less typing”. It rolls off the tongue almost as well as “Worse is Better”.

  9. Yes, dimensional analysis seems to be pretty much parallel to the type-systems in type-theoretic sense, at first glance.
    But an enough general implementation is not so obvious:

    That general form of a dimension (product of powers), forbidding addition of values of incompatible (different) units, is derived in assumption of a linear coordinate system with fixed origin and axes, only scales are varying, producing homogeneity.
    In general case, however, we’d have an object described by a curvilinear coordinate system (say, polar).
    And every coordinate in every such coordinate system should be (generally) considered as being of a different unit (in particular, in affine case, change of origin, or scale, or some axes rotation).
    Even worse, no usual algebraic operations may ever exist there, say, we may have only transportation of a vector along a curve instead of just addition to another vector…
    And how a dimensional analysis might look like in this case?

    But, anyway, in simple cases it is good idea to use different types for different units, even if they’re implemented as the same data-type (duration_t, length_t, velocity_t, …, while all being float).

    I think, the types’ hierarchy should be just structured appropriately, with respect to the structure of the underlying mathematical model, be more “algebraic”.
    For example, when building the constraints for a linear program, one might first define several instances of data-type of a vector, for each unit involved, and define operation of direct sum for these spaces, so that the resulting constraint matrix and vectors would then lie within the corresponding sum spaces.

    Conclusion: no, dimensional analysis, if implemented, should be heavily based on type theory, not in parallel to it.

  10. Functional Scientist

    As a fan of both Haskell’s type safety and scientific dimensional analysis, thought you’d the F# language interesting.

    F# has compile-time unit-of-measure type checking, see https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/units-of-measure

    For example:
    “`
    [] type m
    [] type sec
    [] type kg
    [] type newton = kg*m/sec^2

    let mass = 5.0
    let acc = 10.0
    let force : float = mass * acc

    printf “Force = %g\n” force

    // produce compile-time error
    let error_mixed_units = force + mass
    “`

    generates the compile-time error:

    error FS0043: The unit of measure ‘kg’ does not match the unit of measure ‘newton’

    IMO, this is /so/ much better than the mountains of python and R code that manipulate complex dimensional quantities with dimensionless variables, and hope that the right answers pop out.

  11. The F# approach, as Andrew mentioned, is really nice.

    Another very interesting and different approach is https://pypi.org/project/numericalunits/ which embeds all units into floating points, places a somewhat higher burden on programmer to know what the correct units should be and ask for that, and relies on a trick where if the right units are asked for, the previously encoded multiplicative factors will be cancelled out deterministically… but otherwise the answer will randomly vary. So you have to test it twice to know there is even a problem; on the other hand you get runtime unit checking “for free”. Whereas in F# its all compile-time and dynamic checking is not possible. Interestingly, neither approach will easily allow you to e.g. print out the units at runtime.

  12. Konrad wrote: “Dimensions are unlimited like the integers, as units can be multiplied arbitrarily. Most type systems are much more rigid: they need every single type declared explicity with a name. That’s why most type systems cannot express dimensional analysis.”

    Huh? Every type system I’ve used allows arbitrarily composed unnamed types. How else would a compiler be able to understand common types like “dictionary of (string) to (array of integer)”? Even older languages like C++ can use a type like this, without needing to give it a name.

    That’s not to say all languages are good fits for retrofitting with dimensional analysis, but there’s no inherent disconnect between DA and types due to composition. Types compose, too.

Comments are closed.