Efficiency vs. Robustness

Something is efficient if it performs optimally under ideal circumstances.

Something is robust if it performs pretty well under less than ideal circumstances.

Life is full of trade-offs between efficiency and robustness. Over time, I’ve come to place more weight on the robustness side of the trade-off. I imagine that’s typical. With more experience, you become more willing to concede that you haven’t thought of everything that could happen. After a while, you notice that events with a “one in a million” chance of occurring happen much more often than predicted.

Robust things are often more efficient than efficient things are robust. That is, robust strategies often do fairly well under ideal circumstances. You may not give up much efficiency for robustness. But efficient strategies can fail spectacularly when the assumptions they’re based on don’t hold.

Related post: Six-sigma events

19 thoughts on “Efficiency vs. Robustness

  1. Sounds like a good explanation why economy is a major roadblock on the path developing robust strategies to avoid disasters while facing fuzzy consequences of climate change. Can markets emerge robustness at all?

  2. EnlightenedDuck

    I became a huge fan of non-parametric statistics when I learned that the Asymptotic Relative Efficiency of the sign test compared to the t-test on normal data is 2/pi, and that with the Mann-Whitney test the ARE is 3/pi. Giving up *so* little power for the robustness of the non-parametric tests convinced me.

  3. As for references, the idea of robustness goes under different names in different disciplines: stability, well-posedness, lack of sensitive dependence on initial conditions, etc.

    One of the differences between computer science and software engineering is that the latter is more concerned with robustness. People who develop software constantly have to ask “What if there’s a bug? What will the consequences be? What will it take to find and fix it?”

  4. The thing about robustness especially with computers: that 1/1,000,000 event is called Tuesday. If it is an automated service getting 100k hits a day it doesn’t take long for that one in a million event to occur.

    If it is an odd set of inputs from a user it can happen by mistake, failure to read the manual, difficulties understanding instructions due to past experience and/or language/cultural factors etc. In short trust nothing.

  5. Mike: With sheer volume, one-in-a-million events can happen frequently. As you said, this happens frequently with computers.

    That’s important, but what I had in mind was events that are said to be one-in-a-million rarities that actually happen one time in a thousand. For example, we’ve seen several financial events recently that were only supposed to happen once in 10,000 years.

    There are two common causes of for such gross under-estimates of probability: assuming things are independent that are not, and assuming a normal (thin-tailed) distribution when it doesn’t fit.

  6. Mr.John , i think it will be awesome if you show it efficiency vs robustness graphically :)

  7. The definitions are similar to but certainly distinct from those in merriam-webster, to wit:

    efficient: “productive of desired effects; especially : productive without waste” (definition 2 on http://www.merriam-webster.com/dictionary/efficient)

    robust: “capable of performing without failure under a wide range of conditions” (definition 1d on http://www.merriam-webster.com/dictionary/robust) (Sadly, definition 4 is not applicable here though it remains awesome :)

    Efficiency is not limited to optimal circumstances is the particular difference to note here, though robustness not being limited to suboptimal circumstances is perhaps interesting food for future thought).

    It’s certainly clear why continual examination of efficiency can lead to your restricted redefinition. In computer prgramming for instance, checks for conditions that “should” never happen can be wasteful if the processor has bad branch prediction. This can, however, be addressed. And of course, if you take the completely paranoid bent that *everything* needs to be checked, nothing will complete (for instance, do you check to make sure that assignment really worked? How about making sure the check (which is itself software) really was correct? Who checks the checkers?!) So clearly there’s a point of diminishing returns.

    And while we’re on the subject of things which are of rare frequency being less rare than anticipated, let us discuss flooding. In that situation, the solutions implemented to prevent somewhat rare conditions (protecting buildings in 100- or 500-year flood plains) which themselves cause the events they’re protecting against to become increasingly common (because damming a river to prevent flooding into the flood plain raises its level).

    Life is complicated. :)

  8. Usually these two terms are used as counterwords

    Robustness is somethıng that points out you ensure/guarantee your system behave well under your predefined circumstances. Eg. Let say you claim that your lift will work smoothly for the weights btw 0-1000kg and if you will call the service and give lots of money in the service-counts. It is not need to be optimal. It only guarantee to work. Additionally you are responsible to check as a system designer if the weights are in the correct band. If it is not, simply do not work. It is like a telescope that sees a narrow place in a scene. It is not claiming to show all the beauty of the city.

    Efficiency ahhhhh it is a pain in the ass. When you consider efficiency you need to be clarify your proposal. Efficient in the sense of what? Is it energy efficient or time efficient? Or mixture efficient which depends on your functional. Optimality is quite suitable in much cases. Efficiency is also need a metric. But to use a metric in SW eng. there should be a god-code that is labeled as 1. Efficiency can be used when you consider a transfer. Eg. An electric motor can be .85 efficient for converting the electrical energy to mechanical energy ;)

  9. Franklin: I think he’s on to something. Antifragility is great if you can get it, but robust is better than fragile.

    Statistical methods can be antifragile relative to deterministic methods: sometimes noise in your input can be helpful.

  10. I agree with Joseph that efficiency is not necessarily just about optimal conditions. My personal definitions are something like:

    Efficient: works very well under expected conditions
    Robust: does not fail catastrophically under unexpected conditions

    It’s interesting that decision analysis has (since Arrow) been focused on maximizing expected utility, while Game Theory has focused on minimizing maximum loss. Efficiency versus robustness?

  11. @Dave I think the difference here is in the nature of what causes the adverse events. In Game Theory it is assumed that your competitor is trying to achieve their own goals which might be contrary to yours. You assume they are always going to make the best decision available to them which in a zero sum situation means their trying to maximize your loss.

    For Arrow I think the universe is a little more agnostic: stuff just happens not actively trying to mess with you and you can try to optimize the expected outcome. So probably better for modelling things like shipping disasters etc vs competitors competing for the same contract.

  12. @Mike,

    I agree that it matters whether there’s a sapient adversary choosing strategies — but there’s also the question of iteration. Arrow’s model assumes that you make many choices during your life, and can expect the net end result to be near the sum of the expected values. But real life can be more like a Gambler’s Ruin problem, where one big loss takes you out of the game. Game theory distinguishes between the one-time game and the iterated game. Many of the paradoxes of decision analysis (Allais, etc.) disappear if you stipulate that the situation only happens once.

  13. Suresh Venkat (@geomblog)

    Of course even with this, there’s a “bias-variance” tradeoff. A robust method is one with “low variance”, but it can have “high bias” by being inefficient. An efficient method has “low bias” because it can do quite well, but it can have “high variance” because of fragility.

    Worst-case analysis in algorithm design is an attempt at being robust (to all possible input scenarios, for example)

  14. This reminds me of a lot of stuff Steve Maguire said in his book Writing Solid Code. He emphasizes the virtue of really dumb brute-force algorithms. Often, you know they’ll work to a much greater degree of certainty than clever ones, and they are easy to understand and maintain, if their performance drawbacks aren’t critical.

  15. This seems to be related to the tradeoffs between “engineered” and “organic” systems.

    It might be noted that a system that is “robust enough” in general but with a significant use with very high “efficiency” can be attractive in some cases. E.g., a design for a multi-function tool might sacrifice 5% efficiency in 99 uses for a 100% efficiency boost in one use. Some users would choose a balanced-efficiency multi-tool, but other users would choose the tool that is substantially better in one use if 95% of mediocre is good enough–e.g., when the alternative might be have the somewhat more robust multi-tool and a separate tool for the alternate design’s efficient case (whether a cheap one of comparable or slightly better efficiency or an expensive one with substantially greater efficiency).

  16. Let me give a couple of examples from our recent work on Stan.

    1. Robustness to scale. If you have a regression problem, the scale of the problem refers to the size of the predictors and hence size of the coefficients. The efficient thing to do is assume everything’s prescaled coming in to unit size. The robust thing to do is to let people specify income in dollars, thousands of dollars or whatever, or specify weight in ounces or tons, and then adapt to the scale of the problem. Adaptation takes time, so it’s less efficient. On the other hand, it lets your regression estimates be much more robust.

    1.b. This then engendered another round of decision making about how robust to make the estatimes of scale. Here, robustness comes in one way from having more warmup iterations, but more warmup iterations is slower overall. Robustness comes in another way from regularizing. If we estimate covariance, too, we have to be much more careful to regularize our scales because there’s so much more we’re estimating and linear algebra operations are very touchy. Robustness also comes from forgetting the past. Regularization from samples far away from the typical set aren’t so useful, so you want to forget them to converge to a better estimate faster. But then you have less data. It’s all tradeoffs.

    2. Bounds checking, etc. If we want to provide error messages rather than crashing with a segfault, we need to check that indices are in bounds before using them. The problem is that this can be a significant bottleneck if the only thing the loop is doing is addition or a multiply-add (it introduces new branch points and the need to predict them and also makes compilers less likely to inline because the code’s bigger and more complex). The problem’s even bigger in testing things like positive definitess of a precision or covariance matrix, which is expensive (O(k^3) for k dimensions). But if you provide a non-positive-definite precision matrix, the samples will be wrong and there may not be any other indication of the problem.

Comments are closed.