Cop with a mop

Yesterday I was at a wedding, and a vase broke in the aisle shortly before the bridal party was to enter. Guests quickly picked up the pieces, but the vase left a pool of water on the hard floor.

A security guard ran (literally) for a mop and cheerfully picked up the water. He could have easily stood in the corner and said that mopping floors is not his job. And if he were guarding a jewelry store, it would be inappropriate for him to leave his post to get a mop. But his presence at the wedding was a formality, presumably a venue requirement, and no one was endangered by his fetching a mop. There was more danger of someone slipping on a wet floor.

I enjoy seeing anyone do their job with enthusiasm, doing more than the minimum required. Over-zealous people can cause problems, but I’d much rather deal with such problems than deal with people passively putting in their time.

Why isn’t CPU time more valuable?

Here’s something I find puzzling: why isn’t CPU time more valuable?

I first thought about this when I was working for MD Anderson Cancer Center, maybe around 2002. Our research in adaptive clinical trial methods required bursts of CPU time. We might need hundreds of hours of CPU time for a simulation, then nothing while we figure out what to do next, then another hundreds hours to run a modification.

We were always looking for CPU resources, and we installed Condor to take advantage of idle PCs, something like the SETI at Home or GIMPS projects. Then we had CPU power to spare, sometimes. What could we do between simulations that was worthwhile but not urgent? We didn’t come up with anything.

Fast forward to 2019. You can rent CPU time from Amazon for about 2.5 cents per hour. To put it another way, it’s about 300 times cheaper per hour to rent a CPU than to hire a minimum wage employee in the US. Surely it should be possible to think of something for a computer to do that produces more than 2.5 cents per CPU hour of value. But is it?

Well, there’s cryptocurrency mining. How profitable is that? The answer depends on many factors: which currency you’re mining and its value at the moment, what equipment you’re using, what you’re paying for electricity, etc. I did a quick search, and one person said he sees a 30 to 50% return on investment. I suspect that’s high, but we’ll suppose for the sake of argument there’s a 50% ROI [1]. That means you can make a profit of 30 cents per CPU day.

Can we not thinking of anything for a CPU to do for a day that returns more than 30 cents profit?! That’s mind boggling for someone who can remember when access to CPU power was a bottleneck.

Sometimes computer time is very valuable. But the value of surplus computer time is negligible. I suppose it all has to do with bottlenecks. As soon as CPU time isn’t the bottleneck, its value plummets.

Update: According to the latest episode of the Security Now podcast, it has become unprofitable for hackers to steal CPU cycles in your browser for crypto mining, primarily because of a change in Monero. Even free cycles aren’t worth using for mining! Mining is only profitable on custom hardware.

***

[1] I imagine this person isn’t renting time from Amazon. He probably has his own hardware that he can run less expensively. But that means his profit margins are so thin that it would not be profitable to rent CPUs at 2.5 cents an hour.

International internet privacy law

world map

Scott Hanselman interviewed attorney Gary Nissenbaum in show #647 of Hanselminutes. The title was “How GDPR is effecting the American Legal System.”

Can Europe pass laws constraining American citizens? Didn’t we settle that question in 1776, or at least by 1783? And yet it is inevitable that European law effects Americans. And in fact Nissembaum argues that every country has the potential to pass internet regulation effecting citizens of every other country in the world.

Hanselman: Doesn’t that imply that we can’t win? There’s two hundred and something plus countries in the world and if any European decides to swing by a web site in Djibouti now they’re going to be subject to laws of Europe?

Nissenbaum: I’ll double down on that. It implies that any country that has users of the internet can create a more stringent law than even the Europeans, and then on the basis of that being the preeminent regulatory body of the world, because it’s a race to who can be the most restrictive. Because the most restrictive is what everyone needs to comply with.

So if Tanzania decides that it is going to be the most restrictive country in terms of the laws … that relate to internet use of their citizens, theoretically, all web sites around the world have to be concerned about that because there are users that could be accessing their web site from Tanzania and they wouldn’t even know it.

Will the “world wide web” someday not be worldwide at all? There has been speculation, for example, that we’ll eventually have at least two webs, one Chinese and and one non-Chinese. The web could tear into a lot more pieces than that.

As Nissenbaum says toward the end of the podcast

If anyone assumes there’s a simple way of handling this, they’re probably wrong. It is complicated, and you just have to live with that, because that’s the world we’re in.

Related posts

Pareto’s 80-20 rule

Vilfredo Pareto

Pareto’s 80-20 rule says that 80% of your results often come from 20% of your effort. Maybe 80% of your profit comes from 20% of your customers, or maybe 80% of the bugs in your software are removed in the first 20% of the time you spend debugging.

The rule is named after Italian economist Vilfredo Paretowho observed that 80% of his country’s land belonged to 20% of its population. The exact ratio of 80-20 isn’t important, though it is surprisingly common. The same principle applies whenever a large majority of effects come from a small number of causes.

The 80-20 rule, or Pareto principle, is startling the first time you hear it. It suggests you can be a lot more productive by focusing your effort where it does the most good. For example, there may be 100,000 to 1,000,000 words in English, depending on how you count them. But you could be pretty fluent in English by knowing the 1,000 most common words.

The thousand most frequently used words in any language are far more important than all the rest combined. Studying these words first makes much more sense than a uniformitarian approach, going through a dictionary in alphabetic order on the assumption that all words are equally important.

I’ve thought about the Pareto principle off and on for many years. When I bring it up for discussion, people are often defensive, bringing up the same objections every time.

Objections

The most common objection is the recursive argument. If you could be more effective by focusing on the 20% that’s most important, then you should do that again: focus on the 20% of the 20% that’s most important. Apply this argument repeatedly and you can be infinitely productive with no effort.

The recursive argument takes the “80” and “20” of the 80-20 rule too literally. The point is not the exact ratios. The point is that return on effort invested is not uniformly distributed. In fact, it’s often far from uniformly distributed. I prefer the term Pareto principle to “80-20 rule” just because it does not reference particular numbers that could distract from the general principle.

Could you apply a Pareto principle recursively to English words, say by focusing on the 200 most common words? In fact your could. But that doesn’t mean that you could keep doing this repeatedly, learning only the most common word (“the”) and declaring yourself fluent in English. This doesn’t negate the fact that the importance of English words is very unevenly distributed.

Another objection is the completionist argument. It says that everything has to be done, so the fact that you get less return on some things than others doesn’t matter. For example, the letters E, T, and A appear about 100 times as often as J, Q, and Z. That doesn’t mean you could leave J, Q, and Z off your keyboard. On the other hand, it does mean that you might design a keyboard so that E, T, and A are easier to reach than J, Q, and Z. And Samuel Morse was smart to assign his shortest codes to the most frequently used letters. [1]

A final objection is the ignorance argument: we simply don’t what the most effective 20% will be beforehand. This is a serious objection, and it should temper our optimism regarding the Pareto principle. If a salesman knew which 20% of his prospects were going to buy, he should just sell to them. But of course he doesn’t know ahead of time who those 20% will be. On the other hand, he has some idea who is likely to buy (and how much they may buy) and doesn’t approach prospects randomly.

These objections take the Pareto principle to extremes to justify disregarding it. Since you can’t repeatedly apply it indefinitely, there must be nothing to it. Or if you can’t completely eliminate the least productive work, you should treat everything equally. Or if you don’t have absolute certainty regarding what’s most important, you shouldn’t consider what’s likely to be most important.

Applications

Despite the objections above, it is true that returns on effort are often very unevenly distributed. There’s a common tendency to under estimate the variance [2]. We might have a rough idea how effective a list of possible actions would be, and maybe imagine than the most effective choice would be ten times better than the least effective choice, but in fact the ratio might be a hundred to one or even a thousand to one [3]. Somehow we mentally compress these ratios, maybe on something like a logarithmic scale.

So one key to taking advantage of the Pareto principle is simply to keep in mind that something like the Pareto principle might hold. You’re not likely to find a Pareto rule if you don’t think they exist.

Another key is to be honest with ourselves regarding how effective we want to be. Maybe the most effective thing to do is something we simply don’t want to do. If so, we can either make a principled decision to not do what we know to be more effecitve, or get over our sloth.

I mentioned ignorance above. “Uncertainty” is a more helpful word than “ignorance” here because we’re not often completely ignorant. We usually have some idea which actions are more likely to be effective. Data can help. Start by using whatever information or intuition you have, and update it as you gather data.

This could be a formal Bayesian process if you have quantifiable data. Or it could be as simple as just trying something. If it works, try it again. If not, try something different. You may be able to bootstrap this “play the winner” strategy until you have enough data to be more formal about making decisions.

***

[1] How well does Morse code symbol length correspond to frequency? I looked into that here.

[2] I have a friend who has helped me with this. He will suggest I do X, and I agree, but say I’d rather do Y. Then he will reply with something like “Sure, you could do that. But X could be a thousand times more effective. It’s up to you.” I’ve done the same for others. It’s easier to see someone else’s decisions objectively than your own.

[3] This is not an exaggeration. I’ve seen this, for example, in software optimization. Some changes might make 1,000x more of a difference than others.

Objectives and constraints

Objectives and constraints are symmetrical in a mathematical sense but are asymmetrical in a psychological sense. By taking dual formulations, you can reverse the mathematical role of objectives and constraints, but in application objectives are more obvious than constraints.

In the question “What is the minimum value of x² over the interval [1, 5]?” the function f(x) = x² is the objective function and 1 ≤ x ≤ 5 is the constraint. If someone says the minimum is 0, they’ve minimized the objective function but ignored the constraint. This is clear in a such a simple problem, but failure to consider constraints can be much more subtle.

Objectives tend to be easily quantifiable—maximize profit, minimize energy consumption, etc.— but constraints tend to be less quantifiable—the solution has to be testable and maintainable, has to be legal, has to be something people will buy or vote for, etc.

When children ask “Why don’t you just …” it’s because they see a way to improve some objective, but the “just” part shows that they are either completely unaware of a relevant constraint or are unaware of how difficult it would be to overcome the constraint. As you mature, you become aware of more constraints. You realize that things that seem grossly subopitmal are actually close to optimal when you consider the necessary constraints. There may be room for improvement, but not as much as you imagined and at a higher cost.

Big opportunities open up when constraints change. Maybe an idea was abandoned because it would require more calculation than anyone could carry out by hand, and now’s the time to revisit it. Or maybe an idea was never developed because it would require instantaneous communication between people at multiple points on the globe. No problem now.

In both the examples above, a constraint was relaxed: computation and communication have gotten far less expensive. Increased constraints create opportunities as well. When the price of something goes up, its alternatives become more economical by comparison. Whether an oil field is worth developing, for example, depends on the current price of oil.

If I ask “Why hasn’t someone done this before?” I’m skeptical if the answer is “Because I’m smarter than everyone else who has tried.” But if the answer is “Because constraints have changed” then I’m much more receptive.

Related post: Boundary conditions are the hard part

Dividing projects into math, statistics, and computing

If you’ve read this blog for long, you know that my work is a combination of math, statistics, and computing.

I was looking over my records and tried to see how my work divides into these three areas. In short, it doesn’t.

The boundaries between these areas are fuzzy or arbitrary to begin with, but a few projects fell cleanly into one of the three categories. However, 85% of my income has come from projects that involve a combination of two areas or all three areas.

If you calculate a confidence interval using R, you could say you’re doing math, statistics, and computing. But for the accounting above I’d simply call that statistics. When I say a project uses math and computation, for example, I mean it requires math outside what is typical in programming, and programming outside what is typical in math.

Example of the bike shed principle

Celebration, Florida town seal

One of the case studies in Michael Beirut’s book How to is the graphic design for the planned community Celebration, Florida. The logo for the town’s golf course is an illustration of the bike shed principle.

C. Northcote Parkinson observed that it is easier for a committee to approve a nuclear power plant than a bicycle shed. Nuclear power plants are complex, and no one on a committee presumes to understand every detail. Committee members must rely on the judgment of others. But everyone understands bicycle sheds. Also, questions such as what color to paint the bike shed don’t have objective answers. And so bike sheds provoke long discussions.

People argue about bike sheds because they understand bike sheds. Beirut said something similar about the Celebration Golf Club logo which features a silhouette of a golfer.

Designing the graphics for Celebration’s public golf club was much harder than designing the town seal. It took me some time to realize why: none of our clients were Schwinn-riding, polytailed girls [as in the town seal], but most of them were enthusiastic golfers. The silhouette on the golf club design was refined endlessly as various executives demonstrated their swings in client meetings.

Image credit: By Source, Fair use, https://en.wikipedia.org/w/index.php?curid=37643922

Natural growth

Interesting passage from Small is Beautiful: Economics as if People Mattered by E. F. Schumacher:

Nature always, so to speak, knows where and when to stop. There is a measure in all natural things—in their size, speed, or violence. As a result, the system of nature, of which man is a part, tends to be self-balancing, self-adjusting, self-cleansing. Not so with technology, or perhaps I should say: not so with man dominated by technology and specialization. Technology recognizes no self-limiting principle …

We speak of natural growth more often than natural limits to growth. Maybe we should consider the latter more often.

Schumacher’s book was written in 1973 and seems to embody some of the hippie romanticism of its day. That does not make its arguments right or wrong, but it shows what some of the author’s influences were.

The book’s back cover has an endorsement describing Schumacher as “eminently practical, sensible, … versant in the subtleties of large-scale business management …” I haven’t read the whole book, only parts here and there, but the romantic overtones stand out more to me, maybe because they contrast more with the contemporary atmosphere. When the book was published, maybe the pragmatic overtones stood out more.

Optimal team size

Kevlin Henney’s keynote at GOTO Copenhagen this year discussed how project time varies as a function of the number of people on the project. The most naive assumption is that the time is inversely proportional to the number of people. That is

t = W/n

where t is the calendar time to completion, W is a measure of how much work is to be done, and n is the number of people. This assumes everything on the project can be done in parallel. Nobody waits for anybody else.

The next refinement is to take into account the proportion of work that can be done in parallel. Call this p. Then we have

t = W[1 – p(n-1)/n].

If everything can be done in parallel, p = 1 and tW/n as before. But if nothing can be done in parallel, p= 0, and so tW. In other words, the total time is the same whether one person is on the project or more. This is essentially Amdahl’s law.

With the equation above, adding people never slows things down. And if p > 0, every addition person helps at least a little bit.

Next we add a term to account for communication cost. Assume communication costs are proportional to the number of communication paths, n(n – 1)/2. Call the proportionality constant k. Now we have

t = W[1 – p(n-1)/n + kn(n-1)/2].

If k is small but positive, then at first adding more people causes a project to complete sooner. But beyond some optimal team size, adding more people causes the project to take longer.

Of course none of this is exact. Project time estimation doesn’t follow any simple formula. Think of these equations more as rough guides or metaphors. It’s certainly true that beyond a certain size, adding more people to a project can slow the project down. Kevlin gave examples of projects that were put back on track by reducing the number of people working on them.

My quibble with the equation above is that I don’t think the cost of more people is primarily communication. Communication paths in a real project are not the simple trees of org charts, but neither are they complete graphs. And if the problem were simply communication, then improved communication would mitigate the cost of adding people to a project, though I imagine it hardly does.

I think the cost of adding people to a project has more to do with Parkinson’s Law which says that people make work for each other. (The aphorism form of Parkinson’s Law says that work expands to the time allowed. But the eponymous book explains why work expands, and it is in part because people make extra work for each other.)

Dust jacket of the book Parkinsons Law and Other Studies in Administration

I wrote about a similar theme in the blog post Maybe you only need it because you have it. Here’s the conclusion of that post:

Suppose a useless project adds staff. These staff need to be managed, so they hire a manager. Then they hire people for IT, accounting, marketing, etc. Eventually they have their own building. This building needs security, maintenance, and housekeeping. No one questions the need for the security guard, but the guard would not have been necessary without the original useless project.

When something seems absolutely necessary, maybe it’s only necessary because of something else that isn’t necessary.