More stability, less stress

It’s been eight years since I started my consulting business. Two of the things I love about having my own business are the stability and the reduced stress. This may sound like a joke, but I’m completely serious.

Having a business is ostensibly less stable and more stressful than having a salaried job, but at a deeper level it can be more stable and less stressful.

If you are an employee, you have one client. If you lose that client, you lose 100% of your income. If you have a business with a dozen clients, losing a client or two at the same time is disappointing, but it’s not devastating.

As for stress, I prefer the stress of owning a business to the stresses of employment. My net stress level dropped when I went out on my own. My sleep, for example, improved immediately.

At first I never knew where the next project was coming from. But I found this less stressful than office politics, questioning the value of my work, a lack of correlation between my efforts and my rewards, etc.

If you’re thinking of striking out on your own, I wish you well. Here is some advice I wrote a few years ago that you may find helpful.

Simultaneous projects

I said something to my wife this evening to the effect that it’s best for employees to have one or at most two projects at a time. Two is good because you can switch off when you’re tired of one project or if you’re waiting on input. But with three or more projects you spend a lot of time task switching.

She said “But …” and I immediately knew what she was thinking. I have a lot more than two projects going on. In fact, I would have to look at my project tracker to know exactly how many projects I have going on right now. How does this reconcile with my statement that two projects is optimal?

Unless you’re doing staff augmentation contracting, consulting work is substantially different from salaried work. For one thing, projects tend to be smaller and better defined.

Also consultants, at least in my experience, spend a lot of time waiting on clients, especially when the clients are lawyers. So you take on more work than you could handle if everyone wanted your attention at once. At least you work up to that if you can. You balance the risk of being overwhelmed against the risk of not having enough work to do.

Working for several clients in a single day is exhausting, but that’s usually not necessary. My ideal is to do work for one or two clients each day, even if I have a lot of clients who are somewhere between initial proposal and final invoice.

Opposite of the Peter Principle

Peter Principle book cover

The Peter Principle is an idea that was developed by Lawrence Peter and expanded into a book coauthored with Raymond Hull in 1969. It says that people rise to their level of incompetence. According to the Peter Principle, competent people are repeatedly promoted until they get to a level where they’re not bad enough to fire but not good enough to promote.

I haven’t thought about the Peter Principle in a while, but I was reminded of it when I was reading One Giant Leap and was struck by this line:

He was the opposite of the Peter Principle.

What a great thing to have someone say about you. So what was the context of that line?

Jane Tindall said it about her husband Bill. The title of that chapter in One Giant Leap is “The Man Who Saved Apollo.” The author, Charles Fishman, is saying indirectly that Bill Tindall was the man who saved Apollo by getting the program’s software development effort on track. The previous chapter, “The Fourth Crew Member” explained how Apollo’s guidance computer, primitive as its hardware was by contemporary standards, was absolutely critical to the missions.

Here’s the paragraph containing the line above.

By 1966 Tindall had had years of management experience; one engineer who worked for him said that Tindall liked remaining the deputy in the divisions where he worked because it gave him more actual ability to get things done, more maneuvering room, and considerably less bureaucratic hassle. Said his wife, Jane, “He was the opposite of the Peter Principle.” [1] Tindall had the ability and experience to absorb, understand, and sort out serious technical problems, and that ability earned him the respect of his colleagues, even when they didn’t get the decision they wanted.

More Peter Principle posts

[1] No one used the term “Peter Principle” during the Apollo program because Dr. Peter had not yet coined the term yet. The quote from Jane Tindall came from Fishman interviewing her in 2016.

Scaling up and down

There’s a worn-out analogy in software development that you cannot build a skyscraper the same way you build a dog house. The idea is that techniques that will work on a small scale will not work on a larger scale. You need more formality to build large software systems.

The analogy is always applied in one direction: up. It’s always an exhortation to use techniques appropriate for larger projects.

But the analogy works in the other direction as well: it’s inappropriate to build a dog house the same way you’d build a skyscraper. It would be possible to build a dog house the way you’d build a skyscraper, but it would be very expensive. Amateur carpentry methods don’t scale up, but professional construction methods don’t scale down economically.

Bias for over-engineering

There’s a bias toward over-engineering because it works, albeit inefficiently, whereas under-engineering does not. You can use a sledgehammer to do a hammer’s job. It’ll be clumsy, and you might hurt yourself, but it can work. And there are tasks where a hammer just won’t get the job done.

Another reason for the bias toward over-engineering is asymmetric risk. If an over-engineered approach fails, you’ll face less criticism than if a simpler approach fails. As the old saying goes, nobody got fired for choosing IBM.

Context required

Simple solutions require context to appreciate. If you do something simple, you’re open to the criticism “But that won’t scale!” You have to defend your solution by explaining that it will scale far enough, and that it avoids costs associated with scaling further than necessary.

Suppose a group is debating whether to walk or drive to lunch. Someone advocating driving requires less context to make his point. He can simply say “Driving is faster than walking,” which is generally true. The burden is on the person advocating walking to explain why walking would actually be faster under the circumstances.

Writing prompt

I was using some database-like features in Emacs org-mode this morning and that’s what prompted me to write this post. I can just hear someone say “That won’t scale!” I often get this reaction from someone when I write about a simple, low-tech way to do something on a small scale.

Using a text file as a database doesn’t scale. But I have 88 rows, so I think I’ll be OK. A relational database would be better for storing million of records, but that’s not what I’m working on at the moment.

More posts on scale

What does CCPA say about de-identified data?

The California Consumer Privacy Act, or CCPA, takes effect January 1, 2020, less than six months from now. What does the act say about using deidentified data?

First of all, I am not a lawyer; I work for lawyers, advising them on matters where law touches statistics. This post is not legal advice, but my attempt to parse the CCPA, ignoring details best left up to others.

In my opinion, the CCPA is more vague than HIPAA, but not as vague as GDPR. It contains some clear language about using deidentified data, but that language is scattered throughout the act.

Deidentified data

Where to start? Section 1798.145 says

The obligations imposed on businesses by this title shall not restrict a business’s ability to … collect, use, retain, sell, or disclose consumer information that is deidentified or in the aggregate consumer information.

The act discusses identifiers and more importantly probabilistic identifiers, a topic I wrote about earlier. This term is potentially very broad. See my earlier post for a discussion.

Aggregate consumer information

So what is “aggregate consumer information”? Section 1798.140 says that

For purposes of this title: (a) “Aggregate consumer information” means information that relates to a group or category of consumers, from which individual consumer identities have been removed, that is not linked or reasonably linkable to any consumer or household, including via a device. “Aggregate consumer information” does not mean one or more individual consumer records that have been de­identified.

So aggregate consumer information is different from deidentified information.

Pseudonymization

Later on (subsection (s) of the same section) the act says

Research with personal information … shall be … (2) Subsequently pseudonymized and deidentified, or deidentified and in the aggregate, such that the information cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer.

What? We’ve seen “in the aggregate” before, and presumably that has something to do with the use of “in the aggregate” above. But what does pseudonymized mean? Backing up to subsection (r) we have

Pseudonymize” or “Pseudonymization” means the processing of personal information in a manner that renders the personal information no longer attributable to a specific consumer without the use of additional information, provided that the additional information is kept separately and is subject to technical and organizational measures to ensure that the personal information is not attributed to an identified or identifiable consumer.

That sounds a lot like deidentification to me, but slightly weaker. It implies that a company can retain the ability to re-identify an individual, as long as the means of doing so is “is kept separately and is subject to technical and organizational measures.” I’m speculating here, but it seems like that might mean, for example, that a restricted part of a company might apply a secure hash function to data, and another part of the company sees the results and analyzes the data. Then again, the law says “pseudonymized and deidentified,” so who knows what that means. More on the confusion around pseudonymization here.

The CCPA was written and passed in a hurry with the expectation of being amended later, and it shows.

Update: The CCPA was amended on September 25, 2020 to say that if data is deidentified under HIPAA, it will be considered deidentified under CCPA. Consult and attorney for details.

Compliance

How can you know whether you comply with CCPA‘s requirements for pseudonymizing, deidentifying, and aggregating data? A lawyer would have to tell you how the law applies to your situation. As I said above, I’m not a lawyer. But I can recommend lawyers working in this space. And I can work with your lawyer on the technical aspects: what methods are commonly used, how privacy risk is quantified, etc.

More privacy posts

The cost of no costs

The reason businesses have employees rather than contracting out everything is to reduce transaction costs. If a company needs enough graphics work, they hire a graphic artist rather than outsourcing every little project, eliminating the need to evaluate bids, write contracts, etc. Some things are easier when no money has to change hands.

But some things are made more complicated because money does not change hands. In-house transactions don’t require monetary negotiation, but they require emotional and political negotiation. Discussions can become unnecessarily heated because things don’t have prices. Discussions drift toward arguments over what is and is not possible rather than discussions of trade-offs based on cost.

Cop with a mop

Yesterday I was at a wedding, and a vase broke in the aisle shortly before the bridal party was to enter. Guests quickly picked up the pieces, but the vase left a pool of water on the hard floor.

A security guard ran (literally) for a mop and cheerfully picked up the water. He could have easily stood in the corner and said that mopping floors is not his job. And if he were guarding a jewelry store, it would be inappropriate for him to leave his post to get a mop. But his presence at the wedding was a formality, presumably a venue requirement, and no one was endangered by his fetching a mop. There was more danger of someone slipping on a wet floor.

I enjoy seeing anyone do their job with enthusiasm, doing more than the minimum required. Over-zealous people can cause problems, but I’d much rather deal with such problems than deal with people passively putting in their time.

Why isn’t CPU time more valuable?

Here’s something I find puzzling: why isn’t CPU time more valuable?

I first thought about this when I was working for MD Anderson Cancer Center, maybe around 2002. Our research in adaptive clinical trial methods required bursts of CPU time. We might need hundreds of hours of CPU time for a simulation, then nothing while we figure out what to do next, then another hundreds hours to run a modification.

We were always looking for CPU resources, and we installed Condor to take advantage of idle PCs, something like the SETI at Home or GIMPS projects. Then we had CPU power to spare, sometimes. What could we do between simulations that was worthwhile but not urgent? We didn’t come up with anything.

Fast forward to 2019. You can rent CPU time from Amazon for about 2.5 cents per hour. To put it another way, it’s about 300 times cheaper per hour to rent a CPU than to hire a minimum wage employee in the US. Surely it should be possible to think of something for a computer to do that produces more than 2.5 cents per CPU hour of value. But is it?

Well, there’s cryptocurrency mining. How profitable is that? The answer depends on many factors: which currency you’re mining and its value at the moment, what equipment you’re using, what you’re paying for electricity, etc. I did a quick search, and one person said he sees a 30 to 50% return on investment. I suspect that’s high, but we’ll suppose for the sake of argument there’s a 50% ROI [1]. That means you can make a profit of 30 cents per CPU day.

Can we not thinking of anything for a CPU to do for a day that returns more than 30 cents profit?! That’s mind boggling for someone who can remember when access to CPU power was a bottleneck.

Sometimes computer time is very valuable. But the value of surplus computer time is negligible. I suppose it all has to do with bottlenecks. As soon as CPU time isn’t the bottleneck, its value plummets.

Update: According to the latest episode of the Security Now podcast, it has become unprofitable for hackers to steal CPU cycles in your browser for crypto mining, primarily because of a change in Monero. Even free cycles aren’t worth using for mining! Mining is only profitable on custom hardware.

***

[1] I imagine this person isn’t renting time from Amazon. He probably has his own hardware that he can run less expensively. But that means his profit margins are so thin that it would not be profitable to rent CPUs at 2.5 cents an hour.

International internet privacy law

world map

Scott Hanselman interviewed attorney Gary Nissenbaum in show #647 of Hanselminutes. The title was “How GDPR is effecting the American Legal System.”

Can Europe pass laws constraining American citizens? Didn’t we settle that question in 1776, or at least by 1783? And yet it is inevitable that European law effects Americans. And in fact Nissembaum argues that every country has the potential to pass internet regulation effecting citizens of every other country in the world.

Hanselman: Doesn’t that imply that we can’t win? There’s two hundred and something plus countries in the world and if any European decides to swing by a website in Djibouti now they’re going to be subject to laws of Europe?

Nissenbaum: I’ll double down on that. It implies that any country that has users of the internet can create a more stringent law than even the Europeans, and then on the basis of that being the preeminent regulatory body of the world, because it’s a race to who can be the most restrictive. Because the most restrictive is what everyone needs to comply with.

So if Tanzania decides that it is going to be the most restrictive country in terms of the laws … that relate to internet use of their citizens, theoretically, all websites around the world have to be concerned about that because there are users that could be accessing their website from Tanzania and they wouldn’t even know it.

Will the “world wide web” someday not be worldwide at all? There has been speculation, for example, that we’ll eventually have at least two webs, one Chinese and one non-Chinese. The web could tear into a lot more pieces than that.

As Nissenbaum says toward the end of the podcast

If anyone assumes there’s a simple way of handling this, they’re probably wrong. It is complicated, and you just have to live with that, because that’s the world we’re in.

More data privacy posts

Pareto’s 80-20 rule

Vilfredo Pareto

Pareto’s 80-20 rule says that 80% of your results often come from 20% of your effort. Maybe 80% of your profit comes from 20% of your customers, or maybe 80% of the bugs in your software are removed in the first 20% of the time you spend debugging.

The rule is named after Italian economist Vilfredo Pareto who observed that 80% of his country’s land belonged to 20% of its population. The exact ratio of 80-20 isn’t important, though it is surprisingly common. The same principle applies whenever a large majority of effects come from a small number of causes.

The 80-20 rule, or Pareto principle, is startling the first time you hear it. It suggests you can be a lot more productive by focusing your effort where it does the most good. For example, there may be 100,000 to 1,000,000 words in English, depending on how you count them. But you could be pretty fluent in English by knowing the 1,000 most common words.

The thousand most frequently used words in any language are far more important than all the rest combined. Studying these words first makes much more sense than a uniformitarian approach, going through a dictionary in alphabetic order on the assumption that all words are equally important.

I’ve thought about the Pareto principle off and on for many years. When I bring it up for discussion, people are often defensive, bringing up the same objections every time.

Objections

The most common objection is the recursive argument. If you could be more effective by focusing on the 20% that’s most important, then you should do that again: focus on the 20% of the 20% that’s most important. Apply this argument repeatedly and you can be infinitely productive with no effort.

The recursive argument takes the “80” and “20” of the 80-20 rule too literally. The point is not the exact ratios. The point is that return on effort invested is not uniformly distributed. In fact, it’s often far from uniformly distributed. I prefer the term Pareto principle to “80-20 rule” just because it does not reference particular numbers that could distract from the general principle.

Could you apply a Pareto principle recursively to English words, say by focusing on the 200 most common words? In fact your could. But that doesn’t mean that you could keep doing this repeatedly, learning only the most common word (“the”) and declaring yourself fluent in English. This doesn’t negate the fact that the importance of English words is very unevenly distributed.

Another objection is the completionist argument. It says that everything has to be done, so the fact that you get less return on some things than others doesn’t matter. For example, the letters E, T, and A appear about 100 times as often as J, Q, and Z. That doesn’t mean you could leave J, Q, and Z off your keyboard. On the other hand, it does mean that you might design a keyboard so that E, T, and A are easier to reach than J, Q, and Z. And Samuel Morse was smart to assign his shortest codes to the most frequently used letters. [1]

A final objection is the ignorance argument: we simply don’t what the most effective 20% will be beforehand. This is a serious objection, and it should temper our optimism regarding the Pareto principle. If a salesman knew which 20% of his prospects were going to buy, he should just sell to them. But of course he doesn’t know ahead of time who those 20% will be. On the other hand, he has some idea who is likely to buy (and how much they may buy) and doesn’t approach prospects randomly.

These objections take the Pareto principle to extremes to justify disregarding it. Since you can’t repeatedly apply it indefinitely, there must be nothing to it. Or if you can’t completely eliminate the least productive work, you should treat everything equally. Or if you don’t have absolute certainty regarding what’s most important, you shouldn’t consider what’s likely to be most important.

Applications

Despite the objections above, it is true that returns on effort are often very unevenly distributed. There’s a common tendency to under estimate the variance [2]. We might have a rough idea how effective a list of possible actions would be, and maybe imagine than the most effective choice would be ten times better than the least effective choice, but in fact the ratio might be a hundred to one or even a thousand to one [3]. Somehow we mentally compress these ratios, maybe on something like a logarithmic scale.

So one key to taking advantage of the Pareto principle is simply to keep in mind that something like the Pareto principle might hold. You’re not likely to find a Pareto rule if you don’t think they exist.

Another key is to be honest with ourselves regarding how effective we want to be. Maybe the most effective thing to do is something we simply don’t want to do. If so, we can either make a principled decision to not do what we know to be more effecitve, or get over our sloth.

I mentioned ignorance above. “Uncertainty” is a more helpful word than “ignorance” here because we’re not often completely ignorant. We usually have some idea which actions are more likely to be effective. Data can help. Start by using whatever information or intuition you have, and update it as you gather data.

This could be a formal Bayesian process if you have quantifiable data. Or it could be as simple as just trying something. If it works, try it again. If not, try something different. You may be able to bootstrap this “play the winner” strategy until you have enough data to be more formal about making decisions.

***

[1] How well does Morse code symbol length correspond to frequency? I looked into that here.

[2] I have a friend who has helped me with this. He will suggest I do X, and I agree, but say I’d rather do Y. Then he will reply with something like “Sure, you could do that. But X could be a thousand times more effective. It’s up to you.” I’ve done the same for others. It’s easier to see someone else’s decisions objectively than your own.

[3] This is not an exaggeration. I’ve seen this, for example, in software optimization. Some changes might make 1,000x more of a difference than others.