The HIPAA Privacy Rule provides two ways to determine that data has been deidentified: expert determination and Safe Harbor. In our work providing expert determination we often find that companies come to us with a lot of confusion about how Safe Harbor works. This post clarifies the 18 classes of information that must be excluded under Safe Harbor. Reach out for a free conversation to help orient (and minimize risk) around how the details apply to your company and your data.
The HIPAA Privacy rule, in Section 164.514(b)(2), lists 18 categories of data that must be removed in order for data to be considered deidentified under the Safe Harbor provision.
The following identifiers of the individual or of relatives, employers, or household members of the individual, are removed:
(B) All geographic subdivisions smaller than a State …
(C) All elements of dates (except year) for dates directly related to an individual, …
(D) Telephone numbers;
(E) Fax numbers;
(F) Electronic mail addresses;
(G) Social security numbers;
(H) Medical record numbers;
(I) Health plan beneficiary numbers;
(J) Account numbers;
(K) Certificate/license numbers;
(L) Vehicle identifiers and serial numbers, including license plate numbers;
(M) Device identifiers and serial numbers;
(N) Web Universal Resource Locators (URLs);
(O) Internet Protocol (IP) address numbers;
(P) Biometric identifiers, including finger and voice prints;
(Q) Full face photographic images and any comparable images; and
(R) Any other unique identifying number, characteristic, or code;
Obviously names should be removed. as well as information that will easily allow you to look up someone’s name, such as a telephone number. This accounts for the majority of the list. But several remaining items require more information.
Why no URLs?
For example, notice that no URLs are allowed. This doesn’t say that the URL of someone’s personal web site is disallowed, though it is. Safe Harbor prohibits any URLs. You cannot, for example, list browsing history, because studies hae shown that it can be fairly easy to identify people from their browsing history. This may be surprising when you first run into it, but people search on things they find relevant: local events and businesses, their medical conditions, social media accounts of people they know, etc.
Two of the 18 identifiers were abbreviated above: geographic subdivisions and dates directly related to an individual. These are not simply identifiers; these sections are rules.
There’s a big problem in part (B) above. The provision applied to geographic subdivisions smaller than a state, so including a person’s state is OK. Then it says “The initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000.” But as it turns out, there is at most one sparse three-digit zip code in each state. If you report the state and 000, you might as well report the three-digit zip code.
Dates of service
Like part (B) on geographic subdivisions, part (C) is a list of rules. Dates of service are only allowed to the year. Why?
As with URLs, dates of service may pose more of an identification risk that is immediately apparent. If dates of service can be cross-referenced with public data, this could lead to identifying individuals. This is not hypothetical: it has been demonstrated in practice. More on dates of service here.
What if you really need to know information at a finer granularity than a year? Maybe you want to look for monthly trends in some data, for example. You cannot do that with data that falls under Safe Harbor. You could retain detailed date information using differential privacy as described here, but that would not fall under Safe Harbor; you would need expert determination.
Although the list above is mostly straightforward, there are difficulties. For example, the 18th rule says to remove “any other unique identifying number, characteristic, or code.” How do you know whether a characteristic is identifying? Not only that, there’s also an implicit 19th rule, the so-called “actual knowledge” rule:
The covered entity does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information.
How do you know whether information could be used alone or in combination with other information to identify an individual? The US Department of Health and Human Services gives some examples that it would consider unacceptable.
The alternative: Expert determination
If the restrictions of HIPAA Safe Harbor are incompatible with your business goals, or you wonder whether Safe Harbor would actually protect privacy in your context, there may be a way to retain the business value of your data while protecting individual privacy. The HIPAA Privacy Rule allows for this, provided an experienced expert determines that individual privacy is indeed protected. If you would like to speak to someone about HIPAA expert determination, let’s talk.
Trusted consultants to some of the world’s leading companies