The regulatory distinction between identified and deidentified information is long-standing. The Federal Policy for the Protection of Research Subjects (Common Rule) provides that research involving anonymous or deidentified information is expressly exempt from regulation under the Common Rule. Exemption 4 from the Common Rule applies to the following:
(4) Research involving the collection or study of existing data, documents, records, pathological specimens, or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects. (45 C.F.R. § 46.101(b)(4))
Similarly, the Health Insurance Portability and Accountability Act (HIPAA) Standards for Privacy of Individually Identifiable Health Information (Privacy Rule) (45 C.F.R. Parts 160, 164) applies only to “protected health information.” The Privacy Rule provides that “protected health information means individually identifiable health information” (45 C.F.R. § 164.501). Furthermore, “[h]ealth information that does not identify an individual and with respect to which there is no reasonable basis to believe that the information can be used to identify an individual is not individually identifiable health information” (45 C.F.R. § 164.514(a)).
Unlike the Common Rule, which describes the conditions for exemption based on lack of identifiability in general terms, the Privacy Rule goes into great detail about the requirements for deidentification. According to the Privacy Rule, there are two ways in which a covered entity may determine that information is deidentified. First, an expert in statistical and scientific methodologies may determine “that the risk is very small that the information could be used … to identify an individual who is a subject of the information” (45 C.F.R. § 164.514(b)(1)). Second, because of the difficulty and expense of obtaining expert consultation, a more prescriptive method of achieving deidentification also is provided in the Privacy Rule.
The Privacy Rule lists 17 specific provisions and one general provision regarding the types of identifiers that must be removed from health information before the information will be deemed deidentified. The following identifiers must be removed: (1) names; (2) geographical subdivisions smaller than a state except for the first three digits of a ZIP code; (3) all elements of dates (except year) that relate to birth date, admission date, and discharge date; (4) telephone numbers; (5) FAX numbers; (6) e-mail addresses; (7) Social Security numbers; (8) medical record numbers; (9) health plan beneficiary numbers; (10) account numbers; (11) certificate or license numbers; (12) vehicle identifiers, including license-plate numbers; (13) device identifiers and serial numbers; (14) URLs (web locators); (15) Internet protocol (IP) address numbers; (16) biometric identifiers; (17) photographic and comparable images; and (18) any other unique identifying number, characteristic, or code (45 C.F.R. § 164.514(b)(2)(i)). Compliance with these deidentification specifications eliminates a variety of obligations of covered entities under the Privacy Rule, including providing a notice of privacy practices, requiring an authorization for uses other than treatment, payment, and health care operations (subject to exceptions, such as public health disclosures), and restricting use of the information beyond health care. The Privacy Rule also permits covered entities to use a limited data set for purposes of research, public health, or health care operations if the recipient of the data set enters into a data use agreement specifying that the recipient will only use the information for limited purposes (45 C.F.R. § 164.514(e)(3) and (4)). The limited data set may not include “direct identifiers of the individual or of relatives, employers, or household members of the individual” (45 C.F.R. § 164.514(e)(2)). The impermissible “direct identifiers” include 16 of the 18 identifiers listed in the deidentification specifications mentioned earlier. The two categories of identifiers that may be included in a limited data set are dates, including date of birth and dates of service, and “any other unique identifying number, characteristic, or code.”
Under the Privacy Rule, a covered entity may disclose protected information in a limited data set only if the recipient signs a data use agreement indicating that the information will be used only for limited purposes. In particular, the data use agreement must include the permitted uses and disclosures; indicate who is permitted to use and disclose the information; indicate that the recipient will not redisclose the information; provide that the recipient will use appropriate safeguards to prevent unapproved uses; provide that the recipient will report to the covered entity any use or disclosure not authorized by the data use agreement; provide that the recipient will ensure compliance with the agreement by any agents or subcontractors it uses; and provide that the recipient will not identify the information or contact the individuals (45 C.F.R. § 164.514(e)(4)).
The deidentification and limited data-set provisions of the Privacy Rule differ sharply from the Common Rule in both degree of detail and substance. According to a guidance document issued by the Office of Human Research Protections (OHRP), private information or specimens are “[not] individually identifiable when they cannot be linked to specific individuals by the investigator(s) either directly or through coding systems” (OHRP 2008
). Furthermore, research involving only coded private information does not involve human subjects if the investigator cannot “readily ascertain” the identity of the individual because the key has been destroyed before the research begins, the keyholder has agreed not to release the key to investigators under any circumstances, there are institutional review board (IRB)-approved written policies prohibiting release of the key until individuals are deceased, or there are other legal requirements prohibiting the release of the key to the investigators until the individuals are deceased.
In its guidance, the OHRP recognized that it created a lower standard for deidentification under the Common Rule than exists under the Privacy Rule. “Therefore, some coded information, in which the code has been derived from identifying information linked to or related to the individual, would be individually identifiable under the Privacy Rule, but might not be individually identifiable under the [Common Rule]” (OHRP 2008
). In the OHRP guidance, the Department of Health and Human Services (HHS) has explicitly acknowledged it has two different sets of rules regulating deidentification of health information for research. Notwithstanding the issue of whether deidentification is an adequate privacy strategy, the deidentification regulations of the Common Rule and the Privacy Rule are inexplicably and unjustifiably inconsistent. Although the Common Rule applies to other types of information besides health information, addressing the “welfare” of research subjects and not just privacy, HHS has not attempted to harmonize these important regulations (Rothstein 2005