2 min read

The difference between limited data sets and deidentified information

The difference between limited data sets and deidentified information

While related, deidentification and limited data sets are distinct concepts under HIPAA with each serving a specific purpose in the healthcare sector. One major difference is that deidentification is a process that, once completed, ensures data is no longer protected health information (PHI). 

 

What HIPAA says about limited data sets 

Section 164.514 (e)(2) defines limited data sets as, “...protected health information that excludes..direct identifiers of the individual or relatives, employers, or household members of the individual…”

The direct identifiers removed from limited data sets provide a level of anonymity while still allowing the information to be useful. These data sets are commonly used by healthcare workers and researchers to analyze trends or populations without needing full patient identifiers. 

A Johns Hopkins IRB article states, “A “limited data set” of information may be disclosed to an outside party without a patient’s authorization if certain conditions are met,” meaning when all identifiers are removed and conditions like a comprehensive Data Use Agreement are met, there is no need for patient authorization. The creation of these data sets benefits data sharing by protecting patient privacy while allowing for valuable data to be gained and extracted. 

 

What HIPAA says about deidentification 

Section 164.514 (a) of the Privacy Rule defines deidentification of PHI as “Health information that does not identify an individual and concerning which there is no reasonable basis to believe that the information can be used to identify an individual is not individually identifiable health information.” 

Section 164.514 (b) provides the 18 identifiers that must be removed for thorough deidentification. These direct or indirect markers can be removed through different methods, like the Safe Harbor or Expert Determination methods. The deidentified information can be used for purposes ranging from research to public health. 

Related: How to choose the right method for deidentification

 

Main differences

  • While deidentification removes 18 identifiers, limited data sets only set out 16 specific direct identifiers that don’t include dates and locations. 
  • Once data is fully deidentified, it is no longer considered PHI. A limited data set might still be considered PHI and require protection when shared. We recommend using HIPAA compliant email platforms like Paubox.
  • Limited data sets are often used for research, public health, and healthcare operations with a DUA in place. Deidentification can be used and shared for public health studies, quality improvement, and large-scale data analysis. 
  • Deidentification does not require any agreements since data isn’t PHI anymore. Limited data sets require a DUA to prohibit reidentification. 

 

FAQs

What is HIPAA?

The Health Insurance Portability and Accountability Act is a law that protects the privacy of a patient's medical information.

 

What is a Data Use Agreement?

It is a legal contract that outlines the terms for using and sharing a limited data set under HIPAA so that those accessing the data only do so for specific purposes. 

 

When can PHI not be shared? 

Protected health information can’t be shared without patient consent unless it’s for treatment, payment, or healthcare operations.