A report by the Committee of Experts on Non-personal Data Governance Framework, submitted to the MeitY, has raised concerns on anonymization of personal data. In this article, we will analyze the concept of anonymization. We will also see the proposal of the report on this subject. Why is there a fuss about anonymization? Is anonymization true to its meaning, or we are being served a myth of anonymization of personal data?
In case you’ve not been following us, we have been running a series of articles on the proposed report. You can read more here.
What is anonymization?
The Personal Data Protection Bill, 2019(PDP Bill), defines anonymization as, “such irreversible process of transforming or converting personal data to a form in which a data principal cannot be identified, which meets the standards of irreversibility specified by the Authority”. In simple words, the process of anonymization involves removal of personal identifiers. These identifiers can be both direct and indirect (name, address, postcode, contact number, email, photograph or image, workplace, job title and other personal identifiers).
Transformation of personal data to non-personal data
Personal data, once undergone through the process of anonymization, is non-personal data. A similar definition and rules for anonymization are present under the General Data Protection Regulation (GDPR) of the European Union. Other personal data protection regimes also contain a similar provision. These provisions help data businesses to derive information from personal data (anonymized), without having to comply with the requirements that use of personal data mandates. e.g. To Find a lawful basis, or to inform the users that how they use, process, share data that they collect, response upon breach, etc., under various privacy laws. Organizations achieve anonymization of data through the methods of Randomization, Generalization, Encryption, Hashing and Tokenization. You can read more about the usage of non personal data here.
How is data anonymized?
Technique of anonymization has remained a debatable issue, ever since the inception of GDPR in 2016. The reason the technical aspects of anonymization that are very different from its legal definition. Data scientists have revealed from their research that the process of anonymization may not be full proof. De-identification, or de-anonymization, of data is quite possible with the techniques of cross referencing from multiple resources or reverse engineering. Also, it will only get easier in the future to de-anonymize data with the advancement of technology and improvement of processing power.
What about the Expert Committee on Private Non-Personal Data? What does it say regarding this?
The report submitted to the MeitY, categorizes non-personal data derived from anonymized data as private non-personal data. (to read more about types, read this) It further differentiates private non personal data into two categories on the basis of nature of data: general and sensitive.
General private non personal data includes datasets from which general unique personal identifiers have been removed, such as name, age, contact number and address. Whereas sensitive private non personal data is the anonymized sensitive personal data. e.g. financial data, health data, sexual orientation, etc.
The report further proposes that this category of data should be regulated as it imposes a threat of exploitation of personal data, if non-personal data is de-identified and rendered personal data. Also, there should a similar mechanism of consent for anonymized data as proposed for personal data under the PDP Bill. According to the report, this data should be regulated by a Non-Personal Data Authority. The committee demands for separate laws for non-personal data wherein the model of data trust for data sharing can be implemented.
This gives rise to some basic questions:
1. When we have identified the issues with anonymization, why to leave this loophole in the Personal Data Protection Bill, 2019?
2. Why to make two separate laws to address the issue of anonymization of data wherein in first you allow processing of anonymized data as any category of non-personal data, and in the later you impose similar compliances in relation private non personal data as that of personal data?
3. Is there a need for a separate authority to regulate non personal data? Can’t a single authority regulate both personal as well as non-personal data?
I believe these issues should be accommodated in a single comprehensive legislation that is a ‘data protection law’. The parliament is yet to pass the Personal Data Protection Bill, 2019. Presently, it is pending before the Joint Parliamentary Committee which would submit its report analyzing the provisions of the bill. Hopefully, the JPC would take a note of these issues and suggest changes accordingly. If it doesn’t, both these bills and their contradictory or overlapping provisions would create great difficulty and confusion for data driven businesses (which means every business these days).
Bhagyashree is a qualified advocate practicing in the area of IT law and data protection. She has a great academic record with a LLM degree in Computer and Communication Laws from Queen Mary University of London. She also holds technical expertise in the area of digital forensics and investigation.