With increasing concern for data privacy, there have been several measures taken up to make AI applications privacy friendly. Of many such measures, the most commonly found and practiced method is Federated Learning. While an entire blog post will be dedicated to know how it works and its current application, this post is about yet another least discussed and probably a more theoretical approach as of now, and that is Machine Unlearning.
There have been limited yet substantial research work done in this domain with diverse approaches used by the researchers to attain the objective. As the name suggests, an AI model is made to forget or unlearn certain datapoints that it has been exposed to during its training. Much of its use case has been cited towards ensuring data privacy wherein, it is believed that sensitive training data erasure alone will not suffice to ensure data privacy. A target model may be exploited further to divulge sensitive training data relevant information even after erasing. Hence, it is best that the model forgets about the data as well apart from the data being deleted from the training database. However, this leads to two major questions – one being the efficacy of the model post unlearning of the trained information as to how much accurate and usable will the model be post unlearning such information, provided they have certain inner relation or linking to solve some very complex problems? And the second is about the computational effort required to realize this unlearning approach. In other words, what can be the most efficient way to ensure the model unlearns the relevant data without consuming additional resources or bringing about greater extent of change to model’s architecture?
In Zero-shot machine learning, authors propose a novel approach where zero original data samples are available for use. The approach offers good protection against the model inversion attacks and membership inference attacks. The research work also provides a metric, Anamnesis Index (AIN) to effectively measure the quality of the unlearning method. Similarly in another research, Unrolling SGD: Understanding Factors Influencing Machine Unlearning, a metric based on verification error has been proposed to assess the differential weight factors of unlearned and naively trained Stochastic Gradient Descent (SGD) based model. In Efficient Attribute Unlearning: Towards Selective Removal of Input Attributes from Feature Representations, researchers propose attribute unlearning wherein certain attributes are accurately captured and detached from the learned feature representations at the stage of training, according to their mutual information.
There are several such researches in progress in order to identify alternate measures to safeguard data privacy. It is realized that unlearning can address not only data privacy but also can be self-healing method for AI models against data poisoning. It can be a semi- preventive / semi- curative method induced in the model to reject poisoned or malicious information fed to the model to be trained on. Imagine the model being equipped with a hidden neuron that analyzes and unlearns schemingly close, camouflaged malicious data from genuine data and follows the right path of learning. Such a method of unlearning can save retraining the model right from scratch once it is realized that the model had been corrupted with poisoned data. A self- repairing model is all one needs to stay resilient once the model is out on field of operation. I wonder how Tay could have been benefitted with such unlearning.