PATE framework for differentially private machine learning

Machine learning models can memorize fragments of their training data and return these fragments verbatim. I’ve seen instances, for example, where I believe an LLM returned phrases verbatim from this site. It’s easy to imagine how medical data might leak this way.

How might you prevent this? And how might you do it in a way that is easy to defend?

One such approach is the PATE framework. PATE stands for Private Aggregation of Teacher Ensembles. PATE was introduced in [1] and refined in [2].

In the PATE framework, you divide your sensitive data into n disjoint subsets and train a “teacher” model on each subset. These subsets are formed so that only one teacher has access to data from a particular individual.

Only these teacher models have direct access to sensitive data, and these models will not be released into production. Instead, the teacher models are used to train a “student” model.

The student model asks questions of the teacher models and so the student model is only indirectly trained on sensitive data. Furthermore, differential privacy is inserted between the student and teacher models as a further layer of privacy protection. So the student model is actually not trained on the answers from the teacher models but from an aggregate of the teacher models with a (ideally small) amount of randomness thrown in to further protect privacy. Publicly available data is also added to the training set for the student model.

There are a couple clever refinements in [2] that stretch the framework’s privacy budget.

To be more selective, our new mechanisms leverage some pleasant synergies between privacy and utility in PATE aggregation. For example, when teachers disagree, and there is no real consensus, the privacy cost is much higher; however, since such disagreement also suggest that the teachers may not give a correct answer, the answer may simply be omitted.

Differential privacy is a way of quantifying the notion that an individual’s participation or lack of participation in a database makes little difference. If the teacher models disagree, it may because an individual who is an outlier has had a large influence on the teacher model that was trained on his data. If you protect the privacy of the “teachers” then you protect the privacy of the individuals since no more than one teacher was training on any given individual’s data. When there is consensus among the teachers, there’s no need to spend much of the privacy budget.

The authors go on to say

Similarly, teachers may avoid giving an answer where the student already is confidently predicting the right answer.

Since each query uses some amount of the privacy budget, avoiding even asking questions saves budget.

Related posts

[1] Nicholas Papernot et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data arXiv:1610.05755

[2] Nicolas Papernot et al. Scalable Private Learning with PATE. arXiv:1802.08908v1