Fast and Memory Efficient Differentially Private SGD via JL Projections

Views: 10

A Google TechTalk, presented by Sivakanth Gopi, 2021, 05, 21 ABSTRACT: Differential Privacy for ML Series. Differentially PrivateSGD (DPSGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks. This algorithm requires computation of persample gradients norms which is extremely slow and memory intensive in practice. In this paper, we present a new framework to design differentially private optimizers called DPSGDJL and DPAdamJL. Our approach uses JohnsonLindenstrauss (JL) projections to quickly approximate the persample gradient norms without exactly computing them, thus making the training time and memory requirements of our optimizers closer to that of their nonDP versions. Unlike previous attempts to make DPSGD faster which work only on a subset of network architectures, we propose an algorithmic solution which works for any network in a blackbox manner which is the main contribution of this paper. About the