CN115496198A

CN115496198A - Gradient compression framework for adaptive privacy budget allocation based on federal learning

Info

Publication number: CN115496198A
Application number: CN202210938530.5A
Authority: CN
Inventors: 陈淑红; 杨家维; 王国军; 揭智勇; 彭滔; 冯光辉
Original assignee: Guangzhou University
Current assignee: Guangzhou University
Priority date: 2022-08-05
Filing date: 2022-08-05
Publication date: 2022-12-20

Abstract

The invention discloses a gradient compression framework of adaptive privacy budget allocation based on federal learning, which comprises the following steps: the system comprises a Top-k-based gradient dimensionality reduction compression module, a local differential privacy protection module, a communication turn parameter aggregation module and a self-adaptive privacy budget allocation module, wherein the Top-k-based gradient dimensionality reduction compression module is used for reducing the number of communication turns; before uploading the gradient parameters obtained by training the client to a server, the client compresses the current gradient parameters through a gradient dimensionality reduction compression module based on Top-k, then disturbs the gradient parameters through a privacy protection module based on local differential privacy, then uploads the compressed and disturbed gradient parameters to the server, and the server aggregates the gradient parameters uploaded by the client; and finally, allocating privacy budgets according to the required noise amount in different rounds of training through a self-adaptive privacy budget allocation module. The invention reduces the communication cost, the noise and the loss of the total privacy budget of the model, and has good model accuracy.

Description

Gradient compression framework for adaptive privacy budget allocation based on federal learning

Technical Field

The invention relates to the field of deep learning, in particular to a gradient compression framework of adaptive privacy budget allocation based on federal learning.

Background

In a traditional centralized deep learning framework, users send their data containing sensitive information to a machine learning company (an untrusted third party). Once the data is sent to the third party, the user cannot delete or control their own data, and these untrusted third parties may use their data to perform some illegal activities, and thus their data may risk privacy disclosure. In 2015, shokri et al proposed a multi-party privacy preserving collaborative deep learning model. In this model, each participant can independently locally train their model and then selectively share some of the model parameters of their local model to a central server. In this way, on the one hand, sensitive data of the participants can be protected from being leaked, and on the other hand, parameters shared by the participants can be used for improving the accuracy of the model trained by the participants. Google provides the concept of federal learning for the first time on the basis of the work of Shokri, and aims to establish a high-quality distributed learning framework. In federal learning, data participants do not need to share raw data with each other, nor do they need to rely on a single trusted entity (central server) for distributed training of machine learning models. Konecny et al propose a federal learning model with good communication efficiency in order to solve the communication cost problem of federal learning. McMahan et al [8] are based on the Federal averaging algorithm, considering that mobile device data in reality is distributed. Liu proposes a two-stage framework FedSel, and a Top-k dimension is selected privately for uploading and noise adding according to the contribution of gradient parameters in each iteration, so that the privacy and communication problems in local differential privacy-based federal learning are relieved. Zhao considers the unreliable participants in federal learning and proposes a new scheme named SecProbe that allows participants to share model parameters and handle the unreliable participants by using an exponential mechanism.

For privacy protection in federal learning, there are many security models or privacy protection techniques that can provide reliable privacy assurance. Such as secure multi-party computing (SMC), homomorphic encryption, differential privacy, etc. As a safety protocol, the safe multiparty computation is mainly used for safety aggregation, and can prevent malicious server attacks. For example, danner et al propose a security and protocol using a tree topology. Another study based on secure multiparty computing is securmeml, where participants distribute their private data among two non-conflicting servers, which then use secure multiparty computing techniques to train a global model using the participant's encrypted federal data. Bonawitz et al propose a secure multiparty aggregation method for FL, where participants need to encrypt their local updates and the server aggregates based on the encrypted parameters. Another privacy protection technique is homomorphic encryption, which is mainly used to encrypt uploaded gradient parameters. This technique is then not suitable for all clients, since the server must rely on an external participant that is not collusive to perform encryption or decryption. Both secure aggregation and homomorphic encryption involve significant computational overhead, and the total cost consumed is prohibitively expensive for the federal learning framework. Furthermore, zhu et al show that using gradient compression and sparsification can help defend against privacy leaks from local updates. However, these methods require very high compression rates to achieve the desired defense performance, which compromises the accuracy of the model.

Given the wide applicability of differential privacy in deep learning models, differential privacy may also be well-suited for privacy protection for federal learning. The differential privacy is an important data privacy protection technology in recent years, and is a method for adding artificial noise to prevent information leakage, the differential privacy can resist the attack of background knowledge, the degree of privacy protection can be adjusted according to the requirement of privacy protection, and the privacy protection of a federal learning model can be guaranteed. Abadi et al propose a DP-SGD algorithm which adds noise to the gradient uploaded by the client to prevent an external attacker from stealing model parameters and then obtaining the original sensitive data of the client. From the perspective of users, geyer et al propose a user-level differential privacy federal learning framework that provides different privacy protections for different users, and trade-off between privacy loss and model performance. Wei et al propose an NbAFL scheme to properly adjust the variance of global data under a certain Gaussian noise disturbance level to meet the requirement of global DP.

At present, the research direction in the field is mainly to achieve a balance among privacy, practicability and communication efficiency in federal learning, and how to achieve a better balance among the privacy, the practicability and the communication efficiency is the key point in the field. Therefore, our scheme also searches research sites in the research direction and searches through relevant documents. Liu et al proposed a FedSel scheme, which performed Top-k screening on parameters uploaded by clients in consideration of the fact that the number of uploaded parameters is proportional to noise, and a gradient accumulation technique was used to stabilize the influence of noise in the learning process. Besides, when the dimension of Top-k is selected for uploading, an author also uses an exponential mechanism of differential privacy to privately select the dimension k so as to ensure the privacy when the dimension is selected. Before uploading parameters to a server, the front k-dimensional parameters with the maximum gradient values are selected privately first, and all the parameters are uploaded. And then adding differential privacy noise to the selected k-dimensional parameters, uploading the compressed noise gradient vectors to a server, aggregating all parameters uploaded by the clients participating in training by the server, and then performing the next iteration. However, when the k-dimensional parameters are selected for uploading, different privacy protection mechanisms are used, so that the calculation cost of the scheme is high, and the accuracy of the model is also damaged to some extent. Sun et al propose a novel design of a federal learning local differential privacy mechanism, which considers the difference of parameter ranges of different deep learning model layers and makes the local parameter update have differential privacy by adapting to different ranges of different layers of a deep neural network. In addition, the mechanism amplifies privacy through an aggregation mechanism of parameter shuffling, namely, the model accuracy can still be guaranteed to be high under the conditions of less privacy budget and higher privacy protection level. Although this scheme can achieve a good balance between privacy and model utility, it ignores the aspect of communication efficiency. On one hand, under the influence of a differential privacy mechanism, the convergence speed of the neural network is relatively slow, so that the number of iterations is increased, and the communication cost is increased. On the other hand, since the client uploads all of its local parameters to the server, this inevitably reduces communication efficiency.

The prior art of federal learning is difficult to achieve a good balance among privacy, model practicability and communication efficiency. On one hand, the privacy of the client is protected by using a related privacy protection technology, and since the gradient parameters of the neural network are added with noise, the effect of model training is inevitably negatively influenced. In addition to this, since the model parameters are highly dimensional, the dimensions in a neural network are often as high as tens of thousands or even millions. However, the magnitude of the overall noise volume of the model is proportional to the dimensions of the model parameters. Assuming that noise is added to each dimension of the model parameters, the noise amount of the model increases exponentially, which eventually results in low model accuracy, i.e., poor model practicability. Therefore, adding noise to each dimension parameter causes a series of problems such as excessive noise of the model as a whole, and low accuracy of the model. For the communication cost of the model, since the speed of the uplink in the network is much slower than that of the downlink, if the model parameters of each dimension are uploaded, the communication efficiency of the model is reduced. In addition, since the federal learning is based on multi-user multi-parameter distributed training, this means that in the federal learning, the number of parameters received by the server from a large number of clients is huge, and if all users upload huge model parameters to the server, a communication bottleneck problem is caused. The communication problem is the most challenging problem of current research.

In addition, if noise is to be added to each parameter, it causes a problem that the privacy budget is excessively consumed. Therefore, researching how to reduce the consumption of the privacy budget to the maximum extent while maintaining good model accuracy is the biggest problem faced by the present federal learning framework based on differential privacy. Most current methods are based on uniform, fixed privacy parameter settings, and models often do not perform well due to the accumulation of large amounts of privacy loss in the iteration. The analysis shows that one challenge of federal learning based on differential privacy is how to properly balance the privacy, accuracy and communication efficiency of the model, and ensure that the model still has good communication efficiency and model accuracy on the premise of protecting the privacy of the user as much as possible.

Disclosure of Invention

In view of the existing problems, the present invention aims to provide a gradient compression framework for adaptive privacy budget allocation based on federal learning to solve the above problems.

The invention provides the following technical scheme:

a gradient compression framework for adaptive privacy budget allocation based on federal learning, comprising: the system comprises a Top-k-based gradient dimensionality reduction compression module, a local differential privacy protection module, a communication turn parameter aggregation module and a self-adaptive privacy budget allocation module, wherein the Top-k-based gradient dimensionality reduction compression module is used for reducing the number of communication turns; before uploading the gradient parameters obtained by training the client to a server, the client compresses the current gradient parameters through a gradient dimensionality reduction compression module based on Top-k, then disturbs the gradient parameters through a privacy protection module based on local differential privacy, then uploads the compressed and disturbed gradient parameters to the server, and the server aggregates the gradient parameters uploaded by the client; in addition, the invention distributes the privacy budget according to the required noise amount in different rounds of training through the self-adaptive privacy budget distribution module.

After the client finishes local iterative training, the gradient dimensionality reduction compression module based on Top-k calculates a local model

Gradient of (2)

d-dimensional model parameters

Corresponding gradient of

Where t is the communication turn.

The gradient dimension reduction compression module based on Top-K selects the first K dimensions with the maximum gradient absolute value from the d dimensions of the model parameters to upload, wherein K is<d; gradient of local model according to absolute value of each dimension

And (3) sequencing:

wherein the sort algorithm sort is in a descending method,

representing the gradient after sorting, and the size of the gradient is sequentially decreased according to the dimension; after sorting, sorting the d-dimensional gradient parameters

Selecting the first K dimensionalities as a compressed model:

wherein TopK denotes a gradient compression scheme,

representing the gradient after compression.

Preferably, the privacy protection module based on local differential privacy adds differential privacy noise to the gradient parameter uploaded by the client to realize strict privacy guarantee, and specifically, for the gradient parameter G of the model, the perturbation algorithm randomizes each dimension of G and returns a perturbed gradient parameter G ^* (ii) a Perturbation mechanism

For the gradient parameter G for each dimension in G, the following constraints are applied: g is an element of [ c-r, c + r ]]Wherein c is the center of the range for g and r is the radius of the range; perturbing g by the LDP mechanism:

wherein G is ^* Is the noise weight after being disturbed by the LDP mechanism, including d dimension,

is a differential private perturbation mechanism, epsilon is the privacy budget allocated to a particular dimension in the gradient parameters.

Preferably, the privacy protection module based on local differential privacy is used for compressing the gradient parameters

Perturbing using the LDP mechanism:

wherein the content of the first and second substances,

is a gradient that is compressed and perturbed.

Preferably, in the privacy protection module based on local differential privacy, the range parameters c and r for limiting the gradient parameter g are set according to the method for clipping the gradient parameter g.

The parameter aggregation module of the communication turn adds noise to the compressed gradient parameters by all local clients, and then adds the local gradient parameters to the compressed gradient parameters

Uploading to a server for aggregation, and allocating privacy budget epsilon to the server for the current communication turn t +1 ^t+1 And then sending the new global model to the client participating in training for training,the above operations are cycled until a convergence condition is reached.

Preferably, after receiving the gradient parameters uploaded by the user, the parameter aggregation module of the communication turn aggregates the gradient parameters by using the following formula:

wherein w _t Is the global model to be updated for the current round t, w _t+1 Is the global model parameter after the next round of t +1 update,

and alpha is the mean value of all client gradient parameters, and is the learning rate of the updating algorithm.

Preferably, the adaptive privacy budget allocation module allocates the data to the user by means of a privacy budget allocation scheme:

allocating different privacy budgets for different communication rounds, wherein epsilon is the total privacy budget of training, epsilon ^t Is the privacy budget allocated to the tth round, T is the total communication round.

The invention has the beneficial technical effects that:

the gradient compression framework provided by the invention is based on a communication turn self-adaptive privacy budget allocation scheme so as to reduce the loss of privacy budget and the size of model noise. First, privacy and model performance are weighed to the greatest extent by assigning different privacy budgets to different iteration rounds. Secondly, in order to reduce the overall noise amount of the model, a Top-K-based gradient compression method is also used, which not only reduces the communication cost, the noise amount and the loss of the total privacy budget of the model, but also provides better model accuracy under the privacy protection.

Drawings

FIG. 1 is a schematic diagram of a gradient compression framework for adaptive privacy budget allocation based on federated learning provided in the present invention;

FIG. 2 is a flowchart illustrating a preferred embodiment of the gradient compression framework for adaptive privacy budget allocation based on federated learning provided by the present invention.

Detailed Description

The following examples are given to illustrate the present invention in detail, and the following examples are given to illustrate the detailed embodiments and specific procedures of the present invention, but the scope of the present invention is not limited to the following examples. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those skilled in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.

Examples

As shown in fig. 1, a gradient compression framework for adaptive privacy budget allocation based on federal learning provided by an embodiment of the present invention includes: the system comprises a Top-k-based gradient dimensionality reduction compression module, a local differential privacy protection module, a communication turn parameter aggregation module and a self-adaptive privacy budget allocation module, wherein the Top-k-based gradient dimensionality reduction compression module is used for reducing the number of communication turns; before uploading the gradient parameters obtained by training the client to the server, the client compresses the current gradient parameters through a Top-k-based gradient dimensionality reduction compression module, then disturbs the gradient parameters through a local differential privacy-based privacy protection module, and then uploads the compressed and disturbed gradient parameters to the server, and the server aggregates the gradient parameters uploaded by the client; and finally, allocating privacy budgets according to the required noise amount in different rounds of training through an adaptive privacy budget allocation module.

As shown in FIG. 2, after the gradient dimension reduction compression module based on Top-k completes local iterative training, the local model is calculated

Gradient of (2)

d-dimensional model parameters

Corresponding gradient of

Where t is the communication turn.

And (3) sequencing:

wherein the sort algorithm sort is in a descending method,

representing the gradient after sequencing, and the size of the gradient is sequentially decreased according to the dimension; after sorting, sorting the d-dimensional gradient parameters

The first K dimensions are selected as a compressed model:

wherein TopK denotes a gradient compression scheme,

representing the gradient after compression.

The privacy protection module based on local differential privacy adds differential privacy noise to the gradient parameters uploaded by the client to realize strict privacy guarantee, specifically, for the gradient parameters G of the model, a perturbation algorithm randomizes each dimension of G and returns a perturbed gradient parameter G ^* (ii) a Perturbation mechanism

Privacy protection module pair compressed gradient parameters based on local differential privacy

Perturbing using the LDP mechanism:

wherein the content of the first and second substances,

is a gradient that is compressed and perturbed.

And setting range parameters c and r for limiting the gradient parameter g according to a method for cutting the gradient parameter g in the privacy protection module based on the local differential privacy.

Parameter aggregation module of communication turn is on all local client pairsAfter noise is added to the compressed gradient parameters, the local gradient parameters are added

Uploading to a server for aggregation, and allocating privacy budget epsilon to the server for the current communication turn t +1 ^t+1 For each client, a new global model is then sent to the client participating in training for training, and the above operations are repeated until a convergence condition is reached.

After receiving the gradient parameters uploaded by the user, a parameter aggregation module of the communication turn aggregates the gradient parameters by the following formula:

The adaptive privacy budget allocation module allocates the data to the user according to the privacy budget allocation scheme:

allocating different privacy budgets for different communication turns, where ε is the total privacy budget trained, ε ^t Is the privacy budget allocated to the T-th round, T is the total communication round.

The gradient compression framework provided by the above embodiments of the present invention is based on an adaptive privacy budget allocation scheme of a communication turn, so as to reduce the loss of privacy budget and the size of model noise. First, privacy and model performance are maximally weighted by assigning different privacy budgets to different iteration rounds. Secondly, in order to reduce the overall noise amount of the model, a Top-K-based gradient compression method is also used, which not only reduces the communication cost, the noise amount and the loss of the total privacy budget of the model, but also provides better model accuracy under the privacy protection.

The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims

1. A gradient compression framework for adaptive privacy budget allocation based on federal learning, comprising: the system comprises a Top-k-based gradient dimensionality reduction compression module, a local differential privacy protection module, a communication turn parameter aggregation module and a self-adaptive privacy budget allocation module, wherein the Top-k-based gradient dimensionality reduction compression module is used for reducing the number of communication turns; before uploading the gradient parameters obtained by training the client to a server, the client compresses the current gradient parameters through the gradient dimensionality reduction compression module based on Top-k, then disturbs the gradient parameters through the privacy protection module based on local differential privacy, then uploads the compressed and disturbed gradient parameters to the server, and the server aggregates the gradient parameters uploaded by the client; and finally, distributing the privacy budget according to the required noise amount in different rounds of training through the self-adaptive privacy budget distribution module.

2. The gradient compression framework for adaptive privacy budget allocation based on federated learning according to claim 1, wherein the Top-k based gradient dimension reduction compression module calculates a local model after the client completes local iterative training

Gradient of (2)

d-dimensional model parameters

Corresponding gradient of

Where t is the communication turn.

3. The gradient compression framework for adaptive privacy budget allocation based on federated learning according to claim 2, wherein the Top-K based gradient dimension reduction compression module selects the first K dimensions with the largest absolute value of gradient from the d dimensions of model parameters to upload, wherein K<d; gradient of local model according to absolute value of each dimension

And (3) sequencing:

wherein the sort algorithm sort is in a descending method,

Selecting the first K dimensionalities as a compressed model:

wherein TopK represents a gradient compression scheme,

representing the gradient after compression.

4. The gradient compression framework for adaptive privacy budget allocation based on federated learning according to claim 1, wherein the privacy protection module based on local differential privacy implements strict privacy assurance by adding differential privacy noise to gradient parameters uploaded by clients, specifically, for gradient parameters G of a model, a perturbation algorithm randomizes each dimension of G and returns one perturbed gradient parameter G ^* (ii) a Perturbation mechanism

For each dimension in G, the gradient parameter G is limited as follows: g is an element of [ c-r, c + r ]]Wherein c is the center of the range for g and r is the radius of the range; perturbing g through the LDP mechanism:

wherein, G ^* Is the noise weight after being disturbed by the LDP mechanism, including d dimension,

5. The gradient compression framework for adaptive privacy budget allocation based on federated learning according to claim 4, wherein the privacy protection module based on local differential privacy is to compress gradient parameters

Perturbing using the LDP mechanism:

wherein the content of the first and second substances,

is the gradient compressed and perturbed.

6. The gradient compression framework for adaptive privacy budget allocation based on federated learning according to claim 5, wherein in the privacy protection module based on local differential privacy, range parameters c and r limiting gradient parameter g are set according to the method of clipping gradient parameter g.

7. The gradient compression framework for adaptive privacy budget allocation based on federal learning according to claim 1, wherein the parameter aggregation module for communication turns adds noise to the compressed gradient parameters at all local clients, and then adds the local gradient parameters to the compressed gradient parameters

Uploading to a server for aggregation, and allocating privacy budget epsilon to the server for the current communication turn t +1 ^t+1 And for each client, then sending a new global model to the client participating in training for training, and circulating the operation until a convergence condition is reached.

8. The gradient compression framework for adaptive privacy budget allocation based on federated learning of claim 7, wherein the parameter aggregation module of the communication turn, upon receiving a gradient parameter uploaded by a user, aggregates the gradient parameter by the following formula:

9. The gradient compression framework for adaptive privacy budget allocation based on federated learning of claim 1, wherein the adaptive privacy budget allocation module allocates the data to the mobile device via a privacy budget allocation scheme:

allocating different privacy budgets for different communication rounds, wherein epsilon is the total privacy budget of training, epsilon ^t Is the privacy budget allocated to the T-th round, T is the total communication round.