CN114595830A

CN114595830A - Privacy protection federal learning method under edge computing-oriented scene

Info

Publication number: CN114595830A
Application number: CN202210157685.5A
Authority: CN
Inventors: 吴黎兵; 张壮壮; 曹书琴; 张瑞; 王敏
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2022-02-21
Filing date: 2022-02-21
Publication date: 2022-06-07
Anticipated expiration: 2042-02-21
Also published as: CN114595830B

Abstract

The invention provides a privacy protection federal learning method in an edge computing-oriented scene, which uses a dual-server architecture to perform model aggregation and Byzantine robustness. Firstly, a server issues initial model parameters to a client; secondly, the client side carries out repeated iterative training by using the local data set and the initial parameters, and obtains the training result of the current round; then, the client carries out secret sharing processing on the training results and uploads the training results to different servers respectively; and finally, the double servers carry out cooperative Byzantine node detection to obtain quasi-aggregation parameters, and carry out cooperative model aggregation on the basis to obtain the training result of the global model of the current round. The above process is iterated continuously until an optimal solution is trained. According to the method, through a double-server architecture, the defense of Byzantine nodes is realized while the data privacy is protected, the calculation communication overhead is low, and the problem of cooperative training in the marginal calculation scene can be solved.

Description

Privacy protection federal learning method under edge computing-oriented scene

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a privacy protection federal learning method under an edge computing scene.

Background

In recent years, with the development of edge computing, more and more edge nodes have data computing and collecting capabilities. How to utilize these massive edge nodes becomes a research hotspot in academia and industry. Many service providers wish to have the ability to use their own edge nodes for machine learning training. For example, automated driving service providers have trained models that they want to continue optimizing the model using the computing power and data of the sold vehicle. For consumers, they may not want to reveal their data privacy information to the service provider. To address this issue, the service provider may utilize federated learning techniques to co-train the model with the consumer without sharing the consumer's local raw training data.

Federal learning, as a distributed learning paradigm to protect privacy, has been widely used in edge computing scenarios such as internet of vehicles (IoV), smart homes, smart phones, and the internet of things for medical care (IoMT). In particular, to effectively utilize the computing resources of smartphones, google first proposed a so-called FedAvg federal learning scheme that uses a server to aggregate model parameters for training participants, thereby enabling efficient collaborative training. Based on this, researchers have also proposed a federal learning method for mobile phone keypad input prediction that can enable model training without directly revealing user data privacy to the server. Because of the privacy disclosure problem of federal learning itself, many schemes use cryptographic primitives or differential privacy to achieve privacy-preserving federal learning.

However, these solutions face major limitations. On the one hand, for privacy-preserving federal learning, password-based schemes can place a significant burden on the federal learning system. For example, schemes based on homomorphic encryption or secret sharing add additional computational complexity to the participants, reducing the training efficiency of the model. More importantly, the computing power of the edge node is generally limited, and the support for the complex cryptographic algorithm is poor. On the other hand, the byzantine robust scheme requires access to model updates of the participants, making comparisons, and lacks protection of participant privacy. In the edge computing scenario, the edge nodes are very fragile, may exit frequently, and are less robust.

Disclosure of Invention

The invention provides a privacy protection federal learning method under an edge computing scene, which is used for solving the technical problem that the existing scheme can not protect privacy and simultaneously realize defense on Byzantine nodes.

In order to solve the technical problem, the invention provides a privacy protection federated learning method facing to an edge computing scene, which is applied to a framework comprising a polymerization server SP, a Byzantine detection server TP and a federated learning client, and the method comprises the following steps:

s1: the aggregation server SP and the federal learning client side carry out synchronization of initial global model parameters;

s2: the federated learning client performs iterative training by using the corresponding local data set and the initial global model parameters to obtain new local model parameters;

s3: the federated learning client obtains a first local model parameter and a second local model parameter based on the new local model parameter, uploads the first local model parameter to a Byzantine detection server TP and uploads the second local model parameter to a convergence server SP in a secret sharing mode, wherein the obtained first local model parameter and the second local model parameter are in a secret sharing form of the new local model parameter, and the new local model parameter can be recovered through the first local model parameter and the second local model parameter;

s4: the aggregation server SP and the Byzantine detection server TP carry out cooperative Byzantine node detection based on the first local model parameter and the second local model parameter to obtain a quasi-aggregation parameter;

s5: and the aggregation server SP and the Byzantine detection server TP carry out cooperative model aggregation based on the quasi-aggregation parameters to obtain a global model training result.

In one embodiment, step S1 includes:

s1.1: federal learning client P_iAccessing a federated learning training network and sending a corresponding identity id to an aggregation server SP, wherein P_iThe client-side is used for representing the ith federated learning client-side, and i represents the number of the federated learning client-side;

s1.2: the aggregation server SP issues the initial global model parameters to the corresponding federal learning client according to the identity id;

s1.3: and the federated learning client receives the corresponding initial global model parameters to realize parameter synchronization.

In one embodiment, step S2 includes:

s2.1: federal learning client P_iCalculating a gradient using the received initial global model parameters and the local data set, the calculation formula being

Wherein D_(i)Is a federal learning client P_iW denotes the initial global model parameters, g_iRepresenting federated learning client P_iOn the data set D_(i)The resulting gradient is trained to be a gradient,

representing a calculated gradient;

s2.2: federal learning client P_iUpdating original local model parameters according to the learning rate and the gradient, wherein the formula is w'_i＝w_i-ηg_iWherein w is_iRepresenting original local model parameters, eta is learning rate, w'_iAre new local model parameters.

In one embodiment, step S3 includes:

s3.1: the federal learning client obtains a first local model parameter and a second local model parameter according to the following formula,

wherein, w_i ⁽¹⁾Representing first local model parameters, w_i ⁽²⁾Denotes a second local model parameter, w'_iThe new local model parameters are represented and,

for federated learning client P_iSelf-generated random noise;

s3.2: the first local model parameters are uploaded to the byzantine test server TP and the second local model parameters are uploaded to the aggregation server SP.

In one embodiment, step S4 includes:

s4.1: w received by Byzantine detection Server TP_i ⁽¹⁾Summing to obtain z₁The aggregation server SP pair received w_i ⁽²⁾The values are summed to obtain z₂Then Byzantine detection Server TP will z₁Transmitting to the aggregation server SP;

s4.2: the aggregation server SP calculates by formula

Obtaining model parameters of the round, calculating an intermediate value A, and transmitting the intermediate value to a Byzantine detection server TP, wherein,

the model parameters of the round are n, and the number of the clients selected in the round is n;

s4.3: byzantine detection server TP pass formula

Calculating the distance between the uploaded model parameters of each client and the model parameters of the current round, and calculating the median of the distance on the basis;

s4.4: and selecting k number of model parameters as quasi-aggregation parameters according to the difference value of the s and the median value and the sequence from small to large by the Byzantine detection server TP, wherein k is a preset integer.

In one embodiment, step S5 includes:

s5.1: the byzantine test server TP calculates a first global parameter based on the selected quasi aggregation parameter,

k is a preset integer,

means for summing a first form of secret sharing of the selected k quasi aggregated parameters, W₁Is a first one of the global parameters that,

s5.2: the aggregation server SP calculates a second global parameter according to the selected quasi-aggregation parameter,

W₂as a second global parameter, the global parameter,

the second secret sharing mode of the selected k quasi-aggregation parameters is summed;

s5.3: byzantine detection server TP compares W₁Sending the data to a polymerization server SP, and calculating W ═ W by the SP₁+W₂And W is a global model training result and is a model parameter obtained by the training of the current round.

One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:

firstly, an aggregation server issues initial global model parameters to a federated learning client; then the federated learning client performs multiple iterative training by using the local data set and the initial global model parameters, and obtains the training result of the current round; then the client carries out secret sharing processing on the training results and uploads the training results to different servers respectively; and finally, the double servers carry out cooperative Byzantine node detection to obtain quasi-aggregation parameters, and carry out cooperative model aggregation on the basis to obtain the training result of the global model of the current round. The above process is iterated continuously until an optimal solution is trained. According to the method, through a double-server architecture, defense on Byzantine nodes is achieved while data privacy is protected, computing communication overhead is low, and the method is a light-weight and safe method and can solve the problem of collaborative training in an edge computing scene.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart of a privacy protection federal learning method under an edge-oriented computing scenario according to an embodiment of the present invention;

fig. 2 is a scene schematic diagram of a federal learning method according to an embodiment of the present invention.

Detailed Description

The invention provides a lightweight privacy protection federal learning method facing an edge computing scene, which uses a dual-server architecture (an aggregation server SP and a Byzantine detection server TP) to carry out model aggregation and Byzantine robustness, realizes defense on Byzantine nodes while protecting data privacy, and has the technical effect of low computing communication overhead.

In order to achieve the technical effects, the main concept of the invention is as follows:

firstly, an aggregation server issues initial model parameters to a federal learning client; secondly, the client side carries out repeated iterative training by using the local data set and the initial parameters, and obtains the training result of the current round; then, the client carries out secret sharing processing on the training results and uploads the training results to different servers respectively; and finally, the double servers carry out cooperative Byzantine node detection to obtain quasi-aggregation parameters, and carry out cooperative model aggregation on the basis to obtain the training result of the global model of the current round. The above process is iterated continuously until an optimal solution is trained.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a privacy protection federated learning method under an edge computing-oriented scene, which is applied to a framework comprising a polymerization server SP, a Byzantine detection server TP and a federated learning client, and the method comprises the following steps:

Referring to fig. 1 and fig. 2, fig. 1 is a flowchart of a privacy protection federal learning method in an edge-oriented computing scenario provided in an embodiment, and fig. 2 is a scenario diagram of the federal learning method in the embodiment of the present invention.

In federal learning, there are multiple clients (e.g., edge nodes, internet of things devices, and smartphones) and one service provider called an aggregator. Specifically, each participant has a local data set and trains a model locally, and then only exchanges intermediate parameters with the aggregator, not privacy-sensitive training data. The aggregator then aggregates the model parameters of the different participants. In this way, the service provider can perform model training without transmitting training data.

In fig. 2, each federal learning client may be a smart phone, an automobile, or the like, the locally trained model is a local model, and the aggregation server SP and the byzantine detection server TP aggregate the local models to obtain a global model.

Specifically, step S1 is synchronization of the initial global model parameters, and the aggregation server issues the initial global model parameters to the federal learning client; step S2, the federal learning client side carries out a plurality of times of iterative training by using the local data set and the initial global model parameters, and obtains the training result of the current round; then, in step S3, the client performs secret sharing processing on the training result, and uploads the training result to different servers (SP and TP), respectively; and finally, the double servers carry out cooperative Byzantine node detection to obtain quasi-aggregation parameters, and carry out cooperative model aggregation on the basis to obtain the training result of the global model of the current round.

The above process (steps S1-S5) is iterated until the optimal solution is trained, i.e. the model is saved after all rounds of training are completed. According to the method, through a double-server architecture, the problems that most of the existing federal learning schemes are too high in cost, low in practicability and difficult to achieve model robustness and sum are solved.

In one embodiment, step S1 includes:

In one embodiment, step S2 includes:

representing a calculated gradient;

In one embodiment, step S3 includes:

wherein w_i ⁽¹⁾Representing first local model parameters, w_i ⁽²⁾Representing a second local model parameter, w_iRepresenting the parameters of the original local model and,

for federated learning client P_iSelf-generated random noise;

In particular, w_i ⁽¹⁾、w_i ⁽²⁾Is w_iBy w_i ⁽¹⁾、w_i ⁽²⁾Can recover w_i。

In this embodiment, the Federal learning client generates a random number (random noise), then w_iThen pass through two equations therein (w above)_i ⁽¹⁾、w_i ⁽²⁾The formula) can be calculated, in this way, the server can be guaranteed to carry out w reception_i ⁽¹⁾、w_i ⁽²⁾And when aggregation (direct summation) is carried out, the generated random numbers are offset, and the sum of local parameters (namely global parameters) uploaded by the client is obtained.

Compared with a layered federal learning method applying differential privacy protection in the prior art, the learning method mainly adopts a blinding idea to protect data privacy, namely, a blinding factor (random noise) is added to the uploaded parameters, and then the blinding factor is offset in the process of dual-server aggregation. Since the local training parameters are not subjected to differential privacy processing by using a differential privacy technology, noise is not introduced. In this way, the accuracy of the model is not affected. The difference privacy has the defect that the prediction precision of the model is influenced, and the training precision of the model cannot be guaranteed.

In addition, the technical scheme of the invention focuses on the protection of federal learning privacy besides the robustness to data pollution. In the prior art, a scheme of simultaneously considering data privacy and model robustness in a training process is not provided, the invention provides a dual-server architecture to manage the training process of the model, and the data privacy of participants is ensured while preventing the global model from being influenced by malicious attack.

In one embodiment, step S4 includes:

s4.2: the aggregation server SP calculates by formula

n is the number of clients selected in the round as the model parameter of the round

S4.3: byzantine detection server TP pass formula

In particular, the purpose of the intermediate value A is to calculate the distance in step S4.3, so that TP can directly calculate the sum of the values (i.e. A) calculated by the formula

The summation is performed. The value of k can be preset according to actual conditions.

Is calculated by SP which subtracts it

An intermediate value a can be obtained and then SP sends a into TP. In step S4.3, the distance refers to the distance between the uploaded model parameter of each client and the model parameter of the current round.

It should be noted that the above processes (steps S1 to S5) are continuously iterated until an optimal solution is trained, wherein the judgment basis for the end of the related training process is based on the number of training rounds set up or the accuracy reaches a certain threshold, which may be set up according to the actual situation.

In one embodiment, step S5 includes:

k is a preset integer,

means for summing a first form of secret sharing of the selected k quasi aggregated parameters, W₁Is a first global parameter that is a function of,

W₂as a second global parameter, the global parameter,

s5.3: byzantine detection server TP compares W₁Sending the data to the aggregation server SP, and calculating W as W by the SP₁+W₂And W is a global model training result and is a model parameter obtained by the training of the current round.

Compared with the prior art, the invention has the beneficial effects that:

(1) different from other federal learning methods in the prior art, the method does not use complex homomorphic encryption, safe multiparty, differential privacy and other technologies, and can effectively reduce the calculation and communication overhead of the federal learning security scheme.

(2) The prior federal learning method rarely considers the data privacy and the model robustness in the training process at the same time, and the invention provides a dual-server architecture to manage the training process of the model, thereby realizing the prevention of the global model from being influenced by malicious attacks while ensuring the data privacy of the participants.

(3) The method can be used for collaborative training in the edge computing scene, the privacy of the edge computing node is protected, the dynamic exit of the edge node is supported, and the actual use requirement is met.

It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. The privacy protection federal learning method facing to the edge computing scene is applied to a framework comprising an aggregation server SP, a Byzantine detection server TP and a federal learning client, and comprises the following steps:

2. The privacy preserving federal learning method as claimed in claim 1, wherein step S1 includes:

s1.1: federal learning client P_iAccessing a Federal learning training network and sending a corresponding identity id to an aggregation server SP, wherein P_iThe client-side is used for representing the ith federated learning client-side, and i represents the number of the federated learning client-side;

3. The privacy preserving federal learning method as claimed in claim 1, wherein step S2 includes:

representing a calculated gradient;

4. The privacy preserving federal learning method as claimed in claim 1, wherein step S3 includes:

wherein, w_i ⁽¹⁾Representing first local model parameters, w_i ⁽²⁾Representing a second local model parameter, w'_iThe new local model parameters are represented and,

for federated learning client P_iSelf-generated random noise;

5. The privacy preserving federal learning method as claimed in claim 1, wherein step S4 includes:

s4.1: w received by Byzantine detection Server TP_i ⁽¹⁾Summing to obtain z₁The aggregation server SP receives w_i ⁽²⁾The values are summed to obtain z₂Then Byzantine detection Server TP will z₁Transmitting to the aggregation server SP;

s4.2: the aggregation server SP calculates by formula

s4.3: byzantine detection server TP pass formula

Calculating uploading model parameters and current round model of each clientThe distance of the parameters, and the median of the distance is calculated on the basis;

s4.4: and selecting k model parameters as quasi-aggregation parameters according to the difference value of the s and the median value and the sequence from small to large by the Byzantine detection server TP, wherein k is a preset integer.

6. The privacy preserving federal learning method as claimed in claim 1, wherein step S5 includes:

k is a preset integer,

W₂as a second global parameter, the global parameter,