CN117574429A

CN117574429A - Federal deep learning method for privacy enhancement in edge computing network

Info

Publication number: CN117574429A
Application number: CN202311531854.8A
Authority: CN
Inventors: 汪晓丁; 阙友雄; 林晖; 江水; 镇子航
Original assignee: Fujian Normal University; Institute of Tropical Bioscience and Biotechnology Chinese Academy of Tropical Agricultural Sciences
Current assignee: Fujian Normal University; Institute of Tropical Bioscience and Biotechnology Chinese Academy of Tropical Agricultural Sciences
Priority date: 2023-11-16
Filing date: 2023-11-16
Publication date: 2024-02-20

Abstract

The invention discloses a federal deep learning method for privacy enhancement in an edge computing network. The selected client alternately performs local training and model compression, samples a local data set through a custom tag, accelerates training speed, adds local differential privacy noise into a model after training is completed, and sends the model to a server; the server collects the models uploaded by the clients to carry out self-adaptive updating and sends the aggregate model back to each client; and stopping learning when the model reaches the requirement or the maximum training round. The invention effectively reduces privacy budget and solves the problem of privacy disclosure caused by inference attack on the premise of meeting model precision.

Description

Federal deep learning method for privacy enhancement in edge computing network

Technical Field

The invention relates to the fields of differential privacy and federal learning in artificial intelligence, in particular to a federal deep learning method for strengthening privacy in an edge computing network.

Background

With the rapid development of the fields of the internet of things, artificial intelligence, big data and the like, more and more devices, sensors and end users generate massive data. Traditional cloud computing models face high latency, insufficient bandwidth, and security issues. Edge computing is an emerging computing architecture and model that aims to deploy computing resources such as processing power, data storage, and network devices at the network edge, closer to the user to achieve faster response speeds and lower latency. It should be noted that edge computing is not a substitute for cloud computing, but is a complementary service. With edge computing and cloud computing, an enterprise can create an efficient and secure IT facility. Edge computing is applicable to industries requiring fast data processing, such as medical, financial, and manufacturing industries, among others. It is very common to apply federal learning in edge computing networks to enhance computing performance.

Federal learning (Federated Learning) is a powerful distributed learning paradigm in the field of artificial intelligence that provides efficient model training and a degree of privacy protection. This approach involves mainly two main participants: a client and a server. In federal learning, each client is responsible for training a local model based on an initial model and transmitting the trained model to a server for aggregation. The server then returns the aggregate model to all clients as a new initial model. This process is iterated until a convergence condition is reached. However, since the conventional federal learning cannot fully guarantee the privacy and security of the user and resources in the edge network are often limited, combining the conventional federal learning with the edge computing may result in an incorrect completion of the training process.

In federal learning, a server cannot access data of a single client, and the clients cannot share data. This inherent design provides a degree of privacy protection. However, with the development of artificial intelligence, hacker attack strategies also exhibit intelligent and diversified features. In particular to inference attacks, hackers reverse engineer the model using artificial intelligence techniques and infer sensitive information, which poses a threat to data privacy. This type of attack can lead to data privacy leakage during the uploading of the model from the client to the server. When an attacker intercepts an uploaded local model, they may access the client data embedded in the model. Therefore, the conventional federal learning method cannot effectively protect the data privacy of the client.

Differential privacy is one privacy protection method first proposed by Dwork et al (Differential privacy.33rd international colloquium on automata, 2006). The core idea of differential privacy is to protect the privacy of personal data by adding random noise or turbulence, ensuring that individual information cannot be accurately tracked in the results of data analysis or distribution. Differential privacy is divided into local differential privacy and global differential privacy according to whether the added noise is at the client or the server. Because the local differential privacy training and protection can be realized at the client, the local differential privacy is more suitable for federal learning. At present, in order to solve the problem of privacy disclosure caused by inference attack, local differential privacy has been introduced into federal learning. However, the combination of differential privacy and federal learning also presents another problem. The privacy budget in differential privacy increases with the increase of the model and the increase of the number of communication rounds. Since the privacy budget is inversely proportional to the model accuracy and the amount of parameters in the Neural Network (NN) is large, adding noise to the NN model can significantly reduce the model accuracy. Existing approaches mostly alleviate the problem of privacy budget explosion by minimizing local models. For example, hu et al (Federated learning with sparsification-amplified privacy and adaptive optimization. International Joint Conference on Artificial Intelligence, 2021) consider sparsity of a model as a method of reducing the size of a transmission model; liu et al (frame: differentially private federated learning in the shuffle model. Corr, 2020) solve the privacy budget problem by scaling down the model by sampling the model parameters. However, these methods are consistent with the model parameters and do not take into account the change in importance of the parameters. Therefore, designing a technique capable of achieving a balance between model accuracy and compression degree is a technical problem to be solved.

Disclosure of Invention

The invention aims to provide a federal deep learning method for strengthening privacy in an edge computing network, and the model is compressed under the conditions of ensuring the accuracy of the model and preventing privacy disclosure.

The technical scheme adopted by the invention is as follows:

a federal deep learning method for privacy enhancement in an edge computing network, comprising the steps of:

s1, initializing a global model by a server, distributing the global model to all clients, randomly selecting a part of clients, and updating the selected clients by using local private data to obtain a local model;

s2, training a model by using a random gradient descent algorithm by the client, putting the semi-trained model into a compression algorithm based on deep reinforcement learning, training and compression are alternately performed, and simultaneously, sampling a local data set by each client through a custom label to accelerate training until model precision and model size are balanced; after the client training is completed, adding random noise into the trained model and uploading the random noise to a server;

s3, the server collects the models uploaded by the clients, adaptively updates the models and sends the aggregate models back to the clients.

Further, the specific process of server distribution model in step S1 is as follows:

s10, a server firstly initializes a model theta ⁰ The model is sent to n clients, wherein n is the total number of the clients;

s11, randomly selecting k clients by the server in each round; wherein k is the number of clients which are selected by the server for local training, and the size of k is less than or equal to n.

Further, the specific process of alternately performing local training and model compression in step S2 is as follows:

s20, parameter definition: the client trains a model using a gradient descent algorithm, the gradient descent algorithm training model comprising an actor network pi (|θ) ^π ) Criticizing home network Q (|θ) ^Q ) Target actor network pi (|θ) ^π' ) And a target criticizing home network Q (|θ) ^Q' ) The method comprises the steps of carrying out a first treatment on the surface of the Three basic elements are defined simultaneously:

state, based on CNN layer definition, the State is represented by using 11 feature parameters for each layer, specifically:

S ^l ∶(l,n,c,h,w,stride,k,FLOPs[l].reduced,rest,a(l-1)) (1)；

the dimension of the kernel is (n×c×k×k), the input dimension is (c×h×w), l represents the number of model layers, n represents the number of input channels, c represents the number of output channels, k is the size of the convolution kernel, and stride refers to the sliding step of the convolution kernel on the input data; FLOPs [ t ] represents the number of floating point numbers in layer l, reduced defines the number of FLOPs reduced in the previous layer, rest represents the number of flow points remaining in the subsequent layer, and a (l-1) represents the action performed in the previous layer; scaling parameters within [0,1] before communicating to the agent;

action: the model compression is regarded as an action performed in the continuous space a epsilon (0, 1) so as to realize fine-granularity accurate compression;

rewarding Reward setting a rewarding function sensitive to model accuracy, designing the following rewarding function in inverse proportion to log (FLPs) and log (#Param) according to errors:

R _FLOPs ＝-Error×log(FLOPs) (2)；

R _Param ＝-Error×log(#Param) (3)；

the FLPs represent the floating point operation times of the model, namely the total times of the floating point operations which the model needs to execute in reasoning; # Param represents the number of parameters of the model, i.e., the total number of weights and biases in the model that need to be learned; error represents the Error of the model, i.e., the performance Error of the model when performing tasks; log (FLOPs), i.e., FLOPs taking the logarithm, represents the logarithmic transformation of the floating point number of operations; log (#Param), i.e., # Param takes the logarithm, representing the logarithmic transformation of the number of parameters;

s21, model compression: through the actor network pi (|θ) ^π ) Learning the optimal strategy pi to obtain the most reasonable model compression ratio, taking state as input and action as output to obtain a formula (4)

π(s|θ ^π )→a (4)；

S22, local training: the sampling process of the gradient descent algorithm is to randomly select an action to be called as s (t) based on noise and the current strategy, calculate a reward value after s (t) is executed and obtain a new state to be called as s (t+1), and finally store experience into an experience pool;

s23, customizing a label, and adjusting sampling probability: the training speed is increased by adjusting the sampling probability, the sampling probability set for the majority type in the sample is larger than the sampling probability set for the minority type in the sample, and the sampling probability formula is obtained as follows:

where k represents the client number, y _j Representing a label serial number, and N represents the total number of label categories; n represents the total number of labels of a certain class, i represents the ith label class; j is the j-th label serial number of a certain label;

s24, adding noise and uploading the model to a server: after the training of the client is completed, adding random noise into the trained model, and uploading the model to the server.

Further, S21 uses truncated normal function based on Ornstein-Uhlenbeck process and noise generation to generate action

π'(s _t )～TN(π(s(t)|θ _t ^u ,σ ² ,0,1)) (5)；

Where σ is noise and the number decays in the iterative process after initialization.

Further, σ is initialized to 0.5 in step S21.

Further, training an actor network and a criticizing network simultaneously in deep reinforcement learning in step S22 achieves maximization of jackpot and minimization of error between the evaluation value and the target value,

the loss function of the criticizer network is as follows:

wherein Q (s (t), a) _λ (t)|θ ^Q ) Representing the use of the current Q-function parameter θ in the current state s (t) ^Q Predicted action a _λ The Q value of (t); l (θ) ^Q ) Learning the loss function for Q, representing the parameter θ by adjusting the Q function ^Q Minimizing losses; n is the number of samples, representing the total number of samples that are summed over all samples when calculating the average; t is a time step, representing a time step sequence in reinforcement learning; y (t) is a target value, which represents a target Q value in Q learning and is the sum of the current reward r (t) and the maximum Q value of the next state;

the Q value is calculated as follows:

Q(s(t),a _λ (t))＝E[r(s(t),a _λ (t))+γQ(s(t+1),π(s(t+1)))] (7)；

wherein r (s (t), a _λ (t)) means that action a is performed in the current state s (t) _λ The instant prize obtained after (t); gamma is a discount factor, representing the decay factor of future rewards, typically taking a value between 0 and 1; s (t+1) represents the state of the next time step; pi (s (t+1)) represents the next action predicted according to a certain policy pi in state s (t+1);

wherein y (t) is calculated as follows:

y(t)＝r(t)+γQ′(s(t+1),π′(s(t+1)|θ ^π′ )|θ ^Q′ ) (8)；

wherein Q '(s (t+1), pi' (s (t+1) |θ) ^π′ )|θ ^Q′ ) Represents a next action pi '(s (t+1) |θ) predicted according to the target policy pi' in the next state s (t+1) ^π′ )|θ ^Q′ ) The corresponding Q value; s is(s) ^Q′ A function representing the Q value for the Q function parameter; r (t) represents an instant prize obtained after the action is executed in the current state s (t);

the reward value is calculated by the current state value s (t) and the action value a (t), and Q' is calculated by the criticizing target network based on the next state value s (t+1);

the loss function of the actor network is as follows:

updating parameters in the target critics network through soft updating, and updating formulas such as formula (10) and formula (11):

θ ^Q' ←τθ ^Q +(1-τ)θ ^Q' (10)；

θ ^π' ←τθ ^π +(1-τ)θ ^π' (11)。

further, the random noise in S24 satisfies a multidimensional gaussian normal distribution

The invention adopts the technical proposal and has the beneficial effects that: the invention provides a federal deep learning method for privacy reinforcement in an edge computing network, which reduces the communication cost of model uploading by compressing a model through deep reinforcement learning under the background of limited resources and reduces privacy budget on the premise of ensuring the model accuracy. Meanwhile, the model training speed is increased by using label sampling, and Gaussian noise is added to the model to resist inference attack.

Drawings

The invention is described in further detail below with reference to the drawings and detailed description;

FIG. 1 is a general flow chart of a federal deep learning method for privacy enhancement in an edge computing network according to an embodiment of the present invention;

FIG. 2 is a schematic flow diagram of a system model of a federal deep learning method for privacy enhancement in an edge computing network according to an embodiment of the present invention;

fig. 3 is a schematic flow chart of a method for implementing model compression and training in the federal deep learning method for privacy enhancement in an edge computing network according to an embodiment of the present invention.

Description of the embodiments

For the purposes, technical solutions and advantages of the embodiments of the present application, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

As shown in one of fig. 1 to 3, the present invention discloses a federal deep learning method for privacy enhancement in an edge computing network,

the method comprises the following steps:

s1, a server distributes a global model to all clients, randomly selects a part of clients, and updates the selected clients by using local private data to obtain a local model;

s2, training a model by using a random gradient descent algorithm (deterministic policy gradient algorithm) by the client, putting the semi-trained model into a compression algorithm based on deep reinforcement learning, training and compression are alternately performed, and simultaneously, sampling a local data set by each client through a custom label to accelerate training until model accuracy and model size are balanced. After the client training is completed, adding random noise into the trained model and uploading the random noise to a server;

s3, the server collects the models uploaded by the clients, adaptively updates the models and sends the aggregate models back to each client.

From the above description, the beneficial effects of the invention are as follows: the invention provides a federal deep learning method for privacy reinforcement in an edge computing network, which reduces the communication cost of model uploading by compressing a model through deep reinforcement learning under the background of limited resources and reduces privacy budget on the premise of ensuring the model accuracy. Meanwhile, the model training speed is increased by using label sampling, and Gaussian noise is added to the model to resist inference attack.

Further, the server distribution model in step S1 mainly includes the following steps:

s10, a server firstly initializes a model theta ⁰ The model is sent to n clients;

wherein n is the total number of clients;

s11, randomly selecting k clients by the server in each round;

wherein k is the number of clients selected by the server for local training, and the size of the clients satisfies k < = n;

further, the step S2 of the client alternating the local training and the model compression includes the following steps:

model compression and training are alternately referred to herein as Model Training while Compressing (MCT);

s20, parameter definition: the client trains the model using a gradient descent algorithm, which has Actor networks (Actor networks) and criticism home networks (Critic networks) similar to the AC framework and corresponding target networks, since the gradient descent algorithm is based on the AC framework. Thus, there is an actor network pi (|θ) in the gradient descent algorithm ^π ) Criticizing home network Q (|θ) ^Q ) Target actor network pi (|θ) ^π ') and a target criticizing home network Q (|θ) ^Q ') four networks.

The following three basic elements are defined simultaneously:

state the local model considered by the present invention is the Convolutional Neural Network (CNN), which the MCT algorithm will perform layer-by-layer compression on. This means that the state is defined based on the CNN layer. For each layer L we use 11 characteristic parameters to represent the state as:

S ^l :(t,n,c,h,w,stride,k,FLOPs[l].reduced,rest,a(l-1))(1)；

where the dimension of the kernel is (nxc x k), the dimension of the input is (c x h x w), FLOPs [ t ] represents the number of floating points in the l-layer, reduced defines the number of FLOPs reduced in the previous layer, rest represents the number of flow points remaining in the subsequent layer, and a (l-1) represents the action performed in the previous layer. These parameters will scale within [0,1] before being transmitted to the agent;

action: model compression is regarded as an action performed in continuous space a epsilon (0, 1), which can realize fine-granularity accurate compression;

reward, in order to compress the model as much as possible while maintaining the model accuracy, a Reward function sensitive to the model accuracy is set. The following reward functions were designed in terms of errors inversely proportional to log (FLPs) and log (#Param):

R _FLOPs ＝-Error×log(FLOPs) (2)；

R _Param ＝-Error×log(#Param) (3)；

s21, model compression: MCT passes through actor network pi (|θ) ^π ) Learning the optimal strategy pi to obtain the most reasonable model compression ratio, taking state as input and action as output to obtain a formula (4)

π(s|θ ^π )→a (4)；

In order to find the compression ratio better, the invention adopts the truncated normal function based on the Ornstein-Uhlenbeck process and noise generation to generate the action

π'(s _t )～TN(π(s(t)|θ _t ^u ,σ ² ,0,1)) (5)；

Where σ is noise and is initialized to 0.5 and decays exponentially in the iterative process.

S22, local training: the sampling process of the gradient descent algorithm is to randomly select an action called s (t) based on noise and current strategy, calculate the reorder value after s (t) is performed and obtain a new state called s (t+1), and finally store the experience into the experience pool.

An actor network and a criticizer network are trained simultaneously in deep reinforcement learning to maximize jackpot and minimize the error between the evaluation value and the target value. Wherein, for criticizing network, the loss function is as formula (6)

The Q value in the formula (6) is calculated as shown in the formula (7)

Q(s(t),a _λ (t))＝E[r(s(t),a _λ (t))+γQ(s(t+1),π(s(t+1)))] (7)；

The calculation of y (t) in equation (6) is as shown in equation (8)

y(t)＝r(t)+γQ'(s(t+1),π'(s(t+1)|θ ^π' )|θ ^Q' ) (8)；

The prize value may be calculated from the current state value s (t) and the action value a (t), and Q' is calculated by the criticizing target network based on the next state value s (t+1). For an actor network. Its loss function is as shown in formula (9)

The parameters in the target network are updated by soft update using equation (10) and equation (11).

θ ^Q' ←τθ ^Q +(1-τ)θ ^Q' (10)；

θ ^π' ←τθ ^π +(1-τ)θ ^π' (11)；

Fig. 3 is a schematic diagram of a model compression and training process in the federal deep learning method for privacy enhancement in an edge computing network according to an embodiment of the present invention. The agent first receives the state value s (t) of L (t) from the environment as input and then outputs a sparseness ratio as action. The underlying lanes are pruned according to action and the target lane number is rounded to the nearest feasible score. The agent then moves to the next layer L (t+1) and continues to receive the state value s (t+1). And repeating until the last layer L (T) is finished, and finally evaluating the accuracy of the rewards on the verification set and returning the accuracy to the intelligent agent.

S23, customizing a label and adjusting sampling probability

The compression local model is compressed using deep reinforcement learning in the above steps S20 to S22, which takes a certain time, possibly resulting in a long training period of FED-PEMC. Therefore, the invention accelerates the training speed by adjusting the sampling probability, sets larger sampling probability for a plurality of types in the sample, sets smaller sampling probability for a few types in the sample, and obtains a sampling probability formula (12)

Wherein k represents a client serial number, yj represents a tag serial number, and N represents the total number of tag categories;

and S24, after the client training is completed, adding a random noise uploading server into the trained model.

Wherein the random noise satisfies a multidimensional Gaussian normal distribution

In this embodiment, compression and training are alternately performed to find a balance point of model accuracy and model size, the sampling label can accelerate model training speed, and random noise prevents inference attack.

Wherein, step S3 further comprises:

the server side adopts Adam algorithm to adaptively change learning rate to optimize training process.

Further, the step S3 specifically includes:

adam's algorithm is an extension of the random gradient algorithm, which keeps the learning rate unchanged, and optimizes the training process by continuously adaptively changing the learning rate during the training process. In edge computing networks, client resources are limited, and Adam's algorithm is not an optimal choice for local training. Therefore, the invention is trained by the Adam algorithm at the server side. The server side keeps updating the two momentum vectors u, m e Rd once per round. Especially, in the t-th round, after the client finishes local training and uploads the local model, the server updates the global model according to the formula (13)

Wherein alpha is ₁ And alpha ₂ Is a momentum parameter of the fluid,η _g is the global learning rate, λ is the adaptation;

from the above, adaptive update is realized at the server side through Adam algorithm.

In summary, the invention has the following beneficial effects: protecting data privacy of federal learning clients: according to the invention, only noise is added in the compressed model, so that the privacy budget explosion problem caused by overlarge model parameters is relieved. The accuracy of model compression is guaranteed: the invention compresses the model by using deep reinforcement learning while training, and accelerates the training process by using custom label sampling. The communication efficiency of federal learning is also improved through the compression model.

It will be apparent that the embodiments described are some, but not all, of the embodiments of the present application. Embodiments and features of embodiments in this application may be combined with each other without conflict. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the detailed description of the embodiments of the present application is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

Claims

1. A federal deep learning method for privacy enhancement in an edge computing network is characterized in that: which comprises the following steps:

2. The federal deep learning method for privacy enhancement in an edge computing network of claim 1, wherein: the specific process of server distribution model in step S1 is as follows:

3. The federal deep learning method for privacy enhancement in an edge computing network of claim 2, wherein: the specific process of alternately performing local training and model compression in step S2 is as follows:

s20, parameter definition: the client trains a model using a gradient descent algorithm, the gradient descent algorithm training model comprising an actor network pi (|θ) ^π ) Criticizing home network Q (|θ) ^Q ) Target actor network pi (|θ) ^π′ ) And a target criticizing home network Q (|θ) ^Q′ ) The method comprises the steps of carrying out a first treatment on the surface of the Three basic elements are defined simultaneously:

state: based on the CNN layer definition, 11 feature parameters are used for each layer i to represent the state, specifically:

S ^l ：(l，n，c，h，w，stride，k，FLOPs[l].reduced，rest，a(l-1)) (1)；

awarding Reward: setting a reward function sensitive to model accuracy, designing the following reward function in inverse proportion to log (flow) and log (#param) from errors:

R _FLOPs ＝-Error×log(FLOPs) (2)；

R _Param ＝-Error×log(#Param) (3)；

π(s|θ ^π )→a (4)；

4. A federal deep learning method for privacy enhancement in an edge computing network according to claim 3, wherein: s21, generating action by adopting truncated normal function based on Ornstein-Uhlenbeck process and noise generation

π′(S _t )～TN(π(s(t)|θ _t ^u ，σ ² ，0，1)) (5)；

5. The federal deep learning method for privacy enhancement in an edge computing network of claim 4, wherein: sigma is initialized to 0.5 in step S21.

6. A federal deep learning method for privacy enhancement in an edge computing network according to claim 3, wherein: training an actor network and a criticizing network simultaneously in deep reinforcement learning in step S22 achieves maximization of jackpot and minimization of error between the evaluation value and the target value,

the loss function of the criticizer network is as follows:

wherein Q (s (t), a) _λ (t)|θ ^Q ) Representing the use of the current Q-function parameter θ in the current state s (t) ^Q Predicted action a _λ The Q value of (t); l (θ) ^Q ) Learning the loss function for Q, representing the parameter θ by adjusting the Q function ^Q Minimizing losses; n is the number of samples, representing the total number of samples that are summed over all samples when calculating the average; t is a time step, representing a time step sequence in reinforcement learning; y (t) is a target value, which represents a target Q value in Q learning and is the sum of the current reward r (t) and the maximum Q value of the next state; the Q value is calculated as follows:

Q(s(t)，a _λ (t))＝E[r(s(t)，a _λ (t))+γQ(s(t+1)，π(s(t+1)))] (7)；

wherein y (t) is calculated as follows:

y(t)＝r(t)+γQ′(s(t+1)，π′(s(t+1)|θ ^π′ )|θ ^Q′ ) (8)；

wherein Q '(s (t+1), pi' (s (t+1) |θ) ^π′ )|θ ^Q′ ) Represents a next action pi '(s (t+1) |θ) predicted according to the target policy pi' in the next state s (t+1) ^π′ )|θ ^Q′ ) The corresponding Q value; θ ^Q′ A function representing the Q value for the Q function parameter; r (t) meterShowing the instant rewards obtained after the action is executed under the current state s (t);

the loss function of the actor network is as follows:

θ ^Q′ ←rθ ^Q +(1-τ)θ ^Q′ (10)；

θ ^π′ ←τθ ^π +(1-τ)θ ^π′ (11)。

7. a federal deep learning method for privacy enhancement in an edge computing network according to claim 3, wherein: s24, random noise meets multidimensional Gaussian normal distribution