CN114913390A - Method for improving personalized federal learning performance based on data augmentation of conditional GAN - Google Patents
Method for improving personalized federal learning performance based on data augmentation of conditional GAN Download PDFInfo
- Publication number
- CN114913390A CN114913390A CN202210486378.1A CN202210486378A CN114913390A CN 114913390 A CN114913390 A CN 114913390A CN 202210486378 A CN202210486378 A CN 202210486378A CN 114913390 A CN114913390 A CN 114913390A
- Authority
- CN
- China
- Prior art keywords
- model
- data
- gan
- federal learning
- personalized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 238000013434 data augmentation Methods 0.000 title claims abstract description 52
- 230000007786 learning performance Effects 0.000 title claims abstract description 21
- 238000012360 testing method Methods 0.000 claims abstract description 33
- 238000013256 Gubra-Amylin NASH model Methods 0.000 claims abstract description 12
- 238000009826 distribution Methods 0.000 claims description 37
- 230000006870 function Effects 0.000 claims description 37
- 238000012549 training Methods 0.000 claims description 35
- 238000005457 optimization Methods 0.000 claims description 21
- 238000005070 sampling Methods 0.000 claims description 17
- 238000004891 communication Methods 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 10
- 230000002776 aggregation Effects 0.000 claims description 9
- 238000004220 aggregation Methods 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 6
- 230000008901 benefit Effects 0.000 claims description 5
- 230000003321 amplification Effects 0.000 claims description 4
- 230000003190 augmentative effect Effects 0.000 claims description 4
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 238000004088 simulation Methods 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 241000764238 Isis Species 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 3
- 230000008485 antagonism Effects 0.000 claims description 3
- 238000010835 comparative analysis Methods 0.000 claims description 3
- 238000013461 design Methods 0.000 claims description 3
- 230000006872 improvement Effects 0.000 claims 4
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000001143 conditioned effect Effects 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Abstract
The invention discloses a method for improving individualized federal learning performance by data augmentation based on conditional GAN, which comprises the following steps of establishing an individualized federal learning pFedMe model; establishing a conditional GAN model, and then adding the conditional GAN model into a pFedeMe model in a personalized federal learning manner to obtain a pFedMe model based on the conditional GAN; acquiring a CIFAR10 data set, and realizing data augmentation through the pFedMe model based on the condition GAN; and obtaining the accuracy of the data augmentation method based on the conditional GAN on the test set. The method can obtain the condition GAN to effectively improve the personalized federal learning performance, and has practical value for using the condition GAN to perform data augmentation on the model.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a method for improving personalized federal learning performance based on data augmentation of a conditional generation countermeasure network (GAN).
Background
Federated learning is a privacy-preserving machine learning technique in which a set of clients collaboratively learn a global model with a server without sharing client data. One of the core challenges of the federal learning problem is to overcome the performance loss caused by statistical heterogeneity between clients, which studies have shown to limit the global model to provide good performance on a per-client task. While the personalized federal learning (pFedMe) algorithm, which uses the Moreau envelope as a client regularization loss function, helps the optimization of the personalized model.
Today, the development of federal learning is stimulated by the large amount of data generated in a large number of handheld devices. The federated learning scenario involves a large number of clients connected to a server, with the goal of building a global model in a privacy-preserving and communication-efficient manner. Despite the advantages of data privacy protection and low communication overhead, federal learning faces the following challenges that affect its performance and convergence speed: statistical heterogeneity, which means that the data distribution is different between clients. A global model trained using these non-uniformly distributed data is difficult to have good performance on data for each customer. When statistical heterogeneity increases, the generalization error of the global model to the customer local data also increases significantly. On the other hand, local learning without federal learning (i.e., without client cooperation) may also have large generalization errors due to insufficient data.
Training deep learning models requires a large number of training samples, and insufficient training data can lead to severe overfitting problems and reduce model accuracy. In practice, collecting a large number of samples to train a deep learning model requires time and knowledge in the relevant field, which is both expensive and difficult. Data augmentation is one of the common techniques for solving the above problems, and it can increase the relevant data in the data set, and let the model learn more data-related characteristics. And then overfitting is effectively avoided, so that the trained model has higher robustness, and the generalization performance is obviously improved. Data augmentation is widely used to improve the performance of image and text classification tasks, and advanced methods such as GAN and conditional GAN have designed and optimized data augmentation schemes for these tasks and achieved better results. However, the impact of data augmentation on personalized federal learning has not been fully studied.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a method for improving the performance of personalized federal learning based on data augmentation of conditional GAN, and the method can improve the performance of a personalized federal learning model to the maximum extent.
The technical scheme is as follows: in order to achieve the above-mentioned object, the present invention provides a method for improving personalized federal learning performance based on data augmentation of conditional GAN, comprising the steps of,
step 1, establishing a pFedMe model for personalized federal learning;
step 2, adding the conditional GAN model into the pFedme model in the step 1 according to an individualized federal learning mode;
step 3, acquiring a training set of a CIFAR10 data set, and training the training set through the pFedMe model based on the condition GAN in the step 2;
and 4, respectively carrying out performance comparative analysis on the test set of the CIFAR10 data set in an original pFidMe model (namely the pFidMe model in the step 1) without using a data augmentation method through repeatedly sampling the model trained by the augmented data and the pFidMe model trained by the data augmentation in the step 3.
Further, in the present invention: said step 1 further comprises the step of,
step 1-1, in traditional federal learning, there are N clients communicating with one server to solve the following problems:
to find a global model omega. Function f i :N denotes the period of data distribution of the client iExpectation-loss function:
wherein xi is i Is a random data sample drawn according to the distribution of the client i, andis the loss function for that sample and ω. In federal learning, a client may have a non-independent, non-identically distributed data distribution, i.e., ξ i And xi j The distribution of (j ≠ i) is different;
step 1-2, in the federal learning personalization problem under consideration, the optimization objective is not to obtain the optimal solution to the problem in step 1-1 above, but rather to hopefully provide a specific model for users with different data distributions. For this we use a per client with l 2 A normalized loss function of norm as follows:
wherein, theta i Representing the personalized model of client i, λ is a regularization parameter that controls the strength of ω to the personalized model. While large λ values may benefit clients with unreliable data from rich data aggregation, small λ values may help clients with sufficient useful data to be preferentially personalized. Note λ e (0, ∞) to avoid the extremes of λ 0 (i.e. no federal learning), or λ ∞ (i.e. no individualized federal learning). The main idea of the regularization loss function is to allow a client to constrain their own model from different gradient directions, and to ensure that the local model of the user is not far away from the global "reference point" ω. Based on this, the personalized federal learned optimization objective can be written as a two-tier optimization problem as follows:
In pFedMe, though the global model ω is found by using data aggregation from multiple clients at the outer layer, θ i The data distribution with respect to client i is optimized and internally maintains a bounded distance from ω. F i The definition of (ω) is a Moreau envelope method widely adopted in the field of optimization, which facilitates various learning algorithm designs. Optimized personalized modelIs the best solution to the pFedMe internal problem, defined as follows:
step 1-3, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model omega to all clients in each communication round t t . Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients t Where the latest local model is received for model averaging.
In the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2 to obtain a personalized model thereofWherein the content of the first and second substances,a local model representing client i in global round t and local round r. The purpose of the local model is to facilitate the construction of a global model, reducing the number of communications between the client and the server. Second, in the outer-layer optimization, the local update that client i uses gradient descent is for F i (instead of f) i ) As follows:
steps 1-4, for the actual algorithm, we useIs expressed as satisfyingObtained byA gradient is usually requiredHowever, this requires ξ i Distribution of (2). In practice, by applying to small batches of data T i Sampling is carried out using the following pairsUnbiased estimation of (d):
so thatSecond, in general, obtainingThe closed-form solution of (a) is not simple. In contrast, a first order iterative approach is typically used to obtain a high precision approximationDefining:
suppose λ is chosen such that the loss function isIs strongly convex and then a gradient descent (Nesterov's accelerated gradient descent) is applied to obtainSo that:
otherwise, O (-) is a hidden constant. δ can then be adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level ν.
Further, in the present invention: said step 2 further comprises the step of,
and 2-1, introducing GAN as a framework for training a generated model. The GAN generative model can train any kind of generator network, and most other generative models require some specific functional form of the generator, as if the output had to be gaussian distributed. In addition, although other generative models, such as variational self-encoders, can learn the distribution of data, studies have shown that images generated by such generative models tend to be relatively blurred. However, GAN can learn a model that produces sample points only close to the real data (neural network layer), i.e., the distribution learned by GAN is very close to the real distribution.
GAN consists of a set of "confrontation" models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that the sample is from the real data, rather than the generative model G. Both G and D may be non-linear mapping functions, such as multi-layer perceptrons.
To learn the generator distribution p using data x g The generator constructs a prior noise distribution p z (z) mapping function G (z; o) to data space g ). Discriminator D (x; o) d ) Outputting a scalar quantity, representing that x is from real data instead of p g The probability of (c).
Both G and D require training: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logd (x) as if they followed a game with a cost function V (G, D), i.e. the min-max optimization problem as follows:
step 2-2, the type of data being generated cannot be controlled in the unconditional generative model. However, by adjusting the model based on the additional information, the data generation process may be guided. Such conditions may be based on class labels, some a priori information, or even data information from different modalities.
When both the generator and the arbiter are conditioned with some extra information y, the GAN can be extended to a conditional model, i.e. a conditional GAN. y may be any type of auxiliary information such as category labels or data information from other modalities. We can perform the adjustment by inputting y as an additional input layer into the discriminator and generator.
In the generator, a priori input noise p z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework. In the discriminator, x and y are used as inputs and a discrimination function. The objective functions V (D, G) of the two min-max games are as follows:
and 2-3, respectively passing the data of the N clients through the trained conditional GAN model, then sending the respective specific local model values to the cloud server side, further calculating the aggregation average value of the models by the cloud server side, sending the value back to the N clients, and finally updating the respective local model values by the clients. Adding the trained conditional GAN model into the pFedMe model in the step 1 according to an individualized federal learning mode, thereby realizing data augmentation and obtaining the pFedMe model based on the conditional GAN.
Further, in the present invention: the step 3 further comprises the step of,
step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32 x 32, and all images fall into 10 different categories. We distribute the complete data set to N-20 clients;
step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federal learning task, and marking out 75% and 25% of data for training and testing by the data set of each user through random sampling, wherein the data in the training set and the data in the testing set are not overlapped;
step 3-3, firstly, sending the training sample to the pFidMe model based on the condition GAN in the step 2, and keeping the test set sample the same as that in the step 3-2 to obtain the accuracy of the data augmentation method of the pFidMe model based on the condition GAN on the CIFAR10 test set;
and 3-4, directly copying a training sample through repeated sampling to expand data, then sending the data to the pFedMe model in the step 1, and keeping the same sample of the CIFAR10 test set as that in the step 3-2 to obtain the accuracy of the data amplification method through repeated sampling on the test set.
Further, in the present invention: the step 4 further comprises the step of,
step 4-1, directly passing the CIFAR10 test set in the step 3 through a pFedMe model which does not use any data augmentation method in the step 1;
and 4-2, carrying out pairwise comparison analysis on the accuracy in the step 4-1 and the accuracy in the step 3 based on the two data augmentation methods to obtain a method for improving the personalized federal learning performance.
Has the advantages that: compared with the prior art, the invention has the beneficial effects that:
(1) the method for improving the personalized federal learning performance is obtained by comparing the influence of the unused data augmentation method and the two types of used data augmentation methods on the personalized federal learning performance;
(2) the invention proposes to augment a training data set for personalized federal learning using a generative model based on a conditional GAN, which as a generative model can be used to generate data under fixed conditions. The invention provides a communication mode based on personalized federal learning, which adds a condition GAN into a personalized federal learning model to realize data augmentation, rather than simply performing data augmentation on a data set.
Drawings
FIG. 1 is a schematic overall flow chart of a method for enhancing and improving personalized federal learning performance based on conditional GAN data according to the present invention;
FIG. 2 is a schematic structural diagram of a pFedMe model based on conditional GAN in the present invention;
fig. 3 is a schematic diagram showing a comparison of performance curves of the personalized federal learning model of the CIFAR10 data set under three conditions of no data augmentation method, a method of repeatedly sampling augmented data, and a data augmentation method based on the conditional GAN in the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the drawings and the detailed implementation mode as follows:
as shown in fig. 1, an overall flow chart of the method for improving personalized federal learning performance based on conditional GAN data augmentation is shown, and the method specifically comprises the following steps,
step 1, establishing a pFedMe model for personalized federal learning, wherein N clients communicate with a server;
specifically, the step 1 further comprises the following steps,
1-1, constructing a personalized federal learning model pFedMe with better performance on the basis of the traditional federal learning;
further, the construction of the pFedMe model further comprises the following steps:
step 1-1-1, there are N clients communicating with one server to solve the following problems:
to find a global model omega. Function f i :1, N denotes the expected loss function of the data distribution of client i:
wherein ξ i Is a random data sample drawn according to the distribution of the client i, andis the loss function for that sample and ω. In federal learning, a client may have a non-independent, non-identically distributed data distribution, i.e., ξ i And xi j The distribution of (j ≠ i) is different;
step 1-1-2, use the band l for each client 2 A normalized loss function of norm as follows:
wherein, theta i Representing the personalized model of client i, λ is a regularization parameter that controls the strength of ω to the personalized model. Although the lambda takes a larger value to prevent the data from being aggregated in rich dataReliable customers benefit, but a smaller value for λ may help customers with sufficient useful data to personalize preferentially. Note λ e (0, ∞) to avoid the extremes of λ ∞ 0 (i.e. no federal learning), or λ ∞ (i.e. no personalized federal learning). The main idea of the regularization loss function is to allow a client to constrain their own model from different gradient directions, and to ensure that the local model of the user is not far away from the global "reference point" ω. Based on this, the personalized federal learned optimization objective can be written as a two-tier optimization problem as follows:
step 1-2-1, in pFedMe, θ, although the global model ω is found by using data aggregation from multiple clients at the outer layer i The data distribution with respect to client i is optimized and internally maintains a bounded distance from ω. F i The definition of (ω) is a Moreau envelope method widely adopted in the field of optimization, which facilitates various learning algorithm designs. Optimized personalized modelIs the best solution to the pFedMe internal problem, defined as follows:
step 1-2-2, for the actual algorithm, we useIs expressed as satisfyingObtained byA gradient is usually requiredHowever, this requires ξ i Distribution of (2). In practice, by applying to small batches of data T i Sampling is performed using the following pairsUnbiased estimation of (d):
so thatSecond, in general, obtainingThe closed-form solution of (a) is not simple. In contrast, a first order iterative approach is typically used to obtain a high precision approximationDefining:
suppose λ is chosen such that the loss function isIs strongly convex and then a gradient descent (Nesterov's accelerated gradient descent) is applied to obtainSo that:
number of times of calculationWhere d is the diameter of the search space, v is the accuracy level, and O (-) is a hidden constant. δ can then be adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level v.
step 1-3-1, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model ω to all clients in each communication round t t . Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients t Where the latest local model is received for model averaging.
In the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2-1 to obtain a personalized model thereofWherein the content of the first and second substances,a local model representing client i in global round t and local round r. The purpose of the local model is to facilitate the construction of a global model, reducing the number of communications between the client and the server. Second, in the outer-layer optimization, the local update that client i uses gradient descent is for F i (instead of f) i ) As follows:
Step 2, adding the conditional GAN model into the pFedme model in the step 1 according to an individualized federal learning mode;
specifically, the step 2 further comprises the following steps,
and 2-1, introducing GAN as a framework for training a generated model. The GAN generative model can train any kind of generator network, and most other generative models require some specific functional form of the generator, as if the output had to be gaussian distributed. In addition, although other generative models, such as variational self-encoders, can learn the distribution of data, studies have shown that images generated by such generative models tend to be blurred. However, GAN can learn a model that produces sample points only close to the real data (neural network layer), i.e., the distribution learned by GAN is very close to the real distribution.
GAN consists of a set of "confrontation" models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that the sample is from the real data, rather than the generative model G. Both G and D may be non-linear mapping functions, such as multi-layer perceptrons.
To learn the generator distribution p using data x g The generator constructs a prior noise distribution p z (z) mapping function G (z; o) to data space g ). Discriminator D (x; o) d ) Outputting a scalar quantity, representing that x is from real data instead of p g The probability of (c).
Both G and D require training: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logD (x) as if they followed a game with a cost function V (G, D), i.e., the following min-max optimization problem:
step 2-2, the type of data being generated cannot be controlled in the unconditional generative model. However, by adjusting the model based on the additional information, the data generation process may be guided. Such conditions may be based on class labels, some a priori information, or even data information from different modalities.
When both the generator and the arbiter are conditioned with some extra information y, the GAN can be extended to a conditional model, i.e. a conditional GAN. y may be any type of ancillary information such as category labels or data information from other modalities. We can perform the adjustment by inputting y as an additional input layer into the discriminator and generator.
In the generator, a priori input noise p z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework. In the discriminator, x and y are used as inputs and a discrimination function. The objective functions V (D, G) of the two min-max games are as follows:
and 2-3, respectively passing the data of the N clients through the trained conditional GAN model, then sending the respective unique local model values to the cloud server, calculating the aggregation average value of the models by the cloud server, sending the aggregation average value back to the N clients, and finally updating the respective local model values by the clients. Namely, the trained conditional GAN model is added to the pFedMe model in step 1 according to a personalized federal learning manner, so that data augmentation is realized, and the pFedMe model based on the conditional GAN is obtained, as shown in fig. 2.
Further, in this embodiment, the following table 1 is specifically set for the main parameters of the neural network of the condition GAN in the personalized federal learning task:
table 1: conditional GAN parameter setting table
In the experimental process, a ReLu function is used as an activation function of a hidden layer of a conditional GAN neural network; and adding the trained conditional GAN model into the pFDeMe model according to the mode of figure 2 to achieve the effect of data augmentation.
Step 3, training a training set of the CIFAR10 data set through the pFedMe model based on the condition GAN in the step 2; acquiring a training set of a CIFAR10 data set, repeatedly sampling the training set to expand data, and sending the data into a pFidme model for training;
specifically, the step 3 further comprises the following steps,
in step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32X 32, and all pictures belong to 10 different categories. Due to the limitation of CIFAR10 data size, we distribute a complete data set to N-20 clients;
step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federated learning task, randomly splitting all the data sets, respectively training and testing 75% and 25% of the data sets, wherein the data in the training set and the data in the testing set are not overlapped;
step 3-3, firstly, sending the training sample to the pFidMe model based on the condition GAN in the step 2, and keeping the test set sample the same as that in the step 3-2 to obtain the accuracy of the data augmentation method of the pFidMe model based on the condition GAN on the CIFAR10 test set;
and 3-4, directly copying the training sample to expand the data, then sending the data to the pFedMe model in the step 1, and keeping the sample of the CIFAR10 test set the same as that in the step 3-2 to obtain the accuracy of the data amplification method for directly copying the data on the test set.
As shown in fig. 3, the total global communication round number is set to 800 times during simulation, and the accuracy on the test set is expressed in percentage. As can be seen from the data augmentation method curve based on the condition GAN in the figure, the accuracy of the test set is in an increasing mode along with the increase of the number of global communication rounds, and finally can reach 65%; as can be seen from the test set accuracy curve in the data augmentation method based on repeated sampling, when the global communication times are few, large-amplitude fluctuation occurs. With the increase of the number of communication rounds, the accuracy of the test set is in a smaller function growth trend and finally reaches 55% (10% smaller than the data augmentation method based on the conditional GAN).
specifically, the step 4 further comprises the following steps,
step 4-1, directly passing the CIFAR10 test set in the step 3 through a pFedMe model which does not use any data augmentation method in the step 1;
and 4-2, carrying out pairwise comparison analysis on the accuracy in the step 4-1 and the accuracy in the step 3 based on the two data augmentation methods to obtain a method for improving the personalized federal learning performance.
Comparing the test set accuracy curve in fig. 3, it can be seen that when the number of global communication rounds is greater than about 180, the accuracy of the data augmentation method based on the conditional GAN on the test set is significantly higher than that of the method using the data augmentation data set directly copied, i.e., the data augmentation method based on the conditional GAN has a better effect of improving the personalized federal learning performance. It can also be seen that the accuracy of the test set is substantially the same without using the data expansion method and using the method of directly copying the data expansion data set, except that when the number of global communication rounds is small (about less than or equal to 20), the method of directly copying the data expansion data set even reduces the performance of the original model, and it can be seen that the method of directly copying the data expansion data set cannot achieve the data expansion in the true sense, and cannot improve the performance of the model. In contrast, we believe that the performance of personalized federal learning can be greatly improved using a conditional GAN-based data augmentation method.
The invention realizes a scheme for improving the personalized federal learning performance based on the data amplification of the depth condition GAN aiming at the image processing problem. The method for adding the condition GAN into the model for data augmentation in the personalized federal learning mode based on the personalized federal learning method improves the personalized federal learning performance, and has important guiding significance for data augmentation of the use condition GAN.
It should be noted that the above-mentioned examples only represent some embodiments of the present invention, and the description thereof should not be construed as limiting the scope of the present invention. It should be noted that, for those skilled in the art, various modifications can be made without departing from the spirit of the present invention, and these modifications should fall within the scope of the present invention.
Claims (5)
1. The method for improving the personalized federal learning performance based on the data augmentation of the conditional GAN is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1, establishing a pFedMe model for personalized federal learning;
step 2, adding the conditional GAN model into the pFedme model in the step 1 according to an individualized federal learning mode;
step 3, acquiring a training set of a CIFAR10 data set, and training the training set through the pFedMe model based on the condition GAN in the step 2;
and 4, respectively carrying out performance comparative analysis on the test set of the CIFAR10 data set in an original pFidMe model which does not use a data augmentation method, namely the pFidMe model in the step 1, by repeatedly sampling the model trained by the augmented data and the pFidMe model trained by the augmented data in the step 3.
2. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 1, wherein: said step 1 further comprises the step of,
step 1-1, in traditional federal learning, N clients communicate with one server to solve the following problems:
to find a global model ω; function(s)Expected loss function representing client i data distribution:
wherein ξ i Is a random data sample drawn according to the distribution of the client i, andis the loss function for that sample and ω. In federal learning, a client may have a non-independent, non-identically distributed data distribution, i.e., ξ i And xi j The distribution of (j ≠ i) is different;
step 1-2, in the federal learning personalization problem under consideration, the optimization objective is not to obtain the optimal solution to the problem in step 1-1 above, but is rather expected to have different data distributionsThe user of (2) providing a particular model; for this purpose, a band l is used for each client 2 A normalized loss function of norm as follows:
wherein, theta i Representing the personalized model of the client i, λ is a regularization parameter that controls the strength of ω to the personalized model; while λ takes a larger value to benefit clients with unreliable data from rich data aggregation, λ takes a smaller value to help clients with enough useful data to personalize preferentially; note λ ∈ (0, ∞) to avoid the extreme case of λ ═ 0, or λ ∞; the regularization loss function is a model which enables a client to constrain the client from different gradient directions, and simultaneously ensures that a local model of the client is not far away from a global 'reference point' omega; based on this, the personalized federal learned optimization objective is written as a two-layer optimization problem as follows:
In pFedMe, though the global model ω is found by using data aggregation from multiple clients at the outer layer, θ i The data distribution of the client i is optimized, and a bounded distance from omega is kept inside; f i The definition of (omega) is a Moreau envelope method widely adopted in the field of optimization, which is helpful for the design of various learning algorithms; optimized personalized modelIs the best solution to the pFedMe internal problem, defined as follows:
step 1-3, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model omega to all clients in each communication round t t (ii) a Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients t Receiving the latest local model for model averaging;
in the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2 to obtain a personalized model thereofWherein, the first and the second end of the pipe are connected with each other,a local model representing a client i in a global round t and a local round r; the purpose of the local model is to help construct a global model and reduce the communication times between the client and the server; second, in the outer-layer optimization, the local update that client i uses gradient descent is for F i As follows:
Obtained byA gradient is usually requiredHowever, this requires ξ i The distribution of (a); by applying to small batch data T i Sampling is performed using the following pairsUnbiased estimation of (2):
so thatSecond, a first order iterative method is used to obtain a high precision approximationDefining:
suppose λ is chosen such that the loss function isIs strongly convex and then obtained by applying a gradient descentSo that:
3. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN according to claim 1 or 2, wherein: said step 2 further comprises the step of,
step 2-1, introducing GAN as a framework for training a generation model; the GAN generation model trains any generator network; in addition, the GAN learns a model that only generates sample points near the real data, i.e., the distribution learned by the GAN is very close to the real distribution;
GAN consists of a set of "confrontation" models: a generative model G capturing the data distribution, and a discriminative model D estimating the probability that the sample is from the real data instead of the generative model G; both G and D may be non-linear mapping functions;
to learn the generator distribution p using data x g The generator constructs a prior noise distribution p z (z) mapping function G (z; o) to data space g ) (ii) a Discriminator D (x; o) d ) Outputting a scalar quantity, representing that x is from real data instead of p g The probability of (d);
both G and D need to be trained: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logD (x) as if they followed a game with a cost function V (G, D), i.e., the following min-max optimization problem:
step 2-2, in the unconditional generation model, the type of the data being generated cannot be controlled; however, by adjusting the model according to the additional information, the data generation process is guided; this condition is based on class labels, some a priori information, and even data information from different modalities;
when the generator and the arbiter both condition some extra information y, then GAN is extended to the condition model, i.e. the condition GAN; y is any type of auxiliary information; we perform the adjustment by inputting y as an additional input layer into the discriminator and generator;
in the generator, a priori input noise p z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework; in the discriminator, x and y are used as input and discrimination functions; the objective functions V (D, G) of the two min-max games are as follows:
step 2-3, passing the respective data of the N clients through the trained conditional GAN model respectively, and then sending respective unique local model values to the cloud server side, further calculating the aggregation average value of the models by the cloud server side, sending the value back to the N clients, and finally updating the respective local model values by the clients; adding the trained conditional GAN model into the pFedMe model in the step 1 according to an individualized federal learning mode, thereby realizing data augmentation and obtaining the pFedMe model based on the conditional GAN.
4. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 3, wherein: the step 3 further comprises the step of,
step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32 x 32, and all images fall into 10 different categories; distributing the complete data set to N-20 clients;
step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federal learning task, and marking out 75% and 25% of data for training and testing by the data set of each user through random sampling, wherein the data in the training set and the data in the testing set are not overlapped;
step 3-3, firstly, sending the training sample to the pFidMe model based on the condition GAN in the step 2, and keeping the test set sample the same as that in the step 3-2 to obtain the accuracy of the data augmentation method of the pFidMe model based on the condition GAN on the CIFAR10 test set;
and 3-4, directly copying a training sample through repeated sampling to expand data, then sending the data to the pFedMe model in the step 1, and keeping the same sample of the CIFAR10 test set as that in the step 3-2 to obtain the accuracy of the data amplification method through repeated sampling on the test set.
5. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 4, wherein: the step 4 further comprises the step of,
step 4-1, directly passing the CIFAR10 test set in the step 3 through a pFedMe model which does not use any data augmentation method in the step 1;
and 4-2, carrying out pairwise comparison analysis on the accuracy in the step 4-1 and the accuracy in the step 3 based on the two data augmentation methods to obtain a method for improving the personalized federal learning performance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210486378.1A CN114913390A (en) | 2022-05-06 | 2022-05-06 | Method for improving personalized federal learning performance based on data augmentation of conditional GAN |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210486378.1A CN114913390A (en) | 2022-05-06 | 2022-05-06 | Method for improving personalized federal learning performance based on data augmentation of conditional GAN |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114913390A true CN114913390A (en) | 2022-08-16 |
Family
ID=82767022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210486378.1A Pending CN114913390A (en) | 2022-05-06 | 2022-05-06 | Method for improving personalized federal learning performance based on data augmentation of conditional GAN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114913390A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021120676A1 (en) * | 2020-06-30 | 2021-06-24 | 平安科技(深圳)有限公司 | Model training method for federated learning network, and related device |
CN113468521A (en) * | 2021-07-01 | 2021-10-01 | 哈尔滨工程大学 | Data protection method for federal learning intrusion detection based on GAN |
CN113762530A (en) * | 2021-09-28 | 2021-12-07 | 北京航空航天大学 | Privacy protection-oriented precision feedback federal learning method |
CN114021738A (en) * | 2021-11-23 | 2022-02-08 | 湖南三湘银行股份有限公司 | Distributed generation countermeasure model-based federal learning method |
-
2022
- 2022-05-06 CN CN202210486378.1A patent/CN114913390A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021120676A1 (en) * | 2020-06-30 | 2021-06-24 | 平安科技(深圳)有限公司 | Model training method for federated learning network, and related device |
CN113468521A (en) * | 2021-07-01 | 2021-10-01 | 哈尔滨工程大学 | Data protection method for federal learning intrusion detection based on GAN |
CN113762530A (en) * | 2021-09-28 | 2021-12-07 | 北京航空航天大学 | Privacy protection-oriented precision feedback federal learning method |
CN114021738A (en) * | 2021-11-23 | 2022-02-08 | 湖南三湘银行股份有限公司 | Distributed generation countermeasure model-based federal learning method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Thapa et al. | Splitfed: When federated learning meets split learning | |
CN107392255B (en) | Generation method and device of minority picture sample, computing equipment and storage medium | |
CN108229479B (en) | Training method and device of semantic segmentation model, electronic equipment and storage medium | |
US20200073968A1 (en) | Sketch-based image retrieval techniques using generative domain migration hashing | |
CN108399428B (en) | Triple loss function design method based on trace ratio criterion | |
CN107506822B (en) | Deep neural network method based on space fusion pooling | |
US11636314B2 (en) | Training neural networks using a clustering loss | |
CN111460528B (en) | Multi-party combined training method and system based on Adam optimization algorithm | |
CN111260754B (en) | Face image editing method and device and storage medium | |
CN110598806A (en) | Handwritten digit generation method for generating countermeasure network based on parameter optimization | |
EP4350572A1 (en) | Method, apparatus and system for generating neural network model, devices, medium and program product | |
CN113850272A (en) | Local differential privacy-based federal learning image classification method | |
CN108197561B (en) | Face recognition model optimization control method, device, equipment and storage medium | |
US20210019654A1 (en) | Sampled Softmax with Random Fourier Features | |
CN110929839A (en) | Method and apparatus for training neural network, electronic device, and computer storage medium | |
WO2024027164A1 (en) | Adaptive personalized federated learning method supporting heterogeneous model | |
WO2023061169A1 (en) | Image style migration method and apparatus, image style migration model training method and apparatus, and device and medium | |
CN115271101A (en) | Personalized federal learning method based on graph convolution hyper-network | |
CN115587633A (en) | Personalized federal learning method based on parameter layering | |
CN115204416A (en) | Heterogeneous client-oriented joint learning method based on hierarchical sampling optimization | |
CN111324731B (en) | Computer-implemented method for embedding words of corpus | |
CN114913390A (en) | Method for improving personalized federal learning performance based on data augmentation of conditional GAN | |
CN116561622A (en) | Federal learning method for class unbalanced data distribution | |
CN115759297A (en) | Method, device, medium and computer equipment for federated learning | |
US20220108220A1 (en) | Systems And Methods For Performing Automatic Label Smoothing Of Augmented Training Data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |