CN114913390A - Method for improving personalized federal learning performance based on data augmentation of conditional GAN - Google Patents

Method for improving personalized federal learning performance based on data augmentation of conditional GAN Download PDF

Info

Publication number
CN114913390A
CN114913390A CN202210486378.1A CN202210486378A CN114913390A CN 114913390 A CN114913390 A CN 114913390A CN 202210486378 A CN202210486378 A CN 202210486378A CN 114913390 A CN114913390 A CN 114913390A
Authority
CN
China
Prior art keywords
model
data
gan
federal learning
personalized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210486378.1A
Other languages
Chinese (zh)
Inventor
杨绿溪
李林育
张征明
李春国
黄永明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202210486378.1A priority Critical patent/CN114913390A/en
Publication of CN114913390A publication Critical patent/CN114913390A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention discloses a method for improving individualized federal learning performance by data augmentation based on conditional GAN, which comprises the following steps of establishing an individualized federal learning pFedMe model; establishing a conditional GAN model, and then adding the conditional GAN model into a pFedeMe model in a personalized federal learning manner to obtain a pFedMe model based on the conditional GAN; acquiring a CIFAR10 data set, and realizing data augmentation through the pFedMe model based on the condition GAN; and obtaining the accuracy of the data augmentation method based on the conditional GAN on the test set. The method can obtain the condition GAN to effectively improve the personalized federal learning performance, and has practical value for using the condition GAN to perform data augmentation on the model.

Description

Method for improving personalized federal learning performance based on data augmentation of conditional GAN
Technical Field
The invention relates to the technical field of image processing, in particular to a method for improving personalized federal learning performance based on data augmentation of a conditional generation countermeasure network (GAN).
Background
Federated learning is a privacy-preserving machine learning technique in which a set of clients collaboratively learn a global model with a server without sharing client data. One of the core challenges of the federal learning problem is to overcome the performance loss caused by statistical heterogeneity between clients, which studies have shown to limit the global model to provide good performance on a per-client task. While the personalized federal learning (pFedMe) algorithm, which uses the Moreau envelope as a client regularization loss function, helps the optimization of the personalized model.
Today, the development of federal learning is stimulated by the large amount of data generated in a large number of handheld devices. The federated learning scenario involves a large number of clients connected to a server, with the goal of building a global model in a privacy-preserving and communication-efficient manner. Despite the advantages of data privacy protection and low communication overhead, federal learning faces the following challenges that affect its performance and convergence speed: statistical heterogeneity, which means that the data distribution is different between clients. A global model trained using these non-uniformly distributed data is difficult to have good performance on data for each customer. When statistical heterogeneity increases, the generalization error of the global model to the customer local data also increases significantly. On the other hand, local learning without federal learning (i.e., without client cooperation) may also have large generalization errors due to insufficient data.
Training deep learning models requires a large number of training samples, and insufficient training data can lead to severe overfitting problems and reduce model accuracy. In practice, collecting a large number of samples to train a deep learning model requires time and knowledge in the relevant field, which is both expensive and difficult. Data augmentation is one of the common techniques for solving the above problems, and it can increase the relevant data in the data set, and let the model learn more data-related characteristics. And then overfitting is effectively avoided, so that the trained model has higher robustness, and the generalization performance is obviously improved. Data augmentation is widely used to improve the performance of image and text classification tasks, and advanced methods such as GAN and conditional GAN have designed and optimized data augmentation schemes for these tasks and achieved better results. However, the impact of data augmentation on personalized federal learning has not been fully studied.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a method for improving the performance of personalized federal learning based on data augmentation of conditional GAN, and the method can improve the performance of a personalized federal learning model to the maximum extent.
The technical scheme is as follows: in order to achieve the above-mentioned object, the present invention provides a method for improving personalized federal learning performance based on data augmentation of conditional GAN, comprising the steps of,
step 1, establishing a pFedMe model for personalized federal learning;
step 2, adding the conditional GAN model into the pFedme model in the step 1 according to an individualized federal learning mode;
step 3, acquiring a training set of a CIFAR10 data set, and training the training set through the pFedMe model based on the condition GAN in the step 2;
and 4, respectively carrying out performance comparative analysis on the test set of the CIFAR10 data set in an original pFidMe model (namely the pFidMe model in the step 1) without using a data augmentation method through repeatedly sampling the model trained by the augmented data and the pFidMe model trained by the data augmentation in the step 3.
Further, in the present invention: said step 1 further comprises the step of,
step 1-1, in traditional federal learning, there are N clients communicating with one server to solve the following problems:
Figure BDA0003629311780000021
to find a global model omega. Function f i :
Figure BDA0003629311780000022
N denotes the period of data distribution of the client iExpectation-loss function:
Figure BDA0003629311780000023
wherein xi is i Is a random data sample drawn according to the distribution of the client i, and
Figure BDA0003629311780000024
is the loss function for that sample and ω. In federal learning, a client may have a non-independent, non-identically distributed data distribution, i.e., ξ i And xi j The distribution of (j ≠ i) is different;
step 1-2, in the federal learning personalization problem under consideration, the optimization objective is not to obtain the optimal solution to the problem in step 1-1 above, but rather to hopefully provide a specific model for users with different data distributions. For this we use a per client with l 2 A normalized loss function of norm as follows:
Figure BDA0003629311780000025
wherein, theta i Representing the personalized model of client i, λ is a regularization parameter that controls the strength of ω to the personalized model. While large λ values may benefit clients with unreliable data from rich data aggregation, small λ values may help clients with sufficient useful data to be preferentially personalized. Note λ e (0, ∞) to avoid the extremes of λ 0 (i.e. no federal learning), or λ ∞ (i.e. no individualized federal learning). The main idea of the regularization loss function is to allow a client to constrain their own model from different gradient directions, and to ensure that the local model of the user is not far away from the global "reference point" ω. Based on this, the personalized federal learned optimization objective can be written as a two-tier optimization problem as follows:
Figure BDA0003629311780000031
wherein
Figure BDA0003629311780000032
In pFedMe, though the global model ω is found by using data aggregation from multiple clients at the outer layer, θ i The data distribution with respect to client i is optimized and internally maintains a bounded distance from ω. F i The definition of (ω) is a Moreau envelope method widely adopted in the field of optimization, which facilitates various learning algorithm designs. Optimized personalized model
Figure BDA0003629311780000033
Is the best solution to the pFedMe internal problem, defined as follows:
Figure BDA0003629311780000034
step 1-3, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model omega to all clients in each communication round t t . Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients t Where the latest local model is received for model averaging.
In the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2 to obtain a personalized model thereof
Figure BDA0003629311780000035
Wherein the content of the first and second substances,
Figure BDA0003629311780000036
a local model representing client i in global round t and local round r. The purpose of the local model is to facilitate the construction of a global model, reducing the number of communications between the client and the server. Second, in the outer-layer optimization, the local update that client i uses gradient descent is for F i (instead of f) i ) As follows:
Figure BDA0003629311780000037
wherein, eta is the learning rate,
Figure BDA0003629311780000038
can be based on
Figure BDA0003629311780000039
Using current personalized models
Figure BDA00036293117800000310
To calculate;
steps 1-4, for the actual algorithm, we use
Figure BDA00036293117800000311
Is expressed as satisfying
Figure BDA00036293117800000312
Obtained by
Figure BDA00036293117800000313
A gradient is usually required
Figure BDA00036293117800000314
However, this requires ξ i Distribution of (2). In practice, by applying to small batches of data T i Sampling is carried out using the following pairs
Figure BDA00036293117800000315
Unbiased estimation of (d):
Figure BDA0003629311780000041
so that
Figure BDA0003629311780000042
Second, in general, obtaining
Figure BDA0003629311780000043
The closed-form solution of (a) is not simple. In contrast, a first order iterative approach is typically used to obtain a high precision approximation
Figure BDA0003629311780000044
Defining:
Figure BDA0003629311780000045
suppose λ is chosen such that the loss function is
Figure BDA0003629311780000046
Is strongly convex and then a gradient descent (Nesterov's accelerated gradient descent) is applied to obtain
Figure BDA0003629311780000047
So that:
Figure BDA0003629311780000048
otherwise, O (-) is a hidden constant. δ can then be adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level ν.
Further, in the present invention: said step 2 further comprises the step of,
and 2-1, introducing GAN as a framework for training a generated model. The GAN generative model can train any kind of generator network, and most other generative models require some specific functional form of the generator, as if the output had to be gaussian distributed. In addition, although other generative models, such as variational self-encoders, can learn the distribution of data, studies have shown that images generated by such generative models tend to be relatively blurred. However, GAN can learn a model that produces sample points only close to the real data (neural network layer), i.e., the distribution learned by GAN is very close to the real distribution.
GAN consists of a set of "confrontation" models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that the sample is from the real data, rather than the generative model G. Both G and D may be non-linear mapping functions, such as multi-layer perceptrons.
To learn the generator distribution p using data x g The generator constructs a prior noise distribution p z (z) mapping function G (z; o) to data space g ). Discriminator D (x; o) d ) Outputting a scalar quantity, representing that x is from real data instead of p g The probability of (c).
Both G and D require training: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logd (x) as if they followed a game with a cost function V (G, D), i.e. the min-max optimization problem as follows:
Figure BDA0003629311780000049
step 2-2, the type of data being generated cannot be controlled in the unconditional generative model. However, by adjusting the model based on the additional information, the data generation process may be guided. Such conditions may be based on class labels, some a priori information, or even data information from different modalities.
When both the generator and the arbiter are conditioned with some extra information y, the GAN can be extended to a conditional model, i.e. a conditional GAN. y may be any type of auxiliary information such as category labels or data information from other modalities. We can perform the adjustment by inputting y as an additional input layer into the discriminator and generator.
In the generator, a priori input noise p z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework. In the discriminator, x and y are used as inputs and a discrimination function. The objective functions V (D, G) of the two min-max games are as follows:
Figure BDA0003629311780000051
and 2-3, respectively passing the data of the N clients through the trained conditional GAN model, then sending the respective specific local model values to the cloud server side, further calculating the aggregation average value of the models by the cloud server side, sending the value back to the N clients, and finally updating the respective local model values by the clients. Adding the trained conditional GAN model into the pFedMe model in the step 1 according to an individualized federal learning mode, thereby realizing data augmentation and obtaining the pFedMe model based on the conditional GAN.
Further, in the present invention: the step 3 further comprises the step of,
step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32 x 32, and all images fall into 10 different categories. We distribute the complete data set to N-20 clients;
step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federal learning task, and marking out 75% and 25% of data for training and testing by the data set of each user through random sampling, wherein the data in the training set and the data in the testing set are not overlapped;
step 3-3, firstly, sending the training sample to the pFidMe model based on the condition GAN in the step 2, and keeping the test set sample the same as that in the step 3-2 to obtain the accuracy of the data augmentation method of the pFidMe model based on the condition GAN on the CIFAR10 test set;
and 3-4, directly copying a training sample through repeated sampling to expand data, then sending the data to the pFedMe model in the step 1, and keeping the same sample of the CIFAR10 test set as that in the step 3-2 to obtain the accuracy of the data amplification method through repeated sampling on the test set.
Further, in the present invention: the step 4 further comprises the step of,
step 4-1, directly passing the CIFAR10 test set in the step 3 through a pFedMe model which does not use any data augmentation method in the step 1;
and 4-2, carrying out pairwise comparison analysis on the accuracy in the step 4-1 and the accuracy in the step 3 based on the two data augmentation methods to obtain a method for improving the personalized federal learning performance.
Has the advantages that: compared with the prior art, the invention has the beneficial effects that:
(1) the method for improving the personalized federal learning performance is obtained by comparing the influence of the unused data augmentation method and the two types of used data augmentation methods on the personalized federal learning performance;
(2) the invention proposes to augment a training data set for personalized federal learning using a generative model based on a conditional GAN, which as a generative model can be used to generate data under fixed conditions. The invention provides a communication mode based on personalized federal learning, which adds a condition GAN into a personalized federal learning model to realize data augmentation, rather than simply performing data augmentation on a data set.
Drawings
FIG. 1 is a schematic overall flow chart of a method for enhancing and improving personalized federal learning performance based on conditional GAN data according to the present invention;
FIG. 2 is a schematic structural diagram of a pFedMe model based on conditional GAN in the present invention;
fig. 3 is a schematic diagram showing a comparison of performance curves of the personalized federal learning model of the CIFAR10 data set under three conditions of no data augmentation method, a method of repeatedly sampling augmented data, and a data augmentation method based on the conditional GAN in the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the drawings and the detailed implementation mode as follows:
as shown in fig. 1, an overall flow chart of the method for improving personalized federal learning performance based on conditional GAN data augmentation is shown, and the method specifically comprises the following steps,
step 1, establishing a pFedMe model for personalized federal learning, wherein N clients communicate with a server;
specifically, the step 1 further comprises the following steps,
1-1, constructing a personalized federal learning model pFedMe with better performance on the basis of the traditional federal learning;
further, the construction of the pFedMe model further comprises the following steps:
step 1-1-1, there are N clients communicating with one server to solve the following problems:
Figure BDA0003629311780000061
to find a global model omega. Function f i :
Figure BDA0003629311780000062
1, N denotes the expected loss function of the data distribution of client i:
Figure BDA0003629311780000071
wherein ξ i Is a random data sample drawn according to the distribution of the client i, and
Figure BDA0003629311780000072
is the loss function for that sample and ω. In federal learning, a client may have a non-independent, non-identically distributed data distribution, i.e., ξ i And xi j The distribution of (j ≠ i) is different;
step 1-1-2, use the band l for each client 2 A normalized loss function of norm as follows:
Figure BDA0003629311780000073
wherein, theta i Representing the personalized model of client i, λ is a regularization parameter that controls the strength of ω to the personalized model. Although the lambda takes a larger value to prevent the data from being aggregated in rich dataReliable customers benefit, but a smaller value for λ may help customers with sufficient useful data to personalize preferentially. Note λ e (0, ∞) to avoid the extremes of λ ∞ 0 (i.e. no federal learning), or λ ∞ (i.e. no personalized federal learning). The main idea of the regularization loss function is to allow a client to constrain their own model from different gradient directions, and to ensure that the local model of the user is not far away from the global "reference point" ω. Based on this, the personalized federal learned optimization objective can be written as a two-tier optimization problem as follows:
Figure BDA0003629311780000074
wherein
Figure BDA0003629311780000075
Step 1-2, solving the optimal personalized model of each client
Figure BDA0003629311780000076
Further, the optimal personalized model
Figure BDA0003629311780000077
The calculation of (c) further comprises the steps of:
step 1-2-1, in pFedMe, θ, although the global model ω is found by using data aggregation from multiple clients at the outer layer i The data distribution with respect to client i is optimized and internally maintains a bounded distance from ω. F i The definition of (ω) is a Moreau envelope method widely adopted in the field of optimization, which facilitates various learning algorithm designs. Optimized personalized model
Figure BDA0003629311780000078
Is the best solution to the pFedMe internal problem, defined as follows:
Figure BDA0003629311780000079
step 1-2-2, for the actual algorithm, we use
Figure BDA00036293117800000710
Is expressed as satisfying
Figure BDA00036293117800000711
Obtained by
Figure BDA00036293117800000712
A gradient is usually required
Figure BDA00036293117800000713
However, this requires ξ i Distribution of (2). In practice, by applying to small batches of data T i Sampling is performed using the following pairs
Figure BDA0003629311780000081
Unbiased estimation of (d):
Figure BDA0003629311780000082
so that
Figure BDA0003629311780000083
Second, in general, obtaining
Figure BDA0003629311780000084
The closed-form solution of (a) is not simple. In contrast, a first order iterative approach is typically used to obtain a high precision approximation
Figure BDA0003629311780000085
Defining:
Figure BDA0003629311780000086
suppose λ is chosen such that the loss function is
Figure BDA0003629311780000087
Is strongly convex and then a gradient descent (Nesterov's accelerated gradient descent) is applied to obtain
Figure BDA0003629311780000088
So that:
Figure BDA0003629311780000089
Figure BDA00036293117800000810
number of times of calculation
Figure BDA00036293117800000811
Where d is the diameter of the search space, v is the accuracy level, and O (-) is a hidden constant. δ can then be adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level v.
Step 1-3, solving the global model
Figure BDA00036293117800000812
Further, the global model
Figure BDA00036293117800000813
The calculation of (c) further comprises the steps of:
step 1-3-1, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model ω to all clients in each communication round t t . Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients t Where the latest local model is received for model averaging.
In the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2-1 to obtain a personalized model thereof
Figure BDA00036293117800000814
Wherein the content of the first and second substances,
Figure BDA00036293117800000815
a local model representing client i in global round t and local round r. The purpose of the local model is to facilitate the construction of a global model, reducing the number of communications between the client and the server. Second, in the outer-layer optimization, the local update that client i uses gradient descent is for F i (instead of f) i ) As follows:
Figure BDA00036293117800000816
wherein, eta is the learning rate,
Figure BDA00036293117800000817
can be based on
Figure BDA00036293117800000818
Using the current personalized model
Figure BDA00036293117800000819
To calculate.
Step 2, adding the conditional GAN model into the pFedme model in the step 1 according to an individualized federal learning mode;
specifically, the step 2 further comprises the following steps,
and 2-1, introducing GAN as a framework for training a generated model. The GAN generative model can train any kind of generator network, and most other generative models require some specific functional form of the generator, as if the output had to be gaussian distributed. In addition, although other generative models, such as variational self-encoders, can learn the distribution of data, studies have shown that images generated by such generative models tend to be blurred. However, GAN can learn a model that produces sample points only close to the real data (neural network layer), i.e., the distribution learned by GAN is very close to the real distribution.
GAN consists of a set of "confrontation" models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that the sample is from the real data, rather than the generative model G. Both G and D may be non-linear mapping functions, such as multi-layer perceptrons.
To learn the generator distribution p using data x g The generator constructs a prior noise distribution p z (z) mapping function G (z; o) to data space g ). Discriminator D (x; o) d ) Outputting a scalar quantity, representing that x is from real data instead of p g The probability of (c).
Both G and D require training: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logD (x) as if they followed a game with a cost function V (G, D), i.e., the following min-max optimization problem:
Figure BDA0003629311780000091
step 2-2, the type of data being generated cannot be controlled in the unconditional generative model. However, by adjusting the model based on the additional information, the data generation process may be guided. Such conditions may be based on class labels, some a priori information, or even data information from different modalities.
When both the generator and the arbiter are conditioned with some extra information y, the GAN can be extended to a conditional model, i.e. a conditional GAN. y may be any type of ancillary information such as category labels or data information from other modalities. We can perform the adjustment by inputting y as an additional input layer into the discriminator and generator.
In the generator, a priori input noise p z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework. In the discriminator, x and y are used as inputs and a discrimination function. The objective functions V (D, G) of the two min-max games are as follows:
Figure BDA0003629311780000092
and 2-3, respectively passing the data of the N clients through the trained conditional GAN model, then sending the respective unique local model values to the cloud server, calculating the aggregation average value of the models by the cloud server, sending the aggregation average value back to the N clients, and finally updating the respective local model values by the clients. Namely, the trained conditional GAN model is added to the pFedMe model in step 1 according to a personalized federal learning manner, so that data augmentation is realized, and the pFedMe model based on the conditional GAN is obtained, as shown in fig. 2.
Further, in this embodiment, the following table 1 is specifically set for the main parameters of the neural network of the condition GAN in the personalized federal learning task:
table 1: conditional GAN parameter setting table
Figure BDA0003629311780000101
In the experimental process, a ReLu function is used as an activation function of a hidden layer of a conditional GAN neural network; and adding the trained conditional GAN model into the pFDeMe model according to the mode of figure 2 to achieve the effect of data augmentation.
Step 3, training a training set of the CIFAR10 data set through the pFedMe model based on the condition GAN in the step 2; acquiring a training set of a CIFAR10 data set, repeatedly sampling the training set to expand data, and sending the data into a pFidme model for training;
specifically, the step 3 further comprises the following steps,
in step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32X 32, and all pictures belong to 10 different categories. Due to the limitation of CIFAR10 data size, we distribute a complete data set to N-20 clients;
step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federated learning task, randomly splitting all the data sets, respectively training and testing 75% and 25% of the data sets, wherein the data in the training set and the data in the testing set are not overlapped;
step 3-3, firstly, sending the training sample to the pFidMe model based on the condition GAN in the step 2, and keeping the test set sample the same as that in the step 3-2 to obtain the accuracy of the data augmentation method of the pFidMe model based on the condition GAN on the CIFAR10 test set;
and 3-4, directly copying the training sample to expand the data, then sending the data to the pFedMe model in the step 1, and keeping the sample of the CIFAR10 test set the same as that in the step 3-2 to obtain the accuracy of the data amplification method for directly copying the data on the test set.
As shown in fig. 3, the total global communication round number is set to 800 times during simulation, and the accuracy on the test set is expressed in percentage. As can be seen from the data augmentation method curve based on the condition GAN in the figure, the accuracy of the test set is in an increasing mode along with the increase of the number of global communication rounds, and finally can reach 65%; as can be seen from the test set accuracy curve in the data augmentation method based on repeated sampling, when the global communication times are few, large-amplitude fluctuation occurs. With the increase of the number of communication rounds, the accuracy of the test set is in a smaller function growth trend and finally reaches 55% (10% smaller than the data augmentation method based on the conditional GAN).
Step 4, respectively carrying out comparative analysis on the accuracy of the test set of the CIFAR10 data set on an original pFidMe model (namely the pFidMe model in the step 1) which does not use a data augmentation method and the accuracy of the test set of the CIFAR10 data set on the two data augmentation pFidMe models in the step 3;
specifically, the step 4 further comprises the following steps,
step 4-1, directly passing the CIFAR10 test set in the step 3 through a pFedMe model which does not use any data augmentation method in the step 1;
and 4-2, carrying out pairwise comparison analysis on the accuracy in the step 4-1 and the accuracy in the step 3 based on the two data augmentation methods to obtain a method for improving the personalized federal learning performance.
Comparing the test set accuracy curve in fig. 3, it can be seen that when the number of global communication rounds is greater than about 180, the accuracy of the data augmentation method based on the conditional GAN on the test set is significantly higher than that of the method using the data augmentation data set directly copied, i.e., the data augmentation method based on the conditional GAN has a better effect of improving the personalized federal learning performance. It can also be seen that the accuracy of the test set is substantially the same without using the data expansion method and using the method of directly copying the data expansion data set, except that when the number of global communication rounds is small (about less than or equal to 20), the method of directly copying the data expansion data set even reduces the performance of the original model, and it can be seen that the method of directly copying the data expansion data set cannot achieve the data expansion in the true sense, and cannot improve the performance of the model. In contrast, we believe that the performance of personalized federal learning can be greatly improved using a conditional GAN-based data augmentation method.
The invention realizes a scheme for improving the personalized federal learning performance based on the data amplification of the depth condition GAN aiming at the image processing problem. The method for adding the condition GAN into the model for data augmentation in the personalized federal learning mode based on the personalized federal learning method improves the personalized federal learning performance, and has important guiding significance for data augmentation of the use condition GAN.
It should be noted that the above-mentioned examples only represent some embodiments of the present invention, and the description thereof should not be construed as limiting the scope of the present invention. It should be noted that, for those skilled in the art, various modifications can be made without departing from the spirit of the present invention, and these modifications should fall within the scope of the present invention.

Claims (5)

1. The method for improving the personalized federal learning performance based on the data augmentation of the conditional GAN is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1, establishing a pFedMe model for personalized federal learning;
step 2, adding the conditional GAN model into the pFedme model in the step 1 according to an individualized federal learning mode;
step 3, acquiring a training set of a CIFAR10 data set, and training the training set through the pFedMe model based on the condition GAN in the step 2;
and 4, respectively carrying out performance comparative analysis on the test set of the CIFAR10 data set in an original pFidMe model which does not use a data augmentation method, namely the pFidMe model in the step 1, by repeatedly sampling the model trained by the augmented data and the pFidMe model trained by the augmented data in the step 3.
2. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 1, wherein: said step 1 further comprises the step of,
step 1-1, in traditional federal learning, N clients communicate with one server to solve the following problems:
Figure FDA0003629311770000011
to find a global model ω; function(s)
Figure FDA0003629311770000012
Expected loss function representing client i data distribution:
Figure FDA0003629311770000013
wherein ξ i Is a random data sample drawn according to the distribution of the client i, and
Figure FDA0003629311770000014
is the loss function for that sample and ω. In federal learning, a client may have a non-independent, non-identically distributed data distribution, i.e., ξ i And xi j The distribution of (j ≠ i) is different;
step 1-2, in the federal learning personalization problem under consideration, the optimization objective is not to obtain the optimal solution to the problem in step 1-1 above, but is rather expected to have different data distributionsThe user of (2) providing a particular model; for this purpose, a band l is used for each client 2 A normalized loss function of norm as follows:
Figure FDA0003629311770000015
wherein, theta i Representing the personalized model of the client i, λ is a regularization parameter that controls the strength of ω to the personalized model; while λ takes a larger value to benefit clients with unreliable data from rich data aggregation, λ takes a smaller value to help clients with enough useful data to personalize preferentially; note λ ∈ (0, ∞) to avoid the extreme case of λ ═ 0, or λ ∞; the regularization loss function is a model which enables a client to constrain the client from different gradient directions, and simultaneously ensures that a local model of the client is not far away from a global 'reference point' omega; based on this, the personalized federal learned optimization objective is written as a two-layer optimization problem as follows:
Figure FDA0003629311770000021
wherein
Figure FDA0003629311770000022
In pFedMe, though the global model ω is found by using data aggregation from multiple clients at the outer layer, θ i The data distribution of the client i is optimized, and a bounded distance from omega is kept inside; f i The definition of (omega) is a Moreau envelope method widely adopted in the field of optimization, which is helpful for the design of various learning algorithms; optimized personalized model
Figure FDA0003629311770000023
Is the best solution to the pFedMe internal problem, defined as follows:
Figure FDA0003629311770000024
step 1-3, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model omega to all clients in each communication round t t (ii) a Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients t Receiving the latest local model for model averaging;
in the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2 to obtain a personalized model thereof
Figure FDA0003629311770000025
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003629311770000026
a local model representing a client i in a global round t and a local round r; the purpose of the local model is to help construct a global model and reduce the communication times between the client and the server; second, in the outer-layer optimization, the local update that client i uses gradient descent is for F i As follows:
Figure FDA0003629311770000027
wherein, eta is the learning rate,
Figure FDA0003629311770000028
according to
Figure FDA0003629311770000029
Using the current personalized model
Figure FDA00036293117700000210
To calculate;
steps 1-4, for the actual algorithm, use
Figure FDA00036293117700000211
Is expressed as satisfying
Figure FDA00036293117700000212
Obtained by
Figure FDA00036293117700000213
A gradient is usually required
Figure FDA00036293117700000214
However, this requires ξ i The distribution of (a); by applying to small batch data T i Sampling is performed using the following pairs
Figure FDA00036293117700000215
Unbiased estimation of (2):
Figure FDA00036293117700000216
so that
Figure FDA0003629311770000031
Second, a first order iterative method is used to obtain a high precision approximation
Figure FDA0003629311770000032
Defining:
Figure FDA0003629311770000033
suppose λ is chosen such that the loss function is
Figure FDA0003629311770000034
Is strongly convex and then obtained by applying a gradient descent
Figure FDA0003629311770000035
So that:
Figure FDA0003629311770000036
otherwise, O (-) is a hidden constant; delta is then adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level ν.
3. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN according to claim 1 or 2, wherein: said step 2 further comprises the step of,
step 2-1, introducing GAN as a framework for training a generation model; the GAN generation model trains any generator network; in addition, the GAN learns a model that only generates sample points near the real data, i.e., the distribution learned by the GAN is very close to the real distribution;
GAN consists of a set of "confrontation" models: a generative model G capturing the data distribution, and a discriminative model D estimating the probability that the sample is from the real data instead of the generative model G; both G and D may be non-linear mapping functions;
to learn the generator distribution p using data x g The generator constructs a prior noise distribution p z (z) mapping function G (z; o) to data space g ) (ii) a Discriminator D (x; o) d ) Outputting a scalar quantity, representing that x is from real data instead of p g The probability of (d);
both G and D need to be trained: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logD (x) as if they followed a game with a cost function V (G, D), i.e., the following min-max optimization problem:
Figure FDA0003629311770000037
step 2-2, in the unconditional generation model, the type of the data being generated cannot be controlled; however, by adjusting the model according to the additional information, the data generation process is guided; this condition is based on class labels, some a priori information, and even data information from different modalities;
when the generator and the arbiter both condition some extra information y, then GAN is extended to the condition model, i.e. the condition GAN; y is any type of auxiliary information; we perform the adjustment by inputting y as an additional input layer into the discriminator and generator;
in the generator, a priori input noise p z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework; in the discriminator, x and y are used as input and discrimination functions; the objective functions V (D, G) of the two min-max games are as follows:
Figure FDA0003629311770000041
step 2-3, passing the respective data of the N clients through the trained conditional GAN model respectively, and then sending respective unique local model values to the cloud server side, further calculating the aggregation average value of the models by the cloud server side, sending the value back to the N clients, and finally updating the respective local model values by the clients; adding the trained conditional GAN model into the pFedMe model in the step 1 according to an individualized federal learning mode, thereby realizing data augmentation and obtaining the pFedMe model based on the conditional GAN.
4. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 3, wherein: the step 3 further comprises the step of,
step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32 x 32, and all images fall into 10 different categories; distributing the complete data set to N-20 clients;
step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federal learning task, and marking out 75% and 25% of data for training and testing by the data set of each user through random sampling, wherein the data in the training set and the data in the testing set are not overlapped;
step 3-3, firstly, sending the training sample to the pFidMe model based on the condition GAN in the step 2, and keeping the test set sample the same as that in the step 3-2 to obtain the accuracy of the data augmentation method of the pFidMe model based on the condition GAN on the CIFAR10 test set;
and 3-4, directly copying a training sample through repeated sampling to expand data, then sending the data to the pFedMe model in the step 1, and keeping the same sample of the CIFAR10 test set as that in the step 3-2 to obtain the accuracy of the data amplification method through repeated sampling on the test set.
5. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 4, wherein: the step 4 further comprises the step of,
step 4-1, directly passing the CIFAR10 test set in the step 3 through a pFedMe model which does not use any data augmentation method in the step 1;
and 4-2, carrying out pairwise comparison analysis on the accuracy in the step 4-1 and the accuracy in the step 3 based on the two data augmentation methods to obtain a method for improving the personalized federal learning performance.
CN202210486378.1A 2022-05-06 2022-05-06 Method for improving personalized federal learning performance based on data augmentation of conditional GAN Pending CN114913390A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210486378.1A CN114913390A (en) 2022-05-06 2022-05-06 Method for improving personalized federal learning performance based on data augmentation of conditional GAN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210486378.1A CN114913390A (en) 2022-05-06 2022-05-06 Method for improving personalized federal learning performance based on data augmentation of conditional GAN

Publications (1)

Publication Number Publication Date
CN114913390A true CN114913390A (en) 2022-08-16

Family

ID=82767022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210486378.1A Pending CN114913390A (en) 2022-05-06 2022-05-06 Method for improving personalized federal learning performance based on data augmentation of conditional GAN

Country Status (1)

Country Link
CN (1) CN114913390A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021120676A1 (en) * 2020-06-30 2021-06-24 平安科技(深圳)有限公司 Model training method for federated learning network, and related device
CN113468521A (en) * 2021-07-01 2021-10-01 哈尔滨工程大学 Data protection method for federal learning intrusion detection based on GAN
CN113762530A (en) * 2021-09-28 2021-12-07 北京航空航天大学 Privacy protection-oriented precision feedback federal learning method
CN114021738A (en) * 2021-11-23 2022-02-08 湖南三湘银行股份有限公司 Distributed generation countermeasure model-based federal learning method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021120676A1 (en) * 2020-06-30 2021-06-24 平安科技(深圳)有限公司 Model training method for federated learning network, and related device
CN113468521A (en) * 2021-07-01 2021-10-01 哈尔滨工程大学 Data protection method for federal learning intrusion detection based on GAN
CN113762530A (en) * 2021-09-28 2021-12-07 北京航空航天大学 Privacy protection-oriented precision feedback federal learning method
CN114021738A (en) * 2021-11-23 2022-02-08 湖南三湘银行股份有限公司 Distributed generation countermeasure model-based federal learning method

Similar Documents

Publication Publication Date Title
Thapa et al. Splitfed: When federated learning meets split learning
CN107392255B (en) Generation method and device of minority picture sample, computing equipment and storage medium
CN108229479B (en) Training method and device of semantic segmentation model, electronic equipment and storage medium
US20200073968A1 (en) Sketch-based image retrieval techniques using generative domain migration hashing
CN108399428B (en) Triple loss function design method based on trace ratio criterion
CN107506822B (en) Deep neural network method based on space fusion pooling
US11636314B2 (en) Training neural networks using a clustering loss
CN111460528B (en) Multi-party combined training method and system based on Adam optimization algorithm
CN111260754B (en) Face image editing method and device and storage medium
CN110598806A (en) Handwritten digit generation method for generating countermeasure network based on parameter optimization
EP4350572A1 (en) Method, apparatus and system for generating neural network model, devices, medium and program product
CN113850272A (en) Local differential privacy-based federal learning image classification method
CN108197561B (en) Face recognition model optimization control method, device, equipment and storage medium
US20210019654A1 (en) Sampled Softmax with Random Fourier Features
CN110929839A (en) Method and apparatus for training neural network, electronic device, and computer storage medium
WO2024027164A1 (en) Adaptive personalized federated learning method supporting heterogeneous model
WO2023061169A1 (en) Image style migration method and apparatus, image style migration model training method and apparatus, and device and medium
CN115271101A (en) Personalized federal learning method based on graph convolution hyper-network
CN115587633A (en) Personalized federal learning method based on parameter layering
CN115204416A (en) Heterogeneous client-oriented joint learning method based on hierarchical sampling optimization
CN111324731B (en) Computer-implemented method for embedding words of corpus
CN114913390A (en) Method for improving personalized federal learning performance based on data augmentation of conditional GAN
CN116561622A (en) Federal learning method for class unbalanced data distribution
CN115759297A (en) Method, device, medium and computer equipment for federated learning
US20220108220A1 (en) Systems And Methods For Performing Automatic Label Smoothing Of Augmented Training Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination