CN114913390A

CN114913390A - Method for improving personalized federal learning performance based on data augmentation of conditional GAN

Info

Publication number: CN114913390A
Application number: CN202210486378.1A
Authority: CN
Inventors: 杨绿溪; 李林育; 张征明; 李春国; 黄永明
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2022-08-16

Abstract

The invention discloses a method for improving individualized federal learning performance by data augmentation based on conditional GAN, which comprises the following steps of establishing an individualized federal learning pFedMe model; establishing a conditional GAN model, and then adding the conditional GAN model into a pFedeMe model in a personalized federal learning manner to obtain a pFedMe model based on the conditional GAN; acquiring a CIFAR10 data set, and realizing data augmentation through the pFedMe model based on the condition GAN; and obtaining the accuracy of the data augmentation method based on the conditional GAN on the test set. The method can obtain the condition GAN to effectively improve the personalized federal learning performance, and has practical value for using the condition GAN to perform data augmentation on the model.

Description

Method for improving personalized federal learning performance based on data augmentation of conditional GAN

Technical Field

The invention relates to the technical field of image processing, in particular to a method for improving personalized federal learning performance based on data augmentation of a conditional generation countermeasure network (GAN).

Background

Federated learning is a privacy-preserving machine learning technique in which a set of clients collaboratively learn a global model with a server without sharing client data. One of the core challenges of the federal learning problem is to overcome the performance loss caused by statistical heterogeneity between clients, which studies have shown to limit the global model to provide good performance on a per-client task. While the personalized federal learning (pFedMe) algorithm, which uses the Moreau envelope as a client regularization loss function, helps the optimization of the personalized model.

Today, the development of federal learning is stimulated by the large amount of data generated in a large number of handheld devices. The federated learning scenario involves a large number of clients connected to a server, with the goal of building a global model in a privacy-preserving and communication-efficient manner. Despite the advantages of data privacy protection and low communication overhead, federal learning faces the following challenges that affect its performance and convergence speed: statistical heterogeneity, which means that the data distribution is different between clients. A global model trained using these non-uniformly distributed data is difficult to have good performance on data for each customer. When statistical heterogeneity increases, the generalization error of the global model to the customer local data also increases significantly. On the other hand, local learning without federal learning (i.e., without client cooperation) may also have large generalization errors due to insufficient data.

Training deep learning models requires a large number of training samples, and insufficient training data can lead to severe overfitting problems and reduce model accuracy. In practice, collecting a large number of samples to train a deep learning model requires time and knowledge in the relevant field, which is both expensive and difficult. Data augmentation is one of the common techniques for solving the above problems, and it can increase the relevant data in the data set, and let the model learn more data-related characteristics. And then overfitting is effectively avoided, so that the trained model has higher robustness, and the generalization performance is obviously improved. Data augmentation is widely used to improve the performance of image and text classification tasks, and advanced methods such as GAN and conditional GAN have designed and optimized data augmentation schemes for these tasks and achieved better results. However, the impact of data augmentation on personalized federal learning has not been fully studied.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a method for improving the performance of personalized federal learning based on data augmentation of conditional GAN, and the method can improve the performance of a personalized federal learning model to the maximum extent.

The technical scheme is as follows: in order to achieve the above-mentioned object, the present invention provides a method for improving personalized federal learning performance based on data augmentation of conditional GAN, comprising the steps of,

step 1, establishing a pFedMe model for personalized federal learning;

step 2, adding the conditional GAN model into the pFedme model in the step 1 according to an individualized federal learning mode;

step 3, acquiring a training set of a CIFAR10 data set, and training the training set through the pFedMe model based on the condition GAN in the step 2;

and 4, respectively carrying out performance comparative analysis on the test set of the CIFAR10 data set in an original pFidMe model (namely the pFidMe model in the step 1) without using a data augmentation method through repeatedly sampling the model trained by the augmented data and the pFidMe model trained by the data augmentation in the step 3.

Further, in the present invention: said step 1 further comprises the step of,

step 1-1, in traditional federal learning, there are N clients communicating with one server to solve the following problems:

to find a global model omega. Function f _i :

N denotes the period of data distribution of the client iExpectation-loss function:

wherein xi is _i Is a random data sample drawn according to the distribution of the client i, and

is the loss function for that sample and ω. In federal learning, a client may have a non-independent, non-identically distributed data distribution, i.e., ξ _i And xi _j The distribution of (j ≠ i) is different;

step 1-2, in the federal learning personalization problem under consideration, the optimization objective is not to obtain the optimal solution to the problem in step 1-1 above, but rather to hopefully provide a specific model for users with different data distributions. For this we use a per client with l ₂ A normalized loss function of norm as follows:

wherein, theta _i Representing the personalized model of client i, λ is a regularization parameter that controls the strength of ω to the personalized model. While large λ values may benefit clients with unreliable data from rich data aggregation, small λ values may help clients with sufficient useful data to be preferentially personalized. Note λ e (0, ∞) to avoid the extremes of λ 0 (i.e. no federal learning), or λ ∞ (i.e. no individualized federal learning). The main idea of the regularization loss function is to allow a client to constrain their own model from different gradient directions, and to ensure that the local model of the user is not far away from the global "reference point" ω. Based on this, the personalized federal learned optimization objective can be written as a two-tier optimization problem as follows:

wherein

In pFedMe, though the global model ω is found by using data aggregation from multiple clients at the outer layer, θ _i The data distribution with respect to client i is optimized and internally maintains a bounded distance from ω. F _i The definition of (ω) is a Moreau envelope method widely adopted in the field of optimization, which facilitates various learning algorithm designs. Optimized personalized model

Is the best solution to the pFedMe internal problem, defined as follows:

step 1-3, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model omega to all clients in each communication round t _t . Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients ^t Where the latest local model is received for model averaging.

In the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2 to obtain a personalized model thereof

Wherein the content of the first and second substances,

a local model representing client i in global round t and local round r. The purpose of the local model is to facilitate the construction of a global model, reducing the number of communications between the client and the server. Second, in the outer-layer optimization, the local update that client i uses gradient descent is for F _i (instead of f) _i ) As follows:

wherein, eta is the learning rate,

can be based on

Using current personalized models

To calculate;

steps 1-4, for the actual algorithm, we use

Is expressed as satisfying

Obtained by

A gradient is usually required

However, this requires ξ _i Distribution of (2). In practice, by applying to small batches of data T _i Sampling is carried out using the following pairs

Unbiased estimation of (d):

so that

Second, in general, obtaining

The closed-form solution of (a) is not simple. In contrast, a first order iterative approach is typically used to obtain a high precision approximation

Defining:

suppose λ is chosen such that the loss function is

Is strongly convex and then a gradient descent (Nesterov's accelerated gradient descent) is applied to obtain

So that:

otherwise, O (-) is a hidden constant. δ can then be adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level ν.

Further, in the present invention: said step 2 further comprises the step of,

and 2-1, introducing GAN as a framework for training a generated model. The GAN generative model can train any kind of generator network, and most other generative models require some specific functional form of the generator, as if the output had to be gaussian distributed. In addition, although other generative models, such as variational self-encoders, can learn the distribution of data, studies have shown that images generated by such generative models tend to be relatively blurred. However, GAN can learn a model that produces sample points only close to the real data (neural network layer), i.e., the distribution learned by GAN is very close to the real distribution.

GAN consists of a set of "confrontation" models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that the sample is from the real data, rather than the generative model G. Both G and D may be non-linear mapping functions, such as multi-layer perceptrons.

To learn the generator distribution p using data x _g The generator constructs a prior noise distribution p _z (z) mapping function G (z; o) to data space _g ). Discriminator D (x; o) _d ) Outputting a scalar quantity, representing that x is from real data instead of p _g The probability of (c).

Both G and D require training: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logd (x) as if they followed a game with a cost function V (G, D), i.e. the min-max optimization problem as follows:

step 2-2, the type of data being generated cannot be controlled in the unconditional generative model. However, by adjusting the model based on the additional information, the data generation process may be guided. Such conditions may be based on class labels, some a priori information, or even data information from different modalities.

When both the generator and the arbiter are conditioned with some extra information y, the GAN can be extended to a conditional model, i.e. a conditional GAN. y may be any type of auxiliary information such as category labels or data information from other modalities. We can perform the adjustment by inputting y as an additional input layer into the discriminator and generator.

In the generator, a priori input noise p _z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework. In the discriminator, x and y are used as inputs and a discrimination function. The objective functions V (D, G) of the two min-max games are as follows:

and 2-3, respectively passing the data of the N clients through the trained conditional GAN model, then sending the respective specific local model values to the cloud server side, further calculating the aggregation average value of the models by the cloud server side, sending the value back to the N clients, and finally updating the respective local model values by the clients. Adding the trained conditional GAN model into the pFedMe model in the step 1 according to an individualized federal learning mode, thereby realizing data augmentation and obtaining the pFedMe model based on the conditional GAN.

Further, in the present invention: the step 3 further comprises the step of,

step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32 x 32, and all images fall into 10 different categories. We distribute the complete data set to N-20 clients;

step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federal learning task, and marking out 75% and 25% of data for training and testing by the data set of each user through random sampling, wherein the data in the training set and the data in the testing set are not overlapped;

step 3-3, firstly, sending the training sample to the pFidMe model based on the condition GAN in the step 2, and keeping the test set sample the same as that in the step 3-2 to obtain the accuracy of the data augmentation method of the pFidMe model based on the condition GAN on the CIFAR10 test set;

and 3-4, directly copying a training sample through repeated sampling to expand data, then sending the data to the pFedMe model in the step 1, and keeping the same sample of the CIFAR10 test set as that in the step 3-2 to obtain the accuracy of the data amplification method through repeated sampling on the test set.

Further, in the present invention: the step 4 further comprises the step of,

step 4-1, directly passing the CIFAR10 test set in the step 3 through a pFedMe model which does not use any data augmentation method in the step 1;

and 4-2, carrying out pairwise comparison analysis on the accuracy in the step 4-1 and the accuracy in the step 3 based on the two data augmentation methods to obtain a method for improving the personalized federal learning performance.

Has the advantages that: compared with the prior art, the invention has the beneficial effects that:

(1) the method for improving the personalized federal learning performance is obtained by comparing the influence of the unused data augmentation method and the two types of used data augmentation methods on the personalized federal learning performance;

(2) the invention proposes to augment a training data set for personalized federal learning using a generative model based on a conditional GAN, which as a generative model can be used to generate data under fixed conditions. The invention provides a communication mode based on personalized federal learning, which adds a condition GAN into a personalized federal learning model to realize data augmentation, rather than simply performing data augmentation on a data set.

Drawings

FIG. 1 is a schematic overall flow chart of a method for enhancing and improving personalized federal learning performance based on conditional GAN data according to the present invention;

FIG. 2 is a schematic structural diagram of a pFedMe model based on conditional GAN in the present invention;

fig. 3 is a schematic diagram showing a comparison of performance curves of the personalized federal learning model of the CIFAR10 data set under three conditions of no data augmentation method, a method of repeatedly sampling augmented data, and a data augmentation method based on the conditional GAN in the present invention.

Detailed Description

The technical scheme of the invention is further explained in detail by combining the drawings and the detailed implementation mode as follows:

as shown in fig. 1, an overall flow chart of the method for improving personalized federal learning performance based on conditional GAN data augmentation is shown, and the method specifically comprises the following steps,

step 1, establishing a pFedMe model for personalized federal learning, wherein N clients communicate with a server;

specifically, the step 1 further comprises the following steps,

1-1, constructing a personalized federal learning model pFedMe with better performance on the basis of the traditional federal learning;

further, the construction of the pFedMe model further comprises the following steps:

step 1-1-1, there are N clients communicating with one server to solve the following problems:

to find a global model omega. Function f _i :

1, N denotes the expected loss function of the data distribution of client i:

wherein ξ _i Is a random data sample drawn according to the distribution of the client i, and

step 1-1-2, use the band l for each client ₂ A normalized loss function of norm as follows:

wherein, theta _i Representing the personalized model of client i, λ is a regularization parameter that controls the strength of ω to the personalized model. Although the lambda takes a larger value to prevent the data from being aggregated in rich dataReliable customers benefit, but a smaller value for λ may help customers with sufficient useful data to personalize preferentially. Note λ e (0, ∞) to avoid the extremes of λ ∞ 0 (i.e. no federal learning), or λ ∞ (i.e. no personalized federal learning). The main idea of the regularization loss function is to allow a client to constrain their own model from different gradient directions, and to ensure that the local model of the user is not far away from the global "reference point" ω. Based on this, the personalized federal learned optimization objective can be written as a two-tier optimization problem as follows:

wherein

Step 1-2, solving the optimal personalized model of each client

Further, the optimal personalized model

The calculation of (c) further comprises the steps of:

step 1-2-1, in pFedMe, θ, although the global model ω is found by using data aggregation from multiple clients at the outer layer _i The data distribution with respect to client i is optimized and internally maintains a bounded distance from ω. F _i The definition of (ω) is a Moreau envelope method widely adopted in the field of optimization, which facilitates various learning algorithm designs. Optimized personalized model

Is the best solution to the pFedMe internal problem, defined as follows:

step 1-2-2, for the actual algorithm, we use

Is expressed as satisfying

Obtained by

A gradient is usually required

However, this requires ξ _i Distribution of (2). In practice, by applying to small batches of data T _i Sampling is performed using the following pairs

Unbiased estimation of (d):

so that

Second, in general, obtaining

Defining:

suppose λ is chosen such that the loss function is

So that:

number of times of calculation

Where d is the diameter of the search space, v is the accuracy level, and O (-) is a hidden constant. δ can then be adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level v.

Step 1-3, solving the global model

Further, the global model

The calculation of (c) further comprises the steps of:

step 1-3-1, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model ω to all clients in each communication round t _t . Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients ^t Where the latest local model is received for model averaging.

In the inner-layer optimization, each client i solves the optimal personalized problem in the step 1-2-1 to obtain a personalized model thereof

Wherein the content of the first and second substances,

wherein, eta is the learning rate,

can be based on

Using the current personalized model

To calculate.

specifically, the step 2 further comprises the following steps,

and 2-1, introducing GAN as a framework for training a generated model. The GAN generative model can train any kind of generator network, and most other generative models require some specific functional form of the generator, as if the output had to be gaussian distributed. In addition, although other generative models, such as variational self-encoders, can learn the distribution of data, studies have shown that images generated by such generative models tend to be blurred. However, GAN can learn a model that produces sample points only close to the real data (neural network layer), i.e., the distribution learned by GAN is very close to the real distribution.

Both G and D require training: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logD (x) as if they followed a game with a cost function V (G, D), i.e., the following min-max optimization problem:

When both the generator and the arbiter are conditioned with some extra information y, the GAN can be extended to a conditional model, i.e. a conditional GAN. y may be any type of ancillary information such as category labels or data information from other modalities. We can perform the adjustment by inputting y as an additional input layer into the discriminator and generator.

and 2-3, respectively passing the data of the N clients through the trained conditional GAN model, then sending the respective unique local model values to the cloud server, calculating the aggregation average value of the models by the cloud server, sending the aggregation average value back to the N clients, and finally updating the respective local model values by the clients. Namely, the trained conditional GAN model is added to the pFedMe model in step 1 according to a personalized federal learning manner, so that data augmentation is realized, and the pFedMe model based on the conditional GAN is obtained, as shown in fig. 2.

Further, in this embodiment, the following table 1 is specifically set for the main parameters of the neural network of the condition GAN in the personalized federal learning task:

table 1: conditional GAN parameter setting table

In the experimental process, a ReLu function is used as an activation function of a hidden layer of a conditional GAN neural network; and adding the trained conditional GAN model into the pFDeMe model according to the mode of figure 2 to achieve the effect of data augmentation.

Step 3, training a training set of the CIFAR10 data set through the pFedMe model based on the condition GAN in the step 2; acquiring a training set of a CIFAR10 data set, repeatedly sampling the training set to expand data, and sending the data into a pFidme model for training;

specifically, the step 3 further comprises the following steps,

in step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32X 32, and all pictures belong to 10 different categories. Due to the limitation of CIFAR10 data size, we distribute a complete data set to N-20 clients;

step 3-2, constructing a simulation environment which is consistent with an actual environment, generating a data set for an individualized federated learning task, randomly splitting all the data sets, respectively training and testing 75% and 25% of the data sets, wherein the data in the training set and the data in the testing set are not overlapped;

and 3-4, directly copying the training sample to expand the data, then sending the data to the pFedMe model in the step 1, and keeping the sample of the CIFAR10 test set the same as that in the step 3-2 to obtain the accuracy of the data amplification method for directly copying the data on the test set.

As shown in fig. 3, the total global communication round number is set to 800 times during simulation, and the accuracy on the test set is expressed in percentage. As can be seen from the data augmentation method curve based on the condition GAN in the figure, the accuracy of the test set is in an increasing mode along with the increase of the number of global communication rounds, and finally can reach 65%; as can be seen from the test set accuracy curve in the data augmentation method based on repeated sampling, when the global communication times are few, large-amplitude fluctuation occurs. With the increase of the number of communication rounds, the accuracy of the test set is in a smaller function growth trend and finally reaches 55% (10% smaller than the data augmentation method based on the conditional GAN).

Step 4, respectively carrying out comparative analysis on the accuracy of the test set of the CIFAR10 data set on an original pFidMe model (namely the pFidMe model in the step 1) which does not use a data augmentation method and the accuracy of the test set of the CIFAR10 data set on the two data augmentation pFidMe models in the step 3;

specifically, the step 4 further comprises the following steps,

Comparing the test set accuracy curve in fig. 3, it can be seen that when the number of global communication rounds is greater than about 180, the accuracy of the data augmentation method based on the conditional GAN on the test set is significantly higher than that of the method using the data augmentation data set directly copied, i.e., the data augmentation method based on the conditional GAN has a better effect of improving the personalized federal learning performance. It can also be seen that the accuracy of the test set is substantially the same without using the data expansion method and using the method of directly copying the data expansion data set, except that when the number of global communication rounds is small (about less than or equal to 20), the method of directly copying the data expansion data set even reduces the performance of the original model, and it can be seen that the method of directly copying the data expansion data set cannot achieve the data expansion in the true sense, and cannot improve the performance of the model. In contrast, we believe that the performance of personalized federal learning can be greatly improved using a conditional GAN-based data augmentation method.

The invention realizes a scheme for improving the personalized federal learning performance based on the data amplification of the depth condition GAN aiming at the image processing problem. The method for adding the condition GAN into the model for data augmentation in the personalized federal learning mode based on the personalized federal learning method improves the personalized federal learning performance, and has important guiding significance for data augmentation of the use condition GAN.

It should be noted that the above-mentioned examples only represent some embodiments of the present invention, and the description thereof should not be construed as limiting the scope of the present invention. It should be noted that, for those skilled in the art, various modifications can be made without departing from the spirit of the present invention, and these modifications should fall within the scope of the present invention.

Claims

1. The method for improving the personalized federal learning performance based on the data augmentation of the conditional GAN is characterized by comprising the following steps: comprises the following steps of (a) carrying out,

step 1, establishing a pFedMe model for personalized federal learning;

and 4, respectively carrying out performance comparative analysis on the test set of the CIFAR10 data set in an original pFidMe model which does not use a data augmentation method, namely the pFidMe model in the step 1, by repeatedly sampling the model trained by the augmented data and the pFidMe model trained by the augmented data in the step 3.

2. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 1, wherein: said step 1 further comprises the step of,

step 1-1, in traditional federal learning, N clients communicate with one server to solve the following problems:

to find a global model ω; function(s)

Expected loss function representing client i data distribution:

step 1-2, in the federal learning personalization problem under consideration, the optimization objective is not to obtain the optimal solution to the problem in step 1-1 above, but is rather expected to have different data distributionsThe user of (2) providing a particular model; for this purpose, a band l is used for each client ₂ A normalized loss function of norm as follows:

wherein, theta _i Representing the personalized model of the client i, λ is a regularization parameter that controls the strength of ω to the personalized model; while λ takes a larger value to benefit clients with unreliable data from rich data aggregation, λ takes a smaller value to help clients with enough useful data to personalize preferentially; note λ ∈ (0, ∞) to avoid the extreme case of λ ═ 0, or λ ∞; the regularization loss function is a model which enables a client to constrain the client from different gradient directions, and simultaneously ensures that a local model of the client is not far away from a global 'reference point' omega; based on this, the personalized federal learned optimization objective is written as a two-layer optimization problem as follows:

wherein

In pFedMe, though the global model ω is found by using data aggregation from multiple clients at the outer layer, θ _i The data distribution of the client i is optimized, and a bounded distance from omega is kept inside; f _i The definition of (omega) is a Moreau envelope method widely adopted in the field of optimization, which is helpful for the design of various learning algorithms; optimized personalized model

Is the best solution to the pFedMe internal problem, defined as follows:

step 1-3, similar to the traditional federal learning algorithm such as FedAvg, the server broadcasts the latest global model omega to all clients in each communication round t _t (ii) a Then, after all clients perform R local updates, the server will sample the subset S uniformly from the clients ^t Receiving the latest local model for model averaging;

Wherein, the first and the second end of the pipe are connected with each other,

a local model representing a client i in a global round t and a local round r; the purpose of the local model is to help construct a global model and reduce the communication times between the client and the server; second, in the outer-layer optimization, the local update that client i uses gradient descent is for F _i As follows:

wherein, eta is the learning rate,

according to

Using the current personalized model

To calculate;

steps 1-4, for the actual algorithm, use

Is expressed as satisfying

Obtained by

A gradient is usually required

However, this requires ξ _i The distribution of (a); by applying to small batch data T _i Sampling is performed using the following pairs

Unbiased estimation of (2):

so that

Second, a first order iterative method is used to obtain a high precision approximation

Defining:

suppose λ is chosen such that the loss function is

Is strongly convex and then obtained by applying a gradient descent

So that:

otherwise, O (-) is a hidden constant; delta is then adjusted by controlling the use of the small batch size | T | sampling noise and the accuracy level ν.

3. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN according to claim 1 or 2, wherein: said step 2 further comprises the step of,

step 2-1, introducing GAN as a framework for training a generation model; the GAN generation model trains any generator network; in addition, the GAN learns a model that only generates sample points near the real data, i.e., the distribution learned by the GAN is very close to the real distribution;

GAN consists of a set of "confrontation" models: a generative model G capturing the data distribution, and a discriminative model D estimating the probability that the sample is from the real data instead of the generative model G; both G and D may be non-linear mapping functions;

to learn the generator distribution p using data x _g The generator constructs a prior noise distribution p _z (z) mapping function G (z; o) to data space _g ) (ii) a Discriminator D (x; o) _d ) Outputting a scalar quantity, representing that x is from real data instead of p _g The probability of (d);

both G and D need to be trained: the parameters of G are adjusted to minimize log (1-D (G (z)) and the parameters of D are adjusted to minimize logD (x) as if they followed a game with a cost function V (G, D), i.e., the following min-max optimization problem:

step 2-2, in the unconditional generation model, the type of the data being generated cannot be controlled; however, by adjusting the model according to the additional information, the data generation process is guided; this condition is based on class labels, some a priori information, and even data information from different modalities;

when the generator and the arbiter both condition some extra information y, then GAN is extended to the condition model, i.e. the condition GAN; y is any type of auxiliary information; we perform the adjustment by inputting y as an additional input layer into the discriminator and generator;

in the generator, a priori input noise p _z (z) and y are combined into a joint hidden representation, which provides greater flexibility to the antagonism training framework; in the discriminator, x and y are used as input and discrimination functions; the objective functions V (D, G) of the two min-max games are as follows:

step 2-3, passing the respective data of the N clients through the trained conditional GAN model respectively, and then sending respective unique local model values to the cloud server side, further calculating the aggregation average value of the models by the cloud server side, sending the value back to the N clients, and finally updating the respective local model values by the clients; adding the trained conditional GAN model into the pFedMe model in the step 1 according to an individualized federal learning mode, thereby realizing data augmentation and obtaining the pFedMe model based on the conditional GAN.

4. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 3, wherein: the step 3 further comprises the step of,

step 3-1, CIFAR10 is a data set containing 60000 color pictures of size 32 x 32, and all images fall into 10 different categories; distributing the complete data set to N-20 clients;

5. The method for data augmentation and improvement of personalized federal learning performance based on conditional GAN as claimed in claim 4, wherein: the step 4 further comprises the step of,