CN117473558A

CN117473558A - Self-adaptive DPWGAN training method and system based on federal learning

Info

Publication number: CN117473558A
Application number: CN202311785715.8A
Authority: CN
Inventors: 周长利; 江振宇; 陈子康; 朱文龙; 陈祖希; 程小刚
Original assignee: Huaqiao University
Current assignee: Huaqiao University
Priority date: 2023-12-25
Filing date: 2023-12-25
Publication date: 2024-01-30

Abstract

The invention discloses a self-adaptive DPWGAN training method and system based on federal learning, wherein the method comprises the steps that a server broadcasts a discriminator, generator parameters and noise scale in an initialized WGAN to a client; the client performs DPWGAN training by using a local data set, and perturbs the trained generator parameters by using a differential privacy scheme of self-adaptive noise; uploading the disturbed generator parameters to a server; the server performs weighted average operation on the disturbed generator parameters to obtain an aggregated global model, and generates an acceptance Score and an FID value of the image according to the aggregated global model; and updating until the values of the acceptance Score and the FID reach the expected targets. The invention can solve the problem of data island and privacy protection in the traditional GAN training.

Description

Self-adaptive DPWGAN training method and system based on federal learning

Technical Field

The invention relates to the field of federal learning, in particular to a self-adaptive DPWGAN training method and system based on federal learning.

Background

With the advent of the big data age, machine learning has entered a significant growth phase. To accommodate different scenarios, many machine learning models have been developed, including Convolutional Neural Networks (CNNs), recurrent Neural Networks (RNNs), and generation of countermeasure networks (GANs). GAN and its various iterations are excellent in producing high quality "synthetic" samples that are very difficult to discern from real data. Noteworthy applications include generating images from text descriptions, converting still images to video, and enhancing image resolution.

Unfortunately, GAN training faces three major challenges. First, recent findings emphasize that even complex, opaque generative models, such as GAN, produce output in an unexplainable manner, are susceptible to privacy disclosure. Second, while the deep learning approach achieves significant results, acquiring large amounts of labeled data is still a necessary prerequisite for building robust classifiers. Furthermore, real world data typically exists in isolated form. Privacy and security issues prevent data sharing despite the large amount of data available between various users, participants, and data owners. Finally, current DPGAN training protocols employ a uniform noise scale, lacking in adaptation. Surveys indicate that different data owners have different privacy requirements, making traditional DPGAN incapable of adaptive training.

Therefore, in order to solve the problem of data islanding and privacy protection in the conventional GAN training, it is highly desirable to provide a method or system for self-adaptive DPWGAN training for privacy protection.

Disclosure of Invention

The invention aims to provide a self-adaptive DPWGAN training method and system based on federal learning, which can solve the problems of data island and privacy protection in the traditional GAN training.

In order to achieve the above object, the present invention provides the following solutions:

an adaptive DPWGAN training method based on federal learning, comprising:

broadcasting the discriminator, generator parameters and noise scale in the initialized WGAN to each client by the server;

the client performs DPWGAN training by using a local data set, perturbs the trained generator parameters by using a differential privacy scheme of self-adaptive noise, and uploads the perturbed generator parameters to the server;

the server performs weighted average operation on the disturbed generator parameters uploaded by all clients to obtain an aggregated global model; generating an acceptance Score and an FID value of the image according to the aggregated global model;

and when the InceptionScare and FID values of the generated image do not reach the expected targets, broadcasting generator parameters in the current aggregated global model to the client, returning to the client to perform DPWGAN training by using the local data set, disturbing the trained generator parameters by using a self-adaptive noise differential privacy scheme, and uploading the disturbed generator parameters to the server until the InceptionScare and FID values of the generated image reach the expected targets.

Optionally, the wasperstein distance is introduced as a loss function in the client's DPWGAN training with the local data set.

Optionally, differential privacy is introduced in the client's DPWGAN training with the local data set.

Optionally, the adaptive noise differential privacy scheme is to track the privacy consumption of the user by using moment accounting, and adaptively adjust the noise scale in the training process.

Optionally, the updating mode of the moment accounting is:

；

wherein,l is the sampled data quantity, +.>For the sampling rate, satisfy->N is the amount of data owned by each client, < >>Is the amplitude of noise>For the order in moment accounting, +.>For moment accounting concept in differential privacy, which is used to measure privacy consumption, exp () is an exponential function based on a natural constant e.

Optionally, the noise scale formula is:；

wherein,for the noise scale>To the extent of privacy disclosure->Is the minimum of the sampled data amounts.

An adaptive DPWGAN training system based on federal learning, comprising:

the initialization module is used for broadcasting the identifier, the generator parameters and the noise scale in the initialized WGAN to each client by the server;

the training module is used for carrying out DPWGAN training by the client by utilizing the local data set and disturbing the trained generator parameters by utilizing the differential privacy scheme of the self-adaptive noise; uploading the disturbed generator parameters to a server;

the parameter aggregation module is used for carrying out weighted average operation on the disturbed generator parameters uploaded by all the clients by the server to obtain an aggregated global model; generating an acceptance Score and an FID value of the image according to the aggregated global model;

and the iteration module is used for broadcasting generator parameters in the current aggregated global model to the client when the information Score and the FID value of the generated image do not reach the expected target, and returning to the training module until the information Score and the FID value of the generated image reach the expected target.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the invention provides a self-adaptive DPWGAN training method and a self-adaptive DPWGAN training system based on federal learning, and provides a federal learning method with data sharing and privacy. The client performs self-adaptive DPWGAN training, uploads the parameter information of the local training, the server performs aggregation update on the received parameter information, and the federal learning process does not need to depend on a trusted central aggregator, so that the problem of data islanding can be solved, and training data of each participant can be protected. The invention makes attacks such as inference attack in federal learning difficult to carry out through the differential privacy scheme of the self-adaptive noise, further strengthens the privacy protection of data, and can dynamically change the noise added in the differential privacy mechanism so as to improve the accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a federally learning-based adaptive DPWGAN training method according to the present invention.

Fig. 2 is a schematic diagram of the overall architecture of a federally learning-based adaptive DPWGAN training method according to the present invention.

Fig. 3 is a block diagram of a GAN network.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention aims to provide a self-adaptive DPWGAN training method and system based on federal learning, which not only can solve the problem of data islanding, but also can protect training data of each participant.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

As shown in fig. 1 and fig. 2, the adaptive DPWGAN training method based on federal learning provided by the present invention includes:

s101, broadcasting a discriminator, generator parameters and noise scales in the initialized WGAN to each client by the server; as shown in fig. 2, the system comprises a central server and N clients, data are distributed in the N clients, and the clients and the server only transmit parameters and not transmit data, wherein the server adopts a global model, and the clients adopt a local model.

S102, a client performs DPWGAN training by using a local data set, perturbs the trained generator parameters by using a differential privacy scheme of self-adaptive noise, and uploads the perturbed generator parameters to a server;

the wasperstein distance is introduced as a loss function in the client's DPWGAN training with the local data set.

The loss function is:

；

wherein,representing the discriminator->Representative generator->Representing the desired calculation +.>For a true data distribution,for random noise distribution, ++>Minimum generator->For the maximum of the discriminant, x is the real data and z is the random noise.

Differential privacy is introduced in DPWGAN training by a client by utilizing a local data set, so that attacks such as inference attack in federal learning are difficult to carry out, and the privacy protection of data is further enhanced.

By introducing the self-adaptive noise mechanism, the noise added in the differential privacy mechanism can be dynamically changed, so that the accuracy is improved.

The differential privacy scheme of the self-adaptive noise is to track the privacy consumption of the user by utilizing moment accounting, and the noise scale is self-adaptively adjusted in the training process.

The updating mode of the moment accounting is as follows：。

The noise scale formula is:。

S103, the server performs weighted average operation on the disturbed generator parameters uploaded by all clients to obtain an aggregated global model; and generating an acceptance Score and an FID value of the image according to the aggregated global model.

And S104, when the information Score and the FID value of the generated image do not reach the expected target, broadcasting generator parameters in the current aggregated global model to the client, and returning to S102 until the information Score and the FID value of the generated image reach the expected target.

The GAN network is configured as shown in fig. 3, and mainly includes two parts: a generator and a arbiter. The working strategy of the GAN is that firstly, the generator network is utilized to generate the generated data, and then, the discriminator network is utilized to compare the generated data with the real data so as to calculate the corresponding loss function to train the network. The whole network adopts the idea of game countermeasure, the purpose of the generator is to generate data as real as possible, and the purpose of the discriminator is to improve the capability of discriminating real data and generating data, namely, scoring the real data in high score and scoring the generated data in low score.

Corresponding to the method provided by the above embodiment, the adaptive DPWGAN training system based on federal learning provided by the present invention includes:

the initialization module is used for broadcasting the identifier, the generator parameters and the noise scale in the initialized WGAN to each client by the server.

The training module is used for the client to carry out DPWGAN training by using the local data set, disturbing the trained generator parameters by using the differential privacy scheme of the self-adaptive noise, and uploading the disturbed generator parameters to the server.

The parameter aggregation module is used for carrying out weighted average operation on the disturbed generator parameters uploaded by all the clients by the server to obtain an aggregated global model; and generating an acceptance Score and an FID value of the image according to the aggregated global model.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. An adaptive DPWGAN training method based on federal learning, comprising:

2. The adaptive DPWGAN training method based on federal learning of claim 1, wherein the wasperstein distance is introduced as a loss function in the DPWGAN training of the client using the local data set.

3. The adaptive DPWGAN training method based on federal learning of claim 1, wherein differential privacy is introduced in the client performing DPWGAN training using a local data set.

4. The adaptive DPWGAN training method as claimed in claim 1, wherein the differential privacy scheme of the adaptive noise is to track the privacy consumption of the user by using moment accounting, and the noise scale is adaptively adjusted during the training process.

5. The adaptive DPWGAN training method based on federal learning of claim 4, wherein the moment accounting is updated in the following manner:

；

wherein,l is the sampled data quantity, +.>For the sampling rate, satisfy->N is the amount of data owned by each client, < >>Is the amplitude of noise>Order in accounting for moments，/>For moment accounting concept in differential privacy, which is used to measure privacy consumption, exp () is an exponential function based on a natural constant e.

6. The adaptive DPWGAN training method based on federal learning of claim 5, wherein the noise scale formula is:；

7. An adaptive DPWGAN training system based on federal learning, comprising:

the training module is used for carrying out DPWGAN training by the client by utilizing the local data set, disturbing the trained generator parameters by utilizing the differential privacy scheme of the self-adaptive noise, and uploading the disturbed generator parameters to the server;