CN116822590A

CN116822590A - Forgetting measurement model based on GAN and working method thereof

Info

Publication number: CN116822590A
Application number: CN202310739381.4A
Authority: CN
Inventors: 左知微; 唐卓; 李肯立; 肖雄; 向婷; 刘梦涵
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2023-06-21
Filing date: 2023-06-21
Publication date: 2023-09-29

Abstract

The invention discloses a forgetting measuring model based on GAN, which comprises a self-encoder, an original model obtained by training ResNet18 by using a training set and a forgetting model obtained after a forgetting strategy is applied to the original model, wherein the self-encoder comprises an encoder and a decoder, the first layer of the encoder is a two-dimensional convolution layer, the number of input channels is 3, and the number of output channels is 64. The convolution kernel has a size of 3x3, a step size of 1, and a padding of 1. The activation function is LeakyReLU, when the input value is less than 0, the output is set to 0.2 times the input value, the second layer is a two-dimensional convolution layer, the number of input channels is 64, and the number of output channels is 128. The size of the convolution kernel is 4x4, the step size is 2. The invention can solve the technical problems that the accuracy and recall rate of the existing forgetting measuring model can not accurately measure whether the processing capacity of the model to the data is completely eliminated after the specific data is forgotten, and the existing index can not accurately measure the change of the processing capacity of the model to the prior data.

Description

Forgetting measurement model based on GAN and working method thereof

Technical Field

The invention belongs to the technical field of machine learning, in particular to a forgetting measuring model based on a generated countermeasure network (Generative Adversarial Network, GAN for short) and a working method thereof.

Background

Users may want to revoke certain information they have published, but models trained using such data do not automatically eliminate the impact of such revoked data on the model, thereby not protecting user privacy well. Simply deleting specific data from the training dataset does not enable "machine forgetting" in the true sense, because these data already affect the parameters of the model during the learning process, and these parameter changes carry the information of the specific data. Machine forgetting means that the problem is solved by eliminating the influence of deleted data on a training model, has wide application prospect, and has become an emerging topic in the field of machine learning in recent years. The existing machine forgetting is not only to consider how to realize forgetting, but also how to measure the forgetting effect of the forgetting strategy.

The machine forgetting effect measurement mainly focuses on two aspects, namely, the performance of the model on the data to be deleted is discussed on one hand, namely, the forgetting effect is obtained, and on the other hand, whether the processing capacity of the model is reserved in the residual data is considered.

However, existing forgetfulness metrics generally suffer from some non-negligible drawbacks:

first, depth measurement is not enough: the accuracy and recall rate cannot accurately measure whether the processing capacity of the model on the data is completely eliminated after the specific data is forgotten, and the existing index cannot accurately measure the change of the processing capacity of the model on the prior data.

Second, generalization and forgetting effect balance problem: even if the model completely eliminates the influence of the data to be forgotten, the model still has good performances such as accuracy, recall rate and the like on the data to be forgotten due to the fact that the model has certain generalization, and the model forgetting effect is difficult to accurately and intuitively measure.

Disclosure of Invention

Aiming at the defects or improvement demands of the prior art, the invention provides a forgetting measuring model based on GAN and a working method thereof, and aims to solve the technical problems that the accuracy and recall rate of the prior forgetting measuring model can not accurately measure whether the processing capacity of the model to specific data is completely eliminated after the specific data is forgotten, the prior index can not accurately measure the change of the processing capacity of the model to the prior data, and the model still has good accuracy, recall rate and other performances on the data to be forgotten because the model has certain generalization, and the forgetting effect of the model is difficult to accurately and intuitively measure.

To achieve the above object, according to one aspect of the present invention, there is provided a GAN-based forgetting measure model, comprising three parts of a self-encoder, a primitive model obtained by training res net18 using a training set, and a forgetting model obtained after applying a forgetting strategy to the primitive model, connected in sequence, the self-encoder comprising an encoder and a decoder, the encoder having the following structure:

the first layer is a two-dimensional convolutional layer, the number of input channels is 3 (corresponding to RGB color channels), and the number of output channels is 64. The convolution kernel has a size of 3x3, a step size of 1, and a padding of 1. The activation function is LeakyReLU, and when the input value is less than 0, the output is set to 0.2 times the input value.

The second layer is a two-dimensional convolution layer, the number of channels input is 64, and the number of channels output is 128. The size of the convolution kernel is 4x4, the step size is 2, and the padding is 1.

The third layer is a two-dimensional batch normalization layer, the input channel number is 128, the activation function is LeakyReLU, and the output channel number is 128, and the normalization layer is used for normalizing 128 feature graphs.

The fourth layer is a two-dimensional convolution layer, the number of input channels is 128, and the number of output channels is 256. The convolution kernel size is 4x4, the step size is 2, and the padding is 1.

The fifth layer is a two-dimensional batch normalization layer, the input channel number is 256, the activation function is LeakyReLU, the output channel number is 256, and the normalization layer is used for normalizing 256 feature graphs.

The specific structure of the decoder is as follows:

the first layer is a two-dimensional transpose convolutional layer, the number of channels input is 256, and the number of channels output is 128. The size of the convolution kernel is 4x4, the step size is 2, and the padding is 1, for increasing the size of the feature map.

The second layer is a two-dimensional batch normalization layer, the input channel is 128, the activation function is ReLU, and the output channel is 128, and the normalization is performed on 128 feature graphs.

The third layer is a two-dimensional transpose convolution layer with an input channel number of 128 and an output channel number of 64. The size of the convolution kernel is 4x4, the step size is 2, and the padding is 1.

The fourth layer is a two-dimensional batch normalization layer, the number of input channels is 64, the activation function is ReLU, and the output channels are 64, and the normalization layer is used for normalizing 64 feature maps.

The fifth layer is a two-dimensional convolution layer with an input channel number of 64 and an output channel number of 3. The convolution kernel has a size of 3x3, a step size of 1, and a padding of 1.

According to another aspect of the present invention, there is provided a method for operating a GAN-based forgetting measure model, comprising the steps of:

(1) The method comprises the steps of obtaining a data set, dividing the data set into a training set and a testing set, training a ResNet18 model by using the training set to obtain an original model, and dividing the training set into a residual data set and a forgetting data set according to the proportion of 8:2.

(2) And (3) slightly perturbing the parameters in the original model obtained in the step (1) by using Random-K, top-K, EU-K and CF-K forgetting strategies in sequence to obtain a forgetting model. And (3) training the forgetting model by using the residual data set obtained in the step (1) to obtain a trained forgetting model.

(3) Combining the original model obtained in the step (1) and the forgetting model trained in the step (2) to obtain a joint discriminator, and training the joint discriminator by using a forgetting measuring model based on GAN (hereinafter referred to as a forgetting measuring model based on GAN) to obtain a generator;

(4) Inputting the forgetting data set obtained in the step (1) into the generator obtained in the step (3), thereby generating noise data for the forgetting data in the forgetting data set, adding the generated noise data into the forgetting data to obtain corresponding anti-forgetting samples, and inputting the anti-forgetting samples into the forgetting model obtained in the step (2) to obtain the difference between the performance of the forgetting model on the anti-forgetting samples and the performance of the forgetting model on the forgetting data set, and further measuring the forgetting effect of the machine.

Preferably, step (3) comprises the sub-steps of:

and (3-1) inputting the training set obtained in the step (1) into a generator G (-) of a forgetting measuring model based on GAN for encoding and decoding so as to obtain noise data G (x) with the same size corresponding to the training set.

(3-2) adding the noise data G (x) obtained in step (3-1) to a training set to obtain an antagonistic sample x';

specifically, the following formula is adopted in the present step:

x′＝G(x)+x

(3-3) inputting the challenge sample x' obtained in the step (3-2) into the original model D obtained in the step (1) _s And the forgetting model D obtained in the step (2) _U In the combined discriminant formed, iterative training is carried out on the generators through the combined discriminant, so that the final generators are obtained.

Preferably, the encoding and decoding processes are expressed using the following formulas:

an encoder:

h＝f(Wx+b)

an encoder:

r＝g(W′h+b′)

wherein: x is the input data, h is the output of the encoder, r is the output of the decoder, W and W 'are the weight matrices, b and b' are the bias vectors, and f and g are the activation functions.

Preferably, in the encoder of the generator, the activation function employs the following formula:

where α=0.2.

Preferably, in the decoder of the generator, the activation function employs the following formula:

g(x)＝max(0,x)

in the decoder of the generator, the last layer control output value is within the range of [ -1,1], specifically using the following formula:

where e is the base of the natural logarithm.

Preferably, the step (3-3) is specifically:

first, the loss of the challenge sample x' on the original model is obtained _source By a method of mass _source The minimization is performed so that the challenge sample x' is close to the data x in the training set.

Then, obtaining loss of the countermeasures sample x' on the forgetting model _unlearn By a method of mass _unlearn Maximizing that the challenge sample x' is inconsistent with the data x in the training set as much as possible classified by the forgetting model, thereby obtaining the loss of the forgetting weighing model based on GAN:

L＝loss _source -λloss _unlearn

where lambda is expressed as a coefficient for the desired risk on the forgetting model and ranges from 0 to 1, preferably 0.005.

Finally, the above procedure was repeated 100 times, resulting in a final generator.

In general, the above technical solutions conceived by the present invention, compared with the prior art, enable the following beneficial effects to be obtained:

(1) According to the invention, as the original model and the forgetting model are adopted as the combined discriminator, the performance of the anti-forgetting data set on the original model can be ensured, the processing capacity of the model on different data sets can be reflected in a finer granularity by combining with the measurement indexes such as accuracy and the like, and the performance change of the model in the data forgetting process can be accurately tracked and measured, so that the technical problem of insufficient measurement depth of the existing model can be solved;

(2) The invention adopts the forgetting measuring method based on GAN, which can break the independent same distribution characteristic between the forgetting data set and the residual data set, so that the forgetting effect of the forgetting model is more visual, and the technical problem that the model forgetting effect is difficult to accurately and intuitively measure in the existing model can be solved.

Drawings

FIG. 1 is a schematic diagram of the method of operation of the GAN-based forgetting measure model of the present invention;

FIG. 2 is a network structure diagram of the forgetting measure model based on GAN of the present invention;

fig. 3 is an effect of the GAN-based forgetting measure model of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.

The basic idea of the invention is that K parameters which are most sensitive to disturbance in a screened model are used for introducing noise, in this aspect, the method is used for forcing the model to forget information, the methods are set as forgetting strategies used by the invention, on the other hand, after noise is added to forget data to generate, a generator is trained, the accuracy of the original model is ensured to be equivalent to that of original forgetting data, but the accuracy of the forgetting data after disturbance on the forgetting model is greatly reduced, so that the effectiveness of model forgetting is more intuitively displayed.

In the experiment of the invention, a ResNet18 model is used for testing on an image dataset CIFAR10, a plurality of forgetting strategies are used in the invention, K=10% is set for Random-K, K=25 is set for Top-K, namely, after slightly perturbing the most sensitive 25 parameters in the model, the rest dataset is used for training to complete the forgetting process of the model, and the final K layers of the model are perturbed for EU-K and CF-K, and K is set to be 5 and 10 respectively. The forgetting measuring model based on GAN provided by the invention can amplify the reduction degree of the accuracy of the disturbance sample on the forgetting model under the condition that the accuracy of the antagonism sample on the original model is more than 95%, and the accuracy of the disturbance sample on the ResNet18 model is about 10%.

As shown in fig. 2, the present invention provides a GAN-based forgetting measure model, which includes three parts of a self-encoder, an original model obtained by training the res net18 using a training set, and a forgetting model obtained after applying a forgetting strategy on the original model, which are sequentially connected.

Specifically, the self-encoder includes an encoder and a decoder, and the encoder has the following structure:

The specific structure of the decoder is as follows:

The fifth layer is a two-dimensional convolution layer with an input channel number of 64 and an output channel number of 3. The convolution kernel has a size of 3x3, a step size of 1, and a padding of 1. The output of this layer is the final output of the decoder, whose channel number is set to 3, corresponding to the RGB color channels. The activation function is Tanh for controlling the range of output values within [ -1,1] to match the original input picture.

As shown in fig. 1, the invention provides a working method of a forgetting measuring model based on GAN, which comprises the following steps:

(3) Combining the original model obtained in the step (1) and the forgetting model trained in the step (2) to obtain a joint discriminator, and training the joint discriminator by using a forgetting measurement model (SPD-GAN) based on GAN to obtain a generator;

the method specifically comprises the following steps:

and (3-1) inputting the training set obtained in the step (1) into a generator G (-) of SPD-GAN for encoding and decoding so as to obtain noise data G (x) with the same size corresponding to the training set.

Specifically, the encoding and decoding processes are expressed using the following formulas:

an encoder:

h＝f(Wx+b)

an encoder:

r＝g(W′h+b′)

Specifically, in the encoder of the generator, the activation function employs the following formula:

where α=0.2.

In the decoder of the generator, the activation function uses the following formula:

g(x)＝max(0,x)

where e is the base of the natural logarithm. Through the above processing, a noise matrix having the same size as the input but very small can be obtained.

specifically, the following formula is adopted in the present step:

x′＝G(x)+x

the advantage of this step is that the GAN is used to generate noise, breaking the independent co-distribution characteristics of the dataset, thus rendering the generalization of the model ineffective against samples.

The method specifically comprises the following steps:

first, the loss of the challenge sample x' on the original model is obtained _soutce By a method of mass _soutce The minimization is performed so that the challenge sample x' is close to the data x in the training set.

Then, obtaining loss of the countermeasures sample x' on the forgetting model _unlearn By a method of mass _unlearn Maximization makes the challenge sample x' and the data x in the training set as inconsistent as possible by the forgetting model classification. Then there is a loss of SPD-GAN:

L＝loss _source -λloss _unlearn

The method has the advantages that the original model is used as one of the discriminators to control the difference between the challenge sample and the training set, the forgetting model is used as the other discriminator to change the independent same distribution characteristic of the data as much as possible, so that the performance of the challenge sample on the forgetting model is reduced, and the forgetting effect is more intuitively shown.

The method has the advantages that the SPD-GAN is used for disturbing the distribution of the forgotten data set, the independent identical distribution characteristics between the forgotten data set and the rest data set are broken, and the noise is added in a targeted mode through the use of the joint discriminator, so that the performance change of the forgotten data set is smaller on an original model, and the forgetting effectiveness of the forgotten model is further intuitively represented.

Experimental results

The experimental environment of the invention: the algorithm of the present invention is implemented using a Pytorch framework using a NVIDIA a100 GPU. The specific arrangement is as follows: the batch size was 128, K=25 for Top-K, K=10% for Random-K, K was set to 5 and 10 for EU-K and CF-K, respectively, JS divergence factor was 0.1, and forgetting model loss factor was 0.005.

In order to illustrate the effectiveness of the method and the memory function of the residual data set, 20% of data is randomly selected as forgetting data on the training set of CIFAR10, the residual data set is the residual data set, based on the effectiveness of forgetting is measured by using the SPD-GAN model to the forgetting models obtained by referring to each forgetting strategy, and other measurement indexes are set as forgetting data set accuracy D _UL Acc, residual dataset accuracy D _RE Acc, forgetting rate FR, memory retention MMR, disturbance parameter quantity. The evaluation results are shown in table 1 below, in which the arrow direction indicates a better index value:

TABLE 1

From the above table showing the performance of the forgetting model obtained by training 50 rounds on ResNet18, it can be seen that Random-K of the present invention takes the lowest level in the forgetting datasetThe certainty rate (73.83%) and the highest forgetting rate (0.2617) are distinguished. However, random-K is due to its lower D _RE Acc (91.345) compromises the generalization performance of the model. EU-K and CF-K perform well in memory retention. Notably, top-K produced a 20% difference in accuracy between the forgotten dataset and the remaining set, although only 25 model parameters were perturbed, over other methods with minimal perturbation rates. Top-K and Random-K both have higher forgetting rates, indicating that they can achieve the forgetting effect faster.

In order to further explain whether the SPD-GAN provided by the invention can more intuitively display the forgetting effect of breaking independent and equidistributed forgetting data sets on a forgetting model, resNet18 is used as a model, and anti-forgetting data D is used _p Accuracy on forgetting model PoU and D _p The accuracy PoS on the original model is measured and the result is shown in fig. 3.

When the SPD-GAN of the invention is applied to each forgetting strategy to generate an countermeasure sample and perform 10 rounds of training, FIG. 3 shows that PoS (accuracy of disturbance forgetting data set on original model) obtains an accuracy of nearly 100% under each forgetting strategy, and PoU (accuracy of disturbance forgetting data set on forgetting model) is about 10%, which shows that forgetting effect can be more intuitively displayed by forgetting model under the condition of breaking independent same distribution characteristics

It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. A forgetting measuring model based on GAN comprises a self-encoder, an original model obtained by training ResNet18 by using a training set and a forgetting model obtained after applying forgetting strategy on the original model, which are connected in sequence, and is characterized in that,

the self-encoder comprises an encoder and a decoder, and the encoder has the following structure:

The specific structure of the decoder is as follows:

2. A method of operating a GAN-based forgetting measure model according to claim 1, comprising the steps of:

3. The method of claim 2, wherein step (3) comprises the substeps of:

(3-2) adding the noise data G (x) obtained in the step (3-1) to a training set to obtain an antagonistic sample x ^′ ；

Specifically, the following formula is adopted in the present step:

x ^′ ＝G(x)+x

(3-3) the challenge sample x obtained in the step (3-2) ^′ Inputting the original model D obtained in the step (1) _s And the forgetting model D obtained in the step (2) _U In the combined discriminant formed, iterative training is carried out on the generators through the combined discriminant, so that the final generators are obtained.

4. A method of operating a GAN-based forgetting measure model according to claim 3, characterized in that the encoding and decoding process is formulated as follows:

an encoder:

h＝f(Wx+b)

an encoder:

r＝g(W ^′ h+b′)

5. The method of claim 3 or 4, wherein in the encoder of the generator, the activation function uses the following formula:

where α=0.2.

6. The method of claim 3 to 5, wherein,

g(x)＝max(0,x)

where e is the base of the natural logarithm.

7. The method of claim 3, wherein the step (3-3) is specifically:

L＝loss _source -λloss _unlearn