WO2022067874A1

WO2022067874A1 - Training method and apparatus for image data augmentation network, and storage medium

Info

Publication number: WO2022067874A1
Application number: PCT/CN2020/120175
Authority: WO
Inventors: 胡庆茂; 苏李一磊
Original assignee: 中国科学院深圳先进技术研究院
Priority date: 2020-09-29
Filing date: 2020-10-10
Publication date: 2022-04-07
Also published as: CN112164008A; CN112164008B

Abstract

A training method and apparatus for an image data augmentation network, and a storage medium. The training method comprises: obtaining a noise sample and a real data sample to be augmented (S10); inputting the noise sample into an image data augmentation network to obtain a generated data sample (S20); inputting the real data sample and the generated data sample into the image data augmentation network to obtain multiple sets of cosine distance values, and the image data augmentation network performing calculation according to the multiple sets of cosine distance values to obtain a loss function (S30); and updating network parameters of the image data augmentation network according to the loss function (S40). On the basis of combining an OT theory and a GAN, the distance between real data distribution and generated data distribution is defined by using a cosine distance, and thus, the stability of a network structure and the quality of generated data are improved, and the influence of noise on a network is reduced.

Description

Image data augmentation network training method, training device and storage medium thereof

technical field

The invention belongs to the technical field of image processing, and in particular, relates to a training method of an image data enhancement network, a training device thereof, and a computer-readable storage medium.

Background technique

In deep learning, many neural networks need a large number of parameters for training to effectively prevent overfitting. High-quality data sets should contain enough categories, have a certain diversity, and can fully express the characteristics of the data.

However, in many practical situations, it is very difficult to obtain a large amount of high-quality data. The specific manifestations are: 1) The available data for training is less and difficult to obtain, requiring a lot of manpower; 2) The data is unbalanced in various categories; 3) The data contains sensitive information or personal privacy information that cannot be used for public use. These data limitations are particularly evident in the field of medical image processing. The methods commonly used in deep learning such as fine-tuning are difficult to play an effective role in the training of small samples lacking diversity. In order to improve training accuracy and effectively prevent overfitting, data augmentation is currently the most used method in deep learning. Traditional image data enhancement methods mainly include: translation, rotation, flipping, scaling, cropping, adding noise, etc. These methods are simple, fast and reproducible, but the images generated by traditional data enhancement methods have strong correlation, that is, little new effective information is added, which cannot be solved well in the case of complex images because of small samples. resulting problems.

Generative adversarial network (GAN, Generative adversarial network) has shown great potential in image synthesis in recent years.

The original GAN structure is based on the Multilayer Perceptron (MLP, Multilayer Perceptron), which consists of two parts of the neural network: the generator G and the discriminator D. The input z of the generator D comes from the known distribution p(z), which is usually selected as a Gaussian distribution. or a normal distribution, the generator D generates an output x _g that obeys the distribution p _g (x) to achieve p _g (x) = p _r (x), where p _r (x) is the distribution of the real sample x _r ; while the discriminant The generator outputs the probability that the sample is a real sample: the generator with parameter θ _g outputs the generated sample x _g = G(z; θ _g ), and the parameter θ _d outputs y = D(x; θ _d ). The generator G and the discriminator D optimize the loss function through adversarial training to make (G, D) reach Nash balance.

The loss function of GAN is:

in

For the mathematical expectation, the loss function analysis of the original GAN structure shows that when the discriminator has been optimized as:

The above loss function is equivalent to minimizing the Jensen-Shannon Divergence (JSD) between the true data distribution and the generated data distribution:

However, when the support set of the two distributions is negligible, the JSD is constant, causing the generator to fail to continue training. In real-world situations, it is difficult for the generated distribution after random initialization of the generator to have a non-negligible overlap with the real distribution, which will lead to the problem of mode disappearance or mode collapse.

In order to solve the above problems, the simultaneous optimal transport method (OT, Optimal transport) can measure the distance between two distributions by finding the minimum cost between the two distributions, regardless of whether the support sets of the two distributions overlap or not. This theory provides a way to address the shortcomings of the original GAN structural loss function.

From the perspective of OT, GAN can be regarded as realizing OT mapping through the generator, and realizing the distance determination between the real data distribution and the generated data distribution through the discriminator. The distance between two distributions can be defined as:

Among the existing technologies, Wasserstein GAN (WGAN) has made a breakthrough in the improvement of using OT for GAN. WGAN chooses c(x _r , x _g ) as the Euclidean distance, and the distance between the two distributions is defined as:

That is, the Wasserstein distance. The input of the WGAN generator is a noise sample z that obeys a normal distribution between [-1, 1]. New data is synthesized through OT mapping and optimization of the Wasserstein distance. The images generated by WGAN can be used for data enhancement.

However, the Wasserstein distance definition used in WGAN, which applies OT theory to the GAN structure, is based on Euclidean distance, however, Euclidean distance is sensitive to scale and outliers, that is, sensitive to the influence of noise.

SUMMARY OF THE INVENTION

(1) Technical problem to be solved by the present invention

The technical problem solved by the invention is: how to improve the robustness of the model to the influence of noise on the basis of solving the technical problem of unstable training of the existing adversarial network.

(2) Technical scheme adopted in the present invention

A training method for an image data enhancement network, the training method comprising:

Obtain noise samples and real data samples to be enhanced;

inputting the noise samples into an image data augmentation network to obtain generated data samples;

Inputting the real data samples and the generated data samples into the image data enhancement network to obtain multiple sets of cosine distance values, and the image data enhancement network calculates a loss function according to the multiple sets of cosine distance values;

The network parameters of the image data enhancement network are updated according to the loss function.

Optionally, the image data augmentation network includes a generator and a discriminator, wherein

The method for inputting the noise samples into an image data enhancement network to obtain the generated data samples is: inputting the noise samples into the generator, and the generator outputs the generated data samples;

The method for calculating and obtaining multiple sets of cosine distance values according to the real data samples and the generated data samples is: inputting the real data samples and the generated data samples into the discriminator, and the discriminator outputs multiple sets of cosine distance values. Cosine distance values, the plurality of sets of cosine distance values include a first cosine distance value, a second cosine distance value, and a third cosine distance value.

Optionally, the real data samples include a first sub-real sample x _r and a second sub-real sample x _r ′ that obey the same distribution, and the generated data samples include a first sub-generated sample x _g and a second sub-real sample x g that obey the same distribution. Two sub-generating samples x _g ′, the first cosine distance value d(x _r , x _g ), the second cosine distance value d(x _r , x _r ′) and the third cosine distance The formula for calculating the value d(x _g , x _g ′) is as follows:

Optionally, the method for calculating the loss function according to the multiple sets of cosine distance values is that the discriminator calculates the loss function L according to the following formula:

in,

is the mathematical expectation, and L is the loss function.

Optionally, the method for updating the network parameters of the image data enhancement network according to the loss function is:

Perform reverse operation on the image data enhancement network according to the loss function, and update the network parameters of the discriminator N times according to the stochastic gradient descent method;

The image data enhancement network is reversely operated according to the loss function, and the network parameters of the generator are updated once according to the stochastic gradient descent method.

The application also discloses a training device for an image data enhancement network, the training device comprising:

The acquisition module is used to acquire noise samples and real data samples to be enhanced;

a first input module for inputting the noise samples into an image data enhancement network to obtain generated data samples;

The second input module is configured to input the real data samples and the generated data samples into the image data enhancement network to obtain multiple sets of cosine distance values, and the image data enhancement network calculates the multiple sets of cosine distance values to obtain a loss function;

An update module, configured to update the network parameters of the image data enhancement network according to the loss function.

the first input module is configured to input the noise samples to the generator, and the generator outputs generated data samples;

The second input module is used to input the real data samples and the generated data samples into the discriminator, and the discriminator outputs multiple sets of cosine distance values, and the multiple sets of cosine distance values include the first cosine distance value. The sine distance value, the second cosine distance value, and the third cosine distance value.

Optionally, when updating the network parameters of the image data enhancement network according to the loss function, the updating module is specifically configured to:

Use the loss function to perform a reverse operation on the image data enhancement network, and update the network parameters of the discriminator N times according to the stochastic gradient descent method;

The image data enhancement network is reversely operated by using the loss function, and the network parameters of the generator are updated once according to the stochastic gradient descent method.

The present application also discloses a computer-readable storage medium, where the computer-readable storage medium stores a training program of an image data enhancement network, and the image data enhancement network realizes the above-mentioned image data enhancement when the training program of the image data enhancement network is executed by a processor The training method of the network.

(3) Beneficial effects

The invention discloses a training method for an image data enhancement network, which has the following technical effects compared with the traditional training method:

(1) Based on the combination of OT theory and GAN, the cosine distance is used to define the distance between the real data distribution and the generated data distribution, thereby improving the stability of the network structure and the quality of the generated data, and reducing the impact of noise on the network. The method proposed in this embodiment can perform data enhancement on small sample data, and generates results with high diversity, large IS coefficient and small FID coefficient, which can solve the problem of high correlation of enhanced data in traditional data enhancement methods.

Description of drawings

1 is a flowchart of a training method for an image data enhancement network according to Embodiment 1 of the present invention;

2 is a schematic diagram of an apparatus for training an image data enhancement network according to Embodiment 2 of the present invention;

3 is a comparison diagram of generated images of different network models according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a computer device according to Embodiment 3 of the present invention.

Detailed ways

In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

Before describing the various embodiments of the present application in detail, first briefly describe the inventive concept of the present application: in the prior art, the OT theory is applied to the training of adversarial networks. The application updates the network parameters of the image data enhancement network by calculating multiple sets of cosine distance values between the real data samples and the generated data samples, and constructing a loss function according to the multiple sets of cosine distance values. robustness.

Among them, OT theory can measure the distance between two distributions by finding the minimum cost between the transmissions of the two distributions, regardless of whether the support sets of the two distributions overlap or not. This theory provides a way to address the shortcomings of the original GAN structural loss function. OT is defined as finding the optimal mapping function π between two distributions p _g (x) and p _r (x) based on the cost function c(x _r ,x _g ):

where Π(p _r ,p _g ) is the set of all joint distributions π(x _r ,x _g ). From the perspective of OT, GAN can be regarded as realizing OT mapping through the generator, and realizing the distance determination between the real data distribution and the generated data distribution through the discriminator. The distance between two distributions can be defined as:

Example 1

Specifically, as shown in Figure 1, the training method of the image data enhancement network of the first embodiment includes the following steps:

Step S10: Obtain noise samples and real data samples to be enhanced.

Step S20: Input the noise samples into an image data enhancement network to obtain generated data samples

Step S30: Input the real data samples and the generated data samples into the image data enhancement network to obtain multiple sets of cosine distance values, and the image data enhancement network calculates a loss function according to the multiple sets of cosine distance values.

Step S40: Update network parameters of the image data enhancement network according to the loss function.

Specifically, the image data enhancement network in this embodiment includes a generator G and a discriminator D, wherein both the generator G and the discriminator D use a convolutional neural network. In step S20, the noise samples are input to the generator G, and the generator G outputs generated data samples. In step S30, the real data samples and the generated data samples are input into the discriminator D, and the discriminator D outputs multiple sets of cosine distance values, and the multiple sets of cosine distance values include the first cosine distance. value, the second cosine distance value, and the third cosine distance value.

Further, the real data samples include a first sub-real sample x _r and a second sub-real sample x _r ′ that obey the same distribution, and the generated data samples include a first sub-generated sample x _g and a second sub-real sample x g that obey the same distribution. Subgenerate samples x _g ′. Specifically, the adopted real data samples obey the same distribution, and the first sub-real sample x _r and the second sub-real sample x _r ′ are obtained by random sampling. The input noise sample is fixed. After the generator G, a generated data sample that obeys a certain distribution will be generated, such as a normal distribution. The first sub-generated sample x _g and the second sub-generated sample x are randomly sampled from this distribution. _g '.

Among them, the calculation of the first cosine distance value d(x _r ,x _g ), the second cosine distance value d(x _r ,x _r ′) and the third cosine distance value d(x _g ,x _g ′) The formula is as follows:

Further, the discriminator D obtains the loss function L according to the following calculation formula:

in,

is the mathematical expectation, and L is the loss function.

Further, perform reverse operation on the image data enhancement network according to the loss function L, update the network parameters of the discriminator N times according to the stochastic gradient descent method, and update the generator according to the stochastic gradient descent method. The network parameters are updated once. The above steps are repeated until the generator and discriminator are balanced, thus completing the training of the image data augmentation network.

Specifically, the input of the image data enhancement network is the small sample data that needs to be enhanced, that is, the real data. At the same time, the system needs to provide the training step α, the number of batch images N, and the parameters of the discriminator before each update of the generator parameters. The number of updates n _c . The initial parameter of the discriminator is ω ₀ and the initial parameter of the generator is θ ₀ . The input to the generator is a normally distributed noise sample z between [-1, 1].

When the generator parameter θ does not reach convergence, the distance between the real data distribution p _r (x) and the generated data distribution p _g (x) is calculated using the formula of the above loss function. Before each update of the generator parameters, the discriminator parameter ω needs to be updated n _c times using the stochastic gradient descent method:

Then use stochastic gradient descent to update the generator parameters once:

The above training steps are looped until the generator parameters θ converge.

Further, the IS coefficient (Inception Score) and the FID coefficient (Fréchet inception distance) are used as evaluation indicators for data enhancement. Its formula is expressed as:

where p(l|X) is the conditional distribution of the generated sample X, KL is the Kullback-Leibler divergence, N is the number of samples in a batch, and m, C, and Tr are the mean, covariance, and trace, respectively. The larger the IS and the smaller the FID, the better the quality and diversity of the generated images.

After the training of the image data enhancement network is completed, the noise samples are input into the generator, the generator outputs synthetic data, and the synthetic data and real data samples are used as training samples for subsequent model training, thereby realizing data enhancement.

The training method of the image data enhancement network disclosed in this embodiment, on the basis of combining OT theory and GAN, defines the distance between the real data distribution and the generated data distribution by using the cosine distance, thereby improving the stability of the network structure and the generated data. quality and reduce the impact of noise on the network. The method proposed in this embodiment can perform data enhancement on small sample data, and generates results with high diversity, large IS coefficient and small FID coefficient, which can solve the problem of high correlation of enhanced data in traditional data enhancement methods. It has reference value for deep learning in different fields and can be used in the training of small sample datasets in different fields.

Embodiment 2

As shown in FIG. 3 , the apparatus for training an image data enhancement network in the second embodiment includes an acquisition module 100 , a first input module 200 , a second input module 300 and an update module 400 , wherein the acquisition module 100 is used for acquiring noise samples and to-be-to-be-used samples. Enhanced real data samples; the first input module 200 is used to input the noise samples into the image data enhancement network to obtain generated data samples; the second input module 300 is used to input the real data samples and the generated data samples into all data samples. The image data enhancement network is used to obtain multiple sets of cosine distance values, and the multiple sets of cosine distance values of the image data enhancement network are calculated to obtain a loss function; the updating module 400 is configured to update the network of the image data enhancement network according to the loss function. parameter.

Further, the image data enhancement network includes a generator G and a discriminator D, wherein the first input module 200 is configured to input the noise samples to the generator G, and the generator G outputs generated data samples The second input module 300 is used to input the real data samples and the generated data samples into the discriminator D, and the discriminator D outputs multiple sets of cosine distance values, and the multiple sets of cosine distance values It includes a first cosine distance value, a second cosine distance value, and a third cosine distance value. The specific process for the discriminator D to calculate and obtain multiple sets of cosine distance values and loss functions may refer to Embodiment 1, which will not be repeated here.

Further, when updating the network parameters of the image data enhancement network according to the loss function, the updating module 400 is specifically configured to: use the loss function to perform a reverse operation on the image data enhancement network, and perform a reverse operation on the image data enhancement network according to the stochastic gradient. The descent method updates the network parameters of the discriminator N times; the image data enhancement network is reversely operated by using the loss function, and the network parameters of the generator are updated once according to the stochastic gradient descent method. The update method of the update module 400 refers to Embodiment 1, which will not be repeated here.

Further, in order to more intuitively demonstrate the advantages of the image data enhancement network obtained by the training method of this embodiment, the applicant has conducted experimental verification.

Specifically, the CIFAR-10 dataset is used for experiments and verification. CIFAR-10 contains 60,000 32*32 color images, a total of 10 categories, and each category has 6,000 images.

All experiments are done based on the Chainer-GAN-lib library. In order to better show the superiority of the proposed data augmentation system, we choose the following existing network models for comparison: GAN-OTD (OT on the original MLP-based GAN improvement), WGAN-GP (using gradient penalty to enhance WGAN, the loss function is still the Euclidean distance used in the WGAN structure). The network structure of this embodiment is CNN-GAN-OTD. The experimental parameters all use the default parameters in Chainer-GAN-lib: the number of batches is 64, and the maximum number of training is 100000. 5000 randomly sampled generated images are used for IS coefficient calculation, 50000 randomly sampled real images and 10000 randomly sampled images Randomly sampled generated images are used for the calculation of FID coefficients.

(1) Generate image analysis:

On the premise of the same training parameters, the CIFAR-10 data is synthesized by different methods. The results of IS coefficient and FID coefficient are shown in Table 1. The IS and FID results of this method are the best results among the listed methods, which verifies the superiority of the image data enhancement network trained by the training method of this embodiment in generating image quality and diversity.

Table 1. Comparison of image quality generated by different methods on the CIFAR-10 dataset

(2) Noise impact analysis:

Add Gaussian noise with a mean value of 0 and an increasing standard deviation to the CIFAR-10 data set, and use different network models to synthesize the noise-added CIFAR-10 data under the premise of the same training parameters. The results of IS coefficient and FID coefficient are as follows Table 2, the generated images are compared in Figure 3. Among them, the maximum standard deviation is selected as 20 according to experience.

The IS and FID results of this method are the best results among the listed methods, which verifies the robustness of this method to the influence of noise.

In Figure 3, (a), (e) and (i) are images after adding Gaussian noise with standard deviations of 2, 5 and 20 to the original image; (b), (c) and (d) are images using WGAN, respectively - Images synthesized by GP, DRAGAN and CNN-GAN-OTD via (a); (f), (g) and (h) images synthesized via (e) with WGAN-GP, DRAGAN and CNN-GAN-OTD, respectively ; (j), (k) and (l) are images synthesized by (i) with WGAN-GP, DRAGAN and CNN-GAN-OTD, respectively.

Table 2. Comparison of image quality generated by different methods on the noise-added CIFAR-10 dataset

The third embodiment further discloses a computer-readable storage medium, where the computer-readable storage medium stores a training program for an image data enhancement network, and the image data enhancement network training program is executed by a processor to realize the above image Training methods for data augmentation networks.

The fourth embodiment also discloses a computer device. At the hardware level, as shown in FIG. 4 , the terminal includes a processor 12 , an internal bus 13 , a network interface 14 , and a computer-readable storage medium 11 . The processor 12 reads the corresponding computer program from the computer-readable storage medium and then executes it, forming a request processing device on a logical level. Of course, in addition to software implementations, one or more embodiments of this specification do not exclude other implementations, such as logic devices or a combination of software and hardware, etc., that is to say, the execution subjects of the following processing procedures are not limited to each Logic unit, which can also be hardware or logic device. The computer-readable storage medium 11 stores a training program of the image data enhancement network, and when the image data enhancement network training program is executed by the processor, implements the above-mentioned training method of the image data enhancement network.

Computer-readable storage media includes both persistent and non-permanent, removable and non-removable media, and storage of information can be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer-readable storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage , magnetic cassettes, disk storage, quantum memory, graphene-based storage media or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by computing devices.

The specific embodiments of the present invention have been described in detail above. Although some embodiments have been shown and described, those skilled in the art should understand that the principles and spirit of the present invention, which are defined in the scope of the claims and their equivalents, are not departed from. Under the circumstances, these embodiments can be modified and perfected, and these modifications and improvements should also fall within the protection scope of the present invention.

Claims

A training method for an image data enhancement network, wherein the training method comprises:

Obtain noise samples and real data samples to be enhanced;

inputting the noise samples into an image data augmentation network to obtain generated data samples;

Inputting the real data samples and the generated data samples into the image data enhancement network to obtain multiple sets of cosine distance values, and the image data enhancement network calculates a loss function according to the multiple sets of cosine distance values;

The network parameters of the image data enhancement network are updated according to the loss function.
The training method of an image data augmentation network according to claim 1, wherein the image data augmentation network comprises a generator and a discriminator, wherein

The method for inputting the noise samples into an image data enhancement network to obtain the generated data samples is: inputting the noise samples into the generator, and the generator outputs the generated data samples;

The method for calculating and obtaining multiple sets of cosine distance values according to the real data samples and the generated data samples is: inputting the real data samples and the generated data samples into the discriminator, and the discriminator outputs multiple sets of cosine distance values. Cosine distance values, the plurality of sets of cosine distance values include a first cosine distance value, a second cosine distance value, and a third cosine distance value.
The training method for an image data enhancement network according to claim 2, wherein the real data samples include a first sub-real sample x r and a second sub-real sample x r ′ that obey the same distribution, and the generated data samples include The first sub-generating sample x g and the second sub-generating sample x g ′ obey the same distribution, the first cosine distance value d(x r , x g ), the second cosine distance value d(x r , x r ′) and the third cosine distance value d(x g , x g ′) are calculated as follows:
The training method of image data enhancement network according to claim 3, wherein, the method for calculating loss function according to the multiple groups of cosine distance values is that the discriminator calculates and obtains the loss function L according to the following formula:

in,
is the mathematical expectation, and L is the loss function.
The training method of the image data enhancement network according to claim 2, wherein the method for updating the network parameters of the image data enhancement network according to the loss function is:

Perform reverse operation on the image data enhancement network according to the loss function, and update the network parameters of the discriminator N times according to the stochastic gradient descent method;

The image data enhancement network is reversely operated according to the loss function, and the network parameters of the generator are updated once according to the stochastic gradient descent method.
A training device for an image data enhancement network, wherein the training device comprises:

The acquisition module is used to acquire noise samples and real data samples to be enhanced;

a first input module for inputting the noise samples into an image data enhancement network to obtain generated data samples;

The second input module is configured to input the real data samples and the generated data samples into the image data enhancement network to obtain multiple sets of cosine distance values, and the image data enhancement network calculates the multiple sets of cosine distance values to obtain a loss function;

An update module, configured to update the network parameters of the image data enhancement network according to the loss function.
The apparatus for training an image data augmentation network according to claim 6, wherein the image data augmentation network comprises a generator and a discriminator, wherein

the first input module is configured to input the noise samples to the generator, and the generator outputs generated data samples;

The second input module is used to input the real data samples and the generated data samples into the discriminator, and the discriminator outputs multiple sets of cosine distance values, and the multiple sets of cosine distance values include the first cosine distance value. The sine distance value, the second cosine distance value, and the third cosine distance value.
The apparatus for training an image data enhancement network according to claim 7, wherein, when updating the network parameters of the image data enhancement network according to the loss function, the updating module is specifically configured to:

Use the loss function to perform a reverse operation on the image data enhancement network, and update the network parameters of the discriminator N times according to the stochastic gradient descent method;

The image data enhancement network is reversely operated by using the loss function, and the network parameters of the generator are updated once according to the stochastic gradient descent method.
The apparatus for training an image data augmentation network according to claim 7, wherein the real data samples include a first sub-real sample x r and a second sub-real sample x r ′ subject to the same distribution, and the generated data samples include The first sub-generating sample x g and the second sub-generating sample x g ′ obey the same distribution, the first cosine distance value d(x r , x g ), the second cosine distance value d(x r , x r ′) and the third cosine distance value d(x g , x g ′) are calculated as follows:
The training device of an image data enhancement network according to claim 9, wherein the method for calculating the loss function according to the multiple sets of cosine distance values is that the discriminator calculates and obtains the loss function L according to the following formula:

in,
is the mathematical expectation, and L is the loss function.
A computer-readable storage medium, wherein the computer-readable storage medium stores a training program of an image data enhancement network, and the image data according to claim 1 is realized when the training program of the image data enhancement network is executed by a processor A training method for augmented networks.
The computer-readable storage medium of claim 11, wherein the image data enhancement network includes a generator and a discriminator, wherein

The method for inputting the noise samples into an image data enhancement network to obtain the generated data samples is: inputting the noise samples into the generator, and the generator outputs the generated data samples;

The method for calculating and obtaining multiple sets of cosine distance values according to the real data samples and the generated data samples is: inputting the real data samples and the generated data samples into the discriminator, and the discriminator outputs multiple sets of cosine distance values. Cosine distance values, the plurality of sets of cosine distance values include a first cosine distance value, a second cosine distance value, and a third cosine distance value.
The computer-readable storage medium of claim 12, wherein the real data samples include a first sub-real sample x r and a second sub-real sample x r ′ subject to the same distribution, and the generated data samples include subject to the same distribution The first sub-generating sample x g and the second sub-generating sample x g ′ of the distribution, the first cosine distance value d(x r ,x g ), the second cosine distance value d(x r ,x r ′) and the calculation formula of the third cosine distance value d(x g , x g ′) are as follows:
The computer-readable storage medium according to claim 13, wherein the method for calculating the loss function according to the multiple sets of cosine distance values is that the discriminator calculates the loss function L according to the following formula:

in,
is the mathematical expectation, and L is the loss function.
The computer-readable storage medium of claim 12, wherein the method for updating network parameters of the image data enhancement network according to the loss function is:

Perform reverse operation on the image data enhancement network according to the loss function, and update the network parameters of the discriminator N times according to the stochastic gradient descent method;

The image data enhancement network is reversely operated according to the loss function, and the network parameters of the generator are updated once according to the stochastic gradient descent method.