CN115690487A - Small sample image generation method - Google Patents
Small sample image generation method Download PDFInfo
- Publication number
- CN115690487A CN115690487A CN202211230704.9A CN202211230704A CN115690487A CN 115690487 A CN115690487 A CN 115690487A CN 202211230704 A CN202211230704 A CN 202211230704A CN 115690487 A CN115690487 A CN 115690487A
- Authority
- CN
- China
- Prior art keywords
- network
- generation
- discrimination
- gaussian mixture
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a small sample image generation method, which is used for generating an image under a data-limited scene. The small sample image generation method provided by the invention comprises the following steps: randomly sampling from the dynamic Gaussian mixture distribution to obtain a dynamic Gaussian mixture hidden code; inputting the dynamic Gaussian mixture hidden codes into a generation network, enhancing intermediate features of the generation network through a mixed attention mechanism, wherein the intermediate features are obtained by mapping the dynamic Gaussian mixture hidden codes by the generation network, and inputting the enhanced intermediate features into the generation network to obtain a generated image set; inputting the generated image set and the real image set into a discrimination network to obtain image discrimination results of the generated image set and the real image set; and updating the generating network and the judging network according to the image judging result and the target optimization functions of the generating network and the judging network to obtain the updated generating network and the updated judging network.
Description
Technical Field
The invention belongs to the field of computer vision, and mainly relates to the problem of image generation in a small sample scene; the method is mainly applied to the fields of image editing, generation, expansion and enhancement and the like.
Background
With the continuous development of deep learning, the deep learning has made remarkable progress in the computer vision field and is applied to various fields. The depth image generation model learns and understands the content and distribution of the pictures by utilizing a depth network, learns to generate the real pictures similar to the real pictures, and is a hot problem in the field of computer vision. The depth image generation model is the basis of tasks such as image restoration, editing, super-resolution and the like, and can be applied to various fields such as movie and television media, creative design and the like. However, the training of the depth image generative model usually requires a large amount of data and computational cost, which greatly limits the application of the generative model in the field with only a small number of pictures, such as medical images, celebrity paintings, and the like. The application of the generated model to a small sample scene is a direction with great application and research significance, the data expansion in the small sample scene can be realized through the generated data, and the generated data can be used for assisting the problems of small sample classification, segmentation and the like.
When trained given a small number of pictures, the generative model is often over-fit and simply remembers the training data, and is unable to produce true and diverse pictures. In order to improve the realism and diversity of images generated in small sample scenes, researchers have proposed various methods to alleviate the problem of model overfitting. A direct method is to utilize the idea of transfer learning, assume that a source domain which is similar to small sample data and has a large amount of data exists, pre-train the source domain, and then transfer the knowledge in the source domain to a small sample target domain to improve the diversity and authenticity of small sample generation. However, there are two problems with this type of approach: first, pre-training of the source domain still requires a large amount of computation and acquisition costs; second, when there is a certain deviation between the source domain and the small sample target domain, the performance of the small sample target domain is rather degraded.
In another type of small sample generation method, data expansion is realized by turning and translating small sample data through a data enhancement technology, and available training data of a model is increased. This type of process also presents two problems: firstly, the distribution of original data can be changed by turning and translating the original picture, so that the unreasonable generated picture is generated by misleading a generation model; second, the augmentation of the sample is essentially done on the same batch of data, and does not change the intrinsic structure, so the model is still prone to overfitting.
Disclosure of Invention
Different from the existing small sample image generation method, the method provided by the invention starts from data prior and data essence, can provide more editable attribute assumptions for the model based on more complex prior information, designs Gaussian dynamic hybrid implicit space coding as an input signal of the generation model, and provides more diversified hybrid Gaussian implicit coding for the generation model. Meanwhile, in order to further ensure the diversity and authenticity of the generated pictures, the invention designs a mixed attention enhancement module for performing content and layout aiming at the intermediate characteristics in the generation process, and ensures the rationality and integrity of local content and global layout. By fusing the two modules, the invention constructs a small sample image generation method, and obtains excellent effect on small sample data from different fields including cartoon styles, real photos and the like.
The invention aims to provide a small sample image generation method, which improves the fidelity and richness of a generated picture in a small sample scene.
The invention relates to a small sample image generation method, which provides more variable and editable attributes for a generation model by introducing dynamic mixed Gaussian hidden codes and solves the problem of insufficient diversity of the existing small sample generation method. Meanwhile, in order to further improve the authenticity of the generated picture, the invention provides a mixed attention mechanism, enhances the global layout and the local content of the intermediate features in the generation process, and effectively retains the key information of the intermediate features. By fusing the method, the diversity and the authenticity of the small sample image generation are effectively improved, and the problems of unstable model training, overfitting and the like in a small sample scene are relieved.
For convenience in describing the present disclosure, definitions of some commonly used terms are first given.
Definition 1: generating a countermeasure network (generic adaptive Networks): the generation of the countermeasure network is the most common and most applied depth generation model, the generation system of the invention is constructed by taking the generation of the countermeasure network as a base network, the generation of the countermeasure network usually consists of a generation network (Generator, G) and a discrimination network (Discriminator, D), wherein the generation network G maps hidden codes (late Code) obtained by specific distribution sampling into a generation picture, the generation picture and a real picture are sent into the discrimination network D together, the discrimination network learning distinguishes the generation picture from the real picture, and the training objective function of the generation of the countermeasure network is as follows:
in the above formula, I real Representing the true data distribution, G (z) representing the generated data, and log () representing the log penalty. D and G respectively represent a discrimination network and a generation network, the discrimination network maximizes the classification loss of the generated data and the real data, and the generation network minimizes the classification loss of the discrimination network to the generated data. The two game each other, finally reach the equilibrium state. In a balanced state, the generating network has the capability of generating a sufficiently real picture, and the network cannot distinguish whether the picture is real or generated at all.
Definition 2: latent Code (Latent Code): the implicit code is an input to the network G, and is typically a vector of fixed length, denoted z, randomly sampled from a particular distribution (e.g., gaussian, uniform, etc.). The generation network D learns to map the steganographic z to a generated picture.
Definition 3: attention Mechanism (Attention Mechanism): the attention mechanism is inspired by the human visual attention mechanism, the contents which are more concerned in the image are learned, and the weight of the corresponding part of the contents is increased.
Definition 4: gaussian Mixture Distribution (Gaussian Mixture Distribution): the gaussian mixture distribution represents a distribution formed by combining N gaussian distributions, and in order to provide more editable and variable attributes for the generation network of the present invention, the present invention randomly samples the hidden code from the gaussian mixture distribution as the input of the generation network, rather than sampling the hidden code from a single gaussian distribution as the input of the generation network as in the prior art.
In the above formula, the first and second groups of the formula,is the weight of the ith Gaussian distribution (mu) i ,∑ i ) Respectively, mean and variance of the ith gaussian distribution, and z represents a mixed gaussian distribution.
Definition 5: reparameterization technique (Reparameterization Trick): because the neural network becomes non-differentiable due to the fact that implicit codes are obtained from N Gaussian distribution samples, the method carries out reparameterization processing on Gaussian mixture distribution, firstly obtains delta from one Gaussian distribution sample, and then carries out flattening reparameterization to obtain z:
z=u i +σ i δ
in the above formula, u i And σ i For the net learnable parameter, δ is obtained from random sampling from a gaussian distribution with a mean of 0 and a variance of 1, i.e., δ ∈ N (0,1).
In order to provide more variable and editable attributes for the generation network, the invention further introduces a dynamic regulation factor lambda which can dynamically regulate and control the hidden coding Gaussian component:
z=λu i +(1-λ)σ i δ
in the above formula, λ is a dynamic regulatory factor, u i And σ i For the net learnable parameter, δ is obtained from random sampling from a gaussian distribution with a mean of 0 and a variance of 1, i.e., δ ∈ N (0,1). z represents the obtained dynamic gaussian mixture implicit coding.
Definition 6: sigmoid activation function: the activation function obtains a new output by nonlinear change of the input of the model, and the formalization of the output is defined as:
in the above formula, e denotes the natural exponent, z is the input to the activation function, and g (z) is the Sigmoid activation function.
Definition 7: convolution (Convolution): convolution is a technique for processing an image with respect to the spatial dependence of the pixels of the input image.
Definition 8: pooling (Pooling): and performing pooling on the input feature graph, and performing dimensionality reduction compression on the feature graph according to a certain rule, wherein the selected rule comprises maximum pooling and average pooling.
The invention provides a small sample image generation method, which is characterized by comprising the following steps:
randomly sampling from dynamic Gaussian mixture distribution to obtain dynamic Gaussian mixture hidden codes, wherein the dynamic Gaussian mixture distribution is the Gaussian mixture distribution with dynamic regulation factors introduced;
inputting the dynamic Gaussian mixture hidden code into a generation network, enhancing the intermediate feature of the generation network through a mixed attention mechanism, wherein the intermediate feature is obtained by mapping the dynamic Gaussian mixture hidden code by the generation network, and the enhanced intermediate feature is input into the generation network to obtain a generated image set;
inputting the generated image set and the real image set into a discrimination network to obtain image discrimination results of the generated image set and the real image set;
and updating the generating network and the judging network according to the image judging result and the target optimization functions of the generating network and the judging network to obtain the updated generating network and the updated judging network.
The small sample image generation method helps solve the problems of overfitting and model collapse caused by limited data in a small sample scene, and the reality and diversity of generated pictures are improved.
The invention relates to a small sample image generation method, which is characterized in that the dynamic Gaussian mixture distribution conforms to the following corresponding relation:
z=λu i +(1-λ)σ i δ
wherein z is the dynamic Gaussian mixture distribution and λ isA dynamic regulation factor capable of regulating the Gaussian distribution component in dynamic Gaussian mixture implicit coding i And σ i For the net learnable parameter, δ is obtained from random sampling from a gaussian distribution with a mean of 0 and a variance of 1, i.e., δ ∈ N (0,1).
The small sample image generation method is characterized in that the mixed attention mechanism comprises a space attention mechanism and a channel attention mechanism. The mixed attention mechanism enhancement enhances the intermediate characteristics of the generated network, the intermediate characteristics are obtained by mapping the generated network to dynamic Gaussian mixed hidden codes, the global layout and the local content information of the generated picture are included, and the mixed attention mechanism enhancement helps to improve the authenticity of the generated picture.
The small sample image generation method is characterized in that the spatial attention mechanism focuses on which part of the feature is most important, and the part of the feature is enhanced. The spatial attention mechanism first aggregates channel information using pooling to obtain two 2D feature maps: andrepresenting the profiles obtained using average pooling and maximum pooling, respectively. Next, the two feature maps are connected and the feature map of the spatial attention is obtained by a convolution operation. The spatial attention mechanism is formally described as:
where σ denotes a Sigmoid activation function, avgPool and MaxPool denote the mean pooling and the maximum pooling, respectively, f 7×7 Representing convolution with a size of 7 x 7Convolution operation of the kernel. F is the characteristic diagram of the image,andand obtaining the characteristic diagram after the average pooling and the maximum pooling. The spatial attention focuses on the global layout information, and the enhancement of the global layout information is beneficial to the overall reasonability and reality of the generated image.
The small sample image generation method is characterized in that the channel attention mechanism is concerned with what contents in a feature map are worth attention, and in order to calculate the feature map of the channel attention, space information is squeezed by using average pooling and maximum pooling to obtain two 2D feature maps:andthen, a network is used to generate the feature map M of the channel attention c ∈R c×1×1 The network is a multi-layered perceptron with a hidden layer. The channel attention mechanism is formally described as:
where σ denotes a Sigmoid activation function, W 1 And W 0 For sharing parameters, avgPool and MaxPool represent the average pooling and the maximum pooling, respectively, and MLP is a multilayer perceptron.Andfor using the average pooling and the maximum poolingAnd (5) feature diagrams. The channel focuses on local content information, and the local reality and detail reality of the generated image are facilitated by enhancing the local content information.
The small sample image generation method is characterized in that parameters of the generation network and the discrimination network are updated according to the image discrimination result and the target optimization function of the generation network and the discrimination network. The method comprises the following steps:
substituting the image discrimination result into the target optimization function of the discrimination network, and updating the parameters of the discrimination network, wherein the target optimization function of the discrimination network conforms to the following corresponding relation:
wherein E represents desire, I real For the real image set, z is the dynamic Gaussian mixture distribution, G (z) is the generated image set, D (x) is the image discrimination result, min () represents minimization, L recons In order to reconstruct the loss, the reconstruction loss helps to improve the capability of the discrimination network for extracting features, thereby improving the discrimination capability of the discrimination network.
Substituting the image discrimination result into the target optimization function of the generated network, and updating the parameters of the generated network, wherein the target optimization function of the generated network conforms to the following corresponding relation:
L G =-E x~G(z) [D(x)]
where E represents expectation, z is the dynamic Gaussian mixture distribution, x-G (z) are the generated image set, and D (x) is the image discrimination result.
The method for generating a small sample image according to the present invention is characterized in that after the parameters of the generation network and the discrimination network are updated according to the objective optimization functions of the generation network and the discrimination network, the method further includes: and judging a network according to the updated generating network, wherein the updated generating network is used for generating image data for at least one of data enhancement, classification and segmentation.
The small sample image generation method is characterized by comprising a generator and a discriminator, wherein the generator is coupled with the generation network, the discriminator is coupled with the discrimination network, and the generation method is used for executing programs of the generator and the discriminator.
The invention has the beneficial effects that: the small sample image generation method designed by the invention takes the dynamic Gaussian mixture implicit code as input, provides more editable and changeable attributes for a generation network, and improves the diversity of generated samples; and enhancing the local content and the global layout of the intermediate features in the generation process by using the mixed attention mechanism, and improving the authenticity of the generated sample. The two are fused together, so that the overfitting problem of a generated model is reduced, and enough real and various pictures can be generated under a small sample scene. The method is not limited to a specific generating model, and can be embedded into other models in an adaptive manner, so that the diversity and the authenticity of a generated sample are improved, and the problems of mode collapse and the like of the generating model are avoided. The generated pictures can also be used in the fields of image classification, segmentation, and the like.
Drawings
Fig. 1 is a flowchart of an overall training process provided by an embodiment of the present invention.
Fig. 2 is an overall framework diagram provided by the embodiment of the present invention.
Fig. 3 is a schematic diagram of a hybrid power system according to an embodiment of the present invention.
Fig. 4 is a diagram of the generation effect of the data set of the art painting landscape and the real animal photo provided by the embodiment of the invention.
Fig. 5 is a diagram of the generation effect of the photo data sets of the cartoon face and the real face according to the embodiment of the present invention.
Detailed Description
The following description will specifically describe the embodiments of the present invention with reference to the accompanying drawings, and the system implementation details will be described in detail for the purpose of specifying the implementation process of the present invention. However, these implementation details do not limit the invention to the described embodiments.
The invention relates to a small sample image generation method, which utilizes dynamic Gaussian mixture hidden codes as input of a generation network to provide richer prior information and more editable attribute information for the generation network; the generation network maps the hidden codes into a generated picture, in the process of generating an intermediate picture by the generation network, the intermediate feature representation comprises the local content and the global layout information of the finally generated picture, the content and the layout information of the intermediate representation are enhanced by using a mixed attention mechanism, and finally the picture is generated; inputting the generated picture and the real picture into a discrimination model, wherein the discrimination model needs to discriminate whether the given picture is generated or real; the generation network and the discrimination network are updated through discrimination loss, the generation network learns to generate pictures which are as close to real distribution as possible, the discrimination network distinguishes the real pictures from the generated pictures as possible, the real pictures and the generated pictures are mutually played, the pictures are better in continuous training, and finally, a balanced state is achieved.
The invention learns and generates pictures with authenticity and diversity on a small sample image data set based on the generation network and the discrimination network. Training procedure see fig. 1, general framework see fig. 2.
The specific implementation process of the invention is as follows:
step 1: parameter initialization
Initializing a training picture Size D, a training set P, a Batch Size, training iteration times T, and randomly initializing a generation network G and a discrimination network D;
step 2: sampling dynamic Gaussian mixture implicit coding and data set samples
Randomly sampling m steganographic codes from a dynamic Gaussian mixture distribution 1 ,…,z m Randomly sampling m original training pictures { I ] from the training set P 1 ,…,I m The dynamic gaussian mixture distribution is:
z=λu i +(1-λ)σ i δ,δ∈N(0,1)
wherein, λ is a dynamic regulation factor, which can dynamically adjust the components of Gaussian distribution in the hybrid implicit coding, and δ is a vector randomly sampled from Gaussian distribution with a mean value of 0 and a variance of 1.
And step 3: preprocessing m original pictures, horizontally turning, randomly cutting and standardizing the original pictures, and expressing data in a tensor form;
and 4, step 4: inputting the hidden code into a generation network, removing the intermediate feature representation in the middle process of generation, enhancing the content and layout of the intermediate feature representation by using a mixed attention mechanism, continuously using the enhanced feature representation for generating the network, obtaining m generated pictures, and processing the generated pictures into a format { G (z) } which is the same as that of the training pictures 1 ),…,G(z m )};
The mixed attention mechanism in step 4 includes a spatial attention mechanism and a channel attention mechanism, and focuses on the local content and the whole layout information of the feature map, and the flow is shown in fig. 3.
And 5: generating m pictures G (z) 1 ),…,G(z m ) And m real pictures { I } 1 ,…,I m Inputting the real picture label into a discrimination network, and discriminating the discrimination network by using the label of the real picture as 'real' and the label of the generated picture as 'fake';
step 6: training discriminant network
The discrimination probability of the discrimination network is improved in a mode of a minimum target loss function, reverse propagation is carried out on the loss function of the discrimination network by utilizing gradient descent, and discrimination network parameters are updated;
the penalty function for a discriminant network is defined as:
wherein E represents expectation, I real For a real training sample, z is the dynamic Gaussian mixture distribution, G (z) is a generation sample, min () represents a minimization, D (x) is an image discrimination result, L recons For reconstruction loss, the reconstruction loss helps to improve the capability of discriminating the network for extracting features, thereby improving discrimination capability.
And 7: training a generating network
The generation network continuously generates pictures under the guidance of a discrimination network, the generated pictures need to be similar to real pictures as much as possible, the discrimination network is confused, the probability of misjudgment of the discrimination network is increased by minimizing a target loss function of the generation network, the generation network is subjected to back propagation by utilizing gradient descent, and generated network parameters are updated;
the loss function of the generated network is defined as:
L G =-E x~G(z) [D(G(z))]
where E represents expectation, z is the dynamic Gaussian mixture distribution, x-G (z) are the generated image set, and D (x) is the image discrimination result.
And 8: checking the iteration times, wherein the total iteration times set by the method is 50000 times, repeatedly executing the steps 2-7 until a termination condition is reached, storing the model parameters once every 10000 times of iteration, finally obtaining 5 models, reading the generated network parameters by using the stored models, generating pictures for visualization comparison and quantization index comparison, and generating the pictures for data enhancement to help improve tasks such as classification, segmentation and the like.
Design of experiments
Experimental data set
The experimental data sets are selected from small sample image data sets of different styles including animation, painting, human face, scenery and the like, the small sample image data sets comprise 256 × 3,512 × 3 and 1024 × 3, the resolution of all the data sets is not more than 1000 pictures, the data are extremely limited, detailed description of the data is shown in table 1, and the experimental data sets have great challenges and also have strong application and research significance.
Table 1 introduction to the experimental data set
Comparison algorithm
The invention aims at image generation in a small sample scene, and the comparison algorithm comprises the best current methods of StyleGAN2, diffAug, ADA and FastGAN in a limited sample scene.
Evaluation index
A common indicator for authenticity and diversity evaluation of generated pictures is FID. The FID calculates the distance between the real picture and the generated picture, refers to common settings, selects the real training picture as a reference picture, generates 5000 pictures, and calculates the distribution distance between the two pictures, wherein the smaller the numerical value, the closer the generated picture is to the real picture, that is, the better the performance is.
The calculation formula for FID is:
FID=||μ r -μ g || 2 +Tr(∑ r +∑ g -2(∑ r ∑ g ) 1/2 )
wherein mu r And mu g Characteristic mean, Σ, representing the real picture and the generated picture, respectively r Sum Σ g Representing the covariance matrix of the real picture and the generated picture respectively, tr representing the tracing, | · |. The luminance 2 Representing a two-norm.
Results of the experiment
Table 2 experimental results of the present and comparative methods on the 256 x 3 data set
Table 3 experimental results of the present and comparative methods on a 512 x 3 data set
Datasets | AnimeFace | ArtPainting | Moongate | Flat | Fauvism |
StyleGAN2 | 152.73 | 74.56 | 288.25 | 285.61 | 181.91 |
DiffAug | 135.85 | 49.25 | 136.12 | 310.14 | 223.58 |
ADA | 59.67 | 46.38 | 149.06 | 248.46 | 201.99 |
FastGAN | 59.38 | 45.08 | 122.29 | 240.24 | 182.14 |
Ours | 53.36 | 44.50 | 112.14 | 200.99 | 176.43 |
Table 4 experimental results of the present and comparative methods on 1024 x 3 data sets
Datasets | Pokemon | Skulls | Shells | FFHQ | Flowers |
StyleGAN2 | 190.23 | 127.98 | 241.37 | - | 45.23 |
DiffAug | 62.73 | 124.23 | 151.94 | 48.88 | 37.09 |
ADA | 66.41 | 97.05 | 136.52 | 40.63 | 27.36 |
FastGAN | 57.19 | 130.05 | 155.47 | 47.78 | 25.66 |
Ours | 47.04 | 99.02 | 134.20 | 44.01 | 24.06 |
Table 2,3,4 shows the experimental results of the method of the present invention and the comparison method on different resolution data sets, and it can be seen that, in a scene with a limited sample size, the present invention can generate pictures with higher authenticity and diversity, which proves the effectiveness and superiority of the present invention as small sample image generation.
Visual analysis
To better investigate the diversity and realism of the effects of the present invention on small sample image datasets, visualization results displays were generated and collated for different resolutions, see fig. 4 and 5. As can be seen from the visualization effect in the figure, the picture generated by the method is closer to the real picture, and has better effect on data sets with different resolutions, and the picture generated by the method is quite reasonable in aspects of local content, global layout and the like.
In conclusion, the small sample image generation method provided by the invention can obviously improve the diversity and the authenticity of the generated images in the small sample scene, and the effectiveness and the practicability of the system are verified from the analysis of quantitative and qualitative results. Meanwhile, the pictures generated by the method can be used for a wide range of tasks under the data-limited scene, including data enhancement, classification, segmentation and the like. In addition, the invention also provides reference for other related problems in the field, and the principle and the idea of the invention can be expanded and extended to other related application scenes, thereby having good reference and reference meanings and simultaneously providing very wide application prospects.
The above description is a specific embodiment of the present invention, and the present invention is not limited to the above examples, and it is obvious to those skilled in the art that the present invention can be adapted to various models, and can be adapted and modified according to specific tasks. Any modification, replacement, or improvement made within the principle scope of the present invention should be included in the scope of the claims of the present invention.
Claims (8)
1. A small sample image generation method, comprising:
randomly sampling from dynamic Gaussian mixture distribution to obtain dynamic Gaussian mixture hidden codes, wherein the dynamic Gaussian mixture distribution is the Gaussian mixture distribution with dynamic regulation factors introduced;
inputting the dynamic Gaussian mixture hidden code into a generation network, enhancing the intermediate feature of the generation network through a mixed attention mechanism, wherein the intermediate feature is obtained by mapping the dynamic Gaussian mixture hidden code by the generation network, and the enhanced intermediate feature is input into the generation network to obtain a generated image set;
inputting the generated image set and the real image set into a discrimination network to obtain image discrimination results of the generated image set and the real image set;
and updating the generating network and the judging network according to the image judging result and the target optimization functions of the generating network and the judging network to obtain the updated generating network and the updated judging network.
2. The small sample image generation method according to claim 1, wherein the dynamic gaussian mixture distribution conforms to the following correspondence:
z=λu i +(1-λ)σ i δ
wherein z is the dynamic Gaussian mixture distribution, lambda is a dynamic regulation factor, the components of the Gaussian distribution in the dynamic Gaussian mixture distribution implicit code can be adjusted, and u is i And σ i For the network learnable parameter, δ is a vector sampled randomly from a gaussian distribution with mean 0 and variance 1, i.e., δ ∈ N (0,1).
3. The small sample image generation method of claim 1, wherein the hybrid attention mechanism comprises a spatial attention mechanism and a channel attention mechanism.
4. The spatial attention mechanism of claim 3, wherein the spatial attention mechanism focuses on and enhances what portion of the content of a feature is most important. The spatial attention mechanism first aggregates channel information using pooling to obtain two 2D feature maps: andrepresenting the profiles obtained using average pooling and maximum pooling, respectively. Next, the two feature maps are connectedAnd obtaining the feature map of the spatial attention through convolution operation. The spatial attention mechanism is formally described as:
where σ denotes the activation function, avgPool and MaxPool denote the mean pooling and the maximum pooling, respectively, f 7× 7 f 7×7 Represents the convolution operation with a convolution kernel of size 7 x 7, F being the signature,andand obtaining the characteristic diagram after the average pooling and the maximum pooling.
5. The channel attention mechanism of claim 3, wherein it is interesting to look at what is in a feature map, and to compute the feature map of the channel attention, spatial information is first squeezed using average pooling and maximum pooling, resulting in two 2D feature maps:andthen, a network is used to generate the feature map M of the channel attention c ∈R c×1×1 The network is a multi-layered perceptron with a hidden layer. The channel attention mechanism is formally described as:
6. The method according to any one of claims 1 to 5, wherein the parameters of the generation network and the discrimination network are updated according to an objective optimization function of the generation network and the discrimination network according to the image discrimination result. The method comprises the following steps:
substituting the image discrimination result into the target optimization function of the discrimination network, and updating the parameters of the discrimination network, wherein the target optimization function of the discrimination network conforms to the following corresponding relation:
wherein E represents desire, I real For the real image set, z is the dynamic Gaussian mixture distribution, G (z) is the generated image set, D (x) is the image discrimination result, min () represents minimization, L recons In order to reconstruct the loss, the reconstruction loss helps to improve the capability of the discrimination network for extracting features, thereby improving the discrimination capability of the discrimination network.
Substituting the image discrimination result into the target optimization function of the generated network, and updating the parameters of the generated network, wherein the target optimization function of the generated network conforms to the following corresponding relation:
L G =-E x~G(z) [D(x)]
where E represents expectation, z is the dynamic Gaussian mixture distribution, x-G (z) are the generated image set, and D (x) is the image discrimination result.
7. The method of any one of claims 1 to 6, wherein after said updating parameters of said generator network and said discriminant network according to an objective optimization function of said generator network and said discriminant network, said method further comprises:
and judging a network according to the updated generating network, wherein the updated generating network is used for generating image data for at least one of data enhancement, classification and segmentation.
8. A small sample image generation method comprising a generator coupled to the generation network and a discriminator coupled to the discrimination network, the generation method being configured to execute a program of the generator and the discriminator so that the generation system performs the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211230704.9A CN115690487A (en) | 2022-10-09 | 2022-10-09 | Small sample image generation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211230704.9A CN115690487A (en) | 2022-10-09 | 2022-10-09 | Small sample image generation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115690487A true CN115690487A (en) | 2023-02-03 |
Family
ID=85065194
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211230704.9A Pending CN115690487A (en) | 2022-10-09 | 2022-10-09 | Small sample image generation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115690487A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116664450A (en) * | 2023-07-26 | 2023-08-29 | 国网浙江省电力有限公司信息通信分公司 | Diffusion model-based image enhancement method, device, equipment and storage medium |
CN118447211A (en) * | 2023-10-16 | 2024-08-06 | 苏州飞舸数据科技有限公司 | Data preprocessing method and system based on image feature refinement |
-
2022
- 2022-10-09 CN CN202211230704.9A patent/CN115690487A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116664450A (en) * | 2023-07-26 | 2023-08-29 | 国网浙江省电力有限公司信息通信分公司 | Diffusion model-based image enhancement method, device, equipment and storage medium |
CN118447211A (en) * | 2023-10-16 | 2024-08-06 | 苏州飞舸数据科技有限公司 | Data preprocessing method and system based on image feature refinement |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Alqahtani et al. | Applications of generative adversarial networks (gans): An updated review | |
Qiu et al. | Semanticadv: Generating adversarial examples via attribute-conditioned image editing | |
CN108875935B (en) | Natural image target material visual characteristic mapping method based on generation countermeasure network | |
US11544880B2 (en) | Generating modified digital images utilizing a global and spatial autoencoder | |
CN115690487A (en) | Small sample image generation method | |
Li et al. | Globally and locally semantic colorization via exemplar-based broad-GAN | |
US20230245351A1 (en) | Image style conversion method and apparatus, electronic device, and storage medium | |
CN111695494A (en) | Three-dimensional point cloud data classification method based on multi-view convolution pooling | |
CN111598968A (en) | Image processing method and device, storage medium and electronic equipment | |
CN111931908B (en) | Face image automatic generation method based on face contour | |
CN112884893A (en) | Cross-view-angle image generation method based on asymmetric convolutional network and attention mechanism | |
Zhu et al. | Pyramid nerf: Frequency guided fast radiance field optimization | |
CN114581356A (en) | Image enhancement model generalization method based on style migration data augmentation | |
CN117576248B (en) | Image generation method and device based on gesture guidance | |
Yuan et al. | Explore double-opponency and skin color for saliency detection | |
AU2023204419A1 (en) | Multidimentional image editing from an input image | |
Padala et al. | Effect of input noise dimension in GANs | |
CN109858543A (en) | The image inferred based on low-rank sparse characterization and relationship can degree of memory prediction technique | |
Sun et al. | Channel attention networks for image translation | |
CN115482557A (en) | Human body image generation method, system, device and storage medium | |
CN117255998A (en) | Unsupervised learning of object representations from video sequences using spatial and temporal attention | |
Li et al. | OT-net: a reusable neural optimal transport solver | |
Li et al. | Neural style transfer based on deep feature synthesis | |
Atone et al. | Generative Adversarial Networks in Computer Vision: A Review of Variants, Applications, Advantages, and Limitations | |
Manisha et al. | Effect of input noise dimension in gans |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |