CN114882220B - Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system - Google Patents
Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system Download PDFInfo
- Publication number
- CN114882220B CN114882220B CN202210548444.3A CN202210548444A CN114882220B CN 114882220 B CN114882220 B CN 114882220B CN 202210548444 A CN202210548444 A CN 202210548444A CN 114882220 B CN114882220 B CN 114882220B
- Authority
- CN
- China
- Prior art keywords
- image
- domain
- translated
- adaptive
- target domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/32—Normalisation of the pattern dimensions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image generation method and system for generating a countermeasure network based on domain self-adaptive prior knowledge guidance, wherein the method comprises the following steps: the method comprises the steps of data set preparation, data set preprocessing, source domain generator in a training source domain network model, target domain generator in a training target domain network model, image augmentation, source domain judger in the training source domain network model and target domain judger in the target domain network model. In the GAN proposed by the present invention, the generator includes a source domain branch and a target domain branch. The source domain branch is used for learning content information of a large amount of data similar to the target domain, and the affine parameter migration and domain mixing technology of the BN layer is utilized to migrate the knowledge of the source domain into the target domain, so that the problem of limited data of the target domain is solved. In order to further improve the quality of the generated image, a spatial adaptive normalization module is introduced into the target domain branch, and the prior knowledge of the main target is introduced in the target domain image generation process, so that the accuracy of the target in the generated image is improved.
Description
Technical Field
The invention relates to an image generation technology, belongs to the field of computer vision and artificial intelligence, and particularly relates to an image generation method and system for generating a countermeasure network based on domain adaptive priori knowledge guidance.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
With the proposal of generating a countermeasure network (GAN), the image generation field is in the trend of research, and the GAN-based image generation model has achieved satisfactory effects in the task fields of style migration, image restoration, super-resolution, image translation, and the like.
In general, the GAN model consists of a two-part network, one of which is a generator sub-network for generating images and the other of which is a decision maker sub-network for ensuring that the generated images are consistent with the target image. The training process of the network is also a process in which two sub-networks game each other and jointly optimize. The complex structure of the GAN network makes its parameters large, so training GAN often requires a large amount of data. If the data volume is insufficient, the image generation quality is not high, and the generated image mode is collapsed (mode collapse). However, in certain tasks (e.g., medical image generation), it is difficult to collect large amounts of data, which can lead to reduced model performance.
For the condition of limited data, the adoption of the transfer learning is an effective idea for improving the network performance. In transfer learning, a domain adaptation technique fires, which is capable of aligning the source domain training data and the target domain data in a feature representation within the latent space. The same or similar features can be extracted from the data of the two domains by the network, so that a large number of source domain data extraction features can effectively assist the training of the target domain data, thereby improving the performance of the model trained under the condition of limited data.
Although GAN can generate data according with the image distribution of the training set under a certain training strategy, the quality of the generated image is difficult to guarantee, and the situation of content blurring and the like often occurs. This is often due to the fact that the regularization approach in the network is not appropriate. The problem is solved to a certain extent by spatial-Adaptive Normalization (SPADE), which obtains affine parameters in the regularization layer by performing convolution calculation on additional semantic segmentation labels. The area with the example in the semantic label is more obvious in the feature map extracted by the network, so that the semantic property of the feature map is enhanced, and the result of the generated image is more vivid.
Disclosure of Invention
In order to solve the problem that paired data and data labels are difficult to obtain, the invention provides a domain-adaptive priori-knowledge-based GAN (generic object model) image generation method and system. In the GAN proposed by the present invention, the generator includes two branches, a source domain branch and a target domain branch. The source domain branch is used for learning content information of a large amount of data similar to the target domain, and the affine parameter migration and domain mixing technology of the batch regularization layer is utilized to migrate the knowledge of the source domain into the target domain, so that the problem of limited data of the target domain is solved. In order to further improve the quality of the generated image, a spatial adaptive normalization module is introduced into the target domain branch, and the priori knowledge of the main target is introduced in the target domain image generation process, so that the accuracy of the target in the generated image is improved. In order to improve the importance of the important target region in the discrimination process, a spatial adaptive normalization module is introduced into the discriminator so that the discriminator can focus on the target region.
In order to realize the above content, the invention adopts the following technical scheme:
the invention provides an image generation method for generating a countermeasure network based on domain self-adaptive priori knowledge guidance, which comprises the following steps:
s1, preparing a data set: collecting paired images and semantic segmentation labels corresponding to the images according to task requirements, and using the paired images and the semantic segmentation labels as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training;
s2, preprocessing a data set: unifying the sizes of all image data in the target domain data and the source domain data;
s3, training a source domain generator in the source domain network model: when the source domain data are used for training the model, the input of the model is a noise vector, the noise vector is processed by a full connection layer, new vectors are recombined into a uniform size, and at the moment, batch regularization (Batch regularization) is used as a regularization layer for the model.
It is a desire of the present invention to enable a network to generate a translated image similar to that in the target domain. When a network has this capability, it can be said that the network holds content information that generates the translated image.
S4, training a target domain generator in the target domain network model: when the target domain data is used for training the model, the input data received by the model are the image to be translated and the semantic segmentation label of the image to be translated, wherein the semantic segmentation label is used for performing conditional regularization for space self-adaptive normalized regularization and enhancing the constraint of the image to be translated on the generated translated image;
s5, image augmentation: and the self-adaptive decision device is used for enhancing, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image. This approach can broaden the distribution of images, providing a greater gradient to aid training;
s6, training a source domain judger in the source domain network model and a target domain judger in the target domain network model: a source domain judger in the source domain network model and a target domain judger in the target domain network model do not share a regularization layer; when a target domain judger in a target domain network model is trained, the target domain judger in the target domain network model receives a target domain real image or a synthetic image and semantic segmentation labels of the real image, and the semantic segmentation labels are also used for conditional regularization of a space self-adaptive regularization layer. Thereby focusing more on local objects.
Preferably, in the data set preparation step, the collected images are divided into images to be translated, semantic segmentation labels of the images to be translated, and the translated images and the translated semantic segmentation labels of the images are correspondingly placed in four folders as target domain data; the images associated with the translated images are collected as source domain images, either using public data sets associated with the translated images or from the internet, and placed individually in a folder.
Preferably, in the step of preprocessing the data set, naming rules are set for each group of four data in the target domain data, so as to facilitate grouping.
Preferably, in the step of training the source domain generator in the source domain network model, the noise vector is subjected to full convolution layer dimensionality lifting to become a 65536 dimensionality vector, then the 65536 dimensionality vector is converted into a 256 × 256 dimensionality matrix, and then the matrix is input into the convolution layer, and batch regularization layer is used for regularization after convolution; the process of generating the image by the source domain generator needs to generate 256 × 256 false source domain images by up-sampling after down-sampling; the image received by the source domain decider is a real source domain image or a false source domain image, but the image received by the source domain decider is an enhanced image.
Preferably, in the step of training the target domain generator in the target domain network model, the pre-translated image and the post-translated image are not in the same distribution, so that the spatial adaptive regularization layer does not constrain the last up-sampling layer of the target domain generator, and only constrains the previous down-sampling layer and the feature extraction layer. Thus, the last layers can be better guided by the target domain judger, so that the generated result is closer to the translated image, and the characteristics of the image before translation are reserved. Because the source domain has a large amount of data and the batch regularization layer can learn the content invariant information of the image domain, during training, the affine parameters of the batch regularization layer of the corresponding layer are migrated into the spatial adaptive normalization layer, so as to help strengthen the relation between the source domain and the target domain.
Preferably, in the step of training the target domain generator in the target domain network model, the target domain generator receives the image to be translated and the semantic segmentation label of the image to be translated, the size of the image to be translated is 256 × 256 pixels, and the image to be translated directly enters the convolutional layer network through a full connection layer without the need of training the source domain network; the feature map after convolution uses space self-adaptive Normalization, the feature map (feature map) basic regularization in the space self-adaptive Normalization regularization uses an example regularization calculation mode, affine transformation is carried out through an additionally input to-be-translated image semantic segmentation label, firstly, the input to-be-translated image semantic segmentation label is subjected to one convolution to obtain output, then, two tensors are obtained through two convolutions respectively, the two tensors are used as an offset (beta) and a scaling quantity (gamma) in affine transformation parameters, then, firstly, affine parameters of a source domain batch layer regularization are used for restoring the feature map distribution, and then, gamma and beta are used for multiplying and adding the feature map in element level to obtain final output; the spatial adaptive normalization layer is only used in the downsampling and intermediate volume blocks, and the upsampling and source domain network model share the batch regularization layer, so that the translated image of the false target domain is finally obtained.
Preferably, in the step of image augmentation, an adaptive decision device is used for enhancement, and the augmentation change of the image occurs before the input of the adaptive decision device and not before the adaptive generator; before the image is sent to the self-adaptive decision device, the image is subjected to an augmentation mode of color change or random shielding randomly, and the adopted random probability is 0.8 of a safety value; the decided images will be those with 0.8 probability augmentation.
Preferably, in the step of training the source domain judger in the source domain network model and the target domain judger in the target domain network model, the adaptive judger receives a real translated image or a false translated image, and a semantic segmentation label of the real translated image; the image received by the self-adaptive decision device enters a convolution layer to extract features; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of a spatial adaptive normalization layer in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions; and finally, obtaining a judgment result.
The invention also provides an image generation system for generating a confrontation network based on the guidance of domain self-adaptive prior knowledge, which comprises a data set preparation module, a data set preprocessing module, a training source domain network module, a training target domain generation network module, an image augmentation module and a training target domain judgment network module, wherein the data set preparation module is used for collecting paired images and semantic segmentation labels corresponding to the images aiming at task requirements as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training; the data set preprocessing module is used for unifying the sizes of all image data in the target domain data and the source domain data; the training source domain network module trains a model by using the source domain data, the input of the model is a noise vector, the noise vector is processed by a full connection layer, a new vector is recombined into a uniform image size, and the model uses a batch regularization layer; the training target domain generation network module uses the target domain data training model, input data received by the model are images to be translated and semantic segmentation labels of the images to be translated, the semantic segmentation labels are used for conditioning regularization of a space self-adaptive normalization layer, and constraint of the images to be translated on the generated translated images is enhanced; the image amplification module uses a self-adaptive decision device to enhance, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image; the training target domain judgment network module adopts a self-adaptive judgment device to receive a real translated image or a false translated image and a semantic segmentation label of the real translated image; the real image or the composite image received by the self-adaptive decision device enters the convolution layer to extract features; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of a spatial adaptive normalization layer in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions.
Compared with the prior art, the invention has the beneficial effects that:
the image generation model is built based on the GAN, and the domain adaptive technology is combined to help image generation under the condition of small samples. The training process of the model has two lines of simultaneous training, where the source domain generation branch is used for source domain data generation, the source domain data is similar to the translated image, and the requirement for this line is to generate a realistic image that is similar to the translated image. This line now holds a large amount of information necessary to generate the image. The circuit is used for target domain image translation, the model receives an image to be translated as input, and semantic segmentation label information of the image to be translated is injected by using a space self-adaptive normalization layer. With this additional information, the relationship with the translated image is facilitated to be established. Because the first line can generate images similar to the translated images, and the batch regularization layer affine parameters of the first line store information related to the distribution of the translated images, the migration affine parameters help the second line to be close to the distribution of the translated images. The invention introduces domain self-adaptation into the GAN network, and helps to improve the capability of training the network by small sample data. The purpose of generating a vivid image can be achieved by using only 160 images in an experiment.
Drawings
The drawings in the following description are for the purpose of clearly understanding embodiments and technical aspects of the present invention, and it should be noted that the exemplary embodiments and descriptions thereof are only for the purpose of explaining the present invention and do not constitute an unlimited limit to the present invention.
FIG. 1 is a flow chart of a batch of data training models in the present invention.
Fig. 2 is a schematic structural diagram of a generator according to the present invention.
Fig. 3 is a schematic structural diagram of an adaptive decision device according to the present invention.
Detailed Description
It is to be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments of the invention may be combined with each other without conflict.
The invention provides an image generation method for generating a countermeasure network based on domain self-adaptive prior knowledge guidance, which comprises the following steps:
s1, data set preparation:
image data related to the task is collected and divided into an image to be translated, semantic segmentation labels _ B of the image to be translated, an image translated, image images _ A and semantic segmentation labels _ A of the image translated, the image data and the semantic segmentation labels _ A of the image translated are correspondingly placed in four folders, and the data are used as target domain data. Images related to the translated image images _ a are collected as source domain images source _ images in a folder separately using a public data set related to the translated image images _ a or from the internet.
S2, preprocessing a data set:
the main image area is cut out, then the image data is subjected to size transformation and scaled to 256 × 256, and finally the image is normalized, wherein the normalization parameters are all 0.5.
S3, training a source domain generator in the source domain network model:
when training the model on the source domain data, the model is set to the source domain mode. The source domain generator receives as input a noise vector that is scaled up through a full convolution layer to a 65536 dimensional vector, and then converts the 65536 dimensional vector to a 256 x 256 dimensional matrix, which is then input to the convolution layer, where batch regularization is used for the regularization after convolution. The generation process needs to go through a downsampling layer, a convolution layer with a constant size, and finally upsampling to generate 256 × 256 false source domain images (fake source _ images). The image received by the source domain decider is a real source domain image (real source _ images) or a synthesized fake source domain image (fake source _ images), but the image is an enhanced image, and the enhancing manner is described in detail in S5. The source domain decider tries to distinguish whether the input image is a real image or a synthesized false image. The parameters of the source domain generator and the source domain decider are alternately updated by the feedback.
S4, training a target domain generator in the target domain network model:
when the network is trained on the target domain data, the target domain generator receives images to be translated _ B and semantic segmentation labels _ B of the images to be translated, and the model is set to be in a target domain mode. The size of the image to be translated _ B is 256 pixels by 256 pixels, and the image to be translated _ B directly enters the convolutional layer network through the full connection layer without the need of training the source domain network. The feature graph after convolution uses space self-adaptive normalization, basic regularization used in the space self-adaptive normalization is a calculation mode of example regularization, affine transformation is carried out through additionally input image semantic segmentation labels _ B to be translated, firstly, the input image semantic segmentation labels _ B to be translated are output after being convoluted for one time, two tensors are obtained through two convolutions respectively, the two tensors are used as a scaling value (gamma) and an offset value (beta) in affine transformation parameters, then, the feature graph distribution is restored through affine parameters of a batch regularization layer in a source domain model, and then, element-level multiplication and addition are carried out on the feature graph through the gamma and the beta to obtain final output. The spatial adaptive Normalization layer is only used in the downsampling and intermediate volume blocks, the upsampling and source domain share a BN layer (Batch Normalization), and finally a translated image (fake images _ A) of a false target domain is obtained.
S5, image augmentation:
the conventional augmented approach is not applicable to GAN networks, and the present invention employs adaptive decision maker enhancements specifically proposed for GAN. I.e. the augmented change of the image occurs before the input adaptation determiner and not before the adaptation generator. The image is subjected to color change, random shielding and other augmentation modes randomly before being sent to the self-adaptive decision device. The originally set random probability of the method is automatically adjusted according to the degree of overfitting, but the judgment standard of overfitting is not suitable for the method, and the random probability adopted by the method is a safety value of 0.8. The decided images will be those with 0.8 probability augmentation.
S6, training a source domain judger in the source domain network model and a target domain judger in the target domain network model:
the decider receives a real translated image (real images _ a) or a fake translated image (fake images _ B), and a semantic segmentation label Labels _ a of the real translated image. The model is also set to the target domain mode. And the received real image or false image enters the convolutional layer to extract features. Labels _ A is used as a spatial adaptive normalization layer for conditional regularization, and the calculation process is as described in S4. The use of spatially adaptive normalization in the decider may enable the decider to focus on critical target regions. And finally, whether the input image is a real translated image or a false translated image is obtained.
The invention also provides an image generation system for generating a confrontation network based on the guidance of domain self-adaptive prior knowledge, which comprises a data set preparation module, a data set preprocessing module, a training source domain network module, a training target domain generation network module, an image augmentation module and a training target domain judgment network module, wherein the data set preparation module is used for collecting paired images and semantic segmentation labels corresponding to the images aiming at task requirements as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training; the data set preprocessing module is used for unifying the sizes of all image data in the target domain data and the source domain data; the training source domain network module trains a model by using the source domain data, the input of the model is a noise vector, the noise vector is processed by a full-connection layer, a new vector is recombined into a uniform image size, and the model uses batch regularization as a regularization layer; the training target domain generation network module uses the target domain data training model, input data received by the model are images to be translated and semantic segmentation labels of the images to be translated, the semantic segmentation labels are used for conditioning regularization of a space self-adaptive normalization layer, and constraint of the images to be translated on the generated translated images is enhanced; the image amplification module uses a self-adaptive decision device to enhance, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image; the training target domain judgment network module adopts a self-adaptive judgment device to receive a real translated image or a false translated image and a semantic segmentation label of the real translated image; the real image or the false image received by the decision device enters the convolution layer to extract the characteristics; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of spatially adaptive normalization in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.
Claims (8)
1. An image generation method for generating a countermeasure network based on the guidance of domain adaptive prior knowledge, which is characterized by comprising the following steps:
s1, data set preparation: collecting paired images and semantic segmentation labels corresponding to the images according to task requirements, and using the images and the semantic segmentation labels as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training;
s2, preprocessing a data set: unifying the sizes of all image data in the target domain data and the source domain data;
s3, training a source domain generator in the source domain network model: when the source domain data are used for training the model, the input of the model is a noise vector, and after the noise vector is processed by a full connection layer, the new vector is recombined and unified into the size of an image;
in the step of training the source domain generator in the source domain network model, the noise vector is subjected to full convolution layer dimensionality raising to become a 65536 dimensionality vector, then the 65536 dimensionality vector is converted into a 256 x 256 dimensionality matrix and then input into the convolution layer, and batch regularization layer is used for regularization after convolution; the process of generating the image by the source domain generator needs to generate 256 × 256 false source domain images by up-sampling after down-sampling; the image received by the source domain judger is a real source domain image or a false source domain image, but the image received by the source domain judger is an enhanced image;
s4, training a target domain generator in the target domain network model: when the target domain data is used for training the model, the input data received by the model are the image to be translated and the semantic segmentation label of the image to be translated, the semantic segmentation label is used for performing conditional regularization on a space self-adaptive normalization layer, and the constraint of the image to be translated on the generated translated image is enhanced;
s5, image augmentation: the self-adaptive decision device is used for enhancement, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image;
s6, training a source domain judger in the source domain network model and a target domain judger in the target domain network model: a source domain judger in the source domain network model and a target domain judger in the target domain network model do not share a regularization layer; when a target domain decision network is trained, a target domain decision device receives a target domain real image or a synthetic image and a semantic segmentation label of the real image, wherein the semantic segmentation label is also used for conditional regularization of a space self-adaptive normalization layer; and finally, obtaining a judgment result.
2. The method for generating the image based on the domain-adaptive priori knowledge guided generation countermeasure network as claimed in claim 1, wherein in the data set preparation step, the collected image is divided into an image to be translated, an image semantic segmentation label to be translated, a translated image and a translated image semantic segmentation label which are correspondingly placed in four folders as target domain data; the images associated with the translated images are collected as source domain images, either using public data sets associated with the translated images or from the internet, and placed individually in a folder.
3. The method as claimed in claim 1, wherein in the step of training the target domain generator in the target domain network model, the pre-translated image and the post-translated image are not in the same distribution, so that the spatial adaptive normalization does not constrain the last upsampling layer of the target domain generator, and only constrains other layers that do not contain upsampling.
4. The method as claimed in claim 3, wherein in the step of training the target domain generator in the target domain network model, the affine parameters of the batch regularization layer of the corresponding layer are migrated into the spatial adaptive normalization to help strengthen the relationship between the source domain and the target domain.
5. The method for generating an image based on domain-adaptive priori knowledge guided generation of a countermeasure network according to claim 1, wherein in the step of training a target domain generator in the target domain network model, the target domain generator receives an image to be translated and a semantic segmentation label of the image to be translated, the size of the image to be translated is 256 × 256 pixels, and the image to be translated directly enters the convolutional layer network without passing through a full connection layer as in the case of training a source domain network; the feature graph after convolution is normalized in a space self-adaptive mode, an example regularization is used for calculation in the space self-adaptive normalization regularization mode, affine transformation is carried out through an additionally input image semantic segmentation label to be translated, firstly, the input image semantic segmentation label to be translated is output after being convolved for one time, two tensors are obtained through two convolutions respectively, the two tensors are used as a scaling amount and an offset amount in affine transformation parameters, then, the feature graph distribution is restored through affine parameters of a batch regularization layer in a source domain network model, then, element-level multiplication and addition are carried out on the feature graph through the scaling amount and the offset amount, and final output is obtained; the spatial adaptive normalization layer is only used in downsampling and intermediate volume blocks, and the upsampling and source domain share a batch regularization layer to finally obtain a translated image of a false target domain.
6. The method for generating an image based on domain adaptive priori knowledge guided generation countermeasure network as claimed in claim 1, wherein in the step of image augmentation, an adaptive decision device is used for enhancement, and the augmented change of the image occurs before the input adaptive decision device and not before the adaptive generator; before the image is sent to the self-adaptive decision device, the image is subjected to an augmentation mode of color change or random shielding randomly, and the adopted random probability is a safety value of 0.8; the decided images will all be 0.8 probability augmented images.
7. The method of claim 1, wherein in the step of training a source domain decider in a source domain network model and a target domain decider in a target domain network model, the adaptive decider receives a real translated image or a false translated image, and a semantic segmentation label of the real translated image; the real image or the false image received by the self-adaptive decision device enters the convolution layer to extract the characteristics; the semantic segmentation labels of the real translated image are used as a space self-adaptive normalization layer to carry out condition regularization; the use of spatial adaptive normalization in the adaptive decision device enables the adaptive decision device to focus on critical target regions; and finally, obtaining a judgment result.
8. An image generation system for generating a countermeasure network based on domain self-adaptive priori knowledge guidance is characterized by comprising a data set preparation module, a data set preprocessing module, a training source domain network module, a training target domain generation network module, an image augmentation module and a training target domain judgment network module, wherein the data set preparation module is used for collecting paired images and semantic segmentation labels corresponding to the images according to task requirements and taking the paired images and the semantic segmentation labels as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training; the data set preprocessing module is used for unifying the sizes of all image data in the target domain data and the source domain data; the training source domain network module uses the source domain data to train the model, the input of the model is a noise vector, and the noise vector is processed by a full connection layer, and then a new vector is recombined into a uniform image size; the noise vector is subjected to full convolution layer dimensionality raising to form a 65536 dimensionality vector, then the 65536 dimensionality vector is converted into a 256 x 256 dimensionality matrix, then the matrix is input into a convolution layer, and batch regularization layers are used for regularization after convolution; the process of generating the image by the source domain generator needs to generate 256 × 256 false source domain images by up-sampling after down-sampling; the image received by the source domain judger is a real source domain image or a false source domain image, but the image received by the source domain judger is an enhanced image; the training target domain generation network module uses the target domain data training model, the input data received by the model are the image to be translated and the semantic segmentation label of the image to be translated, the semantic segmentation label is used for the spatial adaptive normalization layer to perform conditional regularization, and the constraint of the image to be translated on the generated translated image is enhanced; the image amplification module uses a self-adaptive decision device to enhance, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image; the training target domain judgment network module adopts a self-adaptive judgment device to receive a real translated image or a false translated image and a semantic segmentation label of the real translated image; the real image or the false image received by the self-adaptive decision device enters the convolution layer to extract the characteristics; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of spatially adaptive normalization in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210548444.3A CN114882220B (en) | 2022-05-20 | 2022-05-20 | Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210548444.3A CN114882220B (en) | 2022-05-20 | 2022-05-20 | Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114882220A CN114882220A (en) | 2022-08-09 |
CN114882220B true CN114882220B (en) | 2023-02-28 |
Family
ID=82678479
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210548444.3A Active CN114882220B (en) | 2022-05-20 | 2022-05-20 | Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114882220B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116385330B (en) * | 2023-06-06 | 2023-09-15 | 之江实验室 | Multi-mode medical image generation method and device guided by graph knowledge |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
CN110310221A (en) * | 2019-06-14 | 2019-10-08 | 大连理工大学 | A kind of multiple domain image Style Transfer method based on generation confrontation network |
CN110570433A (en) * | 2019-08-30 | 2019-12-13 | 北京影谱科技股份有限公司 | Image semantic segmentation model construction method and device based on generation countermeasure network |
CN111242157A (en) * | 2019-11-22 | 2020-06-05 | 北京理工大学 | Unsupervised domain self-adaption method combining deep attention feature and conditional opposition |
CN111597946A (en) * | 2020-05-11 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Processing method of image generator, image generation method and device |
CN113836330A (en) * | 2021-09-13 | 2021-12-24 | 清华大学深圳国际研究生院 | Image retrieval method and device based on generation antagonism automatic enhanced network |
CN113837290A (en) * | 2021-09-27 | 2021-12-24 | 上海大学 | Unsupervised unpaired image translation method based on attention generator network |
CN113888547A (en) * | 2021-09-27 | 2022-01-04 | 太原理工大学 | Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190707A (en) * | 2018-09-12 | 2019-01-11 | 深圳市唯特视科技有限公司 | A kind of domain adapting to image semantic segmentation method based on confrontation study |
CN111723780B (en) * | 2020-07-22 | 2023-04-18 | 浙江大学 | Directional migration method and system of cross-domain data based on high-resolution remote sensing image |
CN112150469B (en) * | 2020-09-18 | 2022-05-27 | 上海交通大学 | Laser speckle contrast image segmentation method based on unsupervised field self-adaption |
CN112308158B (en) * | 2020-11-05 | 2021-09-24 | 电子科技大学 | Multi-source field self-adaptive model and method based on partial feature alignment |
AU2020103905A4 (en) * | 2020-12-04 | 2021-02-11 | Chongqing Normal University | Unsupervised cross-domain self-adaptive medical image segmentation method based on deep adversarial learning |
CN113850813B (en) * | 2021-09-16 | 2024-05-28 | 太原理工大学 | Spatial resolution domain self-adaption based unsupervised remote sensing image semantic segmentation method |
-
2022
- 2022-05-20 CN CN202210548444.3A patent/CN114882220B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
CN110310221A (en) * | 2019-06-14 | 2019-10-08 | 大连理工大学 | A kind of multiple domain image Style Transfer method based on generation confrontation network |
CN110570433A (en) * | 2019-08-30 | 2019-12-13 | 北京影谱科技股份有限公司 | Image semantic segmentation model construction method and device based on generation countermeasure network |
CN111242157A (en) * | 2019-11-22 | 2020-06-05 | 北京理工大学 | Unsupervised domain self-adaption method combining deep attention feature and conditional opposition |
CN111597946A (en) * | 2020-05-11 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Processing method of image generator, image generation method and device |
CN113836330A (en) * | 2021-09-13 | 2021-12-24 | 清华大学深圳国际研究生院 | Image retrieval method and device based on generation antagonism automatic enhanced network |
CN113837290A (en) * | 2021-09-27 | 2021-12-24 | 上海大学 | Unsupervised unpaired image translation method based on attention generator network |
CN113888547A (en) * | 2021-09-27 | 2022-01-04 | 太原理工大学 | Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network |
Non-Patent Citations (2)
Title |
---|
Context-related video anomaly detection via generative adversarial network;Daoheng L.等;《Pattern Recognition Letters》;20220430;第183-189页 * |
基于局部生成对抗网络的水上低照度图像增强;刘文 等;《计算机工程》;20210208;第1-10页 * |
Also Published As
Publication number | Publication date |
---|---|
CN114882220A (en) | 2022-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111311518B (en) | Image denoising method and device based on multi-scale mixed attention residual error network | |
Bulat et al. | To learn image super-resolution, use a gan to learn how to do image degradation first | |
CN109919209B (en) | Domain self-adaptive deep learning method and readable storage medium | |
CN111583210B (en) | Automatic breast cancer image identification method based on convolutional neural network model integration | |
Ye et al. | Underwater image enhancement using stacked generative adversarial networks | |
CN116309648A (en) | Medical image segmentation model construction method based on multi-attention fusion | |
CN112884758B (en) | Defect insulator sample generation method and system based on style migration method | |
Cong et al. | Discrete haze level dehazing network | |
CN110415176A (en) | A kind of text image super-resolution method | |
CN114882220B (en) | Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system | |
CN113837942A (en) | Super-resolution image generation method, device, equipment and storage medium based on SRGAN | |
Zheng et al. | T-net: Deep stacked scale-iteration network for image dehazing | |
CN115713462A (en) | Super-resolution model training method, image recognition method, device and equipment | |
CN117635771A (en) | Scene text editing method and device based on semi-supervised contrast learning | |
CN113763300B (en) | Multi-focusing image fusion method combining depth context and convolution conditional random field | |
Zhang et al. | Dynamic multi-scale network for dual-pixel images defocus deblurring with transformer | |
Liu et al. | Facial image inpainting using multi-level generative network | |
CN107729885B (en) | Face enhancement method based on multiple residual error learning | |
CN116958766B (en) | Image processing method and computer readable storage medium | |
CN115731214A (en) | Medical image segmentation method and device based on artificial intelligence | |
CN116703719A (en) | Face super-resolution reconstruction device and method based on face 3D priori information | |
CN116342385A (en) | Training method and device for text image super-resolution network and storage medium | |
CN113421212B (en) | Medical image enhancement method, device, equipment and medium | |
Deng et al. | UCT‐GAN: underwater image colour transfer generative adversarial network | |
Wang et al. | Automated segmentation of intervertebral disc using fully dilated separable deep neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder | ||
CP02 | Change in the address of a patent holder |
Address after: Room 1409, Floor 14, Building 1, High tech Zone Entrepreneurship Center, No. 177, Gaoxin 6th Road, Rizhao, Shandong 276801 Patentee after: Shandong Liju Robot Technology Co.,Ltd. Address before: 276808 No.99, Yuquan 2nd Road, antonwei street, Lanshan District, Rizhao City, Shandong Province Patentee before: Shandong Liju Robot Technology Co.,Ltd. |