CN114882220B - Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system - Google Patents

Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system Download PDF

Info

Publication number
CN114882220B
CN114882220B CN202210548444.3A CN202210548444A CN114882220B CN 114882220 B CN114882220 B CN 114882220B CN 202210548444 A CN202210548444 A CN 202210548444A CN 114882220 B CN114882220 B CN 114882220B
Authority
CN
China
Prior art keywords
image
domain
translated
adaptive
target domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210548444.3A
Other languages
Chinese (zh)
Other versions
CN114882220A (en
Inventor
张凯
史洋
聂秀山
逯天斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Liju Robot Technology Co ltd
Original Assignee
Shandong Liju Robot Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Liju Robot Technology Co ltd filed Critical Shandong Liju Robot Technology Co ltd
Priority to CN202210548444.3A priority Critical patent/CN114882220B/en
Publication of CN114882220A publication Critical patent/CN114882220A/en
Application granted granted Critical
Publication of CN114882220B publication Critical patent/CN114882220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image generation method and system for generating a countermeasure network based on domain self-adaptive prior knowledge guidance, wherein the method comprises the following steps: the method comprises the steps of data set preparation, data set preprocessing, source domain generator in a training source domain network model, target domain generator in a training target domain network model, image augmentation, source domain judger in the training source domain network model and target domain judger in the target domain network model. In the GAN proposed by the present invention, the generator includes a source domain branch and a target domain branch. The source domain branch is used for learning content information of a large amount of data similar to the target domain, and the affine parameter migration and domain mixing technology of the BN layer is utilized to migrate the knowledge of the source domain into the target domain, so that the problem of limited data of the target domain is solved. In order to further improve the quality of the generated image, a spatial adaptive normalization module is introduced into the target domain branch, and the prior knowledge of the main target is introduced in the target domain image generation process, so that the accuracy of the target in the generated image is improved.

Description

Domain-adaptive priori knowledge-based GAN (generic image generation) guided image generation method and system
Technical Field
The invention relates to an image generation technology, belongs to the field of computer vision and artificial intelligence, and particularly relates to an image generation method and system for generating a countermeasure network based on domain adaptive priori knowledge guidance.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
With the proposal of generating a countermeasure network (GAN), the image generation field is in the trend of research, and the GAN-based image generation model has achieved satisfactory effects in the task fields of style migration, image restoration, super-resolution, image translation, and the like.
In general, the GAN model consists of a two-part network, one of which is a generator sub-network for generating images and the other of which is a decision maker sub-network for ensuring that the generated images are consistent with the target image. The training process of the network is also a process in which two sub-networks game each other and jointly optimize. The complex structure of the GAN network makes its parameters large, so training GAN often requires a large amount of data. If the data volume is insufficient, the image generation quality is not high, and the generated image mode is collapsed (mode collapse). However, in certain tasks (e.g., medical image generation), it is difficult to collect large amounts of data, which can lead to reduced model performance.
For the condition of limited data, the adoption of the transfer learning is an effective idea for improving the network performance. In transfer learning, a domain adaptation technique fires, which is capable of aligning the source domain training data and the target domain data in a feature representation within the latent space. The same or similar features can be extracted from the data of the two domains by the network, so that a large number of source domain data extraction features can effectively assist the training of the target domain data, thereby improving the performance of the model trained under the condition of limited data.
Although GAN can generate data according with the image distribution of the training set under a certain training strategy, the quality of the generated image is difficult to guarantee, and the situation of content blurring and the like often occurs. This is often due to the fact that the regularization approach in the network is not appropriate. The problem is solved to a certain extent by spatial-Adaptive Normalization (SPADE), which obtains affine parameters in the regularization layer by performing convolution calculation on additional semantic segmentation labels. The area with the example in the semantic label is more obvious in the feature map extracted by the network, so that the semantic property of the feature map is enhanced, and the result of the generated image is more vivid.
Disclosure of Invention
In order to solve the problem that paired data and data labels are difficult to obtain, the invention provides a domain-adaptive priori-knowledge-based GAN (generic object model) image generation method and system. In the GAN proposed by the present invention, the generator includes two branches, a source domain branch and a target domain branch. The source domain branch is used for learning content information of a large amount of data similar to the target domain, and the affine parameter migration and domain mixing technology of the batch regularization layer is utilized to migrate the knowledge of the source domain into the target domain, so that the problem of limited data of the target domain is solved. In order to further improve the quality of the generated image, a spatial adaptive normalization module is introduced into the target domain branch, and the priori knowledge of the main target is introduced in the target domain image generation process, so that the accuracy of the target in the generated image is improved. In order to improve the importance of the important target region in the discrimination process, a spatial adaptive normalization module is introduced into the discriminator so that the discriminator can focus on the target region.
In order to realize the above content, the invention adopts the following technical scheme:
the invention provides an image generation method for generating a countermeasure network based on domain self-adaptive priori knowledge guidance, which comprises the following steps:
s1, preparing a data set: collecting paired images and semantic segmentation labels corresponding to the images according to task requirements, and using the paired images and the semantic segmentation labels as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training;
s2, preprocessing a data set: unifying the sizes of all image data in the target domain data and the source domain data;
s3, training a source domain generator in the source domain network model: when the source domain data are used for training the model, the input of the model is a noise vector, the noise vector is processed by a full connection layer, new vectors are recombined into a uniform size, and at the moment, batch regularization (Batch regularization) is used as a regularization layer for the model.
It is a desire of the present invention to enable a network to generate a translated image similar to that in the target domain. When a network has this capability, it can be said that the network holds content information that generates the translated image.
S4, training a target domain generator in the target domain network model: when the target domain data is used for training the model, the input data received by the model are the image to be translated and the semantic segmentation label of the image to be translated, wherein the semantic segmentation label is used for performing conditional regularization for space self-adaptive normalized regularization and enhancing the constraint of the image to be translated on the generated translated image;
s5, image augmentation: and the self-adaptive decision device is used for enhancing, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image. This approach can broaden the distribution of images, providing a greater gradient to aid training;
s6, training a source domain judger in the source domain network model and a target domain judger in the target domain network model: a source domain judger in the source domain network model and a target domain judger in the target domain network model do not share a regularization layer; when a target domain judger in a target domain network model is trained, the target domain judger in the target domain network model receives a target domain real image or a synthetic image and semantic segmentation labels of the real image, and the semantic segmentation labels are also used for conditional regularization of a space self-adaptive regularization layer. Thereby focusing more on local objects.
Preferably, in the data set preparation step, the collected images are divided into images to be translated, semantic segmentation labels of the images to be translated, and the translated images and the translated semantic segmentation labels of the images are correspondingly placed in four folders as target domain data; the images associated with the translated images are collected as source domain images, either using public data sets associated with the translated images or from the internet, and placed individually in a folder.
Preferably, in the step of preprocessing the data set, naming rules are set for each group of four data in the target domain data, so as to facilitate grouping.
Preferably, in the step of training the source domain generator in the source domain network model, the noise vector is subjected to full convolution layer dimensionality lifting to become a 65536 dimensionality vector, then the 65536 dimensionality vector is converted into a 256 × 256 dimensionality matrix, and then the matrix is input into the convolution layer, and batch regularization layer is used for regularization after convolution; the process of generating the image by the source domain generator needs to generate 256 × 256 false source domain images by up-sampling after down-sampling; the image received by the source domain decider is a real source domain image or a false source domain image, but the image received by the source domain decider is an enhanced image.
Preferably, in the step of training the target domain generator in the target domain network model, the pre-translated image and the post-translated image are not in the same distribution, so that the spatial adaptive regularization layer does not constrain the last up-sampling layer of the target domain generator, and only constrains the previous down-sampling layer and the feature extraction layer. Thus, the last layers can be better guided by the target domain judger, so that the generated result is closer to the translated image, and the characteristics of the image before translation are reserved. Because the source domain has a large amount of data and the batch regularization layer can learn the content invariant information of the image domain, during training, the affine parameters of the batch regularization layer of the corresponding layer are migrated into the spatial adaptive normalization layer, so as to help strengthen the relation between the source domain and the target domain.
Preferably, in the step of training the target domain generator in the target domain network model, the target domain generator receives the image to be translated and the semantic segmentation label of the image to be translated, the size of the image to be translated is 256 × 256 pixels, and the image to be translated directly enters the convolutional layer network through a full connection layer without the need of training the source domain network; the feature map after convolution uses space self-adaptive Normalization, the feature map (feature map) basic regularization in the space self-adaptive Normalization regularization uses an example regularization calculation mode, affine transformation is carried out through an additionally input to-be-translated image semantic segmentation label, firstly, the input to-be-translated image semantic segmentation label is subjected to one convolution to obtain output, then, two tensors are obtained through two convolutions respectively, the two tensors are used as an offset (beta) and a scaling quantity (gamma) in affine transformation parameters, then, firstly, affine parameters of a source domain batch layer regularization are used for restoring the feature map distribution, and then, gamma and beta are used for multiplying and adding the feature map in element level to obtain final output; the spatial adaptive normalization layer is only used in the downsampling and intermediate volume blocks, and the upsampling and source domain network model share the batch regularization layer, so that the translated image of the false target domain is finally obtained.
Preferably, in the step of image augmentation, an adaptive decision device is used for enhancement, and the augmentation change of the image occurs before the input of the adaptive decision device and not before the adaptive generator; before the image is sent to the self-adaptive decision device, the image is subjected to an augmentation mode of color change or random shielding randomly, and the adopted random probability is 0.8 of a safety value; the decided images will be those with 0.8 probability augmentation.
Preferably, in the step of training the source domain judger in the source domain network model and the target domain judger in the target domain network model, the adaptive judger receives a real translated image or a false translated image, and a semantic segmentation label of the real translated image; the image received by the self-adaptive decision device enters a convolution layer to extract features; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of a spatial adaptive normalization layer in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions; and finally, obtaining a judgment result.
The invention also provides an image generation system for generating a confrontation network based on the guidance of domain self-adaptive prior knowledge, which comprises a data set preparation module, a data set preprocessing module, a training source domain network module, a training target domain generation network module, an image augmentation module and a training target domain judgment network module, wherein the data set preparation module is used for collecting paired images and semantic segmentation labels corresponding to the images aiming at task requirements as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training; the data set preprocessing module is used for unifying the sizes of all image data in the target domain data and the source domain data; the training source domain network module trains a model by using the source domain data, the input of the model is a noise vector, the noise vector is processed by a full connection layer, a new vector is recombined into a uniform image size, and the model uses a batch regularization layer; the training target domain generation network module uses the target domain data training model, input data received by the model are images to be translated and semantic segmentation labels of the images to be translated, the semantic segmentation labels are used for conditioning regularization of a space self-adaptive normalization layer, and constraint of the images to be translated on the generated translated images is enhanced; the image amplification module uses a self-adaptive decision device to enhance, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image; the training target domain judgment network module adopts a self-adaptive judgment device to receive a real translated image or a false translated image and a semantic segmentation label of the real translated image; the real image or the composite image received by the self-adaptive decision device enters the convolution layer to extract features; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of a spatial adaptive normalization layer in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions.
Compared with the prior art, the invention has the beneficial effects that:
the image generation model is built based on the GAN, and the domain adaptive technology is combined to help image generation under the condition of small samples. The training process of the model has two lines of simultaneous training, where the source domain generation branch is used for source domain data generation, the source domain data is similar to the translated image, and the requirement for this line is to generate a realistic image that is similar to the translated image. This line now holds a large amount of information necessary to generate the image. The circuit is used for target domain image translation, the model receives an image to be translated as input, and semantic segmentation label information of the image to be translated is injected by using a space self-adaptive normalization layer. With this additional information, the relationship with the translated image is facilitated to be established. Because the first line can generate images similar to the translated images, and the batch regularization layer affine parameters of the first line store information related to the distribution of the translated images, the migration affine parameters help the second line to be close to the distribution of the translated images. The invention introduces domain self-adaptation into the GAN network, and helps to improve the capability of training the network by small sample data. The purpose of generating a vivid image can be achieved by using only 160 images in an experiment.
Drawings
The drawings in the following description are for the purpose of clearly understanding embodiments and technical aspects of the present invention, and it should be noted that the exemplary embodiments and descriptions thereof are only for the purpose of explaining the present invention and do not constitute an unlimited limit to the present invention.
FIG. 1 is a flow chart of a batch of data training models in the present invention.
Fig. 2 is a schematic structural diagram of a generator according to the present invention.
Fig. 3 is a schematic structural diagram of an adaptive decision device according to the present invention.
Detailed Description
It is to be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments of the invention may be combined with each other without conflict.
The invention provides an image generation method for generating a countermeasure network based on domain self-adaptive prior knowledge guidance, which comprises the following steps:
s1, data set preparation:
image data related to the task is collected and divided into an image to be translated, semantic segmentation labels _ B of the image to be translated, an image translated, image images _ A and semantic segmentation labels _ A of the image translated, the image data and the semantic segmentation labels _ A of the image translated are correspondingly placed in four folders, and the data are used as target domain data. Images related to the translated image images _ a are collected as source domain images source _ images in a folder separately using a public data set related to the translated image images _ a or from the internet.
S2, preprocessing a data set:
the main image area is cut out, then the image data is subjected to size transformation and scaled to 256 × 256, and finally the image is normalized, wherein the normalization parameters are all 0.5.
S3, training a source domain generator in the source domain network model:
when training the model on the source domain data, the model is set to the source domain mode. The source domain generator receives as input a noise vector that is scaled up through a full convolution layer to a 65536 dimensional vector, and then converts the 65536 dimensional vector to a 256 x 256 dimensional matrix, which is then input to the convolution layer, where batch regularization is used for the regularization after convolution. The generation process needs to go through a downsampling layer, a convolution layer with a constant size, and finally upsampling to generate 256 × 256 false source domain images (fake source _ images). The image received by the source domain decider is a real source domain image (real source _ images) or a synthesized fake source domain image (fake source _ images), but the image is an enhanced image, and the enhancing manner is described in detail in S5. The source domain decider tries to distinguish whether the input image is a real image or a synthesized false image. The parameters of the source domain generator and the source domain decider are alternately updated by the feedback.
S4, training a target domain generator in the target domain network model:
when the network is trained on the target domain data, the target domain generator receives images to be translated _ B and semantic segmentation labels _ B of the images to be translated, and the model is set to be in a target domain mode. The size of the image to be translated _ B is 256 pixels by 256 pixels, and the image to be translated _ B directly enters the convolutional layer network through the full connection layer without the need of training the source domain network. The feature graph after convolution uses space self-adaptive normalization, basic regularization used in the space self-adaptive normalization is a calculation mode of example regularization, affine transformation is carried out through additionally input image semantic segmentation labels _ B to be translated, firstly, the input image semantic segmentation labels _ B to be translated are output after being convoluted for one time, two tensors are obtained through two convolutions respectively, the two tensors are used as a scaling value (gamma) and an offset value (beta) in affine transformation parameters, then, the feature graph distribution is restored through affine parameters of a batch regularization layer in a source domain model, and then, element-level multiplication and addition are carried out on the feature graph through the gamma and the beta to obtain final output. The spatial adaptive Normalization layer is only used in the downsampling and intermediate volume blocks, the upsampling and source domain share a BN layer (Batch Normalization), and finally a translated image (fake images _ A) of a false target domain is obtained.
S5, image augmentation:
the conventional augmented approach is not applicable to GAN networks, and the present invention employs adaptive decision maker enhancements specifically proposed for GAN. I.e. the augmented change of the image occurs before the input adaptation determiner and not before the adaptation generator. The image is subjected to color change, random shielding and other augmentation modes randomly before being sent to the self-adaptive decision device. The originally set random probability of the method is automatically adjusted according to the degree of overfitting, but the judgment standard of overfitting is not suitable for the method, and the random probability adopted by the method is a safety value of 0.8. The decided images will be those with 0.8 probability augmentation.
S6, training a source domain judger in the source domain network model and a target domain judger in the target domain network model:
the decider receives a real translated image (real images _ a) or a fake translated image (fake images _ B), and a semantic segmentation label Labels _ a of the real translated image. The model is also set to the target domain mode. And the received real image or false image enters the convolutional layer to extract features. Labels _ A is used as a spatial adaptive normalization layer for conditional regularization, and the calculation process is as described in S4. The use of spatially adaptive normalization in the decider may enable the decider to focus on critical target regions. And finally, whether the input image is a real translated image or a false translated image is obtained.
The invention also provides an image generation system for generating a confrontation network based on the guidance of domain self-adaptive prior knowledge, which comprises a data set preparation module, a data set preprocessing module, a training source domain network module, a training target domain generation network module, an image augmentation module and a training target domain judgment network module, wherein the data set preparation module is used for collecting paired images and semantic segmentation labels corresponding to the images aiming at task requirements as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training; the data set preprocessing module is used for unifying the sizes of all image data in the target domain data and the source domain data; the training source domain network module trains a model by using the source domain data, the input of the model is a noise vector, the noise vector is processed by a full-connection layer, a new vector is recombined into a uniform image size, and the model uses batch regularization as a regularization layer; the training target domain generation network module uses the target domain data training model, input data received by the model are images to be translated and semantic segmentation labels of the images to be translated, the semantic segmentation labels are used for conditioning regularization of a space self-adaptive normalization layer, and constraint of the images to be translated on the generated translated images is enhanced; the image amplification module uses a self-adaptive decision device to enhance, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image; the training target domain judgment network module adopts a self-adaptive judgment device to receive a real translated image or a false translated image and a semantic segmentation label of the real translated image; the real image or the false image received by the decision device enters the convolution layer to extract the characteristics; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of spatially adaptive normalization in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (8)

1. An image generation method for generating a countermeasure network based on the guidance of domain adaptive prior knowledge, which is characterized by comprising the following steps:
s1, data set preparation: collecting paired images and semantic segmentation labels corresponding to the images according to task requirements, and using the images and the semantic segmentation labels as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training;
s2, preprocessing a data set: unifying the sizes of all image data in the target domain data and the source domain data;
s3, training a source domain generator in the source domain network model: when the source domain data are used for training the model, the input of the model is a noise vector, and after the noise vector is processed by a full connection layer, the new vector is recombined and unified into the size of an image;
in the step of training the source domain generator in the source domain network model, the noise vector is subjected to full convolution layer dimensionality raising to become a 65536 dimensionality vector, then the 65536 dimensionality vector is converted into a 256 x 256 dimensionality matrix and then input into the convolution layer, and batch regularization layer is used for regularization after convolution; the process of generating the image by the source domain generator needs to generate 256 × 256 false source domain images by up-sampling after down-sampling; the image received by the source domain judger is a real source domain image or a false source domain image, but the image received by the source domain judger is an enhanced image;
s4, training a target domain generator in the target domain network model: when the target domain data is used for training the model, the input data received by the model are the image to be translated and the semantic segmentation label of the image to be translated, the semantic segmentation label is used for performing conditional regularization on a space self-adaptive normalization layer, and the constraint of the image to be translated on the generated translated image is enhanced;
s5, image augmentation: the self-adaptive decision device is used for enhancement, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image;
s6, training a source domain judger in the source domain network model and a target domain judger in the target domain network model: a source domain judger in the source domain network model and a target domain judger in the target domain network model do not share a regularization layer; when a target domain decision network is trained, a target domain decision device receives a target domain real image or a synthetic image and a semantic segmentation label of the real image, wherein the semantic segmentation label is also used for conditional regularization of a space self-adaptive normalization layer; and finally, obtaining a judgment result.
2. The method for generating the image based on the domain-adaptive priori knowledge guided generation countermeasure network as claimed in claim 1, wherein in the data set preparation step, the collected image is divided into an image to be translated, an image semantic segmentation label to be translated, a translated image and a translated image semantic segmentation label which are correspondingly placed in four folders as target domain data; the images associated with the translated images are collected as source domain images, either using public data sets associated with the translated images or from the internet, and placed individually in a folder.
3. The method as claimed in claim 1, wherein in the step of training the target domain generator in the target domain network model, the pre-translated image and the post-translated image are not in the same distribution, so that the spatial adaptive normalization does not constrain the last upsampling layer of the target domain generator, and only constrains other layers that do not contain upsampling.
4. The method as claimed in claim 3, wherein in the step of training the target domain generator in the target domain network model, the affine parameters of the batch regularization layer of the corresponding layer are migrated into the spatial adaptive normalization to help strengthen the relationship between the source domain and the target domain.
5. The method for generating an image based on domain-adaptive priori knowledge guided generation of a countermeasure network according to claim 1, wherein in the step of training a target domain generator in the target domain network model, the target domain generator receives an image to be translated and a semantic segmentation label of the image to be translated, the size of the image to be translated is 256 × 256 pixels, and the image to be translated directly enters the convolutional layer network without passing through a full connection layer as in the case of training a source domain network; the feature graph after convolution is normalized in a space self-adaptive mode, an example regularization is used for calculation in the space self-adaptive normalization regularization mode, affine transformation is carried out through an additionally input image semantic segmentation label to be translated, firstly, the input image semantic segmentation label to be translated is output after being convolved for one time, two tensors are obtained through two convolutions respectively, the two tensors are used as a scaling amount and an offset amount in affine transformation parameters, then, the feature graph distribution is restored through affine parameters of a batch regularization layer in a source domain network model, then, element-level multiplication and addition are carried out on the feature graph through the scaling amount and the offset amount, and final output is obtained; the spatial adaptive normalization layer is only used in downsampling and intermediate volume blocks, and the upsampling and source domain share a batch regularization layer to finally obtain a translated image of a false target domain.
6. The method for generating an image based on domain adaptive priori knowledge guided generation countermeasure network as claimed in claim 1, wherein in the step of image augmentation, an adaptive decision device is used for enhancement, and the augmented change of the image occurs before the input adaptive decision device and not before the adaptive generator; before the image is sent to the self-adaptive decision device, the image is subjected to an augmentation mode of color change or random shielding randomly, and the adopted random probability is a safety value of 0.8; the decided images will all be 0.8 probability augmented images.
7. The method of claim 1, wherein in the step of training a source domain decider in a source domain network model and a target domain decider in a target domain network model, the adaptive decider receives a real translated image or a false translated image, and a semantic segmentation label of the real translated image; the real image or the false image received by the self-adaptive decision device enters the convolution layer to extract the characteristics; the semantic segmentation labels of the real translated image are used as a space self-adaptive normalization layer to carry out condition regularization; the use of spatial adaptive normalization in the adaptive decision device enables the adaptive decision device to focus on critical target regions; and finally, obtaining a judgment result.
8. An image generation system for generating a countermeasure network based on domain self-adaptive priori knowledge guidance is characterized by comprising a data set preparation module, a data set preprocessing module, a training source domain network module, a training target domain generation network module, an image augmentation module and a training target domain judgment network module, wherein the data set preparation module is used for collecting paired images and semantic segmentation labels corresponding to the images according to task requirements and taking the paired images and the semantic segmentation labels as target domain data during training; collecting images similar to or related to the translated images in the target domain from the Internet without labels as source domain data during training; the data set preprocessing module is used for unifying the sizes of all image data in the target domain data and the source domain data; the training source domain network module uses the source domain data to train the model, the input of the model is a noise vector, and the noise vector is processed by a full connection layer, and then a new vector is recombined into a uniform image size; the noise vector is subjected to full convolution layer dimensionality raising to form a 65536 dimensionality vector, then the 65536 dimensionality vector is converted into a 256 x 256 dimensionality matrix, then the matrix is input into a convolution layer, and batch regularization layers are used for regularization after convolution; the process of generating the image by the source domain generator needs to generate 256 × 256 false source domain images by up-sampling after down-sampling; the image received by the source domain judger is a real source domain image or a false source domain image, but the image received by the source domain judger is an enhanced image; the training target domain generation network module uses the target domain data training model, the input data received by the model are the image to be translated and the semantic segmentation label of the image to be translated, the semantic segmentation label is used for the spatial adaptive normalization layer to perform conditional regularization, and the constraint of the image to be translated on the generated translated image is enhanced; the image amplification module uses a self-adaptive decision device to enhance, the image before being input into the self-adaptive decision device is randomly enhanced, and the self-adaptive decision device only decides the enhanced image; the training target domain judgment network module adopts a self-adaptive judgment device to receive a real translated image or a false translated image and a semantic segmentation label of the real translated image; the real image or the false image received by the self-adaptive decision device enters the convolution layer to extract the characteristics; the semantic segmentation labels of the real translated images are used as a space self-adaptive normalization layer to carry out condition regularization; the use of spatially adaptive normalization in the adaptive decision maker enables the adaptive decision maker to focus on critical target regions.
CN202210548444.3A 2022-05-20 2022-05-20 Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system Active CN114882220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210548444.3A CN114882220B (en) 2022-05-20 2022-05-20 Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210548444.3A CN114882220B (en) 2022-05-20 2022-05-20 Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system

Publications (2)

Publication Number Publication Date
CN114882220A CN114882220A (en) 2022-08-09
CN114882220B true CN114882220B (en) 2023-02-28

Family

ID=82678479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210548444.3A Active CN114882220B (en) 2022-05-20 2022-05-20 Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system

Country Status (1)

Country Link
CN (1) CN114882220B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116385330B (en) * 2023-06-06 2023-09-15 之江实验室 Multi-mode medical image generation method and device guided by graph knowledge

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062753A (en) * 2017-12-29 2018-05-22 重庆理工大学 The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study
CN110310221A (en) * 2019-06-14 2019-10-08 大连理工大学 A kind of multiple domain image Style Transfer method based on generation confrontation network
CN110570433A (en) * 2019-08-30 2019-12-13 北京影谱科技股份有限公司 Image semantic segmentation model construction method and device based on generation countermeasure network
CN111242157A (en) * 2019-11-22 2020-06-05 北京理工大学 Unsupervised domain self-adaption method combining deep attention feature and conditional opposition
CN111597946A (en) * 2020-05-11 2020-08-28 腾讯科技(深圳)有限公司 Processing method of image generator, image generation method and device
CN113836330A (en) * 2021-09-13 2021-12-24 清华大学深圳国际研究生院 Image retrieval method and device based on generation antagonism automatic enhanced network
CN113837290A (en) * 2021-09-27 2021-12-24 上海大学 Unsupervised unpaired image translation method based on attention generator network
CN113888547A (en) * 2021-09-27 2022-01-04 太原理工大学 Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109190707A (en) * 2018-09-12 2019-01-11 深圳市唯特视科技有限公司 A kind of domain adapting to image semantic segmentation method based on confrontation study
CN111723780B (en) * 2020-07-22 2023-04-18 浙江大学 Directional migration method and system of cross-domain data based on high-resolution remote sensing image
CN112150469B (en) * 2020-09-18 2022-05-27 上海交通大学 Laser speckle contrast image segmentation method based on unsupervised field self-adaption
CN112308158B (en) * 2020-11-05 2021-09-24 电子科技大学 Multi-source field self-adaptive model and method based on partial feature alignment
AU2020103905A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Unsupervised cross-domain self-adaptive medical image segmentation method based on deep adversarial learning
CN113850813B (en) * 2021-09-16 2024-05-28 太原理工大学 Spatial resolution domain self-adaption based unsupervised remote sensing image semantic segmentation method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062753A (en) * 2017-12-29 2018-05-22 重庆理工大学 The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study
CN110310221A (en) * 2019-06-14 2019-10-08 大连理工大学 A kind of multiple domain image Style Transfer method based on generation confrontation network
CN110570433A (en) * 2019-08-30 2019-12-13 北京影谱科技股份有限公司 Image semantic segmentation model construction method and device based on generation countermeasure network
CN111242157A (en) * 2019-11-22 2020-06-05 北京理工大学 Unsupervised domain self-adaption method combining deep attention feature and conditional opposition
CN111597946A (en) * 2020-05-11 2020-08-28 腾讯科技(深圳)有限公司 Processing method of image generator, image generation method and device
CN113836330A (en) * 2021-09-13 2021-12-24 清华大学深圳国际研究生院 Image retrieval method and device based on generation antagonism automatic enhanced network
CN113837290A (en) * 2021-09-27 2021-12-24 上海大学 Unsupervised unpaired image translation method based on attention generator network
CN113888547A (en) * 2021-09-27 2022-01-04 太原理工大学 Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Context-related video anomaly detection via generative adversarial network;Daoheng L.等;《Pattern Recognition Letters》;20220430;第183-189页 *
基于局部生成对抗网络的水上低照度图像增强;刘文 等;《计算机工程》;20210208;第1-10页 *

Also Published As

Publication number Publication date
CN114882220A (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN111311518B (en) Image denoising method and device based on multi-scale mixed attention residual error network
Bulat et al. To learn image super-resolution, use a gan to learn how to do image degradation first
CN109919209B (en) Domain self-adaptive deep learning method and readable storage medium
CN111583210B (en) Automatic breast cancer image identification method based on convolutional neural network model integration
Ye et al. Underwater image enhancement using stacked generative adversarial networks
CN116309648A (en) Medical image segmentation model construction method based on multi-attention fusion
CN112884758B (en) Defect insulator sample generation method and system based on style migration method
Cong et al. Discrete haze level dehazing network
CN110415176A (en) A kind of text image super-resolution method
CN114882220B (en) Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system
CN113837942A (en) Super-resolution image generation method, device, equipment and storage medium based on SRGAN
Zheng et al. T-net: Deep stacked scale-iteration network for image dehazing
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
CN117635771A (en) Scene text editing method and device based on semi-supervised contrast learning
CN113763300B (en) Multi-focusing image fusion method combining depth context and convolution conditional random field
Zhang et al. Dynamic multi-scale network for dual-pixel images defocus deblurring with transformer
Liu et al. Facial image inpainting using multi-level generative network
CN107729885B (en) Face enhancement method based on multiple residual error learning
CN116958766B (en) Image processing method and computer readable storage medium
CN115731214A (en) Medical image segmentation method and device based on artificial intelligence
CN116703719A (en) Face super-resolution reconstruction device and method based on face 3D priori information
CN116342385A (en) Training method and device for text image super-resolution network and storage medium
CN113421212B (en) Medical image enhancement method, device, equipment and medium
Deng et al. UCT‐GAN: underwater image colour transfer generative adversarial network
Wang et al. Automated segmentation of intervertebral disc using fully dilated separable deep neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: Room 1409, Floor 14, Building 1, High tech Zone Entrepreneurship Center, No. 177, Gaoxin 6th Road, Rizhao, Shandong 276801

Patentee after: Shandong Liju Robot Technology Co.,Ltd.

Address before: 276808 No.99, Yuquan 2nd Road, antonwei street, Lanshan District, Rizhao City, Shandong Province

Patentee before: Shandong Liju Robot Technology Co.,Ltd.