CN112529772A - Unsupervised image conversion method under zero sample setting - Google Patents

Unsupervised image conversion method under zero sample setting Download PDF

Info

Publication number
CN112529772A
CN112529772A CN202011501620.5A CN202011501620A CN112529772A CN 112529772 A CN112529772 A CN 112529772A CN 202011501620 A CN202011501620 A CN 202011501620A CN 112529772 A CN112529772 A CN 112529772A
Authority
CN
China
Prior art keywords
attribute
image
unseen
space
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011501620.5A
Other languages
Chinese (zh)
Other versions
CN112529772B (en
Inventor
陈元祺
余晓铭
刘杉
李革
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Instritute Of Intelligent Video Audio Technology Longgang Shenzhen
Original Assignee
Instritute Of Intelligent Video Audio Technology Longgang Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Instritute Of Intelligent Video Audio Technology Longgang Shenzhen filed Critical Instritute Of Intelligent Video Audio Technology Longgang Shenzhen
Priority to CN202011501620.5A priority Critical patent/CN112529772B/en
Priority claimed from CN202011501620.5A external-priority patent/CN112529772B/en
Publication of CN112529772A publication Critical patent/CN112529772A/en
Application granted granted Critical
Publication of CN112529772B publication Critical patent/CN112529772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

An unsupervised image conversion method under zero sample setting comprises the steps of applying attribute-visual relevance constraint and extending an attribute space by using an unseen attribute, wherein the application of the attribute-visual relevance constraint and the extension of the attribute space by using the unseen attribute are carried out synchronously. By applying attribute-visual relevance constraint and expanding the attribute space by using unseen attributes, the model can be promoted to be capable of fully utilizing the attribute characteristics of the category, so that unsupervised image conversion under a zero sample is realized.

Description

Unsupervised image conversion method under zero sample setting
Technical Field
The invention relates to the field of image generation and image conversion, in particular to an unsupervised image conversion method under zero sample setting.
Background
In recent years, with the development of creating an antagonistic network, a creation model has been receiving more and more attention. On the one hand, the generative model based on the generative antagonistic network shows a surprisingly generative effect, and the generative images have high resolution and are visual enough to be in a false-spurious mode; on the other hand, as the famous physicist ferman says, "i really understand it only if i can create it". Although recent machine learning models are excellent in tasks such as image classification, success of these applications cannot show that we really understand images and really realize intelligence. The ability to generate images is significant for further understanding of the images.
The image-to-image transformation (image-to-image transformation) is a branch in the generative model, which is subordinate to the conditional generative model and is input conditional on the input image. It deals with how images are transformed from one domain to the corresponding image in another domain. For example, an image taken in the daytime is converted into a night scene while the scene is kept unchanged. This is a challenging task, first, the output of the model should have both authenticity and characteristics of the target domain, pertaining to the target domain to which it is transformed; secondly, the model should have the output retain the individual characteristics of the input and should not have the complete other picture obtained after conversion. The problem in the second point is also called mode collapse (mode collapse), i.e. the output collapses into a few modes, the network outputting the same single result even if different inputs are provided to the network.
The above problem can be solved well in case of supervision. When paired datasets are owned (as a daytime image and a nighttime image of the same scene), the true images corresponding to the images can be approximated after transitioning from the source domain to the target domain by constraining the images. However, in many scenarios in reality, paired samples often cannot be obtained at low cost, or even do not exist. In this case, how to train the image transformation model unsupervised is a difficulty. Furthermore, the pattern collapse problem of image transitions is particularly acute when some classes of samples are not sufficient in number, or even completely without samples. In summary, at a zero sample setting, unsupervised image transformation is a challenging problem.
Disclosure of Invention
The invention provides an unsupervised image conversion method under zero sample setting, which realizes unsupervised image conversion under zero sample.
The technical scheme of the invention is as follows:
the unsupervised image conversion method under the zero sample setting comprises the steps of applying attribute-visual relevance constraint and extending an attribute space by using an unseen attribute, wherein the application of the attribute-visual relevance constraint and the extension of the attribute space by using the unseen attribute are synchronously carried out.
Preferably, in the above unsupervised image conversion method under zero sample setting, the applying of the attribute-visual association constraint includes the steps of: sampling two seen class attributes a from attribute spacemAnd anAnd calculating the correlation between the two
Figure BDA0002843778770000021
According to the adaptive instance normalization (AdaIN) method of style migration, the category attribute a seen by two is calculatedmAnd anVisual feature w of a determined visual spacem、wnAnd calculating the correlation between the two
Figure BDA0002843778770000022
And applying a relevance constraint: for two seen category attributes amAnd anAnd the two seen category attributes amAnd anDetermined visual feature wm、wnIs used for applying a constraint regular term Lreg=||s(am,an)-s(wm,wn)||2
Preferably, in the unsupervised image conversion method under the zero sample setting, the extending of the attribute space by the unseen attribute includes the following steps: sample not found class attribute auAnd an input image xiGenerating an image x with a generatort: passing loss function
Figure BDA0002843778770000023
Constraining image xtMake it have unseen category attribute auThe features of (1); and performing attribute regression by using the discriminator to expand the attribute space.
According to the technical scheme of the invention, the beneficial effects are as follows:
the method of the invention promotes the model to fully utilize the attribute characteristics of the category by applying the attribute-visual relevance constraint and utilizing the unseen attribute to expand the attribute space, thereby realizing the unsupervised image conversion under the zero sample.
For a better understanding and appreciation of the concepts, principles of operation, and effects of the invention, reference will now be made in detail to the following description of the invention, taken in conjunction with the accompanying drawings, by way of specific embodiments thereof.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention.
Fig. 1 is an overall framework diagram of the unsupervised image conversion method under the zero sample setting of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without any creative effort, shall fall within the protection scope of the present invention.
The image conversion model related to the unsupervised image conversion method under the zero sample setting of the invention is based on a generation countermeasure network and comprises a generator and a discriminator (shown in figure 1). Training the process of generating a countermeasure network is a very small game, wherein the goal of the generator is to produce enough samples to confuse the discriminators with false or false positives; the discriminator then attempts to distinguish between samples from the true data distribution and the generated samples. Training to the stable phase, the generator will be able to have higher quality samples, and it is also difficult for the discriminator to distinguish them from true samples.
In the case of zero samples, there is a portion of the class missing image sample data, referred to as the unseen class. For unseen categories, the only insight is the attribute features that they each hold. The image conversion model under the zero sample aims to input an image to be converted and a category attribute and convert the image into a target category. The key to the problem is how to migrate the knowledge of the seen categories to the unseen categories and how to cause the model to be transformed with the category attributes.
The working principle of the invention is as follows: by applying attribute-visual relevance constraint and expanding the attribute space by using unseen attributes, the model can be promoted to be capable of fully utilizing the attribute characteristics of the category, so that unsupervised image conversion under a zero sample is realized.
Attribute-visual association constraints refer to the association maintained for an attribute pair in attribute space and the association of a converted image pair according to the attribute pair, both of which are constrained to be consistent. Since there are no image samples in the training phase for unseen classes, it is desirable to utilize the attribute vectors to provide effective guidance for image transformation. By introducing the attribute-visual relevance constraint, the structure of the attribute space in the learned visual space is guided, and the image conversion of the unseen category is promoted.
Extending the attribute space with unseen attributes refers to applying the attributes of unseen classes in the training process. For unseen classes, although their image samples cannot be obtained, corresponding class attributes can be obtained. In the training phase, the unseen attributes are substituted into the image conversion model along with the input image and enable capture of the unseen attributes from the converted image. The strategy can better avoid mapping bias in zero sample image conversion, namely a conversion model for an unseen class is biased to convert to a similar seen class, thereby reducing the conversion performance gap between the seen class and the unseen class.
Fig. 1 is a general framework diagram of the unsupervised image conversion method under the zero sample setting of the present invention, and as shown in the figure, the method of the present invention includes two strategies, namely strategy 1: apply attribute-visual association constraints, and policy 2: the attribute space is extended with unseen attributes, where both strategies are synchronized.
Wherein applying the attribute-visual association constraint comprises the steps of:
1) sampling two seen class attributes a from attribute spacemAnd an(i.e., attribute 1 and attribute 2 in the attribute space of FIG. 1), and calculates the correlation between the two
Figure BDA0002843778770000031
2) According to the adaptive instance normalization (AdaIN) method of style migration, the category attribute a seen by two is calculatedm、anVisual feature w of a determined visual spacem、wn(corresponding to image 1 and image 2 in the visual space of fig. 1), and calculates the correlation between them
Figure BDA0002843778770000041
And
3) applying a relevance constraint: for two seen category attributes amAnd anAnd the two seen category attributes am、anDetermined visual feature wm、wnIs used for applying a constraint regular term Lreg=||s(am,an)-s(wm,wn)||2
The method for expanding the attribute space by using the unseen attribute comprises the following steps:
1) sample not found class attribute auAnd an input image xiGenerating an image x with a generatort
2) Passing loss function
Figure BDA0002843778770000042
Constrained generation image xtMake it have unseen category attribute auIs characterized by(ii) a And
3) and performing attribute regression by using a discriminator to expand an attribute space.
Compared with the existing image conversion method, the method provided by the invention has better conversion accuracy and better generation quality. Two concepts of conversion accuracy and generation quality in image conversion and related evaluation indexes are explained below, respectively.
Conversion accuracy: a measure is given to whether the image after conversion belongs to the domain to be converted. Generally, a pre-trained classifier is used for judging the probability that the converted image belongs to a target domain, and the evaluation indexes comprise Top-1 classification accuracy and Top-5 classification accuracy, namely for one picture, if the previous (or the previous five) of the probabilities contains correct answers, the correct accuracy is considered.
The quality of generation: and judging whether the converted image has higher image quality. On the evaluation index, the evaluation is divided into objective evaluation and subjective evaluation. Fraich perceptual distance (FID) is a commonly used objective method of generating an estimate of quality. To calculate the FID of an image conversion model, a batch of converted images is first generated using the model and sampled from the data set for comparison. Then, the characteristics of the two batches of images are extracted, the statistical characteristics of the two batches of images are calculated, and the difference of distribution between the generated image and the real image is measured based on the statistical characteristics to serve as the evaluation of the quality of the generated image. For subjective evaluation, the conversion results of several models are often presented to the testee at the same time, and the testee is allowed to select an image with the highest quality. After a large number of tests are performed, the models with higher selection rates have higher production quality.
As shown in Table 1, the comparison of the results of the present invention with the objective indicators of other algorithms includes the conversion accuracy for the seen and unseen classes and the generated quality indicator FID. Compared with the existing model (in which FUNIT-1 and FUNIT-5 are not at zero sample setting, but are unfair contrast, and StarGAN is at zero sample setting), the method provided by the invention can obtain better effects on both CUB and FLO data sets, wherein the improvement is more remarkable for unseen classes.
TABLE 1 comparison of the results of the present invention with objective indices of other algorithms
Figure BDA0002843778770000051
As shown in table 2, for subjective evaluation, when the conversion results of several models were presented to the subject at the same time, the picking rate of the present invention was much higher than it compared to StarGAN also at the zero sample setting; the present invention also exhibits competitive results for FUNIT-1 and FUNIT-5 in a low sample setting.
Table 2 shows the results of the present invention compared to subjective indicators of other algorithms.
Model (model) CUB data set FLO dataset
FUNIT-1 27.8% 21.8%
FUNIT-5 34.2% 27.8%
StarGAN 7.8% 14.3%
The invention 30.2% 36.1%
The present invention is not limited to the above preferred embodiments, and any modifications, equivalent replacements, improvements, etc. within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (3)

1. An unsupervised image conversion method under zero sample setting is characterized by comprising the steps of applying attribute-visual relevance constraint and extending an attribute space by using an unseen attribute, wherein the application of the attribute-visual relevance constraint and the extension of the attribute space by using the unseen attribute are carried out synchronously.
2. The method of unsupervised image transformation at zero sample setting of claim 1, wherein said applying an attribute-visual association constraint comprises the steps of:
sampling two self-seeing class attributes a from attribute spacemAnd anAnd calculating the correlation between the two
Figure FDA0002843778760000011
Calculating the class attribute a seen by the two according to an adaptive instance normalization (AdaIN) method of style migrationmAnd anVisual feature w of a determined visual spacem、wnAnd calculating the correlation between the two
Figure FDA0002843778760000012
And
applying a relevance constraint: for the two seen category attributes amAnd anAnd the category attribute a seen by the twomAnd anDetermined visual feature wm、wnIs used for applying a constraint regular term Lreg=||s(am,an)-s(wm,wn)||2
3. The unsupervised image conversion method at zero sample setting according to claim 1, wherein the extending the attribute space with unseen attributes comprises the steps of:
sample not found class attribute auAnd an input image xiGenerating an image x with a generatort
Passing loss function
Figure FDA0002843778760000013
Constraining the image xtMake it have the unseen category attribute auThe features of (1); and
and performing attribute regression by using a discriminator to expand an attribute space.
CN202011501620.5A 2020-12-18 Unsupervised image conversion method under zero sample setting Active CN112529772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011501620.5A CN112529772B (en) 2020-12-18 Unsupervised image conversion method under zero sample setting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011501620.5A CN112529772B (en) 2020-12-18 Unsupervised image conversion method under zero sample setting

Publications (2)

Publication Number Publication Date
CN112529772A true CN112529772A (en) 2021-03-19
CN112529772B CN112529772B (en) 2024-05-28

Family

ID=

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI769820B (en) * 2021-05-19 2022-07-01 鴻海精密工業股份有限公司 Method for optimizing the generative adversarial network and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740888A (en) * 2016-01-26 2016-07-06 天津大学 Joint embedded model for zero sample learning
CN109359670A (en) * 2018-09-18 2019-02-19 北京工业大学 A kind of individual strength of association automatic testing method based on traffic big data
CN109582960A (en) * 2018-11-27 2019-04-05 上海交通大学 The zero learn-by-example method based on structured asso- ciation semantic embedding
CN109598279A (en) * 2018-09-27 2019-04-09 天津大学 Based on the zero sample learning method for generating network from coding confrontation
CN110097095A (en) * 2019-04-15 2019-08-06 天津大学 A kind of zero sample classification method generating confrontation network based on multiple view
CN110163796A (en) * 2019-05-29 2019-08-23 北方民族大学 A kind of image generating method and frame that unsupervised multi-modal confrontation encodes certainly
CN110795585A (en) * 2019-11-12 2020-02-14 福州大学 Zero sample image classification model based on generation countermeasure network and method thereof

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740888A (en) * 2016-01-26 2016-07-06 天津大学 Joint embedded model for zero sample learning
CN109359670A (en) * 2018-09-18 2019-02-19 北京工业大学 A kind of individual strength of association automatic testing method based on traffic big data
CN109598279A (en) * 2018-09-27 2019-04-09 天津大学 Based on the zero sample learning method for generating network from coding confrontation
CN109582960A (en) * 2018-11-27 2019-04-05 上海交通大学 The zero learn-by-example method based on structured asso- ciation semantic embedding
CN110097095A (en) * 2019-04-15 2019-08-06 天津大学 A kind of zero sample classification method generating confrontation network based on multiple view
CN110163796A (en) * 2019-05-29 2019-08-23 北方民族大学 A kind of image generating method and frame that unsupervised multi-modal confrontation encodes certainly
CN110795585A (en) * 2019-11-12 2020-02-14 福州大学 Zero sample image classification model based on generation countermeasure network and method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
侯玉兵;: "图像风格迁移方法研究", 中国新通信, no. 17, 5 September 2020 (2020-09-05) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI769820B (en) * 2021-05-19 2022-07-01 鴻海精密工業股份有限公司 Method for optimizing the generative adversarial network and electronic equipment

Similar Documents

Publication Publication Date Title
Cheng et al. An analysis of generative adversarial networks and variants for image synthesis on MNIST dataset
Wang et al. Minegan: effective knowledge transfer from gans to target domains with few images
Hoshen et al. Non-adversarial image synthesis with generative latent nearest neighbors
WO2020216033A1 (en) Data processing method and device for facial image generation, and medium
Jolicoeur-Martineau On relativistic f-divergences
Yuan et al. Neighborloss: a loss function considering spatial correlation for semantic segmentation of remote sensing image
CN113255895A (en) Graph neural network representation learning-based structure graph alignment method and multi-graph joint data mining method
Teo et al. Fair generative models via transfer learning
CN113505855A (en) Training method for anti-attack model
Shariff et al. Artificial (or) fake human face generator using generative adversarial network (gan) machine learning model
Ning et al. Continuous learning of face attribute synthesis
CN112529772A (en) Unsupervised image conversion method under zero sample setting
Zhang et al. Improved procedures for training primal wasserstein gans
CN112529772B (en) Unsupervised image conversion method under zero sample setting
CN116232699A (en) Training method of fine-grained network intrusion detection model and network intrusion detection method
CN114494819B (en) Anti-interference infrared target identification method based on dynamic Bayesian network
CN115309985A (en) Fairness evaluation method and AI model selection method of recommendation algorithm
Cao et al. Searching for better spatio-temporal alignment in few-shot action recognition
Li et al. A method for face fusion based on variational auto-encoder
CN114170426A (en) Algorithm model for classifying rare tumor category small samples based on cost sensitivity
Lyu et al. DeCapsGAN: generative adversarial capsule network for image denoising
Hu et al. Crowd R-CNN: An object detection model utilizing crowdsourced labels
Wang et al. Real-time and accurate face detection networks based on deep learning
Zaji et al. Wheat spike counting using regression and localization approaches
CN116821408B (en) Multi-task consistency countermeasure retrieval method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant