CN111461159A - Decoupling representation learning algorithm based on similarity constraint - Google Patents

Decoupling representation learning algorithm based on similarity constraint Download PDF

Info

Publication number
CN111461159A
CN111461159A CN201910598166.0A CN201910598166A CN111461159A CN 111461159 A CN111461159 A CN 111461159A CN 201910598166 A CN201910598166 A CN 201910598166A CN 111461159 A CN111461159 A CN 111461159A
Authority
CN
China
Prior art keywords
similarity
representation learning
learning algorithm
model
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910598166.0A
Other languages
Chinese (zh)
Inventor
李晓强
陈亮波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN201910598166.0A priority Critical patent/CN111461159A/en
Publication of CN111461159A publication Critical patent/CN111461159A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a decoupling representation learning algorithm based on similarity constraint, and aims to solve the problem that the existing InfoGAN is in an unsupervised representation learning model. The method comprises the following specific steps: step one, preparing a data set; step two, selecting a model structure: in order to generate a picture with better visual effect, a WGAN structure with a gradient penalty is adopted; step three, applying similarity constraint to the model: constraints are imposed on the similarity and factors that generate the patches based on the WGAN structure using gradient penalties. Compared with an InfoGAN model, the similarity-constrained generation countermeasure network (SCGAN) is simple in structure, only the similarity constraint needs to be added on the basis of the original generation countermeasure network, meanwhile, the SCGAN has higher model robustness and still stably represents a processed data set, and the SCGAN is an unsupervised representation learning model, so that expensive labeling work can be avoided, and the application prospect is wide.

Description

Decoupling representation learning algorithm based on similarity constraint
Technical Field
The invention relates to the field of picture representation, in particular to a decoupling representation learning algorithm based on similarity constraint.
Background
In the probability statistics theory, the generation model is a model that can estimate a probability distribution from training data and randomly generate new observation data using the probability distribution. In order to make an excellent machine learning algorithm try to understand the intrinsic rules of the data, the algorithm needs to learn to create, that is, the generated data has important significance. The expression learning is a popular field in the generation learning, and is receiving attention of the learner. By representing the efficient representation obtained by learning, many discrimination tasks in machine learning, such as classification, segmentation, detection, etc., can be assisted. The decoupled representation learning belongs to a sub-branch of the representation learning, with the aim of learning factors that can control the high level semantic information of the picture. For supervised models, such as CGAN (conditional generation countermeasure network), labels are explicitly provided for factors that learn to control the class of items. For unsupervised models, such as InfoGAN (information maximization generation countermeasure network), the relationship between factors and picture representation is measured through mutual information, the lower bound of the mutual information is maximized by using a variation technology, and the factors are further learned to control the potential representation of the picture, such as illumination, color and the like.
Conditional generation models require the provision of tags for representation learning, in most cases, obtaining tags is expensive, and because conditional generation models provide tags for representations to be learned, the representations captured by the models are limited, such as capturing the number types on a handwritten character set.
InfoGAN is relatively classical in the unsupervised representation learning model. The idea of this model is to maximize the mutual information between the factor and the picture, and the intuitive explanation is that since the factor can control some representation of the picture, the factor must have a close relation to the picture, and the mutual information can be used to measure the relation. However, the model of the InfoGAN is complex, and in order to maximize mutual information, the InfoGAN uses a variational technique, and a neural network is additionally added to maximize the lower bound. Moreover, the InfoGAN training is not stable, and the collapse phenomenon is easy to occur on some processed data sets (randomly translated handwritten character data sets), so people are also researching in relevant aspects.
Disclosure of Invention
An object of the embodiments of the present invention is to provide a decoupling representation learning algorithm based on similarity constraint, so as to solve the problems proposed in the above background art.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
a decoupling representation learning algorithm based on similarity constraint specifically comprises the following steps:
step one, preparing a data set;
secondly, the training process of the WGAN is more stable than the original GAN and is easier to converge, in order to enable the discriminator to evaluate the Wassertein distance, the discriminator needs to be restricted under the limit of 1-L ipischitz (Lipschitz), and the essence of the constraint of 1-L ipitz is that the change degree of the penalty output of the discriminator is required to be smaller than the change degree of the input penalty, and some technologies can enable the discriminator to approximately reach the limit of 1-L ipitz, such as weight and gradient, in the actual operation, the effect of the gradient is far better than that of the penalty, and the clipping technology is preferably selected because the gradient is selected;
step three, applying similarity constraint to the model: on the basis of a WGAN structure using gradient punishment, the similarity and the factors of the generated pictures are constrained, the core idea is that when the factors are the same, the representation difference of the generated pictures should be as small as possible, otherwise, when the factors are different, the representation difference of the generated pictures should be as large as possible, the similarity constraint is that through the repulsion action generated among different factors, the attraction action is generated among the same factors, so that different factors can control different representations, and the purpose of representation learning is achieved.
As a further scheme of the embodiment of the invention: the data set in the step one comprises a simple data set and a complex data set, and can have rich samples, so that the most comprehensive samples are provided for the model to the maximum extent.
As a further scheme of the embodiment of the invention: the simple data set comprises MNIST, Fashinon-MNIST and SVHN, and the target is centered; the data set SVHN contains colored digital pictures, the background color is richer, and a plurality of numbers can appear in each picture.
As a further scheme of the embodiment of the invention: the complex data set comprises CIFAR-10 and CelebA, the data set CIFAR-10 contains pictures of real scenes, 10 types of objects are contained, and even if the same type of objects exist, huge differences can exist.
As a further scheme of the embodiment of the invention: since the difference in representation of the generated picture is not easily obtained, it is proposed to assume that the generated picture is composed of two parts of content and representation, and when the control content is the same, the difference in representation of the generated picture is equivalent to the difference in similarity of the generated picture, thereby converting the constraint from the difference in representation of the generated picture to the difference in similarity of the generated picture.
As a further aspect of the embodiments of the present invention. The two data sets MNIST and fast-MNIST contain a gray scale map, a black background and an object, which is centered.
As a further scheme of the embodiment of the invention: the data set CelebA contains a large number of star faces and has rich face representation.
Compared with the prior art, the embodiment of the invention has the beneficial effects that:
compared with an InfoGAN model, the SCGAN model is simple in structure, only the similarity constraint is added on the basis of an original generated countermeasure network, the SCGAN (the similarity constrained generated countermeasure network) has higher model robustness and still shows stability in a processed data set, and the SCGAN is also an unsupervised representation learning model, so that expensive labeling work can be avoided, and the application prospect is wide.
Drawings
Fig. 1 is a workflow diagram of embodiment 1 in a decoupling representation learning algorithm based on similarity constraints.
Detailed Description
The technical solution of the present patent will be described in further detail with reference to the following embodiments.
Example 1
A decoupling representation learning algorithm based on similarity constraint specifically comprises the following steps:
step 1, noise z is sampled from Gaussian distribution at random, a factor c is sampled from polynomial distribution or uniform distribution, the factor c can be a discrete factor or a continuous factor, and the factor c is input into a generator to generate a picture.
And 2, randomly sampling a batch of real pictures x from the data set and sampling a batch of generated pictures G (z, c) from the output of the generator, and inputting the pictures into a discriminator to distinguish between true and false.
And 3, applying similarity constraint to the factor c and the generated picture G (z, c).
And 4, updating the parameters of the discriminator and the generator respectively by using a gradient descent algorithm.
Step 5, repeating steps 1-4 until the generator is able to generate a sufficiently realistic picture and different factors c are able to control different representations of the picture.
The working principle of the embodiment of the invention is as follows: the method is based on generation of a confrontation network and similarity constraint, and can learn the decoupling representation unsupervised. Because of the unsupervised learning mode, the pictures in the data set do not need to be labeled in advance. The representation of the picture is a relatively fuzzy concept, and it is very difficult for people to identify definite division boundaries, such as the division of the brightness of the picture. In an unsupervised mode, the potential characteristics of the data set can be automatically analyzed, and changes of more obvious picture representations can be learned.
The model structure of the method is very simple, and only the similarity constraint needs to be added on the basis of the original generation of the countermeasure network. And another unsupervised model InfoGAN for decoupling and representing learning needs to additionally introduce a neural network to optimize the lower bound of mutual information, so that the complexity of the model is increased. In addition, from the aspect of training difficulty, the model of the method is easier to converge, particularly on some processed data sets (handwriting character data sets which are translated randomly), the InfoGAN is broken down in the training process, and the model of the method is very robust.
The model of the method has better expandability. Since the similarity constraint only acts between the generated picture and the factor, any new type of generation countermeasure network model can integrate our similarity constraint. Therefore, the method can generate a picture with higher quality by using the novel models, and meanwhile, the similarity constraint can ensure that the factor captures the change expressed by the picture.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (7)

1. A decoupling representation learning algorithm based on similarity constraint is characterized by comprising the following specific steps:
step one, preparing a data set;
step two, selecting a model structure: structures employing WGAN using gradient penalties;
step three, applying similarity constraint to the model: constraints are imposed on the similarity and factors that generate the patches based on the WGAN structure using gradient penalties.
2. The similarity constraint-based decoupled representation learning algorithm of claim 1, wherein the first-step data set comprises a simple data set and a complex data set.
3. The similarity constraint-based decoupled representation learning algorithm of claim 2, wherein the simple dataset comprises MNIST, Fashion-MNIST, and SVHN.
4. The similarity constraint-based decoupled representation learning algorithm according to claim 2 or 3, characterized in that the complex dataset comprises CIFAR-10 and CelebA.
5. The similarity constraint-based decoupled representation learning algorithm of claim 1, wherein the generated picture is composed of two parts, namely content and representation.
6. The decoupled representation learning algorithm based on similarity constraints of claim 3 wherein the MNIST and Fashion-MNIST data sets contain gray scale maps, black backgrounds and objects.
7. The decoupling representation learning algorithm based on similarity constraint according to claim 4, wherein the data set CelebA contains a large number of star faces.
CN201910598166.0A 2019-07-04 2019-07-04 Decoupling representation learning algorithm based on similarity constraint Pending CN111461159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910598166.0A CN111461159A (en) 2019-07-04 2019-07-04 Decoupling representation learning algorithm based on similarity constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910598166.0A CN111461159A (en) 2019-07-04 2019-07-04 Decoupling representation learning algorithm based on similarity constraint

Publications (1)

Publication Number Publication Date
CN111461159A true CN111461159A (en) 2020-07-28

Family

ID=71679116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910598166.0A Pending CN111461159A (en) 2019-07-04 2019-07-04 Decoupling representation learning algorithm based on similarity constraint

Country Status (1)

Country Link
CN (1) CN111461159A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763857A (en) * 2018-05-29 2018-11-06 浙江工业大学 A kind of process soft-measuring modeling method generating confrontation network based on similarity
CN109086437A (en) * 2018-08-15 2018-12-25 重庆大学 A kind of image search method merging Faster-RCNN and Wasserstein self-encoding encoder

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763857A (en) * 2018-05-29 2018-11-06 浙江工业大学 A kind of process soft-measuring modeling method generating confrontation network based on similarity
CN109086437A (en) * 2018-08-15 2018-12-25 重庆大学 A kind of image search method merging Faster-RCNN and Wasserstein self-encoding encoder

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIAOQIANG LI: "SCGAN: Disentangled Representation Learning by Adding Similarity Constraint on Generative Adversarial Nets" *
林懿伦等: "人工智能研究的新前线:生成式对抗网络" *
王万良;李卓蓉;: "生成式对抗网络研究进展" *

Similar Documents

Publication Publication Date Title
CN108960409B (en) Method and device for generating annotation data and computer-readable storage medium
CN111915540B (en) Rubbing oracle character image augmentation method, rubbing oracle character image augmentation system, computer equipment and medium
CN100550038C (en) Image content recognizing method and recognition system
CN101315663B (en) Nature scene image classification method based on area dormant semantic characteristic
CN111898696A (en) Method, device, medium and equipment for generating pseudo label and label prediction model
CN113449594A (en) Multilayer network combined remote sensing image ground semantic segmentation and area calculation method
CN112132197A (en) Model training method, image processing method, device, computer equipment and storage medium
CN110399518A (en) A kind of vision question and answer Enhancement Method based on picture scroll product
Zhao et al. A malware detection method of code texture visualization based on an improved faster RCNN combining transfer learning
CN109726712A (en) Character recognition method, device and storage medium, server
CN111652233A (en) Text verification code automatic identification method for complex background
CN113157800A (en) Identification method for discovering dynamic target in air in real time
CN113435335B (en) Microscopic expression recognition method and device, electronic equipment and storage medium
CN110097616B (en) Combined drawing method and device, terminal equipment and readable storage medium
CN112381082A (en) Table structure reconstruction method based on deep learning
CN116309992A (en) Intelligent meta-universe live person generation method, equipment and storage medium
CN115984653B (en) Construction method of dynamic intelligent container commodity identification model
Aiwan et al. Image spam filtering using convolutional neural networks
CN110738239A (en) search engine user satisfaction evaluation method based on mouse interaction sequence region behavior joint modeling
Ver Hoef et al. A primer on topological data analysis to support image analysis tasks in environmental science
Zhang et al. Multi-weather classification using evolutionary algorithm on efficientnet
CN116777925B (en) Image segmentation domain generalization method based on style migration
CN111461159A (en) Decoupling representation learning algorithm based on similarity constraint
CN116415152A (en) Diffusion model-based self-supervision contrast learning method for human motion recognition
CN116681921A (en) Target labeling method and system based on multi-feature loss function fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200728