CN111476294B - Zero sample image identification method and system based on generation countermeasure network - Google Patents
Zero sample image identification method and system based on generation countermeasure network Download PDFInfo
- Publication number
- CN111476294B CN111476294B CN202010263452.4A CN202010263452A CN111476294B CN 111476294 B CN111476294 B CN 111476294B CN 202010263452 A CN202010263452 A CN 202010263452A CN 111476294 B CN111476294 B CN 111476294B
- Authority
- CN
- China
- Prior art keywords
- semantic
- visual
- discriminator
- features
- loss function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000000007 visual effect Effects 0.000 claims abstract description 204
- 238000012549 training Methods 0.000 claims abstract description 72
- 238000012360 testing method Methods 0.000 claims abstract description 36
- 238000009826 distribution Methods 0.000 claims description 37
- 238000010276 construction Methods 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 14
- 230000003042 antagnostic effect Effects 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 238000012952 Resampling Methods 0.000 claims description 6
- 125000004122 cyclic group Chemical group 0.000 claims description 5
- 230000006870 function Effects 0.000 description 68
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 239000013598 vector Substances 0.000 description 7
- 230000008485 antagonism Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012358 sourcing Methods 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a zero sample image identification method and a zero sample image identification system based on a generation countermeasure network. The method comprises the following steps: acquiring a training image sample with marking information and a test image sample without marking information; constructing and generating a confrontation network model; the generation of the countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; constructing a multi-target loss function comprising a cycle consistency loss function, a countermeasure loss function of a semantic discriminator, a countermeasure loss function of a visual discriminator and a classification loss function of the semantic discriminator; taking the training image sample as the input of the generated countermeasure network model, and carrying out iterative training on the generated countermeasure network model based on the multi-target loss function to obtain the trained generated countermeasure network model; and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result. The invention can identify the sketch without the marked information, and the zero sample identification precision is high.
Description
Technical Field
The invention relates to the field of image identification based on weak/semi-supervision, in particular to a zero sample image identification method and system based on a generation countermeasure network.
Background
The concept of Zero-shot Learning (ZSL) was first proposed by h.larochelle et al in 2008, and is mainly used to solve the problem of how to correctly classify and identify unknown new objects under the condition that labeled training samples do not sufficiently cover all object classes. If a classifier is learned on a training set and applied to a test sample set according to a traditional supervised learning method, the classification effect is poor because the sample distribution of two domains is different. This image recognition problem is called zero sample recognition.
Zero sample identification requires only labeled samples of known classes to predict unknown classes. The main idea is to introduce category semantic information as middle layer characteristics and to link visual characteristics with semantic characteristics. Therefore, at the feature level, the key problem of implementing zero sample identification is: 1) searching visual features capable of fully expressing visual information of the image and semantic information capable of fully representing semantic features; 2) how to relate visual features to category semantic information.
For the key problem 1), finding visual features that can sufficiently express image visual information is one of the challenges of zero sample identification. With the rise of deep learning, scholars extract the identification features of images by using deep convolutional neural networks. Zero-sample image recognition requires not only visual features of the image, but also semantic features that can represent the semantics of image classes to link known classes to unknown classes. The most widely used semantic features currently are attribute features and text features. Due to the fact that the attribute characteristics are marked manually, accuracy is poor. In recent years, with the development of natural language processing techniques, research using text description features instead of attribute features has been receiving much attention. Because the text description features can be extracted directly from the corpus, each class corresponds to a vector in the text description space. Compared with attribute features, the text description features can obtain text vectors of any words from the unlabeled text corpus through natural language processing technology, and therefore have better expansibility. A commonly used text vector extraction method is Word2 Vec.
Existing semantic feature spaces can be divided into three categories: (1) semantic feature space based on attributes. (2) A text-based semantic feature space. (3) A common semantic feature space. After the semantic feature space is selected, how to establish the mapping relationship between the visual features and the semantic features is the second key problem of zero sample identification.
For the key problem 2), after extracting semantic features of known classes and unknown classes in a given semantic space, the semantic correlation between the classes can be obtained from the similarity between the semantic features. However, sample images are represented by visual features in the visual space, and they cannot directly link semantic features of the semantic space due to the existence of semantic gaps. Most of the existing methods learn the mapping function which is mapped from the visual space to the semantic space through the visual features of the known class pictures and the semantic features of the corresponding labels. Then, the visual features of the test image are mapped to a semantic space through the mapping function, and predicted semantic features are obtained. And finally finding out the semantic features of the unknown class closest to the unknown class to determine the class to which the unknown class belongs.
In zero-sample image recognition, since the known class and the unknown class are not intersected, the direct application of the model learned from the training sample set to the test set results in a large deviation between the mapping of the test set samples in the semantic space and the real class semantics, which is called domain offset. Recently, to solve the domain shift problem in zero sample learning, many methods have been proposed, such as data enhancement, self-training, and pivot correction.
The zero sample recognition has received a wide attention of the middle and old scholars in recent years, and the application-related algorithm of the zero sample recognition has come to be applied in practice. Previous zero sample learning methods mainly identify targets in the conventional zero sample learning setting, i.e., the test image is limited to only the target class, whereas in an actual scenario, the test image comes not only from the target class but also possibly from the source class. In this case, data from both the source class and the target class should be taken into consideration, and thus the generalized zero sample setting has been introduced in recent years, however, the recognition accuracy of the zero sample based on the generalized zero sample learning is much lower than that based on the conventional zero sample learning. Therefore, the conventional generalized zero sample identification method has the problem of low identification precision.
Disclosure of Invention
Based on the above, there is a need for a zero-sample image recognition method and system based on a generation countermeasure network, which can perform high-precision recognition on test images from a target class and a source class.
In order to achieve the purpose, the invention provides the following scheme:
a zero sample image identification method based on a generation countermeasure network comprises the following steps:
acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
constructing and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
taking the training image sample as the input of the generated countermeasure network model, and performing iterative training on the generated countermeasure network model based on the multi-target loss function to obtain a trained generated countermeasure network model;
and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result.
The invention also provides a zero sample image recognition system based on the generation countermeasure network, which comprises:
the sample acquisition module is used for acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
the network model building module is used for building and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
the loss function construction module is used for constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
the training module is used for taking the training image sample as the input of the generated confrontation network model, and performing iterative training on the generated confrontation network model based on the multi-target loss function to obtain a trained generated confrontation network model;
and the test identification module is used for inputting the test image sample into the trained generated confrontation network model to obtain an identification result.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a zero sample image recognition method and a zero sample image recognition system based on a generated countermeasure network, wherein the method comprises the steps of constructing a generated countermeasure network model comprising a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; constructing a multi-target loss function comprising a cycle consistency loss function, a countermeasure loss function of a semantic discriminator, a countermeasure loss function of a visual discriminator and a classification loss function of the semantic discriminator; taking the training image sample as the input of the generated countermeasure network model, and carrying out iterative training on the generated countermeasure network model based on the multi-target loss function to obtain the trained generated countermeasure network model; and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result. The method can identify the sketch without the marked information, improve the zero sample identification precision and improve the generalization capability of the model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a zero-sample image recognition method based on a generation countermeasure network according to an embodiment of the present invention;
FIG. 2 is a semantic feature generator G according to an embodiment of the present invention1The network structure of (1);
FIG. 3 is a diagram of a visual feature generator G according to an embodiment of the present invention2The network structure of (1);
FIG. 4 is a diagram of a semantic discriminator D according to an embodiment of the present invention1The network structure of (1);
FIG. 5 is a diagram of a visual discriminator D according to an embodiment of the invention2The network structure of (1);
FIG. 6 is a block diagram of a trained generative confrontation network model according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a zero-sample image recognition system based on a generative countermeasure network according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
In order to improve the identification precision of the generalized zero sample, the following two problems need to be solved: on the one hand, aligned image pairs are required or inefficient feature fusion is required to map visual information to semantic space; on the other hand, when the self-encoder is used for extracting semantic information from Wikipedia, redundant noise texts exist, and the recognition effect is influenced.
Fig. 1 is a flowchart of a zero-sample image recognition method based on a generation countermeasure network according to an embodiment of the present invention. Referring to fig. 1, the zero-sample image recognition method based on the generation countermeasure network of the embodiment includes:
step 101: training image samples and test image samples are obtained.
The training image sample is a sample image with marking information, and the test image sample is a sample image without marking information.
Step 102: constructing and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator.
The semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features.
Before this step is performed, it is also necessary: 1) inputting texts in Wikipedia into a layered model to obtain useful information of the texts, and inputting the useful information of the texts into a self-encoder to obtain real semantic features. 2) And inputting the training image sample into a CNN model based on an attention mechanism to obtain a real visual feature.
Step 103: constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, a confrontation loss function of a semantic discriminator, a confrontation loss function of the visual discriminator and a classification loss function of the semantic discriminator.
Step 104: and taking the training image sample as the input of the generated countermeasure network model, and performing iterative training on the generated countermeasure network model based on the multi-target loss function to obtain the trained generated countermeasure network model.
Step 105: and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result.
Step 101 is a training initial stage of this embodiment, the training initial stage of the recognition model is completed under a deep learning tensflo framework, and a specific flow of obtaining a training image sample and a testing image sample is as follows:
the training image samples and the test image samples in this embodiment may be selected from Sketchy and TU-Berlin. Sketchy and TU-Berlin are two common and popular sketch datasets.
The Sketchy dataset is a large sketch set. The dataset consists of 125 different classes of slaves, each class having 100 drafts. The sketch of the object appearing in this 12500 sketch was collected by group sourcing, with the result being 75471 sketches. The data set also contains fine-grained correspondence (alignment) between particular images and sketches, as well as various data augmentations for deep learning based methods. The data set was then expanded by adding 60502 photos, yielding a total of 73002 sketches. We randomly draw 25 classes of sketches as invisible test sets for zero sample recognition (without their labeling information) and the remaining 100 classes of data are used for training (with labeling information).
The TU-Berlin dataset (extended) contains 250 categories, followed by an extension of 20000 sketches, the natural image corresponding to the sketches class, with a total size of 204489. Randomly selecting 30 types of sketches as a test set (without using the labeled information thereof); the remaining 220 classes are used for training (using annotation information).
Step 102 is the middle training stage of this embodiment, namely, constructing a structure for generating a countermeasure network model, where the structure for generating the countermeasure network model includes a semantic feature generator G1Visual feature generator G2And a semantic discriminator D1And a visual sense discriminator D2. The specific construction process is as follows:
1) construction of the generator network:
constructing a generator network, the generator network having two: semantic feature generator G1And visual feature generator G2. As shown in FIG. 2, semantic feature generator G1Comprises 2 groups of convolution modules and 2 groups of full connection modules. The convolution module consists of a convolution layer (Conv), a Max Pooling layer (Max power) and a normalization layer (normalization); the full-connection module consists of a full-connection layer (FC) and a Leaky ReLU. As shown in FIG. 3, visual feature generator G2The system comprises two groups of full-connection modules, a 3-layer 4096-dimensional full-connection layer (FC 4096), a resampling layer (Reshape) and 5 groups of up-sampling modules. Wherein the full-connection module consists of a full-connection layer and a Leaky ReLU; the up-sampling module consists of two up-sampling layers (Upconv) and two Leaky ReLUs, wherein the up-sampling layers and the Leaky ReLUs are alternately connected. G2The input comes from G1And outputting the semantic features.
In particular, semantic feature generator G1Comprises 2 groups of convolution modules and 2 groups of full connection modules. After an image is input into a generator, firstly, convolution processing is carried out on a convolution layer with a convolution kernel of 11 and a step length of 4 through a convolution module, the deviation of a mean square error left by parameter errors of the convolution layer is reduced through maximum pooling with a pooling layer of 3 and a step length of 2, and the dimension of input data is normalized in subsequent normalization; then convolution processing is carried out on the convolution layer with the convolution kernel of 5 and the step length of 1, the deviation of the mean square error left by the parameter error of the convolution layer is reduced through the maximum pooling with the pooling layer of 3 and the step length of 2, the dimensionality of input data is normalized in the subsequent normalization, and then the input data is input into a 1024 full-connection module; and finally, generating semantic features from the input visual features through two full-connection modules with the same size.
In particular, the visual feature generator G2The device comprises two groups of full-connection modules, 3 layers of 4096-dimensional full-connection layers, a resampling layer and 5 groups of up-sampling modules. Inputting the semantic features generated by the semantic feature generator into the visual feature generator, and firstly passing through two 1024 full-connection modules; then three 4096-dimensional full-connected layers extract 4096 dimensions from the input dataThe feature vector of (2); then, the dimensionality of the input feature vector is resampled to be 4 multiplied by 256 through a resampling layer; finally, 5 up-sampling modules with convolution kernels of 4 and step length of 2 are used for up-sampling the feature vectors, and an activation function is used once every up-sampling to prevent gradient disappearance; and outputting the feature vector.
2) Construction of a discriminator network:
constructing a discriminator network, wherein the discriminator network comprises two networks: semantic discriminator D1And a visual sense discriminator D2。D1Comprises two branches: one branch for 0/1 (true and false) classification and the other branch for classification of the input label category. The first branched network structure comprises a group of fully connected modules and a two-way fully connected layer. The full-connection module consists of a full-connection layer and a Leaky ReLU; the network structure of the other branch comprises a group of fully connected modules and an n-way fully connected layer. The full-link module consists of a full-link layer and a Leaky ReLU. D2The full-connection module comprises a group of full-connection modules and a full-connection layer, wherein the full-connection modules comprise a full-connection layer and a Leaky ReLU. Two discriminators D1,D2The fully-connected layer of the last layer serves as a classifier in the overall convolutional neural network.
As shown in FIG. 4, in particular, semantic discriminator D1Two branches are included, one branch for the 0/1 second category; the other branch is used for class label classification. It receives true semantic features from the extracted self-encoder and a semantic feature generator G1Firstly, extracting features through a group of 1024 full-connection modules in a two-classification branch, then stabilizing the gradient by using an activation function, and finally, performing 0/1 two-classification through a full-connection layer to judge the truth of the input features; and in another n-way classification branch, the input data is subjected to n-way classification by using the last full connection layer.
As shown in FIG. 5, in particular, the visual discriminator D2For discriminating use of visual feature generator G2The authenticity of the features between the generated pseudo visual features and the CNN extracted real visual features. Inputting the generated pseudo-visual featuresTo discriminating network D2Firstly, using 1024 full-link layer to extract features, then using activation function to prevent gradient from disappearing, finally using full-link layer to make secondary classification of data and judging true and false of input features.
Wherein, a multi-target loss function is constructed in step 103, and the purpose of constructing the loss function is as follows: according to the convergence condition of the loss function value, the corresponding parameters in the zero sample identification network model can be better updated and optimized, the optimized generation countermeasure network model is finally obtained, and the image to be identified in the real data set is more accurately identified. Specifically, the method comprises the following steps:
the above-mentioned antagonism loss function is divided into two parts, one is the antagonism loss of CTGAN which evaluates the synthesis semantic features, the antagonism loss of CTGAN can make corresponding constraint to the gradient punishment to improve the quality of the synthesis features; and secondly, the antagonism loss of the general GAN for evaluating the synthesized pseudo-visual characteristics, and the general antagonism mechanism can well reduce the domain difference.
The degree of match between the visual features extracted by the CNN based on the attention mechanism and the generated pseudo-visual features can be well documented by the circular consistency loss function.
The classifier is attached to the semantic discriminator D1Therefore, the classifier can effectively classify the class label data so as to meet the task of zero sample image identification. Semantic discriminator D in the generation countermeasure network model1The countermeasure loss function of (1) is specifically as follows:
where x represents the true visual features, a represents the true semantic features, G1(x) Semantic Generator, D, representing an input visual feature as x1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features; first itemRepresenting a desire for a pseudo feature distribution; second itemAn expectation representing a true feature distribution; the difference between the first term and the second term represents the Wasserstein distance between the feature distributions;denotes the gradient penalty, λ, of performing the Lipschitz constraint2CT|x',x”A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty; lambda [ alpha ]2Weights representing consistency or continuity terms; wherein,
x 'and x' both represent perturbation data near the true visual features (arbitrarily extracted perturbation data near the true samples); c is a fixed constant; d (x ') represents a semantic arbiter entered as x', D (x ") represents a semantic arbiter entered as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x" | | represents the distance between two perturbation data features; the consistency term is usedTo approximate the gradient and limit it to be less than c.
Construction of a countering loss function for a visual arbiter
Wherein,representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,representing input pseudo-semantic featuresThe visual sense generator of (a) is,presentation inputThe visual characteristics generator of (1); continuously optimizing the network by a loss function such that the generated pseudo-visual featuresAnd the true visual feature x is getting closer.
The loss resisting function is that the real characteristic distribution and the generated characteristic distribution are integrally analyzed, a feedback signal is output to the generator network, and the parameter of the network is adjusted and optimized.
Constructing a circular consistency loss function of real visual features and pseudo visual features
E[||G2(G1(x))-x||1]Representing a distribution expectation of two visual features measured by cyclic consistency;representing the period of distribution of two semantic features measured by cycle consistencyInspection; loss of cyclic consistency LcycTo optimize network parameters such that the true visual feature x and the pseudo-semantic featureBetter matching is possible.
Constructing a classification loss function for a semantic classifier
Lcls=-E[logP(b|G1(a);θ)];
Wherein, P (c | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a. The classification accuracy of class labels is improved by minimizing the classification loss of generated features.
And 104, performing iterative training on the constructed generation countermeasure network model, updating and optimizing parameters of the network model, and obtaining the trained generation countermeasure network model. Specifically, the training image sample is used as an input of the semantic feature generator, and the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are jointly trained in a back propagation manner according to the multi-objective loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained. Fig. 6 is a diagram of a trained structure for generating a confrontation network model according to an embodiment of the present invention. The specific iterative training steps are as follows:
inputting training sample data on the Sketchy data set and the TU-Berlin data set into a CNN based on an attention mechanism, extracting visual characteristic information of the training sample, and inputting the visual characteristic information into a semantic characteristic generator G1Generating pseudo-semantic features
Inputting the pseudo-semantic features obtained in the last step into a visual feature generator G2In generating pseudo visionFeature(s)
In order to better measure the similarity between the sketch and the real image in the training process, a cycle-GAN cycle consistency loss constraint is introduced. Because the cycle-GAN consists of two generators and two discriminators. Semantic feature generator G for generating semantic features and visual features as data information of two different domains1Generating pseudo-semantic features from real visual features xVisual feature generator G2The obtained pseudo semantic featuresReverse generation of pseudo-visual featuresCyclios is then used to measure the similarity of true visual features and pseudo visual features.
The method comprises the steps of inputting texts in the Wikipedia into a layered model to obtain useful information of the texts, then inputting the useful information into a self-encoder, and extracting real semantic information of the Wikipedia texts. Using the real semantic information a as a discriminator D1Input of (1), with G1The generated pseudo-semantic features are used for counterstudy.
Using a variant of WGAN, CTGAN, as discriminator D1To improve the accuracy of zero sample image recognition. Because the gradient penalty of the WGAN is not reasonable, if the real sample distribution and the generated pseudo sample distribution are far away from each other, the gradient penalty often cannot detect the continuity of the area near the real sample, that is, the discriminator will destroy the Lipschitz continuity. The CTGAN adds a constancy term to constrain the gradient of the real sample distribution on the basis of WGAN, thereby enhancing the Lipschitz continuity near the data sample distribution.
Visual feature generator G2Generated pseudo-visual featuresAnd the real visual feature x as a visual discriminator D2Input of G2Judging the truth of visual features to generate countermeasures to loss, and updating the optimized network parameters via loss function to make the visual features pseudoCloser and closer to the true visual feature x.
Constructing a discriminator D according to the characteristic information of the Wikipedia text and the sketch1Is the function of the penalty of fighting LCTGANAnd a discriminator D2Is the function of the penalty of fighting Ladv(ii) a Constructing a cycle consistency loss function L according to real visual features and pseudo visual features of a sketchcycThen, a loss function L for classifying the label category is constructedcls。
The specific updating optimization process comprises the following steps: the fixed generator network parameters are used for training the discriminator network to obtain a trained discriminator network model; and fixing the trained discriminator network model parameters, carrying out back propagation training on the generator network to obtain an optimized generator network model, and repeating the steps to obtain an optimal generation confrontation network model.
The zero sample image identification method based on the generation countermeasure network in the embodiment has the following advantages: introducing cycle consistent loss constraint of semantic alignment into a generation model to solve the problem that common semantic knowledge cannot be utilized between a training image and a testing image in a real scene, measuring the correlation between visual features and semantic features, and adding a classification network parallel to a discriminator at the output part of the discriminator to correctly classify class labels; using variant CTGAN of WGAN to carry out antagonistic learning on the true characteristic and the synthesized characteristic, and adding a consistency term on the basis of the WGAN so as to restrict the gradient of the distribution of the true characteristic; the zero sample learning has the problem that the training cost and the training complexity are high when the whole attribute set based on the features is identified, and the self-encoder extraction scheme based on the Wikipedia text and the hierarchical structure is proposed to extract the features of the subsets of the attributes, then the hierarchical structure is used for dividing the subsets, useful information is screened, and important feature information from the text is extracted, so that the training cost and the training complexity are reduced, and the zero sample learning is more effective in identifying the attribute subset than the whole attribute set.
The method in the embodiment adopts the generation of the countermeasure network to realize the identification of the zero sample, can identify the sketch without the labeled information, can improve the identification precision of the zero sample, and improves the generalization capability of the model.
Fig. 7 is a schematic structural diagram of a zero-sample image recognition system based on a generative countermeasure network according to an embodiment of the present invention. Referring to fig. 7, the zero-sample image recognition system based on the generation countermeasure network includes:
a sample obtaining module 201, configured to obtain a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information.
A network model construction module 202, configured to construct and generate a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features.
A loss function constructing module 203, configured to construct a multi-objective loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, a confrontation loss function of a semantic discriminator, a confrontation loss function of the visual discriminator and a classification loss function of the semantic discriminator.
The training module 204 is configured to use the training image sample as an input of the generated confrontation network model, and perform iterative training on the generated confrontation network model based on the multi-objective loss function to obtain a trained generated confrontation network model.
And the test recognition module 205 is configured to input the test image sample into the trained generated confrontation network model to obtain a recognition result.
As an optional implementation, the system for zero-sample image recognition based on generation of a countermeasure network further includes:
and the real semantic feature acquisition module is used for inputting the text in the Wikipedia into the layered model to obtain useful information of the text and inputting the useful information of the text into the self-encoder to obtain the real semantic features.
And the real visual feature acquisition module is used for inputting the training image sample into a CNN model based on an attention mechanism to obtain real visual features.
As an optional implementation manner, the network model building module 202 specifically includes:
the first generator constructing unit is used for constructing a semantic feature generator; the semantic feature generator comprises two groups of convolution modules and two groups of full-connection modules; the convolution module comprises a convolution layer, a maximum pooling layer and a normalization layer which are connected in sequence; the full-connection module comprises a full-connection layer and a Leaky ReLU layer.
A second generator building unit for building a visual feature generator; the visual feature generator comprises two groups of fully-connected modules, three 4096-dimensional fully-connected layers, a resampling layer and five groups of upsampling modules which are sequentially connected; the up-sampling module comprises two up-sampling layers and two Leaky ReLU layers; and the upsampling layer in the upsampling module and the Leaky ReLU layer are alternately connected.
The first discriminator constructing unit is used for constructing a semantic discriminator; the semantic discriminator comprises a group of full-connection modules, a two-path full-connection layer, an n-path full-connection layer, two classifiers and an input label classifier.
A second discriminator establishing unit for establishing a visual discriminator; the visual discriminator comprises a group of fully connected modules, a fully connected layer and a classifier.
As an optional implementation manner, the loss function constructing module 203 specifically includes:
a first loss function construction unit for constructing a countermeasure loss function of the semantic discriminator
Where x represents the true visual features, a represents the true semantic features, G1(x) Semantic Generator, D, representing an input visual feature as x1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features;representing a desire for a pseudo feature distribution;an expectation representing a true feature distribution;denotes the gradient penalty, λ, of performing the Lipschitz constraint2CT|x',x”A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty; lambda [ alpha ]2Weights representing consistency or continuity terms; wherein,
x' and x "both represent perturbation data in the vicinity of the true visual feature; c is a fixed constant; d (x ') represents the semantic arbiter input as x', D (x ") represents the semantic arbiter input as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x" | | represents the distance between two perturbation data features.
A second loss function constructing unit for constructing a countering loss function of the visual discriminator
Wherein,representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,representing input pseudo-semantic featuresThe visual sense generator of (a) is,presentation inputThe visual characteristics generator of (1).
A third loss function constructing unit for constructing a circular consistency loss function of the real visual features and the pseudo visual features
E[||G2(G1(x))-x||1]Representing a distribution expectation of two visual features measured by cyclic consistency;representing the expectation of distribution of two semantic features measured by circular consistency.
A fourth loss function construction unit for constructing the classification loss function of the semantic discriminator
Lcls=-E[logP(b|G1(a);θ)];
Wherein, P (c | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a.
As an optional implementation manner, the training module 204 specifically includes:
and the training unit is used for taking the training image sample as the input of the semantic feature generator, and performing combined training on the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator in a back propagation mode according to the multi-target loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained.
The zero sample image identification system based on the generation countermeasure network in the embodiment adopts the generation countermeasure network to realize the identification of the zero sample, can identify the sketch without the labeled information, can improve the identification precision of the zero sample, and improves the generalization capability of the model.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.
Claims (8)
1. A zero sample image identification method based on a generation countermeasure network is characterized by comprising the following steps:
acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
constructing and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
taking the training image sample as the input of the generated countermeasure network model, and performing iterative training on the generated countermeasure network model based on the multi-target loss function to obtain a trained generated countermeasure network model;
inputting the test image sample into the trained generated confrontation network model to obtain a recognition result;
the constructing of the multi-target loss function specifically includes:
construction of a penalty function for semantic discriminators
Where x represents the true visual features, a represents the true semantic features, G1(x) Semantic Generator, D, representing an input visual feature as x1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features;
representing a desire for a pseudo feature distribution;an expectation representing a true feature distribution;denotes the gradient penalty, λ, of performing the Lipschitz constraint2CT|x',x”A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty; lambda [ alpha ]2Weights representing consistency or continuity terms; wherein,
x' and x "both represent perturbation data in the vicinity of the true visual feature; c is a fixed constant; d (x ') represents a semantic arbiter entered as x', D (x ") represents a semantic arbiter entered as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x" | | represents the distance between two perturbation data features;
construction of a countering loss function for a visual arbiter
Wherein,representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,a visual generator representing an input pseudo-semantic feature a-,presentation inputThe visual characteristics generator of (1);
constructing a circular consistency loss function of real visual features and pseudo visual features
Ε[||G2(G1(x))-x||1]Representing a distribution expectation of two visual features measured by cyclic consistency;representing the distribution expectation of two semantic features measured by cycle consistency;
constructing a classification loss function for a semantic classifier
Lcls=-Ε[logP(b|G1(a);θ)];
Wherein, P (b | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a.
2. The method for zero-sample image recognition based on generation of countermeasure network according to claim 1, further comprising, before the constructing of the generation of countermeasure network model:
inputting texts in Wikipedia into a layered model to obtain useful information of the texts, and inputting the useful information of the texts into a self-encoder to obtain real semantic features;
and inputting the training image sample into a CNN model based on an attention mechanism to obtain a real visual feature.
3. The method for zero-sample image recognition based on generation of a countermeasure network according to claim 1, wherein the constructing of the generation countermeasure network model specifically includes:
constructing a semantic feature generator; the semantic feature generator comprises two groups of convolution modules and two groups of full-connection modules; the convolution module comprises a convolution layer, a maximum pooling layer and a normalization layer which are connected in sequence; the full-connection module comprises a full-connection layer and a Leaky ReLU layer;
constructing a visual feature generator; the visual feature generator comprises two groups of fully-connected modules, three 4096-dimensional fully-connected layers, a resampling layer and five groups of upsampling modules which are sequentially connected; the up-sampling module comprises two up-sampling layers and two Leaky ReLU layers; the upsampling layer in the upsampling module is alternately connected with the Leaky ReLU layer;
constructing a semantic discriminator; the semantic discriminator comprises a group of full-connection modules, a two-path full-connection layer, an n-path full-connection layer, two classifiers and an input label classifier;
constructing a visual discriminator; the visual discriminator comprises a group of fully connected modules, a fully connected layer and a classifier.
4. The method as claimed in claim 1, wherein the step of iteratively training the generated countermeasure network model based on the multi-objective loss function by using the training image samples as the input of the generated countermeasure network model to obtain the trained generated countermeasure network model specifically comprises:
and taking the training image sample as the input of the semantic feature generator, and performing combined training on the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator in a back propagation mode according to the multi-target loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained.
5. A zero-sample image recognition system based on a generative countermeasure network, comprising:
the sample acquisition module is used for acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
the network model building module is used for building and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
the loss function construction module is used for constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
the training module is used for taking the training image sample as the input of the generated confrontation network model, and performing iterative training on the generated confrontation network model based on the multi-target loss function to obtain a trained generated confrontation network model;
the test recognition module is used for inputting the test image sample into the trained generated confrontation network model to obtain a recognition result;
the loss function building module specifically includes:
a first loss function construction unit for constructing a countermeasure loss function of the semantic discriminator
Where x represents the true visual features, a represents the true semantic features, G1(x) Semantic Generator, D, representing an input visual feature as x1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features;representing a desire for a pseudo feature distribution;an expectation representing a true feature distribution;denotes the gradient penalty, λ, of performing the Lipschitz constraint2CT|x',x”A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty; lambda [ alpha ]2Weights representing consistency or continuity terms; wherein,
x' and x "both represent perturbation data in the vicinity of the true visual feature; c is a fixed constant; d (x ') represents a semantic arbiter entered as x', D (x ") represents a semantic arbiter entered as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x" | | represents the distance between two perturbation data features;
a second loss function constructing unit for constructing a countering loss function of the visual discriminator
Wherein,representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,a visual generator representing an input pseudo-semantic feature a-,presentation inputThe visual characteristics generator of (1);
a third loss function constructing unit for constructing a circular consistency loss function of the real visual features and the pseudo visual features
Ε[||G2(G1(x))-x||1]Representing a distribution expectation of two visual features measured by cyclic consistency;representing the distribution expectation of two semantic features measured by cycle consistency;
a fourth loss function construction unit for constructing the classification loss function of the semantic discriminator
Lcls=-Ε[log P(b|G1(a);θ)];
Wherein, P (b | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a.
6. The system of claim 5, further comprising:
the real semantic feature acquisition module is used for inputting the text in the Wikipedia into the layered model to obtain useful information of the text and inputting the useful information of the text into the self-encoder to obtain real semantic features;
and the real visual feature acquisition module is used for inputting the training image sample into a CNN model based on an attention mechanism to obtain real visual features.
7. The system for zero-sample image recognition based on generation of a countermeasure network according to claim 5, wherein the network model construction module specifically comprises:
the first generator constructing unit is used for constructing a semantic feature generator; the semantic feature generator comprises two groups of convolution modules and two groups of full-connection modules; the convolution module comprises a convolution layer, a maximum pooling layer and a normalization layer which are connected in sequence; the full-connection module comprises a full-connection layer and a Leaky ReLU layer;
a second generator building unit for building a visual feature generator; the visual feature generator comprises two groups of fully-connected modules, three 4096-dimensional fully-connected layers, a resampling layer and five groups of upsampling modules which are sequentially connected; the up-sampling module comprises two up-sampling layers and two Leaky ReLU layers; the upsampling layer in the upsampling module is alternately connected with the Leaky ReLU layer;
the first discriminator constructing unit is used for constructing a semantic discriminator; the semantic discriminator comprises a group of full-connection modules, a two-path full-connection layer, an n-path full-connection layer, two classifiers and an input label classifier;
a second discriminator establishing unit for establishing a visual discriminator; the visual discriminator comprises a group of fully connected modules, a fully connected layer and a classifier.
8. The system of claim 5, wherein the training module specifically comprises:
and the training unit is used for taking the training image sample as the input of the semantic feature generator, and performing combined training on the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator in a back propagation mode according to the multi-target loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010263452.4A CN111476294B (en) | 2020-04-07 | 2020-04-07 | Zero sample image identification method and system based on generation countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010263452.4A CN111476294B (en) | 2020-04-07 | 2020-04-07 | Zero sample image identification method and system based on generation countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111476294A CN111476294A (en) | 2020-07-31 |
CN111476294B true CN111476294B (en) | 2022-03-22 |
Family
ID=71749908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010263452.4A Expired - Fee Related CN111476294B (en) | 2020-04-07 | 2020-04-07 | Zero sample image identification method and system based on generation countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111476294B (en) |
Families Citing this family (66)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111950619B (en) * | 2020-08-05 | 2022-09-09 | 东北林业大学 | Active learning method based on dual-generation countermeasure network |
CN112069397B (en) * | 2020-08-21 | 2023-08-04 | 三峡大学 | Rumor detection method combining self-attention mechanism and generation of countermeasure network |
CN112001122B (en) * | 2020-08-26 | 2023-09-26 | 合肥工业大学 | Non-contact physiological signal measurement method based on end-to-end generation countermeasure network |
CN112199479B (en) * | 2020-09-15 | 2024-08-02 | 北京捷通华声科技股份有限公司 | Method, device, equipment and storage medium for optimizing language semantic understanding model |
CN112132197B (en) * | 2020-09-15 | 2024-07-09 | 腾讯科技(深圳)有限公司 | Model training, image processing method, device, computer equipment and storage medium |
CN112149802B (en) * | 2020-09-17 | 2022-08-09 | 广西大学 | Image content conversion method with consistent semantic structure |
CN112101470B (en) * | 2020-09-18 | 2023-04-11 | 上海电力大学 | Guide zero sample identification method based on multi-channel Gauss GAN |
CN112199637B (en) * | 2020-09-21 | 2024-04-12 | 浙江大学 | Regression modeling method for generating contrast network data enhancement based on regression attention |
CN112308113A (en) * | 2020-09-23 | 2021-02-02 | 济南浪潮高新科技投资发展有限公司 | Target identification method, device and medium based on semi-supervision |
CN112232378A (en) * | 2020-09-23 | 2021-01-15 | 中国人民解放军战略支援部队信息工程大学 | Zero-order learning method for fMRI visual classification |
CN112364138A (en) * | 2020-10-12 | 2021-02-12 | 上海交通大学 | Visual question-answer data enhancement method and device based on anti-attack technology |
CN112287779B (en) * | 2020-10-19 | 2022-03-25 | 华南农业大学 | Low-illuminance image natural illuminance reinforcing method and application |
CN112364894B (en) * | 2020-10-23 | 2022-07-08 | 天津大学 | Zero sample image classification method of countermeasure network based on meta-learning |
CN112415514B (en) * | 2020-11-16 | 2023-05-02 | 北京环境特性研究所 | Target SAR image generation method and device |
CN113191381B (en) * | 2020-12-04 | 2022-10-11 | 云南大学 | Image zero-order classification model based on cross knowledge and classification method thereof |
CN112560034B (en) * | 2020-12-11 | 2024-03-29 | 宿迁学院 | Malicious code sample synthesis method and device based on feedback type deep countermeasure network |
CN112667496B (en) * | 2020-12-14 | 2022-11-18 | 清华大学 | Black box countermeasure test sample generation method and device based on multiple prior |
CN112580722B (en) * | 2020-12-20 | 2024-06-14 | 大连理工大学人工智能大连研究院 | Generalized zero sample image recognition method based on conditional countermeasure automatic encoder |
CN112731327B (en) * | 2020-12-25 | 2023-05-23 | 南昌航空大学 | HRRP radar target identification method based on CN-LSGAN, STFT and CNN |
CN112700408B (en) * | 2020-12-28 | 2023-09-08 | 中国银联股份有限公司 | Model training method, image quality evaluation method and device |
CN112767505B (en) * | 2020-12-31 | 2023-12-22 | 深圳市联影高端医疗装备创新研究院 | Image processing method, training device, electronic terminal and storage medium |
CN112767507B (en) * | 2021-01-15 | 2022-11-18 | 大连理工大学 | Cartoon sketch coloring method based on dynamic memory module and generation confrontation network |
CN112766366A (en) * | 2021-01-18 | 2021-05-07 | 深圳前海微众银行股份有限公司 | Training method for resisting generation network and image processing method and device thereof |
CN112766386B (en) * | 2021-01-25 | 2022-09-20 | 大连理工大学 | Generalized zero sample learning method based on multi-input multi-output fusion network |
CN112818995B (en) * | 2021-01-27 | 2024-05-21 | 北京达佳互联信息技术有限公司 | Image classification method, device, electronic equipment and storage medium |
CN113283423B (en) * | 2021-01-29 | 2022-08-16 | 南京理工大学 | Natural scene distortion text image correction method and system based on generation network |
CN113221948B (en) * | 2021-04-13 | 2022-08-05 | 复旦大学 | Digital slice image classification method based on countermeasure generation network and weak supervised learning |
CN113222002B (en) * | 2021-05-07 | 2024-04-05 | 西安交通大学 | Zero sample classification method based on generative discriminative contrast optimization |
CN113140020B (en) * | 2021-05-13 | 2022-10-14 | 电子科技大学 | Method for generating image based on text of countermeasure network generated by accompanying supervision |
CN113269274B (en) * | 2021-06-18 | 2022-04-19 | 南昌航空大学 | Zero sample identification method and system based on cycle consistency |
CN113726545B (en) * | 2021-06-23 | 2022-12-23 | 清华大学 | Network traffic generation method and device for generating countermeasure network based on knowledge enhancement |
CN113378959B (en) * | 2021-06-24 | 2022-03-15 | 中国矿业大学 | Zero sample learning method for generating countermeasure network based on semantic error correction |
CN113706645A (en) * | 2021-06-30 | 2021-11-26 | 酷栈(宁波)创意科技有限公司 | Information processing method for landscape painting |
CN113609569B (en) * | 2021-07-01 | 2023-06-09 | 湖州师范学院 | Distinguishing type generalized zero sample learning fault diagnosis method |
CN113361646A (en) * | 2021-07-01 | 2021-09-07 | 中国科学技术大学 | Generalized zero sample image identification method and model based on semantic information retention |
CN113537322B (en) * | 2021-07-02 | 2023-04-18 | 电子科技大学 | Zero sample visual classification method for cross-modal semantic enhancement generation countermeasure network |
CN113505845A (en) * | 2021-07-23 | 2021-10-15 | 黑龙江省博雅智睿科技发展有限责任公司 | Deep learning training set image generation method based on language |
CN113706379B (en) * | 2021-07-29 | 2023-05-26 | 山东财经大学 | Interlayer interpolation method and system based on medical image processing |
CN113642621B (en) * | 2021-08-03 | 2024-06-28 | 南京邮电大学 | Zero sample image classification method based on generation countermeasure network |
CN113657272B (en) * | 2021-08-17 | 2022-06-28 | 山东建筑大学 | Micro video classification method and system based on missing data completion |
CN113746087B (en) * | 2021-08-19 | 2023-03-21 | 浙江大学 | Power grid transient stability sample controllable generation and evaluation method and system based on CTGAN |
CN113763442B (en) * | 2021-09-07 | 2023-06-13 | 南昌航空大学 | Deformable medical image registration method and system |
CN113762180B (en) * | 2021-09-13 | 2023-09-01 | 中国科学技术大学 | Training method and system for human body activity imaging based on millimeter wave radar signals |
CN113806584B (en) * | 2021-09-17 | 2022-10-14 | 河海大学 | Self-supervision cross-modal perception loss-based method for generating command actions of band |
CN114154550B (en) * | 2021-10-12 | 2024-10-18 | 清华大学 | Domain name countermeasure sample generation method and device |
CN114067195B (en) * | 2021-10-20 | 2024-08-13 | 北京航天自动控制研究所 | Target detector learning method based on generated countermeasure |
CN114373077A (en) * | 2021-12-07 | 2022-04-19 | 燕山大学 | Sketch identification method based on double-layer structure |
CN114359659B (en) * | 2021-12-17 | 2024-09-06 | 华南理工大学 | Attention disturbance-based automatic image labeling method, system and medium |
CN114176549B (en) * | 2021-12-23 | 2024-04-16 | 杭州电子科技大学 | Fetal heart rate signal data enhancement method and device based on generation type countermeasure network |
CN114387444B (en) * | 2021-12-24 | 2024-10-15 | 大连理工大学 | Zero sample classification method based on negative boundary triplet loss and data enhancement |
CN114005005B (en) * | 2021-12-30 | 2022-03-22 | 深圳佑驾创新科技有限公司 | Double-batch standardized zero-instance image classification method |
CN114511737B (en) * | 2022-01-24 | 2022-09-09 | 北京建筑大学 | Training method of image recognition domain generalization model |
CN114519118A (en) * | 2022-02-21 | 2022-05-20 | 安徽大学 | Zero sample sketch retrieval method based on multiple times of GAN and semantic cycle consistency |
CN114998124B (en) * | 2022-05-23 | 2024-06-18 | 北京航空航天大学 | Image sharpening processing method for target detection |
CN115187467B (en) * | 2022-05-31 | 2024-07-02 | 北京昭衍新药研究中心股份有限公司 | Enhanced virtual image data generation method based on generation countermeasure network |
CN114723611B (en) * | 2022-06-10 | 2022-09-30 | 季华实验室 | Image reconstruction model training method, reconstruction method, device, equipment and medium |
CN114757342B (en) * | 2022-06-14 | 2022-09-09 | 南昌大学 | Electronic data information evidence-obtaining method based on confrontation training |
CN115314254B (en) * | 2022-07-07 | 2023-06-23 | 中国人民解放军战略支援部队信息工程大学 | Semi-supervised malicious traffic detection method based on improved WGAN-GP |
CN115308705A (en) * | 2022-08-05 | 2022-11-08 | 北京理工大学 | Multi-pose extremely narrow pulse echo generation method based on generation countermeasure network |
CN115222752B (en) * | 2022-09-19 | 2023-01-24 | 之江实验室 | Pathological image feature extractor training method and device based on feature decoupling |
CN115424119B (en) * | 2022-11-04 | 2023-03-24 | 之江实验室 | Image generation training method and device capable of explaining GAN based on semantic fractal |
CN115527216B (en) * | 2022-11-09 | 2023-05-23 | 中国矿业大学(北京) | Text image generation method based on modulation fusion and antagonism network generation |
CN116579414B (en) * | 2023-03-24 | 2024-04-02 | 浙江医准智能科技有限公司 | Model training method, MRI thin layer data reconstruction method, device and equipment |
CN117541883B (en) * | 2024-01-09 | 2024-04-09 | 四川见山科技有限责任公司 | Image generation model training, image generation method, system and electronic equipment |
CN117610614B (en) * | 2024-01-11 | 2024-03-22 | 四川大学 | Attention-guided generation countermeasure network zero sample nuclear power seal detection method |
CN117934930B (en) * | 2024-01-12 | 2024-09-10 | 西南计算机有限责任公司 | Target identification method based on unmanned platform |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108875818A (en) * | 2018-06-06 | 2018-11-23 | 西安交通大学 | Based on variation from code machine and confrontation network integration zero sample image classification method |
CN109460814A (en) * | 2018-09-28 | 2019-03-12 | 浙江工业大学 | A kind of deep learning classification method for attacking resisting sample function with defence |
CN109816032A (en) * | 2019-01-30 | 2019-05-28 | 中科人工智能创新技术研究院(青岛)有限公司 | Zero sample classification method and apparatus of unbiased mapping based on production confrontation network |
CN110334781A (en) * | 2019-06-10 | 2019-10-15 | 大连理工大学 | A kind of zero sample learning algorithm based on Res-Gan |
CN110443293A (en) * | 2019-07-25 | 2019-11-12 | 天津大学 | Based on double zero sample image classification methods for differentiating and generating confrontation network text and reconstructing |
CN110490946A (en) * | 2019-07-15 | 2019-11-22 | 同济大学 | Text generation image method based on cross-module state similarity and generation confrontation network |
CN110795585A (en) * | 2019-11-12 | 2020-02-14 | 福州大学 | Zero sample image classification model based on generation countermeasure network and method thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10810767B2 (en) * | 2018-06-12 | 2020-10-20 | Siemens Healthcare Gmbh | Machine-learned network for Fourier transform in reconstruction for medical imaging |
-
2020
- 2020-04-07 CN CN202010263452.4A patent/CN111476294B/en not_active Expired - Fee Related
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108875818A (en) * | 2018-06-06 | 2018-11-23 | 西安交通大学 | Based on variation from code machine and confrontation network integration zero sample image classification method |
CN109460814A (en) * | 2018-09-28 | 2019-03-12 | 浙江工业大学 | A kind of deep learning classification method for attacking resisting sample function with defence |
CN109816032A (en) * | 2019-01-30 | 2019-05-28 | 中科人工智能创新技术研究院(青岛)有限公司 | Zero sample classification method and apparatus of unbiased mapping based on production confrontation network |
CN110334781A (en) * | 2019-06-10 | 2019-10-15 | 大连理工大学 | A kind of zero sample learning algorithm based on Res-Gan |
CN110490946A (en) * | 2019-07-15 | 2019-11-22 | 同济大学 | Text generation image method based on cross-module state similarity and generation confrontation network |
CN110443293A (en) * | 2019-07-25 | 2019-11-12 | 天津大学 | Based on double zero sample image classification methods for differentiating and generating confrontation network text and reconstructing |
CN110795585A (en) * | 2019-11-12 | 2020-02-14 | 福州大学 | Zero sample image classification model based on generation countermeasure network and method thereof |
Non-Patent Citations (3)
Title |
---|
"Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks";Jun-Yan Zhu等;《2017 IEEE International Conference on Computer Vision (ICCV)》;20171225;第2242-2251页 * |
"基于去冗余特征和语义关系约束的零样本属性识别";张桂梅等;《模式识别与人工智能》;20210930;第 34 卷(第 9 期);第809-823页 * |
"结合迁移引导和双向循环结构 GAN 的零样本文本识别";张桂梅等;《模式识别与人工智能 》;20201231;第 33 卷(第 12 期);第1083-1096页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111476294A (en) | 2020-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111476294B (en) | Zero sample image identification method and system based on generation countermeasure network | |
CN110147457B (en) | Image-text matching method, device, storage medium and equipment | |
CN108875818B (en) | Zero sample image classification method based on combination of variational self-coding machine and antagonistic network | |
CN110059217B (en) | Image text cross-media retrieval method for two-stage network | |
CN111581405A (en) | Cross-modal generalization zero sample retrieval method for generating confrontation network based on dual learning | |
CN112966127A (en) | Cross-modal retrieval method based on multilayer semantic alignment | |
CN110232395B (en) | Power system fault diagnosis method based on fault Chinese text | |
Chen | Model reprogramming: Resource-efficient cross-domain machine learning | |
CN112232053B (en) | Text similarity computing system, method and storage medium based on multi-keyword pair matching | |
CN112732916A (en) | BERT-based multi-feature fusion fuzzy text classification model | |
CN110287354A (en) | A kind of high score remote sensing images semantic understanding method based on multi-modal neural network | |
Wang et al. | Advanced Multimodal Deep Learning Architecture for Image-Text Matching | |
CN117725261A (en) | Cross-modal retrieval method, device, equipment and medium for video text | |
Tang et al. | Class-level prototype guided multiscale feature learning for remote sensing scene classification with limited labels | |
CN115204171A (en) | Document-level event extraction method and system based on hypergraph neural network | |
Nijhawan et al. | VTnet+ Handcrafted based approach for food cuisines classification | |
CN113222002A (en) | Zero sample classification method based on generative discriminative contrast optimization | |
Li et al. | Evaluating BERT on cloud-edge time series forecasting and sentiment analysis via prompt learning | |
Fang | Detection of white blood cells using YOLOV3 network | |
Xie et al. | Full-view salient feature mining and alignment for text-based person search | |
CN115640418A (en) | Cross-domain multi-view target website retrieval method and device based on residual semantic consistency | |
Li et al. | ViT2CMH: Vision Transformer Cross-Modal Hashing for Fine-Grained Vision-Text Retrieval. | |
CN113723111B (en) | Small sample intention recognition method, device, equipment and storage medium | |
Singh et al. | Visual content generation from textual description using improved adversarial network | |
Wang et al. | Contrastive embedding-based feature generation for generalized zero-shot learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220322 |
|
CF01 | Termination of patent right due to non-payment of annual fee |