CN111914929B - Zero sample learning method - Google Patents
Zero sample learning method Download PDFInfo
- Publication number
- CN111914929B CN111914929B CN202010750578.4A CN202010750578A CN111914929B CN 111914929 B CN111914929 B CN 111914929B CN 202010750578 A CN202010750578 A CN 202010750578A CN 111914929 B CN111914929 B CN 111914929B
- Authority
- CN
- China
- Prior art keywords
- network
- visual
- features
- zero
- visual feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000000007 visual effect Effects 0.000 claims abstract description 93
- 238000012549 training Methods 0.000 claims abstract description 35
- 239000013598 vector Substances 0.000 claims abstract description 21
- 230000006870 function Effects 0.000 claims abstract description 16
- 238000009826 distribution Methods 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 238000013508 migration Methods 0.000 claims abstract description 10
- 230000005012 migration Effects 0.000 claims abstract description 10
- 238000005457 optimization Methods 0.000 claims abstract description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 2
- 241000282412 Homo Species 0.000 description 2
- 230000001143 conditioned effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000004382 visual function Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
- Filters That Use Time-Delay Elements (AREA)
Abstract
The invention provides a zero sample learning method, which is used for identifying unseen data classes by carrying out knowledge migration from visible class samples to invisible class samples, and mainly comprises the following steps: acquiring a training characteristic data set; building a zero sample learning model based on a generation network, a noise self-encoder, a regression network and a discrimination network; training a generating network and a noise self-encoder; training a discrimination network; and obtaining a total objective function, and iterating to achieve the purpose of algorithm optimization. According to the method, through knowledge migration, two semantic features of attributes and word vectors are fused, training is carried out under a countermeasure mechanism to minimize the distribution difference between a real sample and a generated sample, and visual features are mapped to the semantic features through a regression network, so that the problem of neighborhood migration of a model prediction result is effectively solved, samples which are difficult to label can be identified, and the identification cost is reduced.
Description
Technical Field
The invention relates to a zero sample learning method, and belongs to the field of pattern recognition.
Background
With the development of deep learning, the performance of computer vision and machine learning methods is greatly improved, and deep learning models have been surprisingly successful in the field of image classification, even comparable to the human recognition capability. However, humans have natural advantages in identifying novel objects that humans have previously heard or seen only a few times, and possibly new objects that have never been touched. The most fundamental reason for the difference between the two is that the depth model relies on fully supervised learning. Training neural networks therefore requires a large amount of labeled data, and in fact, because of the tens of thousands of species in nature, collecting and annotating visual data is cumbersome and expensive. This creates a new task to solve the problem of image annotation by transferring knowledge from visible class samples to invisible class samples, so that invisible class samples can be identified.
Zero sample learning, in which visible classes and invisible class sets are generally assumed to be disjoint, is currently receiving increasing attention. A portion of the samples in the feature space are labeled, the samples are referred to as visible class samples, and only the visual instances of the visible class samples are used to train the model. There is also a portion of unlabeled sample instances in the feature space, these sample classes are referred to as invisible class samples. The feature space is composed of vectors extracted from samples through a neural network, and each sample belongs to one category. To link visible class samples to invisible class samples, semantic features are typically introduced for zero sample learning. In zero sample learning, attributes are the most common semantic features, but manually labeling visual features for each semantic attribute is a time-consuming and labor-consuming invention. The natural language processing technology utilizes semantic features (such as word vectors, glove) with some alternative attributes to directly acquire text information from wikipedia articles, but the semantic features are coarse and invisible in acquisition, so that the performance of the semantic features is poorer than that of attribute features.
In view of the above, it is necessary to provide a zero sample learning method to solve the above problems.
Disclosure of Invention
The invention aims to provide a zero sample learning method which is used for identifying samples difficult to label and reducing identification cost.
In order to achieve the above object, the present invention provides a zero sample learning method for performing knowledge migration from visible class samples to invisible class samples to identify unseen data classes, which mainly comprises the following steps:
step 1, obtaining a training feature data set, wherein the training feature data set comprises visible class samples, and the visible class samples comprise labels, real visual features and semantic features;
step 2, building a zero sample learning model based on the generation network, the noise self-encoder, the regression network and the discrimination network, and initializing the generation network, the noise self-encoder, the regression network and the discrimination network in the zero sample learning model;
step 3, training a generating network and a noise self-encoder to respectively generate a first visual feature and a second visual feature, and fusing the first visual feature and the second visual feature into a pseudo visual feature according to different weights;
step 4, training a discrimination network to classify the pseudo visual features and the real visual features, and optimizing and generating the network and the discrimination network through a countermeasure mechanism;
step 5, training a regression network, and taking the pseudo-visual features as input so as to map the pseudo-visual features to semantic features;
and 6, adding the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network to obtain a total objective function, and iterating to achieve the purpose of algorithm optimization.
Optionally, in step 1, the tags include quantity tags and category tags, and the semantic features include word vectors and attribute features.
Optionally, in step 2, a feedforward neural network is used for data transmission among the generation network, the noise self-encoder, the regression network, and the discrimination network.
Optionally, in step 3, the generating network generates a first visual feature through an attribute feature and gaussian random noise; the noise autoencoder generates a second visual feature by a word vector, a latent variable, and gaussian random noise.
Optionally, in step 3, the formula for fusing the first visual feature and the second visual feature into the pseudo visual feature is as follows:
wherein x is f Is a pseudo-visual feature, λ is the corresponding weight, x 1 Is a representation of the first visual characteristic,is a second visual feature representation, the sum of the weights of the two parts of the first visual feature and the second visual feature being 1.
Optionally, in step 4, the countermeasure mechanism may be expressed as:
Optionally, in step 4, the pseudo visual feature and the real visual feature are both constrained by a least square loss formula, where the least square loss formula is:
where x is the true visual feature, x f Is a pseudo-visual feature.
Optionally, the total objective function in step 6 is:
L=L WGAN +L 1 +λ 2 *L R ,
wherein λ is 2 Are hyperparameters that assign weights in different parts.
Optionally, in step 6, Adam is used as an optimizer to perform algorithm optimization.
Optionally, step 7 is further included, the generation network trained in step 3 is used for generating the real visual features of the invisible class samples, and the invisible class samples are classified to test the total objective function in step 6.
The invention has the beneficial effects that: the method disclosed by the invention has the advantages that through knowledge migration, two semantic features of attributes and word vectors are fused, training is carried out under a countermeasure mechanism to minimize the distribution difference between a real sample and a generated sample, and the visual features are mapped to the semantic features through a regression network, so that the problem of neighborhood migration of a model prediction result is effectively solved, samples which are difficult to label can be identified, and the identification cost is reduced.
Drawings
FIG. 1 is a flow chart of the zero sample learning method of the present invention.
FIG. 2 is a flow chart of the method for zero sample learning according to the present invention for generating a first visual feature.
Fig. 3 is a flow chart of the generation of the second visual feature in the zero sample learning method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
As shown in FIG. 1, the present invention discloses a zero sample learning method for performing knowledge migration from visible class samples to invisible class samples to identify unseen data classes, which mainly comprises the following steps:
step 1, obtaining a training feature data set, wherein the training feature data set comprises visible class samples, and the visible class samples comprise labels, real visual features and semantic features;
step 2, building a zero sample learning model based on the generation network, the noise self-encoder, the regression network and the discrimination network, and initializing the generation network, the noise self-encoder, the regression network and the discrimination network in the zero sample learning model;
step 3, training a generating network and a noise self-encoder to respectively generate a first visual feature and a second visual feature, and fusing the first visual feature and the second visual feature into a pseudo visual feature according to different weights;
step 4, training a discrimination network to classify the pseudo visual features and the real visual features, and optimizing to generate a network and a discrimination network through a countermeasure mechanism;
step 5, training a regression network, and taking the pseudo-visual features as input so as to map the pseudo-visual features to semantic features;
and 6, adding the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network to obtain a total objective function, and iterating to achieve the aim of algorithm optimization.
Step 1 to step 6 will be described in detail below.
In step 1, the training feature data set is 2048-dimensional visual features extracted by a deep convolutional neural network and is a vector group; the labels comprise quantity labels and category labels, and the semantic features comprise word vectors and attribute features. The training feature data set shows excellent performance under 2048-dimensional visual function of a top-layer pooling unit of a deep convolutional neural network model RstNet 101. For the AwA1 and AwA2 databases, in addition to using visual features as semantic features, each category is represented by using a word vector, where the dimension of each category word vector is 1000. Specifically, a natural language processing technique is used to extract word vectors from each class in a large corpus of languages. As for the attribute feature, the dimension of the semantic attribute is shown in table 1 below by continuous values.
TABLE 1
In step 2, the generation network, the noise autoencoder, the regression network and the discrimination network use a feedforward neural network for data transfer.
The noise self-encoder is provided with two hidden fully-connected layers, wherein the two hidden fully-connected layers are 1200 units and 600 units respectively and are realized by 2048 units and 4096 units of hidden fully-connected layers; the judgment network is realized by only one 512-unit hidden full-connection layer; the regression network has only one hidden layer, and consists of 600 units. All noise dimensions in the invention are 100, pseudo-visual features can be synthesized from two different semantic features respectively, and semantic reasoning and related constraints are executed by using a regression network and a discriminant network simultaneously.
In step 3, the purpose of generating the network is to learn the probability distribution of the data points in order to collect samples therefrom for the data enhancement mechanism. As one of the most potential generative models, generative confrontation models are widely studied.
As shown in FIG. 2, given the number of training for the class seenAccording to D tr Aiming at learning and generating network G, ZXC → X, Gaussian random noiseAnd attribute feature a i ∈R q As an input, the first visual characteristic is then output. Once the generation network learns to generate the first visual features based on the visible class samples, the attribute features can be embedded to generate any invisible visual feature classes. The generation network can be learned by the following optimization function:
wherein x is 1 =G(z,a i ) Is the ith generated first visual feature representation in visual space, with corresponding visual feature a i And noise z.
The invention also uses a noise self-encoder to obtain a second visual characteristic by taking a word vector as a semantic characteristic, wherein the word vector is another semantic information and describes invisible category samples from another angle and supplements attribute characteristics, and the WAE is expanded into a conditional WAE.
As shown in FIG. 3, given certain conditional information (word vectors), the noise self-encoder is used to generate a probability distribution Q (Z | X), where Q is the distribution of the underlying space, P Z Is Z:isotropic gaussian prior distribution. Specifically, a discriminant network is introduced in the potential space Z with the goal of being able to distinguish between Z:the "false" point of the sample is distinguished from the "true" point of the sample from Q (Z | X). Wherein the decoder is arranged to decode the word vector w i ∈R p And a latent variable Q (Z | X), and then generating a second visual characteristic, wherein the loss function of the conditioned WAE is defined as follows:
wherein,is the ith generated second visual feature representation in visual space, with a corresponding word vector w i . Simultaneous selection of D Z (Qz,P Z )=D JS (Q Z ,P Z ) And evaluated using the challenge training, λ > 0 is a hyper-parameter.
The invention chooses to fuse the first visual feature and the second visual feature. In one aspect, attribute features contain more efficient semantic information than word vectors based on human experience in learning to identify new objects. On the other hand, compared with the real visual features, the generated pseudo visual features are a large amount of invalid information, and the invalid information can be deleted through feature fusion so as to keep the validity of the information.
Based on the above knowledge, it is necessary to give different weights to the pseudo-visual features generated by the attribute features and the word vector, and the following is the formula for fusing the first visual features and the second visual features into pseudo-visual features:
wherein x is f Is a pseudo-visual feature, λ is the corresponding weight, x 1 Is a representation of the first visual characteristic,is a second visual feature representation, the sum of the weights of the two parts of the first visual feature and the second visual feature being 1.
In step 4, a discriminant network is used to classify the pseudo visual features and the real visual features. The discrimination network can be deceived successfully as far as possible by the pseudo-visual features, and with the continuous improvement of the discrimination capability of the discrimination network, the generation network and the discrimination network can be optimized through a countermeasure mechanism, so that the quality of the generated pseudo-visual features is also continuously improved. The invention uses the improved WGAN to carry out the confrontation training, and the confrontation process of training the discriminant network can be expressed as follows:
where x is the true visual feature, x f =G(a,w,z),Alpha to U (0, 1); the first two terms in the equation approximate the Wasserstein distance, while the third term is the gradient penalty term, forcing the gradient of D to have a unity norm along a straight line.
Although countermeasure mechanisms provide the ability of the generating network to generate realistic visual features and similar distributions, it is not sufficient to ensure that only the fused pseudo-visual features are valid. Therefore, we pass the least squares penalty formula for the fused pseudo-visual features and true visual features to constrain their distribution:
in step 5, the regression network takes the pseudo-visual features as input and then converts the pseudo-visual features into semantic features. The generation network and the regression network together form a dual learning framework so they can learn each other. In the present invention, the main task is to generate visual features conditioned on class embedding, while the dual task is to return visual features to the corresponding class semantic space.
The true visual feature from the training feature data set samples is x, the second is the fused pseudo visual feature is x f With paired training data (x, a), we can train the regression network under supervised loss:
in step 6, the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network are added to obtain a total objective function:
L=L WGAN +L 1 +λ 2 *L R ,
wherein λ is 2 Is a hyper-parameter that assigns weights in different parts.
By selecting Adam as the optimizer and applying the parameter beta 1 And beta 2 Set to (0.9, 0.999). Firstly, training a discrimination network and optimizing parameters of the discrimination network, then fixing the discrimination network parameters, and setting the learning rate of the discrimination network to be 0.00001; training a generation network and a regression network and optimizing parameters of the generation network and the regression network, setting the learning rate of the generation network and the regression network to be 0.0001, training all modules in the zero sample learning model under the condition that the batch number is 128, and evaluating a test set by training 1000 periods on each data set and storing model parameters every 10 periods.
Preferably, the present invention further comprises step 7, specifically: and (4) using the generation network trained in the step (3) for generating the real visual features of the invisible class samples, and classifying the real visual features to test the total objective function in the step (6).
After the model is trained, to predict the labels of the invisible class samples, a new sample may first be generated for each invisible class sample, and any new classifier containing samples of visible and invisible classes may be trained from the new sample set. These synthesized samples are then taken along with other samples in the training data, after which any new classifier containing samples of visible and invisible classes can be trained from this new data set.
Further, the present invention was compared to 15 methods, including: SSE, LATEM, ALE, DEVISE, SJE, ESZSL, SYNC, SAE, DEM, relationship Net, PSR-ZSL, SP-AEN, CAPD, CVAE, GDAN.
For a reasonable comparison with other benchmarks, a simple 1-NN classifier can be applied for testing. The results are shown in table 2 below by comparing GDFN with the state-of-the-art generalized zero sample learning.
TABLE 2 results of the generalized zero sample learning method on four reference datasets
From the results, it can be seen that the present invention achieves good results on a generalized zero sample learning data set.
For the CUB dataset, the present invention achieves good results on invisible categories and the highest accuracy on visible categories. This indicates that: the present invention performs well in terms of harmonic means, which again indicates that the present invention maintains a good predictive image balance between visible and invisible classes and shows better performance than previous models.
For the AwA2 data, the present invention performed better than the latest methods (e.g., SP-AEN and PSR-ZSL) in terms of unseen class accuracy and harmonic mean, and showed higher accuracy in the visible class samples.
For the SUN data set, the method has high accuracy in identifying both visible and invisible category samples, and has obvious improvement on the identification and classification of the visible category samples.
For the aPY dataset, the similarity between the variance of the attributes of the unrelated training images and the test images was much smaller than for the other datasets, indicating that it was difficult to synthesize and classify the unseen classes. Although the prior art has a relatively low accuracy for invisible class identification, good results can be obtained when testing this data set using the present invention. The prior art has higher accuracy for the identification of the visible class samples, while the invention has higher accuracy for the visible class samples, realizes the balance between the visible class and the invisible class and provides accurate aPY harmonic average accuracy.
In summary, the invention fuses two semantic features of attributes and word vectors through knowledge migration, trains under a countermeasure mechanism to minimize the distribution difference between a real sample and a generated sample, and maps visual features to the semantic features through a regression network, thereby effectively solving the problem of neighborhood migration of model prediction results, identifying samples which are difficult to label, and reducing the identification cost.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.
Claims (10)
1. A zero sample learning method for knowledge migration from visible class samples to invisible class samples to identify never seen data classes, comprising essentially the steps of:
step 1, obtaining a training feature data set, wherein the training feature data set comprises visible class samples, and the visible class samples comprise labels, real visual features and semantic features;
step 2, building a zero sample learning model based on the generation network GAN, the noise self-encoder WAE, the regression network and the discrimination network, and initializing the generation network, the noise self-encoder, the regression network and the discrimination network in the zero sample learning model;
step 3, training a generating network and a noise self-encoder to respectively generate a first visual feature and a second visual feature, and fusing the first visual feature and the second visual feature into a pseudo visual feature according to different weights;
step 4, training a discrimination network to classify the pseudo visual features and the real visual features, and optimizing and generating the network and the discrimination network through a countermeasure mechanism;
step 5, training a regression network, and taking the pseudo-visual features as input so as to map the pseudo-visual features to semantic features;
and 6, adding the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network to obtain a total objective function, and iterating to achieve the purpose of algorithm optimization.
2. The zero-sample learning method according to claim 1, characterized in that: in step 1, the labels include quantity labels and category labels, and the semantic features include word vectors and attribute features.
3. The zero-sample learning method according to claim 1, characterized in that: in step 2, a feedforward neural network is used for data transmission among the generation network, the noise self-encoder, the regression network and the discrimination network.
4. The zero-sample learning method according to claim 2, characterized in that: in step 3, the generation network generates a first visual feature through an attribute feature and Gaussian random noise; the noise autoencoder generates a second visual feature by a word vector, a latent variable, and gaussian random noise.
5. The zero-sample learning method according to claim 1, wherein in step 3, the formula for fusing the first visual feature and the second visual feature into the pseudo visual feature is as follows:
7. The zero-sample learning method of claim 6, characterized in that: in step 4, the pseudo visual features and the real visual features are both constrained in distribution by a least square loss formula, wherein the least square loss formula is as follows:
where x is the true visual feature, x f Is a pseudo-visual feature.
8. The zero-sample learning method of claim 7, wherein: the total objective function in step 6 is:
L=L WGAN +L 1 +λ 2 *L R ,
wherein λ is 2 Is a hyper-parameter that assigns weights in different parts.
9. The zero-sample learning method according to claim 1, characterized in that: in step 6, algorithm optimization is performed using Adam as an optimizer.
10. The zero-sample learning method according to claim 1, characterized in that: and 7, using the generation network trained in the step 3 for generating the real visual features of the invisible class samples, and classifying the invisible class samples to test the total objective function in the step 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010750578.4A CN111914929B (en) | 2020-07-30 | 2020-07-30 | Zero sample learning method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010750578.4A CN111914929B (en) | 2020-07-30 | 2020-07-30 | Zero sample learning method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111914929A CN111914929A (en) | 2020-11-10 |
CN111914929B true CN111914929B (en) | 2022-08-23 |
Family
ID=73286794
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010750578.4A Active CN111914929B (en) | 2020-07-30 | 2020-07-30 | Zero sample learning method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111914929B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191381B (en) * | 2020-12-04 | 2022-10-11 | 云南大学 | Image zero-order classification model based on cross knowledge and classification method thereof |
CN112580722B (en) * | 2020-12-20 | 2024-06-14 | 大连理工大学人工智能大连研究院 | Generalized zero sample image recognition method based on conditional countermeasure automatic encoder |
CN112674709B (en) * | 2020-12-22 | 2022-07-29 | 泉州装备制造研究所 | Amblyopia detection method based on anti-noise |
CN112766386B (en) * | 2021-01-25 | 2022-09-20 | 大连理工大学 | Generalized zero sample learning method based on multi-input multi-output fusion network |
CN113222002B (en) * | 2021-05-07 | 2024-04-05 | 西安交通大学 | Zero sample classification method based on generative discriminative contrast optimization |
CN113269274B (en) * | 2021-06-18 | 2022-04-19 | 南昌航空大学 | Zero sample identification method and system based on cycle consistency |
CN113378959B (en) * | 2021-06-24 | 2022-03-15 | 中国矿业大学 | Zero sample learning method for generating countermeasure network based on semantic error correction |
CN113723106B (en) * | 2021-07-29 | 2024-03-12 | 北京工业大学 | Zero sample text classification method based on label extension |
CN114266307B (en) * | 2021-12-21 | 2024-08-09 | 复旦大学 | Method for identifying noise samples in parallel based on non-zero mean shift parameters |
CN115424262A (en) * | 2022-08-04 | 2022-12-02 | 暨南大学 | Method for optimizing zero sample learning |
CN116051909B (en) * | 2023-03-06 | 2023-06-16 | 中国科学技术大学 | Direct push zero-order learning unseen picture classification method, device and medium |
CN117893743B (en) * | 2024-03-18 | 2024-05-31 | 山东军地信息技术集团有限公司 | Zero sample target detection method based on channel weighting and double-comparison learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107679556A (en) * | 2017-09-18 | 2018-02-09 | 天津大学 | The zero sample image sorting technique based on variation autocoder |
CN109492662A (en) * | 2018-09-27 | 2019-03-19 | 天津大学 | A kind of zero sample classification method based on confrontation self-encoding encoder model |
CN110175251A (en) * | 2019-05-25 | 2019-08-27 | 西安电子科技大学 | The zero sample Sketch Searching method based on semantic confrontation network |
-
2020
- 2020-07-30 CN CN202010750578.4A patent/CN111914929B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107679556A (en) * | 2017-09-18 | 2018-02-09 | 天津大学 | The zero sample image sorting technique based on variation autocoder |
CN109492662A (en) * | 2018-09-27 | 2019-03-19 | 天津大学 | A kind of zero sample classification method based on confrontation self-encoding encoder model |
CN110175251A (en) * | 2019-05-25 | 2019-08-27 | 西安电子科技大学 | The zero sample Sketch Searching method based on semantic confrontation network |
Non-Patent Citations (1)
Title |
---|
零样本图像识别;兰红等;《电子与信息学报》(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111914929A (en) | 2020-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111914929B (en) | Zero sample learning method | |
CN111476294B (en) | Zero sample image identification method and system based on generation countermeasure network | |
CN110414377B (en) | Remote sensing image scene classification method based on scale attention network | |
CN112905822B (en) | Deep supervision cross-modal counterwork learning method based on attention mechanism | |
CN109993072A (en) | The low resolution pedestrian weight identifying system and method generated based on super resolution image | |
CN113642621A (en) | Zero sample image classification method based on generation countermeasure network | |
CN109739844A (en) | Data classification method based on decaying weight | |
CN113886626B (en) | Visual question-answering method of dynamic memory network model based on multi-attention mechanism | |
CN110163117A (en) | A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning | |
CN114038007A (en) | Pedestrian re-recognition method combining style transformation and attitude generation | |
CN113837229A (en) | Knowledge-driven text-to-image generation method | |
CN113947101A (en) | Unsupervised pedestrian re-identification method and system based on softening similarity learning | |
CN114764939A (en) | Heterogeneous face recognition method and system based on identity-attribute decoupling | |
CN114723994A (en) | Hyperspectral image classification method based on dual-classifier confrontation enhancement network | |
Nijhawan et al. | VTnet+ Handcrafted based approach for food cuisines classification | |
CN114579794A (en) | Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion | |
CN117593538A (en) | Generalized zero sample recognition model based on noise marking data | |
CN110210562A (en) | Image classification method based on depth network and sparse Fisher vector | |
CN113191381B (en) | Image zero-order classification model based on cross knowledge and classification method thereof | |
CN114429648A (en) | Pedestrian re-identification method and system based on comparison features | |
CN114037866A (en) | Generalized zero sample image classification method based on synthesis of distinguishable pseudo features | |
CN113627522A (en) | Image classification method, device and equipment based on relational network and storage medium | |
Miyauchi et al. | Shape-conditioned image generation by learning latent appearance representation from unpaired data | |
CN117349500B (en) | Method for detecting interpretable false news of double-encoder evidence distillation neural network | |
CN116152885B (en) | Cross-modal heterogeneous face recognition and prototype restoration method based on feature decoupling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |