CN111914929B - Zero sample learning method - Google Patents

Zero sample learning method Download PDF

Info

Publication number
CN111914929B
CN111914929B CN202010750578.4A CN202010750578A CN111914929B CN 111914929 B CN111914929 B CN 111914929B CN 202010750578 A CN202010750578 A CN 202010750578A CN 111914929 B CN111914929 B CN 111914929B
Authority
CN
China
Prior art keywords
network
visual
features
zero
visual feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010750578.4A
Other languages
Chinese (zh)
Other versions
CN111914929A (en
Inventor
罗新新
蔡子赟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202010750578.4A priority Critical patent/CN111914929B/en
Publication of CN111914929A publication Critical patent/CN111914929A/en
Application granted granted Critical
Publication of CN111914929B publication Critical patent/CN111914929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Filters That Use Time-Delay Elements (AREA)

Abstract

The invention provides a zero sample learning method, which is used for identifying unseen data classes by carrying out knowledge migration from visible class samples to invisible class samples, and mainly comprises the following steps: acquiring a training characteristic data set; building a zero sample learning model based on a generation network, a noise self-encoder, a regression network and a discrimination network; training a generating network and a noise self-encoder; training a discrimination network; and obtaining a total objective function, and iterating to achieve the purpose of algorithm optimization. According to the method, through knowledge migration, two semantic features of attributes and word vectors are fused, training is carried out under a countermeasure mechanism to minimize the distribution difference between a real sample and a generated sample, and visual features are mapped to the semantic features through a regression network, so that the problem of neighborhood migration of a model prediction result is effectively solved, samples which are difficult to label can be identified, and the identification cost is reduced.

Description

Zero sample learning method
Technical Field
The invention relates to a zero sample learning method, and belongs to the field of pattern recognition.
Background
With the development of deep learning, the performance of computer vision and machine learning methods is greatly improved, and deep learning models have been surprisingly successful in the field of image classification, even comparable to the human recognition capability. However, humans have natural advantages in identifying novel objects that humans have previously heard or seen only a few times, and possibly new objects that have never been touched. The most fundamental reason for the difference between the two is that the depth model relies on fully supervised learning. Training neural networks therefore requires a large amount of labeled data, and in fact, because of the tens of thousands of species in nature, collecting and annotating visual data is cumbersome and expensive. This creates a new task to solve the problem of image annotation by transferring knowledge from visible class samples to invisible class samples, so that invisible class samples can be identified.
Zero sample learning, in which visible classes and invisible class sets are generally assumed to be disjoint, is currently receiving increasing attention. A portion of the samples in the feature space are labeled, the samples are referred to as visible class samples, and only the visual instances of the visible class samples are used to train the model. There is also a portion of unlabeled sample instances in the feature space, these sample classes are referred to as invisible class samples. The feature space is composed of vectors extracted from samples through a neural network, and each sample belongs to one category. To link visible class samples to invisible class samples, semantic features are typically introduced for zero sample learning. In zero sample learning, attributes are the most common semantic features, but manually labeling visual features for each semantic attribute is a time-consuming and labor-consuming invention. The natural language processing technology utilizes semantic features (such as word vectors, glove) with some alternative attributes to directly acquire text information from wikipedia articles, but the semantic features are coarse and invisible in acquisition, so that the performance of the semantic features is poorer than that of attribute features.
In view of the above, it is necessary to provide a zero sample learning method to solve the above problems.
Disclosure of Invention
The invention aims to provide a zero sample learning method which is used for identifying samples difficult to label and reducing identification cost.
In order to achieve the above object, the present invention provides a zero sample learning method for performing knowledge migration from visible class samples to invisible class samples to identify unseen data classes, which mainly comprises the following steps:
step 1, obtaining a training feature data set, wherein the training feature data set comprises visible class samples, and the visible class samples comprise labels, real visual features and semantic features;
step 2, building a zero sample learning model based on the generation network, the noise self-encoder, the regression network and the discrimination network, and initializing the generation network, the noise self-encoder, the regression network and the discrimination network in the zero sample learning model;
step 3, training a generating network and a noise self-encoder to respectively generate a first visual feature and a second visual feature, and fusing the first visual feature and the second visual feature into a pseudo visual feature according to different weights;
step 4, training a discrimination network to classify the pseudo visual features and the real visual features, and optimizing and generating the network and the discrimination network through a countermeasure mechanism;
step 5, training a regression network, and taking the pseudo-visual features as input so as to map the pseudo-visual features to semantic features;
and 6, adding the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network to obtain a total objective function, and iterating to achieve the purpose of algorithm optimization.
Optionally, in step 1, the tags include quantity tags and category tags, and the semantic features include word vectors and attribute features.
Optionally, in step 2, a feedforward neural network is used for data transmission among the generation network, the noise self-encoder, the regression network, and the discrimination network.
Optionally, in step 3, the generating network generates a first visual feature through an attribute feature and gaussian random noise; the noise autoencoder generates a second visual feature by a word vector, a latent variable, and gaussian random noise.
Optionally, in step 3, the formula for fusing the first visual feature and the second visual feature into the pseudo visual feature is as follows:
Figure GDA0003682882820000031
wherein x is f Is a pseudo-visual feature, λ is the corresponding weight, x 1 Is a representation of the first visual characteristic,
Figure GDA0003682882820000032
is a second visual feature representation, the sum of the weights of the two parts of the first visual feature and the second visual feature being 1.
Optionally, in step 4, the countermeasure mechanism may be expressed as:
Figure GDA0003682882820000033
where x is the true visual feature, x f =G(a,w,z),
Figure GDA0003682882820000034
α~U(0,1)。
Optionally, in step 4, the pseudo visual feature and the real visual feature are both constrained by a least square loss formula, where the least square loss formula is:
Figure GDA0003682882820000035
where x is the true visual feature, x f Is a pseudo-visual feature.
Optionally, the total objective function in step 6 is:
L=L WGAN +L 12 *L R ,
wherein λ is 2 Are hyperparameters that assign weights in different parts.
Optionally, in step 6, Adam is used as an optimizer to perform algorithm optimization.
Optionally, step 7 is further included, the generation network trained in step 3 is used for generating the real visual features of the invisible class samples, and the invisible class samples are classified to test the total objective function in step 6.
The invention has the beneficial effects that: the method disclosed by the invention has the advantages that through knowledge migration, two semantic features of attributes and word vectors are fused, training is carried out under a countermeasure mechanism to minimize the distribution difference between a real sample and a generated sample, and the visual features are mapped to the semantic features through a regression network, so that the problem of neighborhood migration of a model prediction result is effectively solved, samples which are difficult to label can be identified, and the identification cost is reduced.
Drawings
FIG. 1 is a flow chart of the zero sample learning method of the present invention.
FIG. 2 is a flow chart of the method for zero sample learning according to the present invention for generating a first visual feature.
Fig. 3 is a flow chart of the generation of the second visual feature in the zero sample learning method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
As shown in FIG. 1, the present invention discloses a zero sample learning method for performing knowledge migration from visible class samples to invisible class samples to identify unseen data classes, which mainly comprises the following steps:
step 1, obtaining a training feature data set, wherein the training feature data set comprises visible class samples, and the visible class samples comprise labels, real visual features and semantic features;
step 2, building a zero sample learning model based on the generation network, the noise self-encoder, the regression network and the discrimination network, and initializing the generation network, the noise self-encoder, the regression network and the discrimination network in the zero sample learning model;
step 3, training a generating network and a noise self-encoder to respectively generate a first visual feature and a second visual feature, and fusing the first visual feature and the second visual feature into a pseudo visual feature according to different weights;
step 4, training a discrimination network to classify the pseudo visual features and the real visual features, and optimizing to generate a network and a discrimination network through a countermeasure mechanism;
step 5, training a regression network, and taking the pseudo-visual features as input so as to map the pseudo-visual features to semantic features;
and 6, adding the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network to obtain a total objective function, and iterating to achieve the aim of algorithm optimization.
Step 1 to step 6 will be described in detail below.
In step 1, the training feature data set is 2048-dimensional visual features extracted by a deep convolutional neural network and is a vector group; the labels comprise quantity labels and category labels, and the semantic features comprise word vectors and attribute features. The training feature data set shows excellent performance under 2048-dimensional visual function of a top-layer pooling unit of a deep convolutional neural network model RstNet 101. For the AwA1 and AwA2 databases, in addition to using visual features as semantic features, each category is represented by using a word vector, where the dimension of each category word vector is 1000. Specifically, a natural language processing technique is used to extract word vectors from each class in a large corpus of languages. As for the attribute feature, the dimension of the semantic attribute is shown in table 1 below by continuous values.
TABLE 1
Figure GDA0003682882820000051
In step 2, the generation network, the noise autoencoder, the regression network and the discrimination network use a feedforward neural network for data transfer.
The noise self-encoder is provided with two hidden fully-connected layers, wherein the two hidden fully-connected layers are 1200 units and 600 units respectively and are realized by 2048 units and 4096 units of hidden fully-connected layers; the judgment network is realized by only one 512-unit hidden full-connection layer; the regression network has only one hidden layer, and consists of 600 units. All noise dimensions in the invention are 100, pseudo-visual features can be synthesized from two different semantic features respectively, and semantic reasoning and related constraints are executed by using a regression network and a discriminant network simultaneously.
In step 3, the purpose of generating the network is to learn the probability distribution of the data points in order to collect samples therefrom for the data enhancement mechanism. As one of the most potential generative models, generative confrontation models are widely studied.
As shown in FIG. 2, given the number of training for the class seenAccording to D tr Aiming at learning and generating network G, ZXC → X, Gaussian random noise
Figure GDA0003682882820000061
And attribute feature a i ∈R q As an input, the first visual characteristic is then output. Once the generation network learns to generate the first visual features based on the visible class samples, the attribute features can be embedded to generate any invisible visual feature classes. The generation network can be learned by the following optimization function:
Figure GDA0003682882820000062
wherein x is 1 =G(z,a i ) Is the ith generated first visual feature representation in visual space, with corresponding visual feature a i And noise z.
The invention also uses a noise self-encoder to obtain a second visual characteristic by taking a word vector as a semantic characteristic, wherein the word vector is another semantic information and describes invisible category samples from another angle and supplements attribute characteristics, and the WAE is expanded into a conditional WAE.
As shown in FIG. 3, given certain conditional information (word vectors), the noise self-encoder is used to generate a probability distribution Q (Z | X), where Q is the distribution of the underlying space, P Z Is Z:
Figure GDA0003682882820000063
isotropic gaussian prior distribution. Specifically, a discriminant network is introduced in the potential space Z with the goal of being able to distinguish between Z:
Figure GDA0003682882820000064
the "false" point of the sample is distinguished from the "true" point of the sample from Q (Z | X). Wherein the decoder is arranged to decode the word vector w i ∈R p And a latent variable Q (Z | X), and then generating a second visual characteristic, wherein the loss function of the conditioned WAE is defined as follows:
Figure GDA0003682882820000065
wherein,
Figure GDA0003682882820000066
is the ith generated second visual feature representation in visual space, with a corresponding word vector w i . Simultaneous selection of D Z (Qz,P Z )=D JS (Q Z ,P Z ) And evaluated using the challenge training, λ > 0 is a hyper-parameter.
The invention chooses to fuse the first visual feature and the second visual feature. In one aspect, attribute features contain more efficient semantic information than word vectors based on human experience in learning to identify new objects. On the other hand, compared with the real visual features, the generated pseudo visual features are a large amount of invalid information, and the invalid information can be deleted through feature fusion so as to keep the validity of the information.
Based on the above knowledge, it is necessary to give different weights to the pseudo-visual features generated by the attribute features and the word vector, and the following is the formula for fusing the first visual features and the second visual features into pseudo-visual features:
Figure GDA0003682882820000071
wherein x is f Is a pseudo-visual feature, λ is the corresponding weight, x 1 Is a representation of the first visual characteristic,
Figure GDA0003682882820000072
is a second visual feature representation, the sum of the weights of the two parts of the first visual feature and the second visual feature being 1.
In step 4, a discriminant network is used to classify the pseudo visual features and the real visual features. The discrimination network can be deceived successfully as far as possible by the pseudo-visual features, and with the continuous improvement of the discrimination capability of the discrimination network, the generation network and the discrimination network can be optimized through a countermeasure mechanism, so that the quality of the generated pseudo-visual features is also continuously improved. The invention uses the improved WGAN to carry out the confrontation training, and the confrontation process of training the discriminant network can be expressed as follows:
Figure GDA0003682882820000073
where x is the true visual feature, x f =G(a,w,z),
Figure GDA0003682882820000074
Alpha to U (0, 1); the first two terms in the equation approximate the Wasserstein distance, while the third term is the gradient penalty term, forcing the gradient of D to have a unity norm along a straight line.
Although countermeasure mechanisms provide the ability of the generating network to generate realistic visual features and similar distributions, it is not sufficient to ensure that only the fused pseudo-visual features are valid. Therefore, we pass the least squares penalty formula for the fused pseudo-visual features and true visual features to constrain their distribution:
Figure GDA0003682882820000075
in step 5, the regression network takes the pseudo-visual features as input and then converts the pseudo-visual features into semantic features. The generation network and the regression network together form a dual learning framework so they can learn each other. In the present invention, the main task is to generate visual features conditioned on class embedding, while the dual task is to return visual features to the corresponding class semantic space.
The true visual feature from the training feature data set samples is x, the second is the fused pseudo visual feature is x f With paired training data (x, a), we can train the regression network under supervised loss:
Figure GDA0003682882820000081
in step 6, the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network are added to obtain a total objective function:
L=L WGAN +L 12 *L R ,
wherein λ is 2 Is a hyper-parameter that assigns weights in different parts.
By selecting Adam as the optimizer and applying the parameter beta 1 And beta 2 Set to (0.9, 0.999). Firstly, training a discrimination network and optimizing parameters of the discrimination network, then fixing the discrimination network parameters, and setting the learning rate of the discrimination network to be 0.00001; training a generation network and a regression network and optimizing parameters of the generation network and the regression network, setting the learning rate of the generation network and the regression network to be 0.0001, training all modules in the zero sample learning model under the condition that the batch number is 128, and evaluating a test set by training 1000 periods on each data set and storing model parameters every 10 periods.
Preferably, the present invention further comprises step 7, specifically: and (4) using the generation network trained in the step (3) for generating the real visual features of the invisible class samples, and classifying the real visual features to test the total objective function in the step (6).
After the model is trained, to predict the labels of the invisible class samples, a new sample may first be generated for each invisible class sample, and any new classifier containing samples of visible and invisible classes may be trained from the new sample set. These synthesized samples are then taken along with other samples in the training data, after which any new classifier containing samples of visible and invisible classes can be trained from this new data set.
Further, the present invention was compared to 15 methods, including: SSE, LATEM, ALE, DEVISE, SJE, ESZSL, SYNC, SAE, DEM, relationship Net, PSR-ZSL, SP-AEN, CAPD, CVAE, GDAN.
For a reasonable comparison with other benchmarks, a simple 1-NN classifier can be applied for testing. The results are shown in table 2 below by comparing GDFN with the state-of-the-art generalized zero sample learning.
TABLE 2 results of the generalized zero sample learning method on four reference datasets
Figure GDA0003682882820000091
From the results, it can be seen that the present invention achieves good results on a generalized zero sample learning data set.
For the CUB dataset, the present invention achieves good results on invisible categories and the highest accuracy on visible categories. This indicates that: the present invention performs well in terms of harmonic means, which again indicates that the present invention maintains a good predictive image balance between visible and invisible classes and shows better performance than previous models.
For the AwA2 data, the present invention performed better than the latest methods (e.g., SP-AEN and PSR-ZSL) in terms of unseen class accuracy and harmonic mean, and showed higher accuracy in the visible class samples.
For the SUN data set, the method has high accuracy in identifying both visible and invisible category samples, and has obvious improvement on the identification and classification of the visible category samples.
For the aPY dataset, the similarity between the variance of the attributes of the unrelated training images and the test images was much smaller than for the other datasets, indicating that it was difficult to synthesize and classify the unseen classes. Although the prior art has a relatively low accuracy for invisible class identification, good results can be obtained when testing this data set using the present invention. The prior art has higher accuracy for the identification of the visible class samples, while the invention has higher accuracy for the visible class samples, realizes the balance between the visible class and the invisible class and provides accurate aPY harmonic average accuracy.
In summary, the invention fuses two semantic features of attributes and word vectors through knowledge migration, trains under a countermeasure mechanism to minimize the distribution difference between a real sample and a generated sample, and maps visual features to the semantic features through a regression network, thereby effectively solving the problem of neighborhood migration of model prediction results, identifying samples which are difficult to label, and reducing the identification cost.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.

Claims (10)

1. A zero sample learning method for knowledge migration from visible class samples to invisible class samples to identify never seen data classes, comprising essentially the steps of:
step 1, obtaining a training feature data set, wherein the training feature data set comprises visible class samples, and the visible class samples comprise labels, real visual features and semantic features;
step 2, building a zero sample learning model based on the generation network GAN, the noise self-encoder WAE, the regression network and the discrimination network, and initializing the generation network, the noise self-encoder, the regression network and the discrimination network in the zero sample learning model;
step 3, training a generating network and a noise self-encoder to respectively generate a first visual feature and a second visual feature, and fusing the first visual feature and the second visual feature into a pseudo visual feature according to different weights;
step 4, training a discrimination network to classify the pseudo visual features and the real visual features, and optimizing and generating the network and the discrimination network through a countermeasure mechanism;
step 5, training a regression network, and taking the pseudo-visual features as input so as to map the pseudo-visual features to semantic features;
and 6, adding the loss functions of the generation network, the noise self-encoder, the regression network and the discrimination network to obtain a total objective function, and iterating to achieve the purpose of algorithm optimization.
2. The zero-sample learning method according to claim 1, characterized in that: in step 1, the labels include quantity labels and category labels, and the semantic features include word vectors and attribute features.
3. The zero-sample learning method according to claim 1, characterized in that: in step 2, a feedforward neural network is used for data transmission among the generation network, the noise self-encoder, the regression network and the discrimination network.
4. The zero-sample learning method according to claim 2, characterized in that: in step 3, the generation network generates a first visual feature through an attribute feature and Gaussian random noise; the noise autoencoder generates a second visual feature by a word vector, a latent variable, and gaussian random noise.
5. The zero-sample learning method according to claim 1, wherein in step 3, the formula for fusing the first visual feature and the second visual feature into the pseudo visual feature is as follows:
Figure FDA0003682882810000021
wherein x is f Is a pseudo-visual feature, λ is the corresponding weight, x 1 Is a representation of the first visual characteristic,
Figure FDA0003682882810000022
is a second visual feature representation, the sum of the weights of the two parts of the first visual feature and the second visual feature being 1.
6. The zero-sample learning method according to claim 5, wherein in step 4, the countermeasure can be expressed as:
Figure FDA0003682882810000023
where x is the true visual feature, x f =G(a,w,z),
Figure FDA0003682882810000024
α~U(0,1)。
7. The zero-sample learning method of claim 6, characterized in that: in step 4, the pseudo visual features and the real visual features are both constrained in distribution by a least square loss formula, wherein the least square loss formula is as follows:
Figure FDA0003682882810000025
where x is the true visual feature, x f Is a pseudo-visual feature.
8. The zero-sample learning method of claim 7, wherein: the total objective function in step 6 is:
L=L WGAN +L 12 *L R ,
wherein λ is 2 Is a hyper-parameter that assigns weights in different parts.
9. The zero-sample learning method according to claim 1, characterized in that: in step 6, algorithm optimization is performed using Adam as an optimizer.
10. The zero-sample learning method according to claim 1, characterized in that: and 7, using the generation network trained in the step 3 for generating the real visual features of the invisible class samples, and classifying the invisible class samples to test the total objective function in the step 6.
CN202010750578.4A 2020-07-30 2020-07-30 Zero sample learning method Active CN111914929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010750578.4A CN111914929B (en) 2020-07-30 2020-07-30 Zero sample learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010750578.4A CN111914929B (en) 2020-07-30 2020-07-30 Zero sample learning method

Publications (2)

Publication Number Publication Date
CN111914929A CN111914929A (en) 2020-11-10
CN111914929B true CN111914929B (en) 2022-08-23

Family

ID=73286794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010750578.4A Active CN111914929B (en) 2020-07-30 2020-07-30 Zero sample learning method

Country Status (1)

Country Link
CN (1) CN111914929B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191381B (en) * 2020-12-04 2022-10-11 云南大学 Image zero-order classification model based on cross knowledge and classification method thereof
CN112580722B (en) * 2020-12-20 2024-06-14 大连理工大学人工智能大连研究院 Generalized zero sample image recognition method based on conditional countermeasure automatic encoder
CN112674709B (en) * 2020-12-22 2022-07-29 泉州装备制造研究所 Amblyopia detection method based on anti-noise
CN112766386B (en) * 2021-01-25 2022-09-20 大连理工大学 Generalized zero sample learning method based on multi-input multi-output fusion network
CN113222002B (en) * 2021-05-07 2024-04-05 西安交通大学 Zero sample classification method based on generative discriminative contrast optimization
CN113269274B (en) * 2021-06-18 2022-04-19 南昌航空大学 Zero sample identification method and system based on cycle consistency
CN113378959B (en) * 2021-06-24 2022-03-15 中国矿业大学 Zero sample learning method for generating countermeasure network based on semantic error correction
CN113723106B (en) * 2021-07-29 2024-03-12 北京工业大学 Zero sample text classification method based on label extension
CN114266307B (en) * 2021-12-21 2024-08-09 复旦大学 Method for identifying noise samples in parallel based on non-zero mean shift parameters
CN115424262A (en) * 2022-08-04 2022-12-02 暨南大学 Method for optimizing zero sample learning
CN116051909B (en) * 2023-03-06 2023-06-16 中国科学技术大学 Direct push zero-order learning unseen picture classification method, device and medium
CN117893743B (en) * 2024-03-18 2024-05-31 山东军地信息技术集团有限公司 Zero sample target detection method based on channel weighting and double-comparison learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679556A (en) * 2017-09-18 2018-02-09 天津大学 The zero sample image sorting technique based on variation autocoder
CN109492662A (en) * 2018-09-27 2019-03-19 天津大学 A kind of zero sample classification method based on confrontation self-encoding encoder model
CN110175251A (en) * 2019-05-25 2019-08-27 西安电子科技大学 The zero sample Sketch Searching method based on semantic confrontation network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679556A (en) * 2017-09-18 2018-02-09 天津大学 The zero sample image sorting technique based on variation autocoder
CN109492662A (en) * 2018-09-27 2019-03-19 天津大学 A kind of zero sample classification method based on confrontation self-encoding encoder model
CN110175251A (en) * 2019-05-25 2019-08-27 西安电子科技大学 The zero sample Sketch Searching method based on semantic confrontation network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
零样本图像识别;兰红等;《电子与信息学报》(第05期);全文 *

Also Published As

Publication number Publication date
CN111914929A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN111914929B (en) Zero sample learning method
CN111476294B (en) Zero sample image identification method and system based on generation countermeasure network
CN110414377B (en) Remote sensing image scene classification method based on scale attention network
CN112905822B (en) Deep supervision cross-modal counterwork learning method based on attention mechanism
CN109993072A (en) The low resolution pedestrian weight identifying system and method generated based on super resolution image
CN113642621A (en) Zero sample image classification method based on generation countermeasure network
CN109739844A (en) Data classification method based on decaying weight
CN113886626B (en) Visual question-answering method of dynamic memory network model based on multi-attention mechanism
CN110163117A (en) A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning
CN114038007A (en) Pedestrian re-recognition method combining style transformation and attitude generation
CN113837229A (en) Knowledge-driven text-to-image generation method
CN113947101A (en) Unsupervised pedestrian re-identification method and system based on softening similarity learning
CN114764939A (en) Heterogeneous face recognition method and system based on identity-attribute decoupling
CN114723994A (en) Hyperspectral image classification method based on dual-classifier confrontation enhancement network
Nijhawan et al. VTnet+ Handcrafted based approach for food cuisines classification
CN114579794A (en) Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion
CN117593538A (en) Generalized zero sample recognition model based on noise marking data
CN110210562A (en) Image classification method based on depth network and sparse Fisher vector
CN113191381B (en) Image zero-order classification model based on cross knowledge and classification method thereof
CN114429648A (en) Pedestrian re-identification method and system based on comparison features
CN114037866A (en) Generalized zero sample image classification method based on synthesis of distinguishable pseudo features
CN113627522A (en) Image classification method, device and equipment based on relational network and storage medium
Miyauchi et al. Shape-conditioned image generation by learning latent appearance representation from unpaired data
CN117349500B (en) Method for detecting interpretable false news of double-encoder evidence distillation neural network
CN116152885B (en) Cross-modal heterogeneous face recognition and prototype restoration method based on feature decoupling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant