CN113191381B - Image zero-order classification model based on cross knowledge and classification method thereof - Google Patents

Image zero-order classification model based on cross knowledge and classification method thereof Download PDF

Info

Publication number
CN113191381B
CN113191381B CN202011402935.4A CN202011402935A CN113191381B CN 113191381 B CN113191381 B CN 113191381B CN 202011402935 A CN202011402935 A CN 202011402935A CN 113191381 B CN113191381 B CN 113191381B
Authority
CN
China
Prior art keywords
visual
semantic
features
level
cross
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011402935.4A
Other languages
Chinese (zh)
Other versions
CN113191381A (en
Inventor
曾婷
向鸿鑫
谢诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN202011402935.4A priority Critical patent/CN113191381B/en
Publication of CN113191381A publication Critical patent/CN113191381A/en
Application granted granted Critical
Publication of CN113191381B publication Critical patent/CN113191381B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Abstract

The invention discloses an image zero-order classification model based on cross knowledge, which comprises a biological classification tree module; constructing a biological classification tree according to all categories in the data set; the visual feature extraction module: the system is used for converting images in the data set into one-dimensional visual features; a semantic feature extraction module: the semantic feature extraction module is used for converting texts or attributes in the data set into one-dimensional semantic features; a cross knowledge learning module: semantic information for enriching categories; generating a confrontation network module: the image recognition system comprises a generator and a discriminator, wherein the generator generates a pseudo visual feature from a semantic feature, and the discriminator is used for discriminating the authenticity and the category of an image; the cross knowledge learning is adopted, more relevant semantic features can be trained, so that the features from semantics to vision are embedded into the ZSL, and the semantic features in the cross-modal learning process are enriched; the model and the method are simple and efficient, and high-accuracy classification results are obtained on a plurality of authoritative data sets.

Description

Image zero-order classification model based on cross knowledge and classification method thereof
Technical Field
The invention relates to the technical field of image classification, in particular to an image zero-order classification model based on cross knowledge and a classification method thereof.
Background
The field of image classification is increasingly attractive due to rapid expansion of data size and the explosion of machine learning models, however, collecting sufficient data sets is time consuming and laborious, and some data sets are unavailable. How to correctly and efficiently classify certain categories in the case of partial data set loss becomes one of the main challenges facing the image classification field.
Aiming at the problem of imperfect data set, the mainstream scheme in the field at present firstly proposes the concept of Zero-learning (ZSL). It can identify new classes that have not appeared in the training phase in the testing phase, i.e. to solve the situation where the labeled training samples are not sufficient to cover all object classes. The zero-order classification method simplifies the image classification problem of the lack of samples into a traditional image classification problem. The generating ZSL attempts to learn the relationship between semantic features and visual features from the seen classes and then generate a composite image for the unseen classes. Currently, the mainstream generation-type zero-order image classification methods include FeatGen, GAZSL, ZSLPP, CIZSL, etc., which utilize a generation countermeasure network as a basic structure of a depth model, and simplify the depth model into a conventional image classification task by generating a dummy sample. For example, GDAN (generic Dual Adversal Network) uses a Dual Generative countermeasure Network to accomplish the semantic to visual bi-directional mapping.
In the field of animal and plant image classification, although the above method has achieved good effect in a mode without enough training set, two challenges still exist:
1. cross-modality (cross-modality) problem: the cross-modal problem in visual-semantic embedding causes incomplete expression of semantic features and visual features in the embedding process, and particularly in two classes which are very similar and have no difference in embedding space, so that the model is difficult to distinguish, and the performance of the model is greatly reduced;
2. cross-domain (cross-domain) problem: the seen class may intersect the unseen class very rarely (or not), and the visual appearance of the same attribute or text description may be significantly different in the unseen class, which may make it difficult for the model to accurately distinguish the unseen class when embedding the semantic vector into the visual space.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an image zero-order classification model based on cross knowledge and a classification method thereof.
In order to achieve the purpose of the invention, the invention is realized by the following technical method:
a zero-order image classification model based on cross knowledge comprises
A biological classification tree module: constructing a biological classification tree according to all categories in the data set;
the visual feature extraction module: the system is used for converting images in the data set into one-dimensional visual features;
a semantic feature extraction module: the semantic feature extraction module is used for converting texts or attributes in the data set into one-dimensional semantic features;
a cross knowledge learning module: semantic information for enriching categories;
generating a confrontation network module: the image recognition system comprises a generator and a discriminator, wherein the generator generates a pseudo-visual feature from a semantic feature, and the discriminator is used for discriminating the authenticity and the category of an image.
Preferably, the bio-taxonomy tree includes a Family level, a Genus level, and a specifices level, the specifices level including all categories in the dataset.
Preferably, the visual feature extraction module adopts ResNet101, and the semantic feature extraction module adopts Term Frequency Inverse Document Frequency.
Preferably, the cross-learning module is a Family level and a Genus level which are obtained by combining a plurality of classes into a biological classification tree by using biological taxonomy, and the cross-learning is performed inside the Family level, the Genus level and the specifices level respectively.
Preferably, the generator is represented as:
Figure RE-GDA0003125249140000036
the arbiter is represented as:
Figure RE-GDA0003125249140000031
L T expressed as:
Figure RE-GDA0003125249140000032
wherein the content of the first and second substances,
Figure RE-GDA0003125249140000033
respectively expressed as:
Figure RE-GDA0003125249140000034
K s,F expressed as:
Figure RE-GDA0003125249140000035
preferably, the generator employs taxonomic regularization.
Preferably, the data set is animal and plant image data.
Preferably, the training method of the model comprises the following steps:
step S1: constructing a biological classification tree according to all class names in the data set;
step S2: respectively inputting images, texts or attribute descriptions in the data set into a visual feature extraction module and a semantic feature extraction module, and extracting visual vectors and semantic vectors;
and step S3: constructing visual feature data sets of Family level, genus level and specifices level according to the biological classification tree and the visual vector;
and step S4: constructing semantic feature data sets of Family level, genus level and specifices level according to the biological classification tree and the semantic vector;
step S5: initializing a discriminator and a generator with classification regularization;
step S6: from the visual characteristic data set and the semantic characteristic data set of the same level, semantic characteristics and visual characteristics are selected in a cross mode, and the L of the network is generated by using the countermeasure G And L D Performing cross training;
step S7: inputting the semantic special V into a generator to obtain pseudo-visual characteristics, and calculating a TR regularization item according to the pseudo-visual characteristics and the visual characteristics;
step S8: inputting the pseudo-visual features and the visual features into a discriminator, and calculating L D Classification loss and true and false loss of;
step S9: from TR, classification penalty, and true-false penalty, calculate gradients and update L G And L D
Step S10: the process of S6-S9 is iterated until a termination condition is reached.
Preferably, the termination condition is the number of iterations set before training, and the number of iterations is 5000-10000.
Preferably, the classification method comprises the following steps:
step S1: inputting image data to be classified into a visual feature extraction module and a semantic feature extraction module to respectively obtain visual features and semantic features;
step S2: inputting the semantic features into a generator to obtain pseudo visual features;
and step S3: and calculating the similarity between the visual features and the pseudo-visual features, and searching the highest similarity between the visual features and the pseudo-visual features, wherein the category to which the visual features with the highest similarity belong is the category to which the animal and plant images belong.
Compared with the prior art, the invention has the beneficial effects that:
1. the Cross Knowledge Learning (CKL) is adopted, more relevant semantic features can be trained, so that the features from semantics to vision are embedded into the ZSL, and the semantic features in the Cross-modal Learning process are enriched;
2. the invention uses classification normalization to generate more universal visual characteristics so as to increase the cross points of unseen images in the ZSL, thereby obviously relieving the adverse effect brought by cross-domain problems;
3. the model and the method are simple and efficient, and high-accuracy classification results are obtained on a plurality of authoritative data sets.
Drawings
FIG. 1 is a diagram of a classification model framework according to the present invention;
FIG. 2 is a block diagram of a classification method according to the present invention;
Detailed Description
The invention will be further described with reference to specific embodiments, and the advantages and features of the invention will become apparent as the description proceeds. The examples are illustrative only and do not limit the scope of the present invention in any way. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention, and that such changes and substitutions are intended to be within the scope of the invention.
As shown in FIG. 1, a zero-order image classification model based on cross knowledge comprises
A biological classification tree module: constructing a biological classification tree according to all categories in the data set;
the visual feature extraction module: the system is used for converting images in the data set into one-dimensional visual features;
a semantic feature extraction module: the semantic feature extraction module is used for converting texts or attributes in the data set into one-dimensional semantic features;
a cross knowledge learning module: semantic information for enriching categories;
generating a confrontation network module: the image recognition system comprises a generator and a discriminator, wherein the generator generates a pseudo visual feature from a semantic feature, and the discriminator is used for discriminating the authenticity and the category of an image.
Preferably, the bio-taxonomy tree includes a Family level, a Genus level, and a specifices level, the specifices level including all categories in the dataset.
Preferably, the visual feature extraction module adopts ResNet101, and the semantic feature extraction module adopts Term Frequency Inverse Document Frequency.
Preferably, the cross-learning module is a Family level and a Genus level which are obtained by combining a plurality of classes into a biological classification tree by using biological taxonomy, and the cross-learning is performed inside the Family level, the Genus level and the specifices level respectively.
Preferably, the generator is represented as:
Figure RE-GDA0003125249140000066
the arbiter is represented as:
Figure RE-GDA0003125249140000061
L T expressed as:
Figure RE-GDA0003125249140000062
wherein, the first and the second end of the pipe are connected with each other,
Figure RE-GDA0003125249140000063
respectively expressed as:
Figure RE-GDA0003125249140000064
K s,F expressed as:
Figure RE-GDA0003125249140000065
preferably, the generator employs taxonomic regularization.
Preferably, the data set is animal and plant image data.
Preferably, the training method of the model comprises the following steps:
step S1: constructing a biological classification tree according to all class names in the data set;
step S2: respectively inputting images, texts or attribute descriptions in the data set into a visual feature extraction module and a semantic feature extraction module, and extracting visual vectors and semantic vectors;
and step S3: constructing visual feature data sets of Family level, genus level and specifices level according to the biological classification tree and the visual vector;
and step S4: constructing semantic feature data sets of Family level, genus level and specifices level according to the biological classification tree and the semantic vector;
step S5: initializing a discriminator and a generator with classification regularization;
step S6: cross-selecting semantic features and semantic features from the same hierarchical visual feature dataset and semantic feature datasetVisual features, L of network generation with antagonism G And L D Performing cross training;
step S7: inputting the semantic character V into a generator to obtain a pseudo-visual feature, and calculating a TR regularization term according to the pseudo-visual feature and the visual feature;
step S8: inputting the pseudo-visual features and the visual features into a discriminator, and calculating L D Classification loss and true and false loss of (c);
step S9: from TR, classification penalty, and true-false penalty, calculate gradients and update L G And L D
Step S10: the process of S6-S9 is iterated until a termination condition is reached.
Preferably, the termination condition is the number of iterations set before training, and the number of iterations is 5000-10000.
Preferably, the classification method as shown in fig. 2 includes the following steps:
step S1: inputting image data to be classified into a visual feature extraction module and a semantic feature extraction module to respectively obtain visual features and semantic features;
step S2: inputting the semantic features into a generator to obtain pseudo visual features;
and step S3: and calculating the similarity between the visual features and the pseudo visual features, and searching the highest similarity between the visual features and the pseudo visual features, wherein the category to which the visual features with the highest similarity belong is the category to which the animal and plant images belong.
Example 1
A zero-order image classification model training method based on cross knowledge comprises the following steps:
step S1: constructing a biological classification tree according to all class names in the data set;
step S2: respectively inputting images, texts or attribute descriptions in the data set into a visual feature extraction module and a semantic feature extraction module, and extracting visual vectors and semantic vectors;
and step S3: constructing visual feature data sets of Family level, genus level and specifices level according to the biological classification tree and the visual vector;
and step S4: constructing semantic feature data sets of Family level, genus level and specifices level according to the biological classification tree and the semantic vector;
step S5: initializing a discriminator and a generator with classification regularization;
step S6: from the visual characteristic data set and the semantic characteristic data set of the same level, semantic characteristics and visual characteristics are selected in a cross mode, and the L of the network is generated by using the countermeasure G And L D Performing cross training;
step S7: inputting the semantic special V into a generator to obtain pseudo-visual characteristics, and calculating a TR regularization item according to the pseudo-visual characteristics and the visual characteristics;
step S8: inputting the pseudo-visual features and the visual features into a discriminator, and calculating L D Classification loss and true and false loss of (c);
step S9: from TR, classification penalty, and true-false penalty, calculate gradients and update L G And L D
Step S10: the process of S6-S9 is iterated until a termination condition is reached.
Preferably, the termination condition is the number of iterations set before training, and the number of iterations is 5000.
Example 2
A zero-order image classification model training method based on cross knowledge comprises the following steps:
step S1: constructing a biological classification tree according to all class names in the data set;
step S2: respectively inputting images, texts or attribute descriptions in the data set into a visual feature extraction module and a semantic feature extraction module, and extracting visual vectors and semantic vectors;
and step S3: constructing visual feature data sets of Family level, genus level and specifices level according to the biological classification tree and the visual vector;
and step S4: constructing semantic feature data sets of Family level, genus level and specifices level according to the biological classification tree and the semantic vector;
step S5: initializing a discriminator and a generator with classification regularization;
step S6: from the visual characteristic data set and the semantic characteristic data set of the same level, semantic characteristics and visual characteristics are selected in a cross mode, and the L of the network is generated by using the countermeasure G And L D Performing cross training;
step S7: inputting the semantic character V into a generator to obtain a pseudo-visual feature, and calculating a TR regularization term according to the pseudo-visual feature and the visual feature;
step S8: inputting the pseudo-visual features and the visual features into a discriminator, and calculating L D Classification loss and true and false loss of (c);
step S9: from TR, classification penalty, and true-false penalty, calculate gradients and update L G And L D
Step S10: the process of S6-S9 is iterated until a termination condition is reached.
Preferably, the termination condition is the number of iterations set before training, and the number of iterations is 7500.
Example 3
A zero-order image classification model training method based on cross knowledge comprises the following steps:
step S1: constructing a biological classification tree according to all class names in the data set;
step S2: respectively inputting images, texts or attribute descriptions in the data set into a visual feature extraction module and a semantic feature extraction module, and extracting visual vectors and semantic vectors;
and step S3: constructing visual feature data sets of Family level, genus level and specifices level according to the biological classification tree and the visual vector;
and step S4: constructing semantic feature data sets of Family level, genus level and specifices level according to the biological classification tree and the semantic vector;
step S5: initializing a discriminator and a generator with classification regularization;
step S6: from the visual feature data set and the semantic feature data set of the same hierarchy, cross-selectingSelecting semantic and visual features, generating L of a network with a countermeasure G And L D Performing cross training;
step S7: inputting the semantic special V into a generator to obtain pseudo-visual characteristics, and calculating a TR regularization item according to the pseudo-visual characteristics and the visual characteristics;
step S8: inputting the pseudo-visual features and the visual features into a discriminator, and calculating L D Classification loss and true and false loss of;
step S9: from TR, classification penalty, and true-false penalty, calculate gradients and update L G And L D
Step S10: the process of S6-S9 is iterated until a termination condition is reached.
Preferably, the termination condition is the number of iterations set before training, and the number of iterations is 10000.
Example 4
This implementation was performed on four datasets: CUB (Caltech-UCSD-Birds 200-2011), NAB (North America Birds), aPY (Attributes Pascal and Yahoo), awA2 (Attributes with Attributes 2).
Data set details as shown in table 1, we used the CUB and NAB data sets as ZSL benchmarks based on wikipedia, and the AwA2 and aPY data sets as ZSL benchmarks based on attributes. In ZSL based on wikipedia, we used TF-IDF to extract 7551D features of the data set CUB and 13217D features of NAB, respectively. In attribute-based ZSL, we directly use the attributes provided in the original dataset as semantic features. Also, we use the visual features provided by the original dataset directly, using a pre-trained ResNet101 extraction.
1. Data set partitioning scenarios
Table 1 data set details
Figure RE-GDA0003125249140000101
Figure RE-GDA0003125249140000111
Description of the drawings: s represents the dimension of the semantic features, type represents the type of the semantic features, X represents the number of pictures, Y represents the number of types of visible classes and unseen classes, ys represents the number of visible classes, and Yu represents the number of unseen classes.
There are two partitioning strategies in the CUB and NAB datasets, super-Category-Shared (simple) and Super-Category-Exclusive (difficult). The two partitioning strategies are divided according to whether they share the same parent class or not. In SCS partitioning, for each unseen class, there are one or more unseen classes of the parent class. For example, the seen class "Indigo Bunting" and the unseen class "Lazuli Bunting" have the same parent class "Bunting". In SCE partitioning, unseen classes never share the same parent class as seen classes. It can be seen that the visible class and the unseen class have higher correlation on SCS, while the visible class and the unseen class have less correlation on SCE. Therefore, zero-learning classification and retrieval under SCE is more difficult than under SCS.
2. In the present embodiment, two evaluation criteria are used:
top-1 accuracy: the accuracy rate of the first ranked category matching the actual result;
area Under Sen-Unsien Accuracy Curve (AUSUC): known-area under unknown precision curve.
Table 2 shows the comparison with the current optimal accuracy method at different segmentations on CUB and NAB. The values of the upper right hand indices in the numbers represent an increase and decrease in accuracy compared to the corresponding baseline method. Obviously, the invention is superior to the CIZSL and other reference methods.
TABLE 2 comparison of ZSL image classification results on CUB and NAB
Figure RE-GDA0003125249140000112
Figure RE-GDA0003125249140000121
To prove that our approach is effective under different semantic representations, we follow the GBU setting, whereby the textual semantic representation of Wikipedia is changed to an attribute. As shown in table 3, our approach is significantly superior to all others on the AwA2 and the aby datasets.
TABLE 3 comparison of ZSL image classifications on AwA2 and aPY datasets
Figure RE-GDA0003125249140000122
Figure RE-GDA0003125249140000131
As shown in tables 2 and 3, compared with the reference experiment, after using the CKL and the TR, the experiment effect is significantly improved, which shows that the present invention not only effectively solves the problem that the neural network distinguishes the seen categories and the unseen categories, but also can reduce the prediction category space of the zero sample image classification under the broad sense, and effectively prevents the problems of cross-domain and cross-mode.
In summary, compared with the reference method, the method provided by the present embodiment obtains a better result in the evaluation index, thereby verifying the validity of CKL and TR in the method. In addition, the model has better independence, and the zero sample image classification model and the generation countermeasure model are trained separately.

Claims (8)

1. A zero-order image classification model based on cross knowledge is characterized by comprising
A biological classification tree module: constructing a biological classification tree according to all categories in the data set;
the visual feature extraction module: the system is used for converting images in the data set into one-dimensional visual features;
a semantic feature extraction module: the semantic feature extraction module is used for converting texts or attributes in the data set into one-dimensional semantic features;
a cross knowledge learning module: the system is used for enriching the category semantic information, and transmits the data obtained by the biological classification tree module, the visual characteristic extraction module and the semantic characteristic extraction module to the confrontation network generation module after processing;
generating a confrontation network module: the image recognition system comprises a generator and a discriminator, wherein the generator generates a pseudo-visual feature from a semantic feature, and the discriminator discriminates the authenticity and the category of an image according to the pseudo-visual feature and the visual feature;
the biological classification tree comprises a Family level, a Genus level and a specials level, wherein the specials level comprises all categories in the data set; the cross knowledge learning module is used for merging a plurality of classes into a Family level and a Genus level in a biological classification tree by using biological taxonomy, and performing cross learning respectively in the Family level, the Genus level and the specifices level.
2. The cross-knowledge-based image zero-order classification model of claim 1, wherein the visual feature extraction module adopts ResNet101, and the semantic feature extraction module adopts Term Frequency Inverse Document Frequency.
3. The cross-knowledge based image zero-order classification model of claim 1, wherein the generator is represented as:
L G =-E z ~P z [D ω (G θ (ts,A,z))]+L cls (D ω (G θ (ts,A,Z)))+L T
the arbiter is represented as:
Figure DA00038003298351289030
L T expressed as:
Figure FDA0003800329830000022
wherein the content of the first and second substances,
Figure FDA0003800329830000023
respectively expressed as:
Figure DA00038003298351310176
K s,F expressed as:
Figure FDA0003800329830000025
and constructing visual feature data sets of Family level, genus level and specifices level according to the biological classification tree and the visual vector.
4. The cross-knowledge based image zero-order classification model according to claim 3, wherein the generator employs taxonomic regularization.
5. The cross-knowledge based image zero-order classification model according to claim 1, wherein the data set is animal and plant image data.
6. The training method of the zero-order image classification model based on the cross knowledge as claimed in claim 1, wherein the training method of the model comprises the following steps:
step S1: constructing a biological classification tree according to all class names in the data set;
step S2: respectively inputting images, texts or attribute descriptions in the data set into a visual feature extraction module and a semantic feature extraction module, and extracting visual vectors and semantic vectors;
and step S3: constructing visual feature data sets of Family level, genus level and specifices level according to the biological classification tree and the visual vector;
and step S4: constructing semantic feature data sets of Family level, genus level and specifices level according to the biological classification tree and the semantic vector;
step S5: initializing a discriminator and a generator with classification regularization;
step S6: cross-selecting semantic features and visual features from the same level of visual feature dataset and semantic feature dataset, using the L of the countermeasure generation network G And L D Performing cross training;
step S7: inputting the semantic features into a generator to obtain pseudo-visual features, and calculating TR regularization terms according to the pseudo-visual features and the visual features;
step S8: inputting the pseudo-visual features and the visual features into a discriminator, and calculating L D Classification loss and true and false loss of;
step S9: from TR, classification penalty, and true-false penalty, calculate gradients and update L G And L D
Step S10: the process of S6-S9 is iterated until a termination condition is reached.
7. The cross knowledge-based training method for the image zero-order classification model according to claim 6, wherein the termination condition is the number of iterations set before training, and the number of iterations is 5000-10000.
8. The method for classifying the zero-order image classification model based on the cross knowledge as claimed in claim 1, wherein the method for classifying the zero-order image comprises the following steps:
step S1: inputting image data to be classified into a visual feature extraction module and a semantic feature extraction module to respectively obtain visual features and semantic features;
step S2: inputting the semantic features into a generator to obtain pseudo visual features;
and step S3: and calculating the similarity between the visual features and the pseudo-visual features, and searching the highest similarity between the visual features and the pseudo-visual features, wherein the category to which the pseudo-visual features with the highest similarity belong is the category to which the image belongs.
CN202011402935.4A 2020-12-04 2020-12-04 Image zero-order classification model based on cross knowledge and classification method thereof Active CN113191381B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011402935.4A CN113191381B (en) 2020-12-04 2020-12-04 Image zero-order classification model based on cross knowledge and classification method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011402935.4A CN113191381B (en) 2020-12-04 2020-12-04 Image zero-order classification model based on cross knowledge and classification method thereof

Publications (2)

Publication Number Publication Date
CN113191381A CN113191381A (en) 2021-07-30
CN113191381B true CN113191381B (en) 2022-10-11

Family

ID=76972575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011402935.4A Active CN113191381B (en) 2020-12-04 2020-12-04 Image zero-order classification model based on cross knowledge and classification method thereof

Country Status (1)

Country Link
CN (1) CN113191381B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114154645B (en) * 2021-12-03 2022-05-17 中国科学院空间应用工程与技术中心 Cross-center image joint learning method and system, storage medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598759A (en) * 2019-08-23 2019-12-20 天津大学 Zero sample classification method for generating countermeasure network based on multi-mode fusion
CN111563554A (en) * 2020-05-08 2020-08-21 河北工业大学 Zero sample image classification method based on regression variational self-encoder

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10908616B2 (en) * 2017-05-05 2021-02-02 Hrl Laboratories, Llc Attribute aware zero shot machine vision system via joint sparse representations
CN109492662B (en) * 2018-09-27 2021-09-14 天津大学 Zero sample image classification method based on confrontation self-encoder model
CN109816032B (en) * 2019-01-30 2020-09-11 中科人工智能创新技术研究院(青岛)有限公司 Unbiased mapping zero sample classification method and device based on generative countermeasure network
CN110443293B (en) * 2019-07-25 2023-04-07 天津大学 Zero sample image classification method for generating confrontation network text reconstruction based on double discrimination
CN110580501B (en) * 2019-08-20 2023-04-25 天津大学 Zero sample image classification method based on variational self-coding countermeasure network
CN110795585B (en) * 2019-11-12 2022-08-09 福州大学 Zero sample image classification system and method based on generation countermeasure network
CN111026898A (en) * 2019-12-10 2020-04-17 云南大学 Weak supervision image emotion classification and positioning method based on cross space pooling strategy
CN111476294B (en) * 2020-04-07 2022-03-22 南昌航空大学 Zero sample image identification method and system based on generation countermeasure network
CN111914929B (en) * 2020-07-30 2022-08-23 南京邮电大学 Zero sample learning method
CN112017182B (en) * 2020-10-22 2021-01-19 北京中鼎高科自动化技术有限公司 Industrial-grade intelligent surface defect detection method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598759A (en) * 2019-08-23 2019-12-20 天津大学 Zero sample classification method for generating countermeasure network based on multi-mode fusion
CN111563554A (en) * 2020-05-08 2020-08-21 河北工业大学 Zero sample image classification method based on regression variational self-encoder

Also Published As

Publication number Publication date
CN113191381A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN111581405B (en) Cross-modal generalization zero sample retrieval method for generating confrontation network based on dual learning
CN104899253B (en) Towards the society image across modality images-label degree of correlation learning method
CN107766933B (en) Visualization method for explaining convolutional neural network
Sousa et al. Sketch-based retrieval of drawings using spatial proximity
CN113065577A (en) Multi-modal emotion classification method for targets
CN105389326B (en) Image labeling method based on weak matching probability typical relevancy models
CN107862561A (en) A kind of method and apparatus that user-interest library is established based on picture attribute extraction
CN111324765A (en) Fine-grained sketch image retrieval method based on depth cascade cross-modal correlation
CN110472652A (en) A small amount of sample classification method based on semanteme guidance
CN110956044A (en) Attention mechanism-based case input recognition and classification method for judicial scenes
CN112818889A (en) Dynamic attention-based method for integrating accuracy of visual question-answer answers by hyper-network
CN114461890A (en) Hierarchical multi-modal intellectual property search engine method and system
Li et al. Multi-view pairwise relationship learning for sketch based 3D shape retrieval
CN113191381B (en) Image zero-order classification model based on cross knowledge and classification method thereof
CN113239159B (en) Cross-modal retrieval method for video and text based on relational inference network
Mi et al. Knowledge-aware cross-modal text-image retrieval for remote sensing images
CN107908749A (en) A kind of personage's searching system and method based on search engine
Averbuch‐Elor et al. Distilled collections from textual image queries
CN116579348A (en) False news detection method and system based on uncertain semantic fusion
CN116562280A (en) Literature analysis system and method based on general information extraction
Rao et al. Deep learning-based image retrieval system with clustering on attention-based representations
Liu et al. A method of measuring the semantic gap in image retrieval: Using the information theory
CN115359486A (en) Method and system for determining custom information in document image
Ye et al. Cross-modality pyramid alignment for visual intention understanding
Tian et al. Research on image classification based on a combination of text and visual features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant