CN114299326A - Small sample classification method based on conversion network and self-supervision - Google Patents
Small sample classification method based on conversion network and self-supervision Download PDFInfo
- Publication number
- CN114299326A CN114299326A CN202111483193.7A CN202111483193A CN114299326A CN 114299326 A CN114299326 A CN 114299326A CN 202111483193 A CN202111483193 A CN 202111483193A CN 114299326 A CN114299326 A CN 114299326A
- Authority
- CN
- China
- Prior art keywords
- class
- feature
- embedding
- small sample
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000000007 visual effect Effects 0.000 claims description 25
- 239000002131 composite material Substances 0.000 claims description 13
- 238000012360 testing method Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 8
- 238000005457 optimization Methods 0.000 claims description 8
- 238000009826 distribution Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 230000015572 biosynthetic process Effects 0.000 claims description 5
- 238000003786 synthesis reaction Methods 0.000 claims description 5
- 238000003780 insertion Methods 0.000 claims description 3
- 230000037431 insertion Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 238000011524 similarity measure Methods 0.000 claims 1
- 238000013145 classification model Methods 0.000 abstract 1
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a small sample classification method based on a conversion network and self-supervision, which is characterized in that a conversion network module is added on the basis of a general classification model, different noises are added for characteristic enhancement, and characteristic embedding with distinctiveness and diversity is synthesized, so that a trained model can be better suitable for downstream tasks of small samples. The method specifically comprises the following steps: acquiring an image data set for training a feature extractor and a conversion network module; sending the image data set into a network, using a feature enhancement method to obtain feature embedding with distinctiveness and diversity, and combining a self-supervision learning training feature extractor and a conversion network module to optimize the sum of several cross entropy losses and KL divergence; and obtaining a trained feature extractor and a conversion network module, and applying the trained feature extractor and the conversion network module to a small sample classification task. The invention has good performance on 4 small sample classification task benchmarks (miniImageNet, tiered ImageNet, CIFAR-FS and Caltech-UCSD), and proves the effectiveness and superiority of the performance.
Description
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a small sample classification method for adding a conversion network and self-supervision.
Background
Small sample learning aims at identifying target classes with only a small number of samples per class. To accomplish this task, many existing methods train models with base classes, each of which contains a large number of labeled samples, and then apply the trained models to the testing task. Existing small sample learning methods can be roughly classified into three classes based on data migrated from the base class: a meta-learning based approach; a metric-based learning method; a method based on data enhancement.
The meta-learning-based method tries to learn a meta-learner which can adjust an optimization algorithm so that the meta-learner can quickly adapt to a small sample task;
the method based on metric learning refers to learning a migratable distance metric function to evaluate the similarity between samples;
the data enhancement-based method refers to enhancing data by using a general image transformation technique or generating a countermeasure network. However, this method is not always satisfactory in performance because it lacks the characteristics required by the small sample task.
The classification problem in small sample learning mainly refers to a C-way K-shot problem, which refers to: in the training stage, C classes are randomly extracted from the training set, K samples (C × K data in total) of each class are input as a support set of the model, and Q samples are extracted from the remaining data in the C classes as a query set of the model, that is, how the model distinguishes the C classes from the C × K data is required.
Disclosure of Invention
The invention provides a small sample classification method added with a conversion network and self-supervision, which is better suitable for downstream tasks of small samples. The method is characterized in that a conversion network module is added, the conversion network module is composed of a pair of an encoder and a decoder, and the output is a synthesized characteristic embedding. The method uses a simple feature synthesis technology to disturb the feature space, and synthesizes the feature embedding with distinctiveness and diversity, which is realized by correctly classifying the synthesized feature embedding into the type of the original feature embedding, and simultaneously classifying the synthesized feature embedding into different subclasses according to different added interferences. In addition, in the process of ensuring diversity, self-supervision learning is utilized. This is just a desirable feature for small sample tasks.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a small sample classification method for adding a switching network and self-supervision comprises the following steps:
s1, acquiring an image data set for training the feature extractor and the conversion network module;
s2, sending the image data set to a network, using a feature enhancement method to obtain feature embedding with distinctiveness and diversity, and combining a self-supervision learning training feature extractor and a conversion network module;
and S3, using the trained model for a small sample classification task.
Further, in step S1, a base class is givenWhere n is the total number of images in the data set, xiAnd yiRespectively representing the ith image and its corresponding class label, yiE { 1.., C }, C representing the total number of classes, each class containing multiple images.
Further, step S2 specifically includes:
s21, randomly sampling a batch of image samples from the image data set in a batch processing mode during deep neural network trainingWherein the batch size NbsPresetting;
and S22, sending the batch image samples in the B into a model consisting of a backbone network and a classifier to obtain the prediction probabilities of the batch image samples. The optimization goal of the model using cross-entropy (CE) loss is
Where f and g represent the feature extractor and classifier, respectively, Θ is the parameter set, LceDenotes CE loss, R denotes the regularization term of the parameter set, and λ is a hyper-parameter.
And S23, in order to ensure the embedding distinctiveness of the synthesized features, sending the synthesized features into a classification network of the original visual features, and enabling the prediction class to be consistent with the class to which the original visual features belong. The classification of the composite feature embedding is
Where t is the number of additional composite feature insertions, cjThe j is the characteristics of the Gaussian distribution noise, T is the conversion network module, yijIs the synthesis of the feature T (f (x)i),cj) Class label of (2), which is related to the original visual feature f (x)i) The class labels of (a) are the same, Θ represents the parameter set of the entire model.
And S24, in order to ensure the diversity of the embedding of the synthesized features, the features synthesized with different noises are divided into different subclasses. Embedding the original visual features and the synthesized features into a classifier different from the above classifier, and outputting the original visual features and the synthesized features into different categories
Wherein lijIs an auto-supervised class label that is manually annotated according to different distributions of noise, and h denotes an auto-supervised classifier.
S25, regularizing the composite feature embedding in the label space by using the real visual features to ensure that the composite feature embedding retains the inter-class relation of the real visual features
Wherein KL represents the Kullback Leibler divergence, xijIs class yiIn (1)And (4) real samples. f (x)ij) Embedding T (f (x) as a composite featurei),cj) The monitor of (2) is not optimized.
S26, the overall optimization objective is
Lall=L1+L2+αL3+βL4
Where α and β are hyperparameters.
S27, training a deep neural network by using a random gradient descent optimizer with momentum and a back propagation algorithm according to the obtained total loss function;
and S28, repeating the steps S21 to S27 until the model converges.
Further, step S3 specifically includes:
s31, given a C-way K-shot classification task, the support set is S. For each support sample xuFirstly, a final feature representation is obtained through a feature extractor and a conversion network module
S32, calculating visual prototype of each category
Wherein c represents a certain class, ScAnd | ScAnd | is the support set and number of samples in the support set for category c.
S33 test sample x in query setuThe probability that it belongs to class c is
Where d is a similarity metric function. Finally, according to the probability of the test sample belonging to the N classes, the class to which the test sample belongs is predicted, and the class with the highest probability is the predicted class.
The small sample classification method for adding the conversion network and self-supervision has the following advantages:
firstly, the method directly synthesizes visual features instead of input data, and ensures the distinctiveness and diversity of the embedding of the synthesized features by introducing SSL supervision;
secondly, the method proves that the synthesis feature embedding can provide an additional mode for feature representation, so that the model is better suitable for a downstream small sample task;
the small sample classification method added with the conversion network and the self-supervision has good performance on 4 small sample classification task benchmarks (miniImageNet, tiered ImageNet, CIFAR-FS and Caltech-UCSD), and proves the effectiveness and superiority of the method in performance.
Drawings
Fig. 1 is a schematic flow chart of a small sample classification method for joining a transition network and self-supervision according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
On the contrary, the invention is intended to cover alternatives, modifications, equivalents and alternatives which may be included within the spirit and scope of the invention as defined by the appended claims.
Referring to fig. 1, in a preferred embodiment of the present invention, a method for joining a transition network and self-supervision small sample classification includes the following steps:
first, an image dataset is obtained for training the feature extractor and the transformation network module.
In particular, the base class is givenWhere n is the total number of images in the data set, xiAnd yiRespectively representing the ith image and its corresponding class label, yi∈{1,.., C representing the total number of categories, each category containing multiple images.
Then, the image data set is sent into a network, feature embedding with distinguishability and diversity is obtained by using a feature enhancement method, and a feature extractor and a conversion network module are trained by combining self-supervision learning. The method specifically comprises the following steps:
firstly, a batch processing mode is adopted when the deep neural network is trained, firstly, a batch of image samples are randomly sampled from an image data setWherein the batch size NbsIs given in advance.
And secondly, sending the batch image samples in the B into a model consisting of a backbone network and a classifier to obtain the prediction probability of the batch image samples. The optimization goal of the model using cross-entropy loss is
Where f and g represent the feature extractor and classifier, respectively, Θ is the parameter set, LceDenotes CE loss, R denotes the regularization term of the parameter set, and λ is a hyper-parameter.
And thirdly, in order to ensure the embedding distinctiveness of the synthesized features, the synthesized features are sent into a classification network of the original visual features, and the prediction classes are consistent with the classes to which the original visual features belong. The classification of the composite feature embedding is
Where t is the number of additional composite feature insertions, cjThe j is the characteristics of the Gaussian distribution noise, T is the conversion network module, yijIs the synthesis of the feature T (f (x)i),cj) Class label of (2), which is related to the original visual feature f (x)i) The class labels of (a) are the same, Θ represents the parameter set of the entire model.
And fourthly, in order to ensure the diversity of the embedding of the synthesized features, the features synthesized with different noises are divided into different subclasses. Embedding the original visual features and the synthesized features into a classifier different from the above classifier, and outputting the original visual features and the synthesized features into different categories
Wherein lijIs an auto-supervised class label that is manually annotated according to different distributions of noise, and h denotes an auto-supervised classifier.
Fifthly, regularizing the embedding of the synthesized features in the label space by using the real visual features to ensure that the embedding of the synthesized features preserves the inter-class relationship of the real visual features
Wherein KL represents the Kullback Leibler divergence, xijIs class yiOf (4) is determined. f (x)ij) Embedding T (f (x) as a composite featurei),cj) The monitor of (2) is not optimized.
The sixth step, get the total optimization goal to
Lall=L1+L2+αL3+βL4
Where α and β are hyperparameters.
Seventhly, training a deep neural network by using a random gradient descent optimizer with momentum and a back propagation algorithm according to the obtained total loss function;
and finally, repeating the steps until the model converges.
And finally, using the trained model for a small sample classification task.
Given a C-way K-shot classification task, the support set is S. For each support sample xuFirstly, a final feature representation is obtained through a feature extractor and a conversion network module
Then, visual prototypes of the respective categories are calculated
Wherein c represents a certain class, ScAnd | ScAnd | is the support set and number of samples in the support set for category c.
Further, for test sample x in the query setuThe probability that it belongs to class c is
Where d is a similarity metric function. Finally, according to the probability of the test sample belonging to the N classes, the class to which the test sample belongs is predicted, and the class with the highest probability is the predicted class.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (4)
1. A small sample classification method for adding a switching network and self-supervision is characterized by comprising the following steps:
s1, acquiring an image data set for training the feature extractor and the conversion network module;
s2, sending the image data set to a network, using a feature enhancement method to obtain feature embedding with distinctiveness and diversity, and combining a self-supervision learning training feature extractor and a conversion network module;
and S3, using the trained model for a small sample classification task.
2. The method for small sample classification for joining transition network and auto-supervision according to claim 1, wherein in step S1, the base class is givenWhere n is the total number of images in the data set, xiAnd yiRespectively representing the ith image and its corresponding class label, yiE { 1.., C }, C representing the total number of classes, each class containing multiple images.
3. The method for small sample classification for joining a switching network and self-supervision according to claim 2, wherein the step S2 specifically includes:
s21, during training, a batch processing mode is adopted, firstly, a batch of image samples are randomly sampled from the image data setWherein the batch size NbsPresetting;
s22, sending the batch image samples in the B into a model consisting of a backbone network and a classifier to obtain the prediction probability of the batch image samples; the optimization goal of the model using cross-entropy loss is
Where f and g represent the feature extractor and classifier, respectively, Θ is the parameter set, LceRepresenting CE loss, R representing the regularization term of the parameter set, λ being a hyper-parameter;
s23, in order to ensure the embedding distinguishability of the synthesized features, the synthesized features are sent into a classification network of the original visual features, and the prediction categories are consistent with the categories to which the original visual features belong; the classification of the composite feature embedding is
Where t is the number of additional composite feature insertions, cjThe j is the characteristics of the Gaussian distribution noise, T is the conversion network module, yijIs the synthesis of the feature T (f (x)i),cj) Class label of (2), which is related to the original visual feature f (x)i) The class labels of (a) are the same, Θ represents the parameter set of the entire model;
s24, in order to ensure the diversity of the embedding of the synthesized features, the features synthesized with different noises are divided into different subclasses, the original visual features and the synthesized features are embedded and sent to a classifier different from the classifier, and the original visual features and the synthesized features are output into different classes
Wherein lijIs an auto-supervised class label manually annotated according to different distributions of noise, h represents an auto-supervised classifier;
s25, regularizing the composite feature embedding in the label space by using the real visual features to ensure that the composite feature embedding retains the inter-class relation of the real visual features
Wherein KL represents the Kullback Leibler divergence, xijIs class yiThe true sample of (1); f (x)ij) Embedding T (f (x) as a composite featurei),cj) The monitor of (2) does not perform optimization;
s26, the overall optimization objective is
Lall=L1+L2+αL3+βL4
Wherein α and β are hyperparameters;
s27, training a deep neural network by using a random gradient descent optimizer with momentum and a back propagation algorithm according to the obtained total loss function;
and S28, repeating the steps S21 to S27 until the model converges.
4. The method for small sample classification for joining a switching network and self-supervision according to any one of claims 1 to 3, wherein the step S3 specifically includes:
s31, given a C-way K-shot classification task, the support set is S. For each support sample xuFirstly, a final feature representation is obtained through a feature extractor and a conversion network module
S32, calculating visual prototype of each category
Wherein c represents a certain class, ScAnd | ScI is the support set and the number of samples in the support set for category c;
s33 test sample x in query setuThe probability that it belongs to class c is
Where d is the similarity measure function and the cosine similarity function is used in the present invention. Finally, according to the probability of the test sample belonging to the N classes, the class to which the test sample belongs is predicted, and the class with the highest probability is the predicted class.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111483193.7A CN114299326A (en) | 2021-12-07 | 2021-12-07 | Small sample classification method based on conversion network and self-supervision |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111483193.7A CN114299326A (en) | 2021-12-07 | 2021-12-07 | Small sample classification method based on conversion network and self-supervision |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114299326A true CN114299326A (en) | 2022-04-08 |
Family
ID=80965005
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111483193.7A Pending CN114299326A (en) | 2021-12-07 | 2021-12-07 | Small sample classification method based on conversion network and self-supervision |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114299326A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114936615A (en) * | 2022-07-25 | 2022-08-23 | 南京大数据集团有限公司 | Small sample log information anomaly detection method based on characterization consistency correction |
CN116071609A (en) * | 2023-03-29 | 2023-05-05 | 中国科学技术大学 | Small sample image classification method based on dynamic self-adaptive extraction of target features |
-
2021
- 2021-12-07 CN CN202111483193.7A patent/CN114299326A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114936615A (en) * | 2022-07-25 | 2022-08-23 | 南京大数据集团有限公司 | Small sample log information anomaly detection method based on characterization consistency correction |
CN114936615B (en) * | 2022-07-25 | 2022-10-14 | 南京大数据集团有限公司 | Small sample log information anomaly detection method based on characterization consistency correction |
CN116071609A (en) * | 2023-03-29 | 2023-05-05 | 中国科学技术大学 | Small sample image classification method based on dynamic self-adaptive extraction of target features |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110209823B (en) | Multi-label text classification method and system | |
CN110334705B (en) | Language identification method of scene text image combining global and local information | |
Oord et al. | Representation learning with contrastive predictive coding | |
CN111428718B (en) | Natural scene text recognition method based on image enhancement | |
CN111552807B (en) | Short text multi-label classification method | |
CN109189767B (en) | Data processing method and device, electronic equipment and storage medium | |
CN111444340A (en) | Text classification and recommendation method, device, equipment and storage medium | |
CN111079847B (en) | Remote sensing image automatic labeling method based on deep learning | |
CN111738169B (en) | Handwriting formula recognition method based on end-to-end network model | |
CN113626589B (en) | Multi-label text classification method based on mixed attention mechanism | |
CN113610173A (en) | Knowledge distillation-based multi-span domain few-sample classification method | |
CN114299326A (en) | Small sample classification method based on conversion network and self-supervision | |
CN114780767B (en) | Large-scale image retrieval method and system based on deep convolutional neural network | |
CN112232395B (en) | Semi-supervised image classification method for generating countermeasure network based on joint training | |
CN113434683A (en) | Text classification method, device, medium and electronic equipment | |
CN114416991A (en) | Method and system for analyzing text emotion reason based on prompt | |
CN116385946B (en) | Video-oriented target fragment positioning method, system, storage medium and equipment | |
CN115795037B (en) | Multi-label text classification method based on label perception | |
CN116226357B (en) | Document retrieval method under input containing error information | |
CN116483942A (en) | Legal case element extraction method based on re-attention mechanism and contrast loss | |
CN116521863A (en) | Tag anti-noise text classification method based on semi-supervised learning | |
CN115577111A (en) | Text classification method based on self-attention mechanism | |
CN115098681A (en) | Open service intention detection method based on supervised contrast learning | |
CN115374943A (en) | Data cognition calculation method and system based on domain confrontation migration network | |
CN114529746B (en) | Image clustering method based on low-rank subspace consistency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |