CN114580566A

CN114580566A - Small sample image classification method based on interval supervision contrast loss

Info

Publication number: CN114580566A
Application number: CN202210289086.9A
Authority: CN
Inventors: 杨赛; 胡彬; 杨慧; 周伯俊
Original assignee: Nantong University
Current assignee: Nantong University
Priority date: 2022-03-22
Filing date: 2022-03-22
Publication date: 2022-06-03

Abstract

The invention relates to the technical field of small sample image classification, in particular to a small sample image classification method based on interval supervision contrast loss, which comprises the following steps: pre-training a model on a base class data set by using a novel interval supervision contrast loss function, fixing parameters in an encoder in the pre-training model, extracting characteristics of support image samples in the new class data set, training an SVM classifier, and finally performing classification decision on query samples by using an SVM. The interval supervision contrast loss function in the invention establishes a mathematical model for the contrast relation between the base class samples, rather than only paying attention to the class to which the base class samples belong, and the pre-trained backbone network has more mobility. The supervision contrast loss function in the invention can further reduce the distance of data in the class and increase the distance between classes by increasing the interval parameters, thereby further improving the classification performance.

Description

Small sample image classification method based on interval supervision contrast loss

Technical Field

The invention relates to the technical field of small sample image classification, in particular to a small sample image classification method based on interval supervision contrast loss.

Background

In recent years, deep learning has made a significant breakthrough in several fields of artificial intelligence, such as computer vision, speech recognition, natural language processing, automated driving, etc. However, the powerful learning capabilities of deep convolutional neural networks rely entirely on large amounts of manual label data. The large data-driven nature of the technique severely limits the application of deep learning techniques in many practical situations. Because the cost of accurately labeling large amounts of data is very costly and sometimes impossible, such as medical data for rare diseases, image data for rare species, etc. Under such a background, researchers have actively studied the problem of classifying small sample images, and the purpose of the research is to complete classification and recognition of images by a deep neural network when the number of samples per category is very small.

Meta learning obtains intrinsic meta knowledge by learning a large number of tasks, so that new tasks can be rapidly processed, and the meta learning is a mainstream learning paradigm for solving the problem of small sample image classification. Recent studies, however, have shown that combining feature extractors pre-trained on the entire base class dataset with traditional classifiers can achieve classification performance comparable to complex meta-learning models. The method adopts a two-stage transfer learning paradigm, namely, a backbone network is trained on a large number of labeled base class data sets, so that a good feature extractor can be improved for new class data with a small number of samples in the next class post-processing stage. Generally, the method uses a cross entropy loss function to perform optimization training, and then removes the last layer related to the base class in the backbone network, extracts new class features and completes classification decision. For example, Chen et al (ChenWY, Liu C Y, Kira Z, Wang Y F, Huang J B.A custom look at raw-shot classification [ C ]// Proceedings of the 7th International Conference on Learning retrieval. New Oreans, USA: ICLR,2019:1-8.) propose to optimize the neural network consisting of backbone network and Cosine classifier at the base class dataset using cross entropy loss function, then fix backbone network parameters and fine tune the Cosine classifier with the new class support samples to complete the classification of the query samples. Subsequently, several efforts were proposed in succession, how to further improve the generalization performance of backbone networks. For example, Liu et al (Liu B, Cao Y, Lin Y T, Li Q, Zhang Z, Long M S, Hu H. Positive mapping tables: understating mapping in raw-shot classification [ C ]// Proceedings of the16th European Conference on Computer vision. Berlin, German: Springer,2020:1-8.) also employed a model consisting of backbone network and Cosine classifier in the pre-training phase, while optimizing using a cross-entropy loss function with spacing; mangla et al (Mangla P, Kumari N, Sinha A, Singh M, Krishn acupressure B, Balasuramanian V N. charting the right maniffold: Maniffold mix for feed-shot sharing [ C ]// Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Colorado, USS,2020: 2218-2227.) and Tian et al (Tian Y, Wang Y, Krispann D, et al. Retening few-shot image classification: a good matching you neu/Proceedings of the16th year Conference on company, jet in which distillation was proposed in the future: spring et al. prediction technology: spring, in which A. was introduced in the prediction process and the self-supervision process 2020, respectively; rizve et al (Rizve M N, Khan S, Khan F S, expanding computational Strengths of Invariant and equivalent responses for Few-Shot Learning [ C ]// Proceedings of the 33rd IEEE Conference on Computer Vision and Pattern Recognition.) design a multitask Learning loss function that can maintain isotopy and invariance to complete the pre-training of backbone networks.

After the pre-training phase is completed, only the backbone network in the model is retained since the last layer associated with the base class is removed. At this time, if the classification and identification of the new class data are to be completed, the decision is completed by means of the retrained classifier. The nonparametric model K-nearest neighbor classifier or prototype classifier is a common classification model in a small sample image classification task due to simplicity and effectiveness. For example, Wang et al (Wang Y, Chao W L, Weinberger K Q, Maaten V D. Simpleshot: revising near-neighbor classification for raw-shot learning. arXivpreprint arXiv:1911.04623,2019:1-8.) can achieve good classification performance using a KNN classifier after whitening preprocessing of new-type data. Wang et al (Wangz Y, ZhaoYF, Li J, TianYH. cooperative bi-path metric for raw-shot learning [ C ]// Proceedings of the 28th ACM International Conference on multimedia. New York, USA: ACM press,2020: 1524. sup. 1532.) also used KNN classifiers, but used query samples and correlations between support and base classes to determine neighbor samples in the classifier. Besides, parameter classifiers such as SVM and logistic regression also achieve good classification performance in a small sample image classification task.

The above method for classifying the thumbnail images by adopting the transfer learning paradigm achieves classification performance comparable to that of a complex meta-learning model. Generally, such methods use a cross entropy loss function to pre-train the whole base class data set, and pay more attention to the class to which the base class sample belongs, thereby seriously affecting the mobility of the base class sample in a new class image domain.

In order to solve the above problems, the present invention provides a small sample image classification method based on interval supervision contrast loss.

Disclosure of Invention

Aiming at the problems, the invention provides a small sample image classification method based on interval supervision and contrast loss, which leads a backbone network to have mobility by introducing a supervision and contrast loss function with intervals to pre-train on a base class data set, thereby improving the accuracy of identifying a new class of data samples.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a small sample image classification method based on interval supervision contrast loss comprises the following steps:

step 1: for a given dataset D, determining a base class dataset D to be processed_baseAnd a new class data set D_novel(ii) a Wherein

x_iRepresents the ith base class image sample, y _iIndicates its corresponding label; at D_novelThe N-way-K-shot classification tasks are constructed, and each classification task is provided with a support sample set

And querying the sample set

Composition of (a) wherein x_sDenotes the s-th support sample, y_sDenotes the corresponding label, x_qRepresents the qth query sample;

step 2: from base class dataset D_baseRandomly decimating M image samples, expanding them to 2M samples using image enhancement techniques, and inputting them to an encoder E with a parameter α_αProjector P with parameter beta_βIn (c), a set of 2M data features is obtained

Assuming any data in the set as an anchor point and taking the anchor point as a positive sample, taking all samples in the set A (i) with the anchor point label as positive samples to construct a positive sample set P (i), and taking all the rest samples in the set A (i) as negative samples to construct a negative sample set N (i);

and step 3: calculating the interval supervision contrast loss between the positive sample set P (i) and the negative sample set N (i), calculating the gradient of the loss function relative to the parameter, and using gradient descent algorithm to the encoder E_αMiddle parameter alpha and projector P_βOptimizing the parameter beta in the (-) step;

step 4, pre-training the coder E in the step 3 _α(. the) is used as a feature extractor to extract features of the support image samples in the new class data set and train the SVM;

and 5: pre-trained encoder E in step 3_αAnd the (-) is used as a feature extractor for extracting features of the query image samples in the new class data set, and the trained SVM is used for testing.

Preferably, the specific steps of step 1 are as follows:

s11: for a given data set D, it is partitioned into three subsets, the training set D_trainVerification set D_val and test set D_testThe classification categories in each subset are not the same, namely:

D_train∪D_val∪D_test＝D；

s12: will train set D_trainAs base class dataset D_baseTest set D_test as a new class data set D_novel(ii) a At D_novelRandomly extracting N classes in the data set, and randomly extracting K samples in each class to obtain a support sample set

Extracting a batch of samples from the residual data in the N categories to obtain a query sample set

Preferably, the specific steps of step 2 are as follows:

s21: from D_trainExtracting M image samples as a training batch image sample set D_MFor the ith base class image sample x therein_iEnhanced by enhancement techniques

And adding it to the batch image sample set;

s22: data set D_MIs input to encoder E _α(. and projector P)_β(. o) obtain the corresponding d-dimensional feature vector, where encoder E_α(. to) ResNet12, projector P_β() is a multi-layer sensor with only one hidden layer;

then the ith base class image sample x_iIs characterized by:

u_i＝P_β(E_α(x_i))；

s23: through the feature extraction link, a set consisting of 2M data features is obtained at the moment

Will u_iAs a positive example, the labels and y of all samples in set A (i)_iAll samples are regarded as positive samples to form a positive sample set P (i), and the rest samples are regarded as negative samples to form negative samplesThis set N (i).

Preferably, the specific steps of step 3 are as follows:

s31: the calculation formula of the interval supervision contrast loss function constructed based on the positive example sample set P (i) and the negative example sample set N (i) is as follows:

wherein the content of the first and second substances,

in order to be a temperature coefficient of the temperature,

is a spacing coefficient, u_pIs the p-th sample in the positive sample set P (i), u_aFor negative sample set N (i), s (-) represents the cosine similarity measure of any two samples, and the expression is:

s32: calculating the gradient of the interval supervision contrast loss relative to the parameter, wherein the calculation formula is as follows:

wherein

S33: and updating the parameters by using a gradient descent algorithm until the algorithm converges, and finishing the pre-training of the backbone network at the moment.

Preferably, the specific steps of step 4 are as follows:

s41: fixed encoder E_αThe parameter alpha in the (-) extracts the feature of the support sample in the new class data set, and the d-dimensional feature of the s-th support sample is expressed as:

u_s＝P_β(E_α(x_s))

s42: selecting a linear support vector machine to classify and judge the new data samples, wherein an optimized objective function is as follows:

s.t.y_s(w·u_s+b)≥1-ξ_s

ξ_s≥0,s＝1,2,…,2K

wherein, y_sIs a label corresponding to the s-th support sample, C is a penalty coefficient, b is a classification threshold value, and xi_sAnd (4) optimizing and solving the above formula to obtain L support vectors, wherein the w is a parameter of the classifier and is a relaxation variable.

Preferably, the specific steps of step 5 are as follows:

s51: fixed encoder E_αThe parameter alpha in the (-) extracts the feature of the query sample in the new class data set, and the d-dimensional feature of the qth support sample is expressed as:

u_q＝P_β(E_α(x_q))

s52: the classification decision of the trained support vector machine on the query sample is as follows:

wherein alpha is_lIs the Lagrangian coefficient, u_lFor the first supported sample feature, y_lIs it a pairAnd (4) a corresponding label.

The invention has the beneficial effects that:

1. the method utilizes a novel interval supervision contrast loss function to pre-train the model on the base class data set, fixes the parameters in the coder in the pre-trained model, extracts the characteristics of the support image samples in the new class data set and trains an SVM classifier, and finally utilizes the SVM to make classification decision on the query samples.

2. The interval supervision comparison loss function establishes a mathematical model for the contrast relation between the base class samples, instead of only focusing on the class to which the base class samples belong, and the pre-trained backbone network has higher mobility.

3. The supervision contrast loss function in the invention can further reduce the distance of data in the class and increase the distance between classes by increasing the interval parameters, thereby further improving the classification performance.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, so that those skilled in the art can better understand the advantages and features of the present invention, and thus the scope of the present invention is more clearly defined. The embodiments described herein are only a few embodiments of the present invention, rather than all embodiments, and all other embodiments that can be derived by one of ordinary skill in the art without inventive faculty based on the embodiments described herein are intended to fall within the scope of the present invention.

Referring to fig. 1, a small sample image classification method based on interval supervision contrast loss includes the following steps:

x_iRepresents the ith base class image sample, y_iIndicates its corresponding tag; at D_novelThe N-way-K-shot classification tasks are constructed, and each classification task is provided with a support sample set

And querying the sample set

Assuming any data in the set as an anchor point and taking the anchor point as a positive example sample, taking all samples in the set A (i) which are labeled by the anchor point as positive example samples to construct a positive example sample set P (i), and taking all the rest samples in the set A (i) as negative example samples to construct a negative example sample set N (i);

and step 3: calculating the interval supervision contrast loss between the positive sample set P (i) and the negative sample set N (i), calculating the gradient of the loss function relative to the parameter, and using gradient descent algorithm to the encoder E _αParameter α in (v) and projector P_βOptimizing the parameter beta in the (-) step;

step 4, pre-training the coder E in the step 3_α(. the) is used as a feature extractor to extract features of the support image samples in the new class data set and train the SVM;

Specifically, the specific steps of step 2 are as follows:

And adding it to the batch image sample set;

s22: data set D_MIs input to encoder E_α(. and projector P)_β(. o) obtain the corresponding d-dimensional feature vector, where encoder E_α(. to) ResNet12, projector P_β() is a multi-layer sensor with only one hidden layer;

then the ith base class image sample x_iIs characterized by:

u_i＝P_β(E_α(x_i))；

Will u_iAs a positive example, the labels and y of all samples in set A (i) _iAll samples are regarded as positive samples to form a positive sample set P (i), and the remaining samples are regarded as negative samples to form a negative sample set N (i).

Specifically, the specific steps of step 3 are as follows:

wherein the content of the first and second substances,

in order to be a temperature coefficient of the temperature,

wherein

Specifically, the specific steps of step 4 are as follows:

u_s＝P_β(E_α(x_s))

s.t.y_s(w·u_s+b)≥1-ξ_s

ξ_s≥0,s＝1,2,…,2K

wherein, y_sIs a label corresponding to the s-th support sample, C is a penalty coefficient, b is a classification threshold value, and xi _sAnd (4) optimizing and solving the above formula to obtain L support vectors, wherein the w is a parameter of the classifier and is a relaxation variable.

Specifically, the specific steps of step 5 are as follows:

u_q＝P_β(E_α(x_q))

wherein alpha is_lIs the Lagrangian coefficient, u_lFor the first supported sample feature, y_lIs its corresponding tag.

In summary, the invention utilizes a novel interval supervision contrast loss function to pre-train the model on the base class data set, fixes the parameters in the coder in the pre-training model, extracts the features of the support image samples in the new class data set and trains the SVM classifier, and finally utilizes the SVM to make classification decision on the query samples.

The embodiments of the present invention have been described in detail, but the description is only for the preferred embodiments of the present invention and should not be construed as limiting the scope of the present invention. All equivalent changes and modifications made within the scope of the present invention shall fall within the scope of the present invention.

Claims

1. A small sample image classification method based on interval supervision contrast loss is characterized by comprising the following steps:

And querying the sample set

and step 3: calculating the interval supervision contrast loss between the positive sample set P (i) and the negative sample set N (i), calculating the gradient of the loss function relative to the parameter, and using the gradient Encoder E of reducing algorithm pair_αMiddle parameter alpha and projector P_βOptimizing the parameter beta in the (-) step;

2. The method for classifying small sample images based on interval supervision contrast loss according to claim 1 is characterized in that the specific steps of step 1 are as follows:

s11: for a given data set D, it is partitioned into three subsets, the training set D_trainVerification set D_valAnd test set D_testThe classification categories in each subset are not the same, namely:

D_train∪D_val∪D_test＝D；

s12: will train set D_trainAs base class dataset D_baseTest set D_testAs a new class data set D_novel(ii) a At D_novelRandomly extracting N classes in the data set, and randomly extracting K samples in each class to obtain a support sample set

3. The method for classifying small sample images based on interval supervision contrast loss according to claim 1 is characterized in that the specific steps of the step 2 are as follows:

And adding it to the batch image sample set;

then the ith base class image sample x_iIs characterized by:

u_i＝P_β(E_α(x_i))；

Will u_iAs a positive example, the labels and y of all samples in set A (i)_iAll samples are regarded as positive samples to form a positive sample set P (i), and the remaining samples are regarded as negative samples to form a negative sample set N (i).

4. The method for classifying small sample images based on interval supervision contrast loss according to claim 1 is characterized in that the specific steps of the step 3 are as follows:

wherein the content of the first and second substances,

In order to be a temperature coefficient of the temperature,

is a spacing coefficient, u_pIs the p sample in the positive example sample set P (i), u_aFor negative sample set N (i), s (-) represents the cosine similarity measure of any two samples, and the expression is:

wherein

5. The method for classifying small sample images based on interval supervision contrast loss according to claim 1 is characterized in that the specific steps of the step 4 are as follows:

u_s＝P_β(E_α(x_s))

s.t.y_s(w·u_s+b)≥1-ξ_s

ξ_s≥0,s＝1,2,…,2K

6. The method for classifying small sample images based on interval supervision contrast loss according to claim 1 is characterized in that the specific steps of the step 5 are as follows:

u_q＝P_β(E_α(x_q))