CN114092747A

CN114092747A - Small sample image classification method based on depth element metric model mutual learning

Info

Publication number: CN114092747A
Application number: CN202111440323.9A
Authority: CN
Inventors: 杨赛; 杨慧; 周伯俊; 胡彬
Original assignee: Nantong University
Current assignee: Nantong University
Priority date: 2021-11-30
Filing date: 2021-11-30
Publication date: 2022-02-25

Abstract

The invention discloses a small sample image classification method based on depth element metric model mutual learning. The method is characterized by comprising two element measurement models with different parameters, wherein each model is used for predicting a query sample and improving a regularization term for the other network, and the regularization term is obtained by calculating a KL divergence value between the two outputs. The method can be fused with the unit-scale model with any depth, so that the overfitting problem is avoided, and the generalization performance of the extracted features is improved; and the classification decision of any depth meta-metric model can be further drawn to the optimal classification decision boundary through a mutual learning technology.

Description

Small sample image classification method based on depth element metric model mutual learning

Technical Field

The invention belongs to the field of small sample image classification, and particularly relates to a small sample image classification method based on depth element metric model mutual learning.

Background

With the emergence of big data and the rapid development of computer hardware, deep learning makes breakthrough progress in the task of image classification. On large-scale reference databases such as ImageNet, classification models based on deep convolutional neural networks reach even the human recognition level. However, the great success of deep learning relies entirely on large-scale data, which severely limits its application in many scenarios. Because it is very difficult to collect a large amount of tag data, it is very labor-intensive, sometimes even impossible, to collect medical data of rare diseases, collect multi-user manual annotation data, and the like. In contrast, humans need only a few images to recognize a large number of objects and have the ability to quickly understand and generalize new concepts. The high-efficiency learning ability of human beings directly motivates students to carry out extensive research on the problem of classifying small sample images.

The small sample image classification task is to complete classification decision of a test image under the condition that the number of samples of each category is very small. Deep meta learning is a popular learning paradigm to solve this problem. The depth metric model has the advantages of high training efficiency, good classification effect and the like, and is the most effective method for solving the problem of small sample image classification at present. The basic idea is to project an image sample into a certain feature space by using a deep neural network, calculate the similarity of the sample in the feature space, and classify the similarity into the same category. Wherein the classical model comprises a Matching network (Matching Networks) proposed by Vinyals et al (Vinyals O, Blundell C, Lillicrap T, et al. Matching Networks for one shot learning [ C ]// Processing of the 30th annular Conference on Neural Information Processing Systems, Barcelona, Spain: NIPS,2016:1-8.) is an end-to-end differentiable KNN network, extracting features using attention LSTM and bidirectional LSTM for the support sample set and the query sample set respectively, and the output of the final classifier is a weighted sum of predicted values between the support sample set and the query sample set; snell et al (Snell J, Swersky K, Zemel R.prototypical Networks for raw-shot learning [ C ]// Proceedings of the 31st annular Conference on Neural Information Processing Systems, Long Beach, CA, USA: NIPS,2017:4077-4087.) propose prototype Networks (Prototypical Networks) which assume that there is an embedding of each class around a certain prototype expression and calculate the mean value of the supporting sample set in the embedding space as the expression of the prototype, translating the classification problem into the nearest neighbor in the embedding space. The network model uses fixed measurement modes on similarity measurement, such as cosine similarity, Euclidean distance and the like, and the learning part is embodied in the aspect of feature embedding. Sung et al (Sun F, Yang Y, Zhang L, et al. learning to compare: relationship Network for raw-shot learning [ C ]// Proceedings of the 31st Conference on Computer Vision and Pattern registration. Piscataway, NJ, USA: IEEE Press,2018: 1199-.

Extracting effective features is a key step for image classification, and how to improve the feature characterization capability in the depth meta-metric model is also a very much concern of scholars. For example, Gidarid et al (Gidaris S, Bursuc A, Komodakis N.boosting few-shot visual learning with self-persistence [ C ]// Proceedings of the 17th International Conference on Computer Vision. Piscataway, NJ, USA: IEEE Press,2019: 8059-; li et al (Li H, edge D, Dodge S, et al. Finding task-replace features for raw-sharing by category transactions of the 32nd IEEE Conference on Computer Vision and Pattern registration. Piscataway, NJ, USA: IEEE,2019:1-10.) consider information between categories, propose a category traversal module consisting of an integrator and a mapper, wherein the integrator extracts common features in each category and the mapper removes irrelevant features. Simon et al (Simon C, Koniusz P, Nock R, Harandi M.Adaptive subspaces for the raw-shot learning [ C ]// Proceedings of the 33rd IEEE Conference on Computer Vision and Pattern recognition. Piscataway, NJ, USA: IEEE Press,2020: 4135-; li and the like (Li A, Huang W, Lan X, Feng J, Li Z, Wang L.boosting few-shot learning with adaptive margin loss [ C ]// Proceedings of the 33rd IEEE Conference on Computer Vision and Pattern recognition. Piscataway, NJ, USA: IEEE Press,2020: 12573-; wu et al (Wu F, Smith J S, Lu W. Experimental protocol type now-shot with capsule network-based encoding [ C ]// Proceedings of the16th European Conference on Computer Vision. Berlin, German: Springer,2020: 237-shot 253.) propose to use capsule network to encode the relative spatial relationship between features and to use a novel triplet loss to enhance the semantic features of the embedded network, thereby achieving the purpose of closer distance between similar samples and farther distance between different types of samples.

Although the method achieves good effect in small sample image classification, the key problem in the task of small sample image classification is that the quantity of training samples is too small to depict the distribution of each type of image sample. The above method requires a large number of similar tasks to be sampled for meta-training, and the commonly used network structure is relatively simple to avoid over-fitting. In order to improve the representation capability of the network structure, the methods gradually adopt a more complex network structure as a base learner in the meta-training process. However, as the complexity of the network increases, the search space for the network parameters also expands, which easily results in overfitting.

Disclosure of Invention

The invention aims to provide a small sample image classification method based on depth meta-metric model mutual learning, and aims to solve the technical problem that overfitting occurs when the depth meta-metric gradually uses a backbone network with a complex structure to extract features at present.

In order to solve the technical problems, the specific technical scheme of the invention is as follows:

a small sample image classification method based on depth element metric model mutual learning comprises the following steps:

step 1, constructing a classification task on each segment for a given data set D; the classification task is that a classification model distinguishes N classes by using K supporting samples of each class; each classification task is supported by a sample set

And querying the sample set

Is composed of (a) wherein

Set of supporting samples representing the nth class, i.e.

Indicates the ith sample to be supported,

indicates its corresponding tag;

the j-th query sample is represented,

indicates its corresponding tag;

step 2, randomly initializing two depth meta-metric models for mutual learning, wherein each depth meta-metric model comprises parameters of

Feature extraction module of

Similarity measurement module g with a parameter theta_θThen, the depth metric models for mutual learning are respectively recorded as:

and

in each classification task, a sample set S is supported_trainAnd query sample set Q_trainRespectively input into a depth metric model I₁And I₂(ii) a Wherein the ith support sample image

And j query sample image

Input into model I₁The features obtained after passing through the feature extraction module are expressed as

Ith support sample image

And j query sample image

Input into model I₂The features obtained after passing through the feature extraction module are expressed as

And then calculating prototypes corresponding to K supporting samples in each type of sample, wherein the prototype of the nth type of each model is represented as P_n1And P_n2Then, the similarity between each query sample and each type of prototype is calculated, the probability value of the query sample belonging to the nth type is calculated by utilizing a Softmax function, and the output of each model is obtained

And

step 3, respectively calculating the cross entropy loss function L corresponding to each model_CE1And L_CE2And mutual information loss function D between them_KL(p₂|p₁) And D_KL(p₁|p₂) To obtain the total loss function

And

step 4, respectively optimizing the two models by using a gradient descent algorithm according to the loss function to complete the meta-training process;

step 5, constructing a classification task in a meta-test stage; the classification task is from meta training set D_testRandomly extracting N classes, and randomly extracting K samples from each class to obtain a support sample set, which is abbreviated as

Set of supporting samples representing the nth class in the test set, i.e.

Indicates the ith sample to be supported,

indicates its corresponding tag; extracting a batch of samples from the residual data in the N categories to obtain a query sample set, and recording the query sample set as a query sample set

Representing the jth test query sample,

indicates its corresponding tag; utilizing a trained depth element metric model I₁And I₂Respectively testing the meta-test set to obtain the jth test query sample

Probability output value belonging to nth class

And

further, the step 1 specifically comprises the following steps:

step 1.1, for a given data set D, it is divided into three subsets, i.e. meta-training set D_trainVerification set of Yuan_valAnd meta test set D_testThe classification category in each subset is different;

step 1.2 from D_trainRandomly extracting N classes, and randomly extracting K samples from each class to obtain a support sample set

Extracting a batch of samples from the residual data in the N categories to obtain a query sample set, and recording the query sample set as a query sample set

Further, the depth metric model I described in step 2₁And I₂The output is calculated as follows:

step 2.1, support sample set S for meta-training_trainThe nth class supports prototypes of samples

And

the calculation formula is as follows:

wherein

Representing the nth set of supported samples

The number of inner samples;

step 2.2, inquiring a sample set Q for meta-training_trainThe similarity calculation formula between the jth query sample and the nth prototype is as follows:

step 2.3, in the meta-training stage, a depth meta-metric model I₁And I₂The calculation formula of the output value of (a) is:

further, the depth metric model I described in step 3₁And I₂Total loss function of

And

the calculation process of (2) is as follows:

step 3.1, depth element measurement model I₁And depth component measurement model I₂Cross entropy loss function of L_CE1And L_CE2The calculation formula of (2) is as follows:

step 3.2, depth element measurement model I₁Depth-to-depth meta metric model I₂The calculation formula of the KL divergence value is as follows:

depth element measurement model I₂Depth-to-depth meta metric model I₁The calculation formula of the KL divergence value is as follows:

step 3.3, depth metric model I₁Total loss function of

And depth metric moduleForm I₂Total loss function of

The calculation formula of (2) is as follows:

further, the iterative formula of the optimization calculation described in step 4 is:

where gamma represents a learning rate parameter,

is the partial derivative operator.

Further, the meta-test procedure described in step 5 is described as follows:

step 5.1, utilizing the trained depth element measurement model I₁And I₂Feature extraction module pair support samples in (1)

And query samples

Extracting the features to obtain

And

step 5.2, support the sample set S for meta-test_testThe nth class supports prototypes of samples

And

the calculation formula is as follows:

step 5.3, query sample set Q for meta-test_testThe similarity calculation formula between the jth query sample and the nth prototype is as follows:

step 5.4, finally obtaining the output category of the query sample to be tested, namely the depth meta-metric model I₁And I₂The calculation formula of the output value of (a) is:

the small sample image classification method based on the depth element metric model mutual learning has the following advantages:

1. the invention randomly initializes the model of any depth element measurement, and the mutual learning method can be fused with the model of any depth element measurement. And the KL divergence between the outputs of the two element metric models can provide a regularization term, so that the over-fitting problem of the element metric models in the learning process can be avoided, and the generalization performance of the extracted features is improved.

2. The KL divergence between the outputs of the two meta-metric models in the invention can further pull the classification decision of any depth meta-metric model to the optimal classification decision boundary.

Drawings

FIG. 1 is a flowchart of a small sample image classification method based on depth metric model mutual learning according to the present invention;

Detailed Description

In order to better understand the purpose, structure and function of the present invention, the method for classifying small sample images based on depth metric model mutual learning according to the present invention is described in further detail below with reference to the accompanying drawings.

As shown in fig. 1, the small sample image classification method based on depth metric model mutual learning includes the following steps:

step 1, deep meta-learning simulates a small sample classification test environment by adopting a segment type (Episodic) training mode. For a given data set D, an N-way-K-shot classification task is constructed in each segment, namely, a classification model is required to distinguish N classes by using K supporting samples of each class. Each classification task is supported by a sample set

And querying the sample set

Is composed of (a) wherein

Set of supporting samples representing the nth class, i.e.

Indicates the ith sample to be supported,

indicating its corresponding label.

The j-th query sample is represented,

indicating its corresponding label.

The steps described are described as follows:

step 1.1, the given dataset D may be any small sample image classification dataset, such as the MiniImageNet dataset or the Caltech-UCSD Bird-200-. The former includes 100 classes, each of which has 600 pictures, and 60000 color pictures, each of which has a size of 84 × 84. Of which 64 classes, 16 classes and 20 classes are used for meta training, meta verification and meta testing, respectively. Alternatively, image data for 200 different bird species were provided, for a total of 11788 pictures. Each image has 1 annotation box, 15 part keypoints, and 312 annotation attributes, 100 categories, 50 categories, and 50 categories being used for meta-training, meta-verification, and meta-testing, respectively.

Step 2.2, for a given data set D, it is divided into three subsets, i.e. meta-training set D_trainVerification set of Yuan_valAnd meta test set D_testThe classification categories in each subset are not the same, namely:

D_train∪D_val∪D_test＝D。

step 2.3, from D_trainZhongrandExtracting N categories, and randomly extracting K samples from each category to obtain a support sample set abbreviated as

S_nSet of supporting samples representing the nth class, i.e.

Indicates the ith sample to be supported,

indicating its corresponding label. Then extracting a batch of samples from the residual data in the N categories to obtain a query sample set, which is abbreviated as

Step 2, randomly initializing two depth meta-metric models for mutual learning, wherein each depth meta-metric model comprises a feature extraction module

And a similarity metric module g_θThen, the depth metric models for mutual learning are respectively recorded as:

and

in each classification task, a sample set S will be supported_trainAnd query sample set Q_trainRespectively input into a depth metric model I₁And I₂Supporting the sample set S_trainAnd query sample set Q_trainRespectively input into a depth metric model I₁And I₂. Wherein the ith support sample image

And j query sample image

Ith support sample image

And j query sample image

Calculating the corresponding prototypes of the K supporting samples in each type of sample, and then expressing the prototype of the nth type of each model as P_n1And P_n2Then, the similarity between each query sample and each type of prototype is calculated, the probability value of the query sample belonging to the nth type is calculated by utilizing a Softmax function, and the output of each model is obtained

And

described degree of depth metric model I₁And I₂The output is calculated as follows:

And

the calculation formula is as follows:

wherein

Representing the nth set of supported samples

The number of inner samples.

step 3, respectively calculating the cross entropy loss function L corresponding to each model_CE1And L_CE2And mutual information loss function D between them_KL(p₂|p₁) And D_KL(p₁|p₂) Thereby obtaining an overall loss function

And

described degree of depth metric model I₁And I₂Total loss function of

And

the calculation process of (2) is as follows:

step 3.1, depth element measurement model I₁And I₂Cross entropy loss function of L_CE1And L_CE2The calculation formula of (2) is as follows:

step 3.3, depthElement measurement model I₁And I₂Total loss function of

And

the calculation formula of (2) is as follows:

and 4, respectively optimizing the two models by using a gradient descent algorithm according to the loss function to complete the meta-training process.

The iterative formula of the described optimization calculation is:

where gamma represents a learning rate parameter,

is the partial derivative operator.

And 5, in the meta-test stage, constructing a plurality of N-way-K-shot classification tasks in the same way. I.e. in the classification task, from D_testRandomly extracting N classes, and randomly extracting K samples from each class to obtain a support sample set (abbreviated as support set)

Set of supporting samples representing the nth class in the test set, i.e.

Indicates the ith sample to be supported,

Representing the jth test query sample,

indicating its corresponding label. Utilizing a trained depth element metric model I₁And I₂Respectively testing the meta-test set to obtain the jth query sample

Probability output value belonging to nth class

And

the described meta-test procedure is described as follows:

And query samples

Extracting the features to obtain

And

And

the calculation formula is as follows:

it is to be understood that the present invention has been described with reference to certain embodiments, and that various changes in the features and embodiments, or equivalent substitutions may be made therein by those skilled in the art without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims

1. A small sample image classification method based on depth element metric model mutual learning is characterized by comprising the following steps:

And querying the sample set

Is composed of (a) wherein

Set of supporting samples representing the nth class, i.e.

Indicates the ith sample to be supported,

indicates its corresponding tag;

the j-th query sample is represented,

indicates its corresponding tag;

Feature extraction module of

and

And j query sample image

The ith viewSupporting sample images

And j query sample image

And

And

step 5, constructing a classification task in a meta-test stage; the classification task is from meta training set D_testMiddle random drawingTaking N categories, randomly extracting K samples from each category to obtain a support sample set, which is abbreviated as

Set of supporting samples representing the nth class in the test set, i.e.

Indicates the ith sample to be supported,

Representing the jth test query sample,

Probability output value belonging to nth class

And

2. the small sample image classification method based on depth element metric model mutual learning according to claim 1, characterized in that step 1 specifically comprises the following steps:

3. The method for classifying small sample images based on depth element metric model mutual learning according to claim 2, wherein the depth element metric model I described in step 2₁And I₂The output is calculated as follows:

And

the calculation formula is as follows:

wherein

Representing the nth set of supported samples

The number of inner samples;

4. the method for classifying small sample images based on depth element metric model mutual learning as claimed in claim 3, wherein the depth element metric model I described in step 3₁And I₂Total loss function of

And

the calculation process of (2) is as follows:

step 3.3, depth metric model I₁Total loss function of

And depth component measurement model I₂Total loss function of

The calculation formula of (2) is as follows:

5. the small sample image classification method based on depth element metric model mutual learning of claim 4, characterized in that the iterative formula of the optimization calculation described in step 4 is:

where gamma represents a learning rate parameter,

is the partial derivative operator.

6. The method for classifying small sample images based on depth meta-metric model mutual learning according to claim 5, wherein the meta-test process described in step 5 is described as follows:

And query samples

Extracting the features to obtain

And

And

the calculation formula is as follows: