CN113033698A

CN113033698A - Method for improving classification accuracy of few samples by using distribution strategy

Info

Publication number: CN113033698A
Application number: CN202110412397.5A
Authority: CN
Inventors: 杨航; 杨淑爱; 黄坤山
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2021-04-16
Filing date: 2021-04-16
Publication date: 2021-06-25

Abstract

A method for improving the accuracy of few-sample classification by using a distribution strategy, comprising the following steps: obtaining multiple sets of sample data sets with the highest similarity to the shape feature vector of the few-sample data of different categories to train a feature extractor, and using the feature extractor to obtain each The feature vector x _j of the photo; use the feature vector x _j to obtain the mean and variance of the sample data sets of each group of different categories; use the feature vector x _j again to calculate the mean and variance of the few-sample data; according to each group of different categories The mean and variance of the sample data set are used to calibrate the mean and variance of the few-sample data; the calibrated mean and variance are randomly sampled to generate new feature data to train the linear classifier, and the classifier model is obtained through training; The distribution of the samples is calibrated, and the accuracy of the classification of the few-shot categories can be improved without adding other parameters and modules.

Description

Method for improving classification accuracy of few samples by using distribution strategy

Technical Field

The invention relates to the technical field of image classification, in particular to a method for improving the classification accuracy of few samples by using a distribution strategy.

Background

Deep learning has been widely used in various fields, mainly for solving the problem of image classification, however, deep learning is a technique that relies heavily on data, requires a large number of labeled samples to play a core role, in real life, many scene applications may not have too many labeled sample data, such as in the medical field, the security field, etc., and the cost of retrieving labeled data is very large, therefore, major problems faced by deep learning techniques when faced with the problem of small samples include the fact that, data distributions formed by very few samples tend to differ significantly from the true data distributions, training a model on a skewed data distribution can lead to severe overfitting phenomena and severely destroy the generalization ability of the model, so learning a model with good generalization performance from a very small number of samples is extremely difficult, and learning from a limited number of samples is a very challenging problem.

Disclosure of Invention

The invention aims to provide a method for improving the classification accuracy of few samples by using a distribution strategy aiming at the defects in the prior art, solves the problem that a model with good generalization performance cannot be obtained due to less sample data in the prior art aiming at the classification problem of few sample data images, and is realized by the following technical scheme for realizing the purposes:

a method for improving the classification accuracy of few samples by using a distribution strategy is characterized by comprising the following steps:

s1, obtaining a plurality of groups of sample data sets of different categories with the highest similarity with the appearance feature vectors of less sample data, training a feature extractor, and obtaining the feature vector x of each photo by using the feature extractor_j；

S2, utilizing the feature vector x obtained in the step S1_jCalculating the mean value and the variance of all the sample data sets of different types in each group;

s3, reusing the feature vector x_jCalculating the mean value and the variance of less sample data;

s4, calibrating the mean value and the variance of the few sample data in the step S3 according to the mean value and the variance of each group of different types of sample data sets in the step S2;

and S5, randomly sampling the mean value and the variance after calibration in the step S4 to generate new feature data to train a linear classifier, and training the classifier through cross entropy loss to obtain a classifier model.

Preferably, the specific implementation manner of step S1 is: obtaining a plurality of groups of sample data sets of different types with the highest similarity to the appearance feature vector of the few sample data, collectively referring the plurality of groups of sample data sets of different types as base class data, arbitrarily training a feature extractor with the same high performance as Resnet on the base class data, and counting the feature vector x of each photo in each group of sample data sets of different types in the base class data by using the feature extractor_j。

Preferably, the specific implementation manner of the calculation of the mean and the variance of each group of different types of sample data sets in step S2 is as follows: using the feature vector x acquired in step S1_jSubstituting the data into a formula 1 and a formula 2 to respectively calculate the mean value and the variance of each group of different types of sample data sets;

wherein i represents the number of groups, n_iNumber of samples, μ, representing the ith group_i1Represents the mean, sigma, of each set of different classes of sample data sets_i1Representing the variance, x, of each set of different classes of sample data sets_jFeature vectors for each photo within the sets of different categories of sample data sets.

Preferably, the specific implementation manner of the calculation of the mean and the variance of the few sample data in step S3 is as follows: using the feature vector x acquired in step S1_jSubstituting into formula 3 and formula 4 to respectively calculate the mean value and variance of the less sample data;

wherein i represents the number of groups, n_iNumber of samples, μ, representing the ith group_j2Represents the mean, Σ, of a small sample of data_i2Variance, x, representing few sample data_jFeature vectors for each photo within the sets of different categories of sample data sets.

Preferably, the implementation process of the calibration in step S4 includes the following steps:

s41, first, the feature vector x is reduced by equation 5_jThe deviation degree of dispersion, wherein a TLPT method formula is adopted in formula 5;

wherein, λ is the set hyper-parameter, and the original characteristic distribution of less sample data can be restored when λ is set to 1; when the lambda is set to be less than 1, the positive skewness of the characteristic distribution of less sample data can be reduced; when lambda is set to be larger than 1, the positive skewness of the feature distribution of less sample data can be increased;

s42, next, using the feature vector x of the base class data_jAnd transferring at least sample data, and acquiring the mean value and the variance of the small amount of sample data after calibration.

Preferably, the specific implementation steps of step S42 are as follows:

s421, first select and sample from the base class data

K base class data samples with the most similar appearance characteristic vectors;

s422, the distance to the sample is obtained by using the formula 6

Set of characteristic distances S_dUsing equation 7 to obtain a sample

Nearest k base class data sets S_N；

S423, passing S_NAnd comparing and calibrating to obtain the mean value and the variance of the calibrated few-sample data.

Preferably, the specific implementation steps of step S5 are as follows:

s51, randomly sampling the mean and variance calibrated in the step S4 to generate new feature data to train a linear classifier, and using a set of standard data S^ySampling is carried out from the mean value and the variance after the calibration of less sample data, and a group of characteristic vectors marked as y types are generated:

s52, setting the total number of features generated by each group of different types of sample data sets as hyper-parameters, then using the y types of original data set feature vectors in less sample data and the y types of feature vectors generated in the step S51 as training data of a linear classifier, training the linear classifier by using cross entropy loss, and obtaining a classifier model:

and ∑ -logPr (y | x; θ) formula 11.

The invention has the beneficial effects that:

1. because similar characteristic representations generally have similar mean values and variances, the method can correct data with less sample distribution according to the mean values and the variances, and the input of the classifier is expanded by transferring data with enough samples similar to statistical information and then extracting enough samples from the corrected sample data distribution, so that the accuracy of classifying the few samples is greatly improved;

2. compared with the traditional method, the method disclosed by the invention has the advantages that for few-sample classification, the traditional method usually focuses on developing a stronger model to adapt to the distribution of few samples, the basic strategy is to train a model with only a few training samples, the model tends to be over-fitted by reducing the training loss of the samples as much as possible, however, the inclination of the few samples can reduce the generalization capability of the model, so that the real distribution of the samples can not be reflected far; the method of the invention calibrates the distribution of few samples in the data set, uses the characteristic data sampled from the calibration distribution to input the trained model, not only fits a few samples, but also does not need to add other parameters and modules, and can improve the accuracy of the classifier in identifying the classes of the few samples.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention.

Detailed Description

The invention will be further described with reference to the accompanying drawings and the detailed description below:

example 1:

as shown in fig. 1, a method for improving the classification accuracy of few samples by using a distribution strategy includes the following steps:

The specific implementation manner of the step S1 is as follows: acquiring a plurality of groups of sample data sets of different classes with the highest similarity with the appearance feature vectors of less sample data,collectively referring a plurality of groups of sample data sets of different types as base class data, training a feature extractor with the same high performance as Resnet on the base class data at will, and counting the feature vector x of each photo in the sample data sets of different types of each group in the base class data by using the feature extractor_j。

the specific implementation manner of solving the mean and the variance of each group of different types of sample data sets in step S2 is as follows: using the feature vector x acquired in step S1_jSubstituting the data into a formula 1 and a formula 2 to respectively calculate the mean value and the variance of each group of different types of sample data sets;

the specific implementation of the calculation of the mean and the variance of the few sample data in step S3 is as follows: using the feature vector x acquired in step S1_jSubstituting into formula 3 and formula 4 to respectively calculate the mean value and variance of the less sample data;

wherein i represents the number of groups, n_iNumber of samples, μ, representing the ith group_i2Represents the mean, Σ, of a small sample of data_i2Variance, x, representing few sample data_jFeature vectors for each photo within the sets of different categories of sample data sets.

the implementation process of the calibration in step S4 includes the following steps:

s41, in order to make the feature vector x_jThe feature distribution of (2) is closer to the gaussian distribution by first reducing the feature vector x using equation 5_jThe deviation degree of dispersion, wherein a TLPT method formula is adopted in formula 5;

s42, next, using the feature vector x of the base class data_jTransferring at least sample data, and acquiring a mean value and a variance of the sample data after calibration; migration is based on feature vectors x of base class data_jEuclidean distance from the characteristic mean value of the base class data;

the specific implementation steps of step S42 are as follows:

s421, first select and sample from the base class data

K of the most similar outline feature vectorsBase class data samples; k is a modifiable hyper-parameter which can be freely set according to the state;

s422, the distance to the sample is obtained by using the formula 6

Set of characteristic distances S_dUsing equation 7 to obtain a sample

Nearest k base class data sets S_N；

Wherein alpha is also a modifiable hyper-parameter, set in situ according to the calibration state.

S5, randomly sampling and generating new feature data by using the mean value and the variance calibrated in the step S4 to train a linear classifier, and training the classifier through cross entropy loss to obtain a classifier model;

the specific implementation steps of step S5 are as follows:

s51, randomly sampling the mean and variance calibrated in the step S4 to generate new feature data to train a linear classifier, and using a set of standard data S^yMean and variance after calibration from few sample dataSampling to generate a group of characteristic vectors marked as y types:

and ∑ -logPr (y | x; θ) formula 11.

Various other modifications and changes may be made by those skilled in the art based on the above-described technical solutions and concepts, and all such modifications and changes should fall within the scope of the claims of the present invention.

Claims

1. A method for improving the classification accuracy of few samples by using a distribution strategy is characterized by comprising the following steps:

2. The method of claim 1, wherein the step S1 is implemented by the following steps: obtaining a plurality of groups of sample data sets of different types with the highest similarity to the appearance feature vector of the few sample data, collectively referring the plurality of groups of sample data sets of different types as base class data, arbitrarily training a feature extractor with the same high performance as Resnet on the base class data, and counting the feature vector x of each photo in each group of sample data sets of different types in the base class data by using the feature extractor_j。

3. The method of claim 1, wherein the specific implementation of the mean and variance calculation of each set of different sample data sets in step S2 is as follows: using the feature vector x acquired in step S1_jSubstituting the data into a formula 1 and a formula 2 to respectively calculate the mean value and the variance of each group of different types of sample data sets;

4. The method of claim 1, wherein the sample-less classification accuracy of step S3 is improved by using a distribution strategyThe specific implementation of the mean and variance calculation is as follows: using the feature vector x acquired in step S1_jSubstituting into formula 3 and formula 4 to respectively calculate the mean value and variance of the less sample data;