Disclosure of Invention
The invention aims to provide a method for improving the classification accuracy of few samples by using a distribution strategy aiming at the defects in the prior art, solves the problem that a model with good generalization performance cannot be obtained due to less sample data in the prior art aiming at the classification problem of few sample data images, and is realized by the following technical scheme for realizing the purposes:
a method for improving the classification accuracy of few samples by using a distribution strategy is characterized by comprising the following steps:
s1, obtaining a plurality of groups of sample data sets of different categories with the highest similarity with the appearance feature vectors of less sample data, training a feature extractor, and obtaining the feature vector x of each photo by using the feature extractorj;
S2, utilizing the feature vector x obtained in the step S1jCalculating the mean value and the variance of all the sample data sets of different types in each group;
s3, reusing the feature vector xjCalculating the mean value and the variance of less sample data;
s4, calibrating the mean value and the variance of the few sample data in the step S3 according to the mean value and the variance of each group of different types of sample data sets in the step S2;
and S5, randomly sampling the mean value and the variance after calibration in the step S4 to generate new feature data to train a linear classifier, and training the classifier through cross entropy loss to obtain a classifier model.
Preferably, the specific implementation manner of step S1 is: obtaining a plurality of groups of sample data sets of different types with the highest similarity to the appearance feature vector of the few sample data, collectively referring the plurality of groups of sample data sets of different types as base class data, arbitrarily training a feature extractor with the same high performance as Resnet on the base class data, and counting the feature vector x of each photo in each group of sample data sets of different types in the base class data by using the feature extractorj。
Preferably, the specific implementation manner of the calculation of the mean and the variance of each group of different types of sample data sets in step S2 is as follows: using the feature vector x acquired in step S1jSubstituting the data into a formula 1 and a formula 2 to respectively calculate the mean value and the variance of each group of different types of sample data sets;
wherein i represents the number of groups, niNumber of samples, μ, representing the ith groupi1Represents the mean, sigma, of each set of different classes of sample data setsi1Representing the variance, x, of each set of different classes of sample data setsjFeature vectors for each photo within the sets of different categories of sample data sets.
Preferably, the specific implementation manner of the calculation of the mean and the variance of the few sample data in step S3 is as follows: using the feature vector x acquired in step S1jSubstituting into formula 3 and formula 4 to respectively calculate the mean value and variance of the less sample data;
wherein i represents the number of groups, niNumber of samples, μ, representing the ith groupj2Represents the mean, Σ, of a small sample of datai2Variance, x, representing few sample datajFeature vectors for each photo within the sets of different categories of sample data sets.
Preferably, the implementation process of the calibration in step S4 includes the following steps:
s41, first, the feature vector x is reduced by equation 5jThe deviation degree of dispersion, wherein a TLPT method formula is adopted in formula 5;
wherein, λ is the set hyper-parameter, and the original characteristic distribution of less sample data can be restored when λ is set to 1; when the lambda is set to be less than 1, the positive skewness of the characteristic distribution of less sample data can be reduced; when lambda is set to be larger than 1, the positive skewness of the feature distribution of less sample data can be increased;
s42, next, using the feature vector x of the base class datajAnd transferring at least sample data, and acquiring the mean value and the variance of the small amount of sample data after calibration.
Preferably, the specific implementation steps of step S42 are as follows:
s421, first select and sample from the base class data
K base class data samples with the most similar appearance characteristic vectors;
s422, the distance to the sample is obtained by using the formula 6
Set of characteristic distances S
dUsing equation 7 to obtain a sample
Nearest k base class data sets S
N;
S423, passing SNAnd comparing and calibrating to obtain the mean value and the variance of the calibrated few-sample data.
Preferably, the specific implementation steps of step S5 are as follows:
s51, randomly sampling the mean and variance calibrated in the step S4 to generate new feature data to train a linear classifier, and using a set of standard data SySampling is carried out from the mean value and the variance after the calibration of less sample data, and a group of characteristic vectors marked as y types are generated:
s52, setting the total number of features generated by each group of different types of sample data sets as hyper-parameters, then using the y types of original data set feature vectors in less sample data and the y types of feature vectors generated in the step S51 as training data of a linear classifier, training the linear classifier by using cross entropy loss, and obtaining a classifier model:
and ∑ -logPr (y | x; θ) formula 11.
The invention has the beneficial effects that:
1. because similar characteristic representations generally have similar mean values and variances, the method can correct data with less sample distribution according to the mean values and the variances, and the input of the classifier is expanded by transferring data with enough samples similar to statistical information and then extracting enough samples from the corrected sample data distribution, so that the accuracy of classifying the few samples is greatly improved;
2. compared with the traditional method, the method disclosed by the invention has the advantages that for few-sample classification, the traditional method usually focuses on developing a stronger model to adapt to the distribution of few samples, the basic strategy is to train a model with only a few training samples, the model tends to be over-fitted by reducing the training loss of the samples as much as possible, however, the inclination of the few samples can reduce the generalization capability of the model, so that the real distribution of the samples can not be reflected far; the method of the invention calibrates the distribution of few samples in the data set, uses the characteristic data sampled from the calibration distribution to input the trained model, not only fits a few samples, but also does not need to add other parameters and modules, and can improve the accuracy of the classifier in identifying the classes of the few samples.
Detailed Description
The invention will be further described with reference to the accompanying drawings and the detailed description below:
example 1:
as shown in fig. 1, a method for improving the classification accuracy of few samples by using a distribution strategy includes the following steps:
s1, obtaining a plurality of groups of sample data sets of different categories with the highest similarity with the appearance feature vectors of less sample data, training a feature extractor, and obtaining the feature vector x of each photo by using the feature extractorj;
The specific implementation manner of the step S1 is as follows: acquiring a plurality of groups of sample data sets of different classes with the highest similarity with the appearance feature vectors of less sample data,collectively referring a plurality of groups of sample data sets of different types as base class data, training a feature extractor with the same high performance as Resnet on the base class data at will, and counting the feature vector x of each photo in the sample data sets of different types of each group in the base class data by using the feature extractorj。
S2, utilizing the feature vector x obtained in the step S1jCalculating the mean value and the variance of all the sample data sets of different types in each group;
the specific implementation manner of solving the mean and the variance of each group of different types of sample data sets in step S2 is as follows: using the feature vector x acquired in step S1jSubstituting the data into a formula 1 and a formula 2 to respectively calculate the mean value and the variance of each group of different types of sample data sets;
wherein i represents the number of groups, niNumber of samples, μ, representing the ith groupi1Represents the mean, sigma, of each set of different classes of sample data setsi1Representing the variance, x, of each set of different classes of sample data setsjFeature vectors for each photo within the sets of different categories of sample data sets.
S3, reusing the feature vector xjCalculating the mean value and the variance of less sample data;
the specific implementation of the calculation of the mean and the variance of the few sample data in step S3 is as follows: using the feature vector x acquired in step S1jSubstituting into formula 3 and formula 4 to respectively calculate the mean value and variance of the less sample data;
wherein i represents the number of groups, niNumber of samples, μ, representing the ith groupi2Represents the mean, Σ, of a small sample of datai2Variance, x, representing few sample datajFeature vectors for each photo within the sets of different categories of sample data sets.
S4, calibrating the mean value and the variance of the few sample data in the step S3 according to the mean value and the variance of each group of different types of sample data sets in the step S2;
the implementation process of the calibration in step S4 includes the following steps:
s41, in order to make the feature vector xjThe feature distribution of (2) is closer to the gaussian distribution by first reducing the feature vector x using equation 5jThe deviation degree of dispersion, wherein a TLPT method formula is adopted in formula 5;
wherein, λ is the set hyper-parameter, and the original characteristic distribution of less sample data can be restored when λ is set to 1; when the lambda is set to be less than 1, the positive skewness of the characteristic distribution of less sample data can be reduced; when lambda is set to be larger than 1, the positive skewness of the feature distribution of less sample data can be increased;
s42, next, using the feature vector x of the base class datajTransferring at least sample data, and acquiring a mean value and a variance of the sample data after calibration; migration is based on feature vectors x of base class datajEuclidean distance from the characteristic mean value of the base class data;
the specific implementation steps of step S42 are as follows:
s421, first select and sample from the base class data
K of the most similar outline feature vectorsBase class data samples; k is a modifiable hyper-parameter which can be freely set according to the state;
s422, the distance to the sample is obtained by using the formula 6
Set of characteristic distances S
dUsing equation 7 to obtain a sample
Nearest k base class data sets S
N;
S423, passing SNAnd comparing and calibrating to obtain the mean value and the variance of the calibrated few-sample data.
Wherein alpha is also a modifiable hyper-parameter, set in situ according to the calibration state.
S5, randomly sampling and generating new feature data by using the mean value and the variance calibrated in the step S4 to train a linear classifier, and training the classifier through cross entropy loss to obtain a classifier model;
the specific implementation steps of step S5 are as follows:
s51, randomly sampling the mean and variance calibrated in the step S4 to generate new feature data to train a linear classifier, and using a set of standard data SyMean and variance after calibration from few sample dataSampling to generate a group of characteristic vectors marked as y types:
s52, setting the total number of features generated by each group of different types of sample data sets as hyper-parameters, then using the y types of original data set feature vectors in less sample data and the y types of feature vectors generated in the step S51 as training data of a linear classifier, training the linear classifier by using cross entropy loss, and obtaining a classifier model:
and ∑ -logPr (y | x; θ) formula 11.
Various other modifications and changes may be made by those skilled in the art based on the above-described technical solutions and concepts, and all such modifications and changes should fall within the scope of the claims of the present invention.