CN111738301B - Long-tail distribution image data identification method based on double-channel learning - Google Patents
Long-tail distribution image data identification method based on double-channel learning Download PDFInfo
- Publication number
- CN111738301B CN111738301B CN202010465433.XA CN202010465433A CN111738301B CN 111738301 B CN111738301 B CN 111738301B CN 202010465433 A CN202010465433 A CN 202010465433A CN 111738301 B CN111738301 B CN 111738301B
- Authority
- CN
- China
- Prior art keywords
- learning
- channel
- unbalanced
- small sample
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a long tail distribution image data identification method based on double-channel learning, which comprises the following steps: 1) Constructing a two-channel learning model combining unbalanced learning and small sample learning; 2) Updating all parameters in the two-channel learning model by using the total loss of the two-channel learning, and storing the optimal parameters of the two-channel learning model; 3) Inputting the image data of the test set to an optimal two-channel learning model to obtain a prediction label of the image. According to the invention, unbalanced learning and small sample learning are combined to solve the problem of long-tail distributed image data identification, the unbalanced learning channel can improve the identification accuracy of an unbalanced data set, the small sample learning channel can improve the characteristic representation of model learning, the model is focused on the unbalanced learning channel in the early training stage, and the small sample learning channel in the later training stage, so that the identification accuracy of long-tail distributed image data is improved as a whole.
Description
Technical Field
The invention relates to the technical fields of unbalanced classification, small sample learning and long-tail distribution image data identification in machine learning, in particular to a long-tail distribution image data identification method based on double-channel learning.
Background
Long tail distributed image data identification generally adopts unbalanced learning related technology, and the technology is mainly divided into a data layer and an algorithm layer. The techniques at the data plane mainly include a mixed sampling method of downsampling majority class samples, upsampling minority class samples, or a combination of both. However, resampled data cannot reflect the real data distribution characteristics, for example, most samples are discarded by the downsampling method, so that a lot of valuable information in the data set is lost, and the upsampling method causes an overfitting problem and causes great computational power consumption. The algorithm level technology mainly readjusts the weight of each category through a cost sensitive method, and the method alleviates the problem of long-tail distribution image data identification to a certain extent, but does not comprehensively consider the situation that a large number of tail categories have few samples, so that the tail category identification accuracy is still low. In addition, the feasible solution thinking is to learn knowledge from the head class rich data and migrate to the tail class, design a loss function suitable for long-tail distribution image data identification, and construct a more reasonable long-tail distribution image data identification model.
Data in real life are often presented in a long-tail distribution form, however, the current research on long-tail distribution image data identification is still in a preliminary stage, and all long-tail distribution image data identification methods have limitations and cannot well improve the identification accuracy of tail categories.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art, and provides an effective, scientific and reasonable long-tail distribution image data identification method based on double-channel learning, wherein unbalanced learning and small-sample learning are combined to solve the problem of long-tail distribution image data identification, an unbalanced learning channel can improve the identification accuracy of a model to an unbalanced data set, a small-sample learning channel can improve the characteristic representation learned by the model, and the identification capability of the model to tail type image data is enhanced; the built double-channel learning total loss function enables the model to be focused on an unbalanced learning channel in the early training stage and focused on a small sample learning channel in the later training stage, so that the recognition accuracy of the model on long-tail image data is improved as a whole. The method provided by the invention is applicable to the problems of unbalanced multi-classification and long-tail distribution image data identification, and is a general method with stronger robustness.
In order to achieve the above purpose, the technical scheme provided by the invention is as follows: a long tail distribution image data identification method based on double-channel learning comprises the following steps:
1) Constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function; dividing the long-tail distribution image data set into a training set, a verification set and a test set; sampling image data and label data from a training set by using an unbalanced learning channel sampler, inputting the data into an unbalanced learning channel network, and calculating unbalanced learning channel loss; sampling image data and label data from a training set by using a small sample learning channel sampler, inputting the data into a small sample learning channel network, and calculating small sample learning channel loss; then carrying out weighted summation on the unbalanced learning channel loss and the small sample learning channel loss to obtain a two-channel learning total loss;
2) Using the total loss of the two-channel learning, and updating all parameters in the two-channel learning model by back propagation, namely training the two-channel learning model, and storing the optimal two-channel learning model parameters to obtain an optimal two-channel learning model;
3) Inputting the image data of the test set to an optimal two-channel learning model, and obtaining a prediction label, namely a prediction result, of the image.
In step 1), the case of the unbalanced learning channel sampler is as follows:
the input data of the unbalanced learning channel is sampled from a uniform sampler, and each sample in the training set is sampled with equal probability and at most once in each training round T; define B as the number of samples sampled per batch, the sampled input data is represented as { (x) 1 imb ,y 1 imb ),...,(x i imb ,y i imb ),...,(x B imb ,y B imb ) A superscript imb identifying an unbalanced learning path, (x) i imb ,y i imb ) Image data and label data representing the ith sample, i is more than or equal to 1 and less than or equal to B;
the unbalanced learning channel network is as follows:
the unbalanced learning channel network is based on an unbalanced classification algorithmThe network model of the unbalanced classification algorithm is transplanted and comprises a feature extractor f φ ClassifierAnd an unbalance loss function L imb Three parts, the feature extractor f φ For extracting input data (x i imb ,y i imb ) Characteristic representation of +.>The feature representation is then ++>Input to classifier->Get predictive tag->Finally combine the defined unbalance loss function L imb Calculating unbalanced learning channel loss of corresponding batch of samples>
The small sample learning channel sampler is as follows:
the input data of the small sample learning channel is sampled from a meta-sampler that randomly samples N categories in all categories of the training set first, and then randomly samples K in each of the N categories, for each training round T S Samples and K Q The individual samples are respectively used as the support sets of the small sample learning channelsAnd a query setWherein the superscript sup and the superscript qry are respectively used for identifying the support set and the queryA collection; />Image data and label data representing the ith sample of the support set, 1.ltoreq.i.ltoreq.NxK S ;Image data and label data representing the ith sample of the query set, 1.ltoreq.i.ltoreq.NxK Q The method comprises the steps of carrying out a first treatment on the surface of the The data of each batch consists of a support set S and a query set Q;
the small sample learning channel network is as follows:
the small sample learning channel network is based on a small sample learning algorithm, and a network model of the small sample learning algorithm is transplanted and comprises a feature extractor f φ Distance gauge d and loss function L fs Three parts, wherein the feature extractor adopted by the small sample learning channel network and the feature extractor adopted by the unbalanced learning channel network use the same network architecture and share weight parameters; input support set sample data (x i sup ,y i sup ) And query set sample data (x i qry ,y i qry ) First pass feature extractor f φ Extracting feature z i sup =f φ (x i sup ) And z i qry =f φ (x i qry ) Then, according to the distance scale d, calculating the distance d (x) between the sample characteristics of the query set and the sample characteristics of the support set i qry ,y i sup ) The label of the support set sample closest to the query set sample is the prediction label of the query set sampleFinally according to the defined small sample loss function L fs Calculating the small sample learning channel loss->
The two-channel learning total loss function is as follows:
the two-channel learning total loss is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss, as follows:
wherein, alpha is a super parameter related to the training round T, alpha and the training round number T are in parabolic decreasing relation, the value is 1 at the beginning of training, and the value gradually decreases to 0 along with the increase of the training round number T, so that the two-channel learning model focuses on an unbalanced learning channel in the early stage of training and focuses on a small sample learning channel in the later stage of training.
In step 2), when training the two-channel learning model, a maximum training round number T is first set max The method comprises the steps of firstly, inputting sampling data of an optimizer type and an initial learning rate into an unbalanced learning channel network by using a uniform sampler and inputting sampling data of a meta sampler into a small sample learning channel network by using each round, simultaneously calculating unbalanced learning channel loss and small sample learning channel loss, then carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss to calculate a two-channel learning total loss, combining the two-channel learning total loss with the optimizer, updating a characteristic extractor parameter and an unbalanced learning channel classifier parameter of two-channel weight sharing by back propagation, wherein an over-parameter alpha in a two-channel learning total loss function and a training round number are in parabolic decreasing relation, the alpha is 1 at the beginning of training, and gradually decreases to 0 along with the increase of the training round number, so that the two-channel learning model is focused on the unbalanced learning channel in the early training period and focused on the small sample learning channel in the later training period;
evaluating performance of the dual-channel model using accuracy and recall rates of a Many-shot class, a Medium-shot class, a Few-shot class, and an Overall class in a verification set of the long-tail distributed image dataset, wherein the number of samples of the Many-shot class is greater than 100, the number of samples of the Medium-shot class is between 20 and 100, and the number of samples of the Few-shot classLess than 20, the override category refers to all categories of the validation set when the training round number reaches a set maximum round number T max And when the training is stopped, the optimal two-channel learning model parameters are saved.
In step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. compared with the method for only using unbalanced learning, the method for identifying the tail type image data of the unbalanced learning system has the advantages that the method for identifying the tail type image data of the unbalanced learning system adopts a mode of combining the unbalanced learning channel and the small sample learning channel, and the added small sample learning channel can improve characteristic representation, enhance intra-class compactness and improve the identification capability of the two-channel learning model on the tail type image data with rare data.
2. The uniform sampler adopted by the unbalanced learning channel can keep the original distribution of the long-tail distribution image dataset, and is beneficial to the representation learning of the characteristics.
3. The meta sampler adopted by the small sample learning channel is used for meta sampling all categories of a training set of a long-tail distributed image dataset, and learning is carried out by taking a small amount of data of different categories as meta tasks in different rounds of sampling, so that the two-channel learning model can learn the self-adaption capability of identifying tasks on a small amount of samples and fully utilize the dataset.
4. The total loss function of the two-channel learning constructed by the invention is the weighted sum of the loss of the unbalanced learning channel and the loss of the small sample learning channel, the two-channel learning model focuses on the unbalanced learning channel in the early training stage so as to learn a good decision boundary, and focuses on the small sample learning channel in the later training stage, the characteristic representation of the two-channel learning model damaged by the unbalanced learning is gradually corrected by pulling similar samples and pushing different samples, meanwhile, the decision boundary learned by the unbalanced learning channel is not damaged, and the recognition accuracy of the two-channel learning model on long-tail distribution image data is improved as a whole.
5. The long tail distribution image data identification method based on the double-channel learning uses the output of the last layer classifier of the unbalanced learning channel network as a final prediction result. And when the two-channel learning model is trained, the performance of the two-channel learning model is evaluated by using the accuracy and recall rates of the Many-shot class, the Medium-shot class and the Few-shot class in the verification set of the long-tail distributed image dataset, so that the change of the real performance of the model can be tracked better, and the trained model is more reliable.
Drawings
Fig. 1 is a diagram showing an example of input data according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of a two-channel learning model structure according to the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples.
Places365 dataset is a large image dataset covering 365 scene categories, each category containing no more than 5000 training pictures, 50 verification pictures, and 900 test pictures. The Places365 original data set is downsampled according to the pareto distribution with the power exponent parameter of 6, the training lump of the obtained long-tail distribution image data set totally comprises 62500 pictures, wherein each class contains 4980 pictures at most and 5 pictures at least, and the training set Places-LT of the constructed long-tail distribution image data set is shown in figure 1. The verification set of the long-tail distributed image dataset samples 20 pictures per class for tracking and evaluating the performance of the two-channel learning model. 50 pictures are sampled from each class of test set of the long-tail distributed image data set and are used for evaluating and comparing the performance of the two-channel learning model and other image data identification models.
For a constructed long tail distributed image dataset, the data preprocessing operation is as follows: all pictures were first adjusted to 256 x 256, randomly cropped to 224 x 224 during training, then flipped horizontally with 50% probability, and randomly dithered in brightness, contrast, and saturation of the pictures to enhance the pictures, which were center cropped to 224 x 224 without further enhancement during verification and testing.
As shown in fig. 2, the long tail distribution image data identification method based on the dual-channel learning provided by the embodiment includes the following steps:
1) Constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function, wherein:
imbalance learning channel sampler: the input data of the unbalanced learning channel is sampled from a uniform sampler. In each training round T, each sample in the training set of long-tail distributed image datasets is sampled with equal probability and at most once. Defining B as the number of samples sampled per batch, B is set to 128 in this embodiment, and the sampled input data is represented as { (x) 1 imb ,y 1 imb ),...,(x i imb ,y i imb ),...,(x B imb ,y B imb ) A }, wherein superscript imb is used to identify an imbalance learning channel, (x) i imb ,y i imb ) Image data and label data of the i (1. Ltoreq.i.ltoreq.B) th sample are represented, respectively.
Unbalanced learning channel network: based on the unbalanced classification algorithm, the network model of the unbalanced classification algorithm can be transplanted, in this embodiment, the unbalanced learning channel network adopts an LDAM unbalanced classification network, wherein the feature extractor f φ Adopting ResNet10 residual error network and classifierUsing a fully-connected layer network, unbalanced loss function L imb LDAM losses are used. Feature extractor f φ First, input data (x i imb ,y i imb ) Characteristic representation of +.>The feature representation is then ++>Input to classifier->Get predictive tag->Finally, calculating the unbalanced learning channel loss of the batch of samples by using the LDAM loss function>Record category y i imb Is of sample x of (2) i imb Is characterized byTraining set y i imb The number of class samples is +.>The super parameter C is set to 0.5, and the LDAM loss function is as follows:
wherein:
a small sample learning channel sampler: the input data of the small sample learning channel is sampled from a meta-sampler. In each training round T, the meta-sampler first randomly samples n=5 categories among all categories of the training set of the long-tail distributed image dataset, and then randomly samples K in each of the 5 categories S =1 sample and K Q Support set for small sample learning channel with 1 sampleAnd a query setWherein the superscript sup and the superscript qry are used for identifying the support set and the query set, respectively, < >>The ith (i is more than or equal to 1 and less than or equal to N multiplied by K) of the support set is represented S ) Image data and label data for each sample,the ith (i is more than or equal to 1 and less than or equal to N multiplied by K) of the query set is represented Q ) Image data and label data for each sample. The data for each batch consists of a support set S and a query set Q.
Small sample learning channel network: based on the small sample learning algorithm, the network model of the small sample learning algorithm can be transplanted, in this embodiment, the network model of the small sample learning channel adopts a prototype network model, and the channel adopts a feature extractor f φ Feature extractor f for use with an unbalanced learning channel φ The same ResNet10 network architecture is used and the weight parameters are shared. Small sample loss function L fs Cross entropy loss is employed. Input support set sample data (x i sup ,y i sup ) And query set sample data (x i qry ,y i qry ) First pass feature extractor f φ Extracting the feature representation z i sup =f φ (x i sup ) And z i qry =f φ (x i qry ) Feature extractor f φ After extracting the characteristics of the input batch data, calculating each type of sample set S of the support set k Feature center c of (2) k Then according to the sample characteristics z of the query set i qry And class feature center c k Is the Euclidean distance d (z) i qry ,c k ) Calculate the sample x of the query set i qry Probability of belonging to class k
Wherein:
finally, according to the small sample loss function L fs Calculating a small sample learning channel lossThe small sample learning channel loss is as follows:
double-channel learning total loss function: the two-channel learning total loss is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss. The two-channel learning total loss function is as follows:
wherein alpha is a super parameter related to training round T, and the total round number of training is defined as T max The relationship of α to training round T is as follows:
2) And (3) updating all parameters in the two-channel learning model by using the total loss of the two-channel learning through back propagation, namely training the two-channel learning model, and storing the optimal two-channel learning model parameters to obtain the optimal two-channel learning model.
In the process of training the two-channel learning model, the maximum training round number T max Set to 120, the optimizer uses an SGD optimizer, the learning rate is initialized to 0.1, the learning rate drops by a factor of 0.1 when the training round number T reaches 70, and the learning rate continues to drop by a factor of 0.1 when the training round number reaches 90. The super parameter alpha in the dual-channel learning total loss function and the training round number T are in parabolic decreasing relation, so that the dual-channel learning model is focused on an unbalanced learning channel in the early stage of training and focused on a small sample learning channel in the later stage of training.
And when the two-channel learning model is trained, the accuracy and recall of the Many-shot class, the Medium-shot class, the Few-shot class and the Overall class in the verification set of the long-tail distributed image dataset are used for evaluating the performance of the two-channel model. The number of samples of the Many-shot class is greater than 100, the number of samples of the Medium-shot class is between 20 and 100, the number of samples of the Few-shot class is less than 20, and the overlay class refers to all classes of the verification set. When the training round number T reaches the set maximum round number T max And when the training is stopped, the optimal two-channel learning model parameters are saved.
3) Inputting the image data of the test set of the long-tail distributed image data set into the optimal two-channel learning model stored in the last step, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the prediction result of the image data of the test set finally.
The following table is a comparison of the two-channel learning model with other image data recognition models in the Places-LT dataset. In the comparison Model, the DC-LTR represents a two-channel learning Model, and other models are currently mainstream models for processing an unbalanced image dataset or a long tail distributed image dataset except that the Plain Model is a naive deep convolutional neural network classification Model. For fair comparison, all comparison models were trained using a Places-LT training set and Resnet10 network structure, and then Class-Balanced Accuracy and Macro F-measure for Many-shot Class, medium-shot Class, few-shot Class, and Overall Class were calculated on Places-LT test set, where Class-Balanced Accuracy represents average recall per Class and Macro F-measure represents average precision per Class.
TABLE 1 results of comparative experiments on Places-LT datasets
From the experimental results, the Class-Balanced Accuracy results and the Macro F-measure results of the two-channel learning model DC-LTR on the Few-shot Class and the Overall Class are obviously superior to those of other contrast models, so that the recognition accuracy of the tail type image data with sparse data can be improved by the two-channel learning model, and the recognition accuracy of the long tail distribution image data is improved as a whole; the DC-LTR of the two-channel learning model has the same advantages that the result on the Medium-shot class is slightly reduced, but the result on the Many-shot class is equivalent to other models based on an unbalanced algorithm or long-tail distribution image data recognition models, so that the recognition accuracy of the tail type image data with sparse data is improved, and meanwhile, the recognition accuracy of the head type image data with abundant data is not damaged. The effectiveness and superiority of the two-channel learning model are verified by comparing different models.
The model is written by using Python3.7, and based on a deep learning framework PyTorch, the model of the GPU of experimental operation is 2 blocks NVIDIA GeForce GTX 1080Ti, and the total 22GB video memory.
The long tail identification method of other data sets is similar to this method.
In summary, the invention combines unbalanced learning and small sample learning to solve the long tail distribution image data identification problem. The unbalanced learning channel improves the malformation phenomenon that a general algorithm is too biased to the head category, and simultaneously learns a good classification decision boundary, thereby improving the recognition accuracy of the unbalanced data set by the two-channel learning model; the small sample learning channel restores the characteristic representation capability damaged by the unbalanced learning channel by pulling similar samples and pushing different samples, and enhances the recognition capability of the two-channel learning model on tail image data; the constructed two-channel learning total loss function enables the two-channel learning model to focus on an unbalanced learning channel in the early training stage and focus on a small sample learning channel in the later training stage, so that the recognition accuracy of the two-channel learning model on long-tail distribution image data is improved as a whole. Therefore, the invention has practical application value and is worth popularizing.
The above embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention, so variations in shape and principles of the present invention should be covered.
Claims (3)
1. A long tail distribution image data identification method based on double-channel learning is characterized by comprising the following steps:
1) Constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function; dividing the long-tail distribution image data set into a training set, a verification set and a test set; sampling image data and label data from a training set by using an unbalanced learning channel sampler, inputting the data into an unbalanced learning channel network, and calculating unbalanced learning channel loss; sampling image data and label data from a training set by using a small sample learning channel sampler, inputting the data into a small sample learning channel network, and calculating small sample learning channel loss; then carrying out weighted summation on the unbalanced learning channel loss and the small sample learning channel loss to obtain a two-channel learning total loss;
the unbalanced learning channel sampler is as follows:
the input data of the unbalanced learning channel is sampled from a uniform sampler, and each sample in the training set is sampled with equal probability and at most once in each training round T; define B as the number of samples sampled per batch, the sampled input data is represented as { (x) 1 imb ,y 1 imb ),...,(x i imb ,y i imb ),...,(x B imb ,y B imb ) A superscript imb identifying an unbalanced learning path, (x) i imb ,y i imb ) Image data and label data representing the ith sample, i is more than or equal to 1 and less than or equal to B;
the unbalanced learning channel network is as follows:
the unbalanced learning channel network is based on an unbalanced classification algorithm, and a network model of the unbalanced classification algorithm is transplanted and comprises a feature extractor f φ ClassifierAnd an unbalance loss function L imb Three parts, the feature extractor f φ For extracting input data (x i imb ,y i imb ) Characteristic representation of +.>The feature representation is then ++>Input to a classifierGet predictive tag->Finally combine the defined unbalance loss function L imb Calculating unbalanced learning channel loss of corresponding batch of samples>
The small sample learning channel sampler is as follows:
the input data of the small sample learning channel is sampled from a meta-sampler that randomly samples N categories in all categories of the training set first, and then randomly samples K in each of the N categories, for each training round T S Samples and K Q The individual samples are respectively used as small sample learning channelsSupport set of tracksAnd a query setWherein, the superscript sup and the superscript qry are respectively used for identifying a supporting set and a query set; />Image data and label data representing the g-th sample of the support set, 1.ltoreq.g.ltoreq.NxK S ;Image data and tag data representing the p-th sample of the query set, 1.ltoreq.p.ltoreq.NxK Q The method comprises the steps of carrying out a first treatment on the surface of the The data of each batch consists of a support set S and a query set Q;
the small sample learning channel network is as follows:
the small sample learning channel network is based on a small sample learning algorithm, and a network model of the small sample learning algorithm is transplanted and comprises a feature extractor f φ Distance gauge d and loss function L fs Three parts, wherein the feature extractor adopted by the small sample learning channel network and the feature extractor adopted by the unbalanced learning channel network use the same network architecture and share weight parameters; input support set sample dataAnd query set sample data->First pass feature extractor f φ Extracting feature z g sup =f φ (x g sup ) And z p qry =f φ (x p qry ) Then according to the distance metric d, calculating the sample characteristics of the query set and the sample characteristics of the support setDistance of sign d (x p qry ,y g sup ) The label of the support set sample closest to the query set sample is the predictive label of the query set sample +.>Finally according to the defined small sample loss function L fs Calculating the small sample learning channel loss->
The two-channel learning total loss function is as follows:
the two-channel learning total loss is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss, as follows:
wherein, alpha is a super parameter related to the training round T, alpha and the training round number T are in parabolic decreasing relation, the value is 1 at the beginning of training, and gradually decreases to 0 along with the increase of the training round number T, so that the two-channel learning model focuses on an unbalanced learning channel in the early stage of training and focuses on a small sample learning channel in the later stage of training;
2) Using the total loss of the two-channel learning, and updating all parameters in the two-channel learning model by back propagation, namely training the two-channel learning model, and storing the optimal two-channel learning model parameters to obtain an optimal two-channel learning model;
3) Inputting the image data of the test set to an optimal two-channel learning model, and obtaining a prediction label, namely a prediction result, of the image.
2. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein the method comprises the following steps: in step 2), when training the two-channel learning model, a maximum training round number T is first set max The method comprises the steps of firstly, inputting sampling data of an optimizer type and an initial learning rate into an unbalanced learning channel network by using a uniform sampler and inputting sampling data of a meta sampler into a small sample learning channel network by using each round, simultaneously calculating unbalanced learning channel loss and small sample learning channel loss, then carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss to calculate a two-channel learning total loss, combining the two-channel learning total loss with the optimizer, updating a characteristic extractor parameter and an unbalanced learning channel classifier parameter of two-channel weight sharing by back propagation, wherein an over-parameter alpha in a two-channel learning total loss function and a training round number are in parabolic decreasing relation, the alpha is 1 at the beginning of training, and gradually decreases to 0 along with the increase of the training round number, so that the two-channel learning model is focused on the unbalanced learning channel in the early training period and focused on the small sample learning channel in the later training period;
the method comprises the steps of evaluating the performance of a dual-channel model by using the accuracy rate and recall rate of a Many-shot class, a Medium-shot class, a Few-shot class and an Overall class in a verification set of a long-tail distribution image dataset, wherein the number of samples of the Many-shot class is greater than 100, the number of samples of the Medium-shot class is between 20 and 100, the number of samples of the Few-shot class is less than 20, the Overall class refers to all classes of the verification set, and when the training round number reaches a set maximum round number T max And when the training is stopped, the optimal two-channel learning model parameters are saved.
3. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein the method comprises the following steps: in step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010465433.XA CN111738301B (en) | 2020-05-28 | 2020-05-28 | Long-tail distribution image data identification method based on double-channel learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010465433.XA CN111738301B (en) | 2020-05-28 | 2020-05-28 | Long-tail distribution image data identification method based on double-channel learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111738301A CN111738301A (en) | 2020-10-02 |
CN111738301B true CN111738301B (en) | 2023-06-20 |
Family
ID=72647933
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010465433.XA Active CN111738301B (en) | 2020-05-28 | 2020-05-28 | Long-tail distribution image data identification method based on double-channel learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111738301B (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022099600A1 (en) * | 2020-11-13 | 2022-05-19 | Intel Corporation | Method and system of image hashing object detection for image processing |
CN112560904A (en) * | 2020-12-01 | 2021-03-26 | 中国科学技术大学 | Small sample target identification method based on self-adaptive model unknown element learning |
CN112632319B (en) * | 2020-12-22 | 2023-04-11 | 天津大学 | Method for improving overall classification accuracy of long-tail distributed speech based on transfer learning |
CN112632320A (en) * | 2020-12-22 | 2021-04-09 | 天津大学 | Method for improving speech classification tail recognition accuracy based on long tail distribution |
CN113076873B (en) * | 2021-04-01 | 2022-02-22 | 重庆邮电大学 | Crop disease long-tail image identification method based on multi-stage training |
CN113095304B (en) * | 2021-06-08 | 2021-09-03 | 成都考拉悠然科技有限公司 | Method for weakening influence of resampling on pedestrian re-identification |
CN113449613B (en) * | 2021-06-15 | 2024-02-27 | 北京华创智芯科技有限公司 | Multi-task long tail distribution image recognition method, system, electronic equipment and medium |
CN113255832B (en) * | 2021-06-23 | 2021-10-01 | 成都考拉悠然科技有限公司 | Method for identifying long tail distribution of double-branch multi-center |
CN113569960B (en) * | 2021-07-29 | 2023-12-26 | 北京邮电大学 | Small sample image classification method and system based on domain adaptation |
CN114283307B (en) * | 2021-12-24 | 2023-10-27 | 中国科学技术大学 | Network training method based on resampling strategy |
CN114511887B (en) * | 2022-03-31 | 2022-07-05 | 北京字节跳动网络技术有限公司 | Tissue image identification method and device, readable medium and electronic equipment |
CN114882273B (en) * | 2022-04-24 | 2023-04-18 | 电子科技大学 | Visual identification method, device, equipment and storage medium applied to narrow space |
CN114863193B (en) * | 2022-07-07 | 2022-12-02 | 之江实验室 | Long-tail learning image classification and training method and device based on mixed batch normalization |
CN115953631B (en) * | 2023-01-30 | 2023-09-15 | 南开大学 | Long-tail small sample sonar image classification method and system based on deep migration learning |
CN116203929B (en) * | 2023-03-01 | 2024-01-05 | 中国矿业大学 | Industrial process fault diagnosis method for long tail distribution data |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB823263A (en) * | 1956-09-05 | 1959-11-11 | Atomic Energy Authority Uk | Improvements in or relating to nuclear particle discriminators |
CN108830416A (en) * | 2018-06-13 | 2018-11-16 | 四川大学 | Ad click rate prediction framework and algorithm based on user behavior |
CN109800810A (en) * | 2019-01-22 | 2019-05-24 | 重庆大学 | A kind of few sample learning classifier construction method based on unbalanced data |
CN109961089A (en) * | 2019-02-26 | 2019-07-02 | 中山大学 | Small sample and zero sample image classification method based on metric learning and meta learning |
CN110580500A (en) * | 2019-08-20 | 2019-12-17 | 天津大学 | Character interaction-oriented network weight generation few-sample image classification method |
CN110633758A (en) * | 2019-09-20 | 2019-12-31 | 四川长虹电器股份有限公司 | Method for detecting and locating cancer region aiming at small sample or sample unbalance |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10740595B2 (en) * | 2017-09-28 | 2020-08-11 | Nec Corporation | Long-tail large scale face recognition by non-linear feature level domain adaption |
-
2020
- 2020-05-28 CN CN202010465433.XA patent/CN111738301B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB823263A (en) * | 1956-09-05 | 1959-11-11 | Atomic Energy Authority Uk | Improvements in or relating to nuclear particle discriminators |
CN108830416A (en) * | 2018-06-13 | 2018-11-16 | 四川大学 | Ad click rate prediction framework and algorithm based on user behavior |
CN109800810A (en) * | 2019-01-22 | 2019-05-24 | 重庆大学 | A kind of few sample learning classifier construction method based on unbalanced data |
CN109961089A (en) * | 2019-02-26 | 2019-07-02 | 中山大学 | Small sample and zero sample image classification method based on metric learning and meta learning |
CN110580500A (en) * | 2019-08-20 | 2019-12-17 | 天津大学 | Character interaction-oriented network weight generation few-sample image classification method |
CN110633758A (en) * | 2019-09-20 | 2019-12-31 | 四川长虹电器股份有限公司 | Method for detecting and locating cancer region aiming at small sample or sample unbalance |
Non-Patent Citations (2)
Title |
---|
Enli Lin 等.Deep reinforcement learning for imbalanced classification.《Applied Intelligence》.2020,第2488-2502页. * |
陈琼 等.不平衡数据的迁移学习分类算法.《华南理工大学学报》.2018,第46卷(第46期),第122-130页. * |
Also Published As
Publication number | Publication date |
---|---|
CN111738301A (en) | 2020-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111738301B (en) | Long-tail distribution image data identification method based on double-channel learning | |
Asif et al. | Ensemble knowledge distillation for learning improved and efficient networks | |
CN109063565B (en) | Low-resolution face recognition method and device | |
CN112308158A (en) | Multi-source field self-adaptive model and method based on partial feature alignment | |
CN101968853B (en) | Improved immune algorithm based expression recognition method for optimizing support vector machine parameters | |
CN111696101A (en) | Light-weight solanaceae disease identification method based on SE-Inception | |
CN111160533A (en) | Neural network acceleration method based on cross-resolution knowledge distillation | |
Hara et al. | Towards good practice for action recognition with spatiotemporal 3d convolutions | |
CN110334243A (en) | Audio representation learning method based on multilayer timing pond | |
CN111738303A (en) | Long-tail distribution image identification method based on hierarchical learning | |
CN112232395B (en) | Semi-supervised image classification method for generating countermeasure network based on joint training | |
CN112766378A (en) | Cross-domain small sample image classification model method focusing on fine-grained identification | |
CN107832753B (en) | Face feature extraction method based on four-value weight and multiple classification | |
CN111860278A (en) | Human behavior recognition algorithm based on deep learning | |
CN112329536A (en) | Single-sample face recognition method based on alternative pair anti-migration learning | |
CN114882531A (en) | Cross-domain pedestrian re-identification method based on deep learning | |
Tong et al. | Automatic error correction for speaker embedding learning with noisy labels | |
CN114972753A (en) | Lightweight semantic segmentation method and system based on context information aggregation and assisted learning | |
CN109241315B (en) | Rapid face retrieval method based on deep learning | |
CN115100509B (en) | Image identification method and system based on multi-branch block-level attention enhancement network | |
US20140343945A1 (en) | Method of visual voice recognition by following-up the local deformations of a set of points of interest of the speaker's mouth | |
Li et al. | Adaptive multi-prototype relation network | |
US20140343944A1 (en) | Method of visual voice recognition with selection of groups of most relevant points of interest | |
CN113111774B (en) | Radar signal modulation mode identification method based on active incremental fine adjustment | |
CN110750672B (en) | Image retrieval method based on deep measurement learning and structure distribution learning loss |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |