CN111738301A - Long-tail distribution image data identification method based on two-channel learning - Google Patents
Long-tail distribution image data identification method based on two-channel learning Download PDFInfo
- Publication number
- CN111738301A CN111738301A CN202010465433.XA CN202010465433A CN111738301A CN 111738301 A CN111738301 A CN 111738301A CN 202010465433 A CN202010465433 A CN 202010465433A CN 111738301 A CN111738301 A CN 111738301A
- Authority
- CN
- China
- Prior art keywords
- channel
- learning
- unbalanced
- training
- small sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a long-tail distribution image data identification method based on dual-channel learning, which comprises the following steps of: 1) constructing a double-channel learning model combining unbalanced learning and small sample learning; 2) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning in a back propagation manner, and storing the optimal parameters of the dual-channel learning model; 3) and inputting the image data of the test set to the optimal double-channel learning model to obtain a prediction label of the image. The invention combines unbalanced learning and small sample learning to solve the problem of long tail distribution image data identification, the unbalanced learning channel can improve the identification accuracy of an unbalanced data set, the small sample learning channel can improve the characteristic representation of model learning, the model is emphasized on the unbalanced learning channel in the early stage of training and emphasized on the small sample learning channel in the later stage of training due to the double-channel total loss, and thus the identification accuracy of the long tail distribution image data is improved on the whole.
Description
Technical Field
The invention relates to the technical field of unbalanced classification, small sample learning and long-tail distribution image data identification in machine learning, in particular to a long-tail distribution image data identification method based on double-channel learning.
Background
Long-tail distribution image data identification generally adopts imbalance learning related technologies, and the technologies are mainly divided into a data level and an algorithm level. The data plane techniques mainly include a hybrid sampling method of down-sampling most samples, up-sampling few samples, or a combination of both. However, the resampled data cannot reflect the real data distribution characteristics, for example, the down-sampling method discards most samples, thereby losing much valuable information in the data set, and the up-sampling method causes the over-fitting problem and brings great computational power consumption. The algorithm level technology mainly readjusts the weight of each category through a cost sensitive method, and the method alleviates the problem of long tail distribution image data identification to a certain extent, but does not comprehensively consider the condition that a large number of tail categories only have few samples, so that the identification accuracy of the tail categories is still low. In addition, feasible solution ideas also include learning knowledge from head-class rich data to migrate to tail classes, designing a loss function suitable for long-tail distribution image data identification and constructing a more reasonable long-tail distribution image data identification model.
Data in real life are often presented in a long-tail distribution mode, however, currently, research on long-tail distribution image data identification is still in a preliminary stage, all long-tail distribution image data identification methods have limitations, and the identification accuracy of tail categories cannot be well improved.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art, and provides an effective, scientific and reasonable long-tail distribution image data identification method based on double-channel learning, wherein the method combines unbalanced learning and small sample learning to solve the problem of long-tail distribution image data identification, the unbalanced learning channel can improve the identification accuracy of a model to an unbalanced data set, and the small sample learning channel can improve the feature representation learned by the model and enhance the identification capability of the model to tail type image data; the constructed double-channel learning total loss function enables the model to be emphasized on an unbalanced learning channel in the early stage of training and be emphasized on a small sample learning channel in the later stage of training, and therefore the recognition accuracy of the model on long-tail image data is improved on the whole. The method provided by the invention is suitable for the problems of unbalanced multi-classification and long-tail distribution image data identification, and is a universal method with stronger robustness.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: a long-tail distribution image data identification method based on dual-channel learning comprises the following steps:
1) constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function; dividing a long-tail distribution image data set into a training set, a verification set and a test set; sampling image data and label data from a training set by using an unbalanced learning channel sampler, inputting the data into an unbalanced learning channel network, and calculating the loss of the unbalanced learning channel; sampling image data and label data from a training set by using a small sample learning channel sampler, inputting the data into a small sample learning channel network, and calculating the small sample learning channel loss; then carrying out weighted summation on the unbalanced learning channel loss and the small sample learning channel loss to obtain a two-channel learning total loss;
2) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning in a back propagation mode, namely training the dual-channel learning model, and storing the optimal parameters of the dual-channel learning model to obtain the optimal dual-channel learning model;
3) and inputting the image data of the test set to the optimal two-channel learning model to obtain a prediction label of the image, namely a prediction result.
In step 1), the unbalanced learning channel sampler is as follows:
the input data of the unbalanced learning channel is sampled from a uniform sampler, and each sample in the training set is sampled with equal probability and at most once in each training round T; define B as the number of samples sampled per batch, the input data for the samples is represented as { (x)1 imb,y1 imb),...,(xi imb,yi imb),...,(xB imb,yB imb) Where superscript imb is used to identify the unbalanced learning channel, (x)i imb,yi imb) Representing image data and label data of the ith sample, i is more than or equal to 1 and less than or equal to B;
the case of the unbalanced learning channel network is as follows:
the unbalanced learning channel network is based on an unbalanced classification algorithm, transplants a network model of the unbalanced classification algorithm, and comprises a feature extractor fφSorterAnd an imbalance loss function LimbThree parts, the feature extractor fφFor extracting input data (x)i imb,yi imb) Is characterized byThen the features are expressedInput to a classifierObtaining a predictive tagFinally, the well-defined unbalanced loss function L is combinedimbCalculating the unbalanced learning channel loss of the corresponding batch of samples
The small sample learning channel sampler is as follows:
the input data for the small sample learning channel is sampled from a meta-sampler that first randomly samples N classes in all classes of the training set and then randomly samples each of the N classes in each training pass TSampling KSA sample and KQThe samples are respectively used as a support set of a small sample learning channelAnd query setThe superscript sup and the superscript qry are respectively used for identifying the support set and the query set;image data and label data representing the ith sample of the support set, 1 ≦ i ≦ N × KS;Image data and label data representing the ith sample of the query set, 1 ≦ i ≦ N × KQ(ii) a Each batch of data consists of a support set S and a query set Q;
the case of the small sample learning channel network is as follows:
the small sample learning channel network is based on a small sample learning algorithm, a network model of the small sample learning algorithm is transplanted, and the small sample learning channel network comprises a feature extractor fφDistance gauge d and loss function LfsThe method comprises three parts, wherein a feature extractor adopted by a small sample learning channel network and a feature extractor adopted by an unbalanced learning channel network use the same network architecture and share weight parameters; input support set sample data (x)i sup,yi sup) And query set sample data (x)i qry,yi qry) First pass feature extractor fφExtracting feature zi sup=fφ(xi sup) And zi qry=fφ(xi qry) Then, according to the distance gauge d, the distance d (x) of the query set sample feature and the support set sample feature is calculatedi qry,yi sup) The label of the support set sample closest to the query set sample is the query set samplePredictive tagFinally according to the defined small sample loss function LfsCalculating small sample learning channel losses
The two-channel learning total loss function is as follows:
the total loss of the two-channel learning is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss, which is as follows:
in the formula, α is a hyper-parameter related to the training round T, α and the training round number T are in a parabolic decreasing relationship, and a value is 1 at the beginning of training and gradually decreases to 0 along with the increase of the training round number T, so that the dual-channel learning model is focused on an unbalanced learning channel in the early stage of training and focused on a small sample learning channel in the later stage of training.
In step 2), when training the two-channel learning model, firstly, the maximum training round number T is setmaxThe method comprises the steps of optimizing an optimizer type and an initial learning rate, inputting sampling data of a uniform sampler to an unbalanced learning channel network and sampling data of a small sample to a small sample learning channel network in each turn respectively, calculating unbalanced learning channel loss and small sample learning channel loss simultaneously, then carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss, calculating double-channel learning total loss, combining the double-channel learning total loss with an optimizer, reversely propagating and updating a characteristic extractor parameter shared by double-channel weight and an unbalanced learning channel classifier parameter, enabling a hyper-parameter α in a double-channel learning total loss function to be in a parabolic decreasing relation with the number of training turns, setting the value of α to be 1 at the beginning of training, gradually reducing the hyper-parameter to 0 along with the increase of the number of training turns, and enabling the double-channel learning model to be more emphasized than the unbalanced learning in the early stage of trainingThe channel is emphasized to learn the channel by small samples in the later training period;
evaluating the performance of the dual-channel model by using the accuracy rate and the recall rate of a Many-shot type, a Medium-shot type, an Few-shot type and an Overall type in a verification set of the long-tail distribution image data set, wherein the number of samples of the Many-shot type is more than 100, the number of samples of the Medium-shot type is between 20 and 100, the number of samples of the Few-shot type is less than 20, the Overall type refers to all types of the verification set, and when the number of training rounds reaches a set maximum number of rounds TmaxAnd (5) terminating the training and storing the optimal double-channel learning model parameters.
In step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention adopts a mode of combining an unbalanced learning channel and a small sample learning channel, compared with the method of only using the unbalanced learning method, the added small sample learning channel can improve the feature representation, enhance the compactness in class and improve the identification capability of a double-channel learning model to the tail class image data with rare data.
2. The uniform sampler adopted by the unbalanced learning channel can keep the original distribution of the long-tail distribution image data set, and is beneficial to the representation learning of the characteristics.
3. The small sample learning channel adopts the meta sampler to perform meta sampling on all classes of a training set of the long-tail distribution image data set, and samples a small amount of data of different classes in different rounds to perform learning as a meta task, so that the dual-channel learning model can learn the self-adaptive capacity of a small amount of sample recognition tasks and can fully utilize the data set.
4. The two-channel learning total loss function constructed by the invention is the weighted summation of the unbalanced learning channel loss and the small sample learning channel loss, the two-channel learning model emphasizes on the unbalanced learning channel in the early stage of training so as to learn a good decision boundary, and emphasizes on the small sample learning channel in the later stage of training, the characteristic representation of the two-channel learning model damaged by unbalanced learning is gradually corrected by pulling the similar samples and pushing the heterogeneous samples, meanwhile, the decision boundary learned by the unbalanced learning channel is ensured not to be damaged, and the identification accuracy of the two-channel learning model on long-tail distribution image data is integrally improved.
5. The long-tail distribution image data identification method based on the two-channel learning uses the output of the last layer of classifier of the unbalanced learning channel network as the final prediction result. When the dual-channel learning model is trained, the accuracy rate and the recall rate of the Many-shot type, the Medium-shot type and the Few-shot type in the verification set of the long-tail distribution image data set are used for evaluating the performance of the dual-channel learning model, the change of the model authenticity performance can be tracked better, and the trained model is more reliable.
Drawings
FIG. 1 is a diagram illustrating an example of input data according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of a two-channel learning model structure according to the present invention.
Detailed Description
The present invention will be further described with reference to the following specific examples.
The plates 365 dataset is a large image dataset covering 365 scene categories, each category containing no more than 5000 training pictures, 50 verification pictures and 900 test pictures. The Places365 original data set is downsampled according to the pareto distribution with the power exponent parameter of 6, the training set of the obtained long-tail distribution image data set totally comprises 62500 pictures, each class at most comprises 4980 pictures and at least 5 pictures, and the place-LT of the constructed long-tail distribution image data set is shown in figure 1. And (4) sampling 20 pictures in each type in a verification set of the long-tail distribution image data set, and tracking and evaluating the performance of the two-channel learning model. And (3) sampling 50 pictures in each type of the test set of the long-tail distribution image data set, and evaluating and comparing the performance of the dual-channel learning model and other image data identification models.
For the constructed long-tailed distribution image dataset, the data preprocessing operation is as follows: all pictures are first adjusted to 256 x 256, randomly cropped to 224 x 224 and then horizontally flipped with 50% probability during training and randomly dithered in brightness, contrast and saturation of the picture to enhance the picture, with the image being center cropped to 224 x 224 without further enhancement during verification and testing.
As shown in fig. 2, the method for identifying long-tail distribution image data based on two-channel learning according to this embodiment includes the following steps:
1) constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function, wherein:
unbalanced learning channel sampler: the input data for the unbalanced learning channel is sampled from a uniform sampler. Each sample in the training set of the long-tailed distribution image dataset is sampled with equal probability and at most once in each training round T. Define B as the number of samples sampled per batch, B is set to 128 in this embodiment, and the input data for the sampling is represented by { (x)1 imb,y1 imb),...,(xi imb,yi imb),...,(xB imb,yB imb) Where superscript imb is used to identify the unbalanced learning channel, (x)i imb,yi imb) Image data and label data representing the ith (1. ltoreq. i.ltoreq.B) sample, respectively.
Unbalanced learning channel network: based on the unbalanced classification algorithm, the network model of the unbalanced classification algorithm can be transplanted, in this embodiment, the unbalanced learning channel network adopts an LDAM unbalanced classification network, wherein the feature extractor fφResidual error network and classifier adopting ResNet10Using full-link networksUnbalanced loss function LimbLDAM losses were used. Feature extractor fφFirst, input data (x) is extractedi imb,yi imb) Is characterized byThen the features are expressedInput to a classifierObtaining a predictive tagFinally, calculating the unbalanced learning channel loss of the batch of samples by utilizing an LDAM loss functionRecord the category as yi imbSample x ofi imbIs characterized by being represented asTraining set yi imbThe number of class samples isThe hyperparameter C is set to 0.5 and the LDAM loss function is given by:
wherein:
small sample learning channel sampler: the input data for the small sample learning channel is sampled from a meta-sampler. At each training round T, the meta-sampler first distributes the bins of the training set of image data sets at the long tailRandom sampling N in 5 classes, then random sampling K in each of the 5 classes S1 sample and K Q1 sample is respectively used as a support set of a small sample learning channelAnd query setWherein the superscript sup and the superscript qry are used to identify the support set and the query set respectively,i (1. ltoreq. i. ltoreq.N × K) of the support setS) The image data and the label data of the individual samples,the ith (1 ≦ i ≦ N × K) representing the query setQ) Image data and label data for individual samples. Each batch of data consists of a support set S and a query set Q.
Small sample learning channel network: based on the small sample learning algorithm, the network model of the small sample learning algorithm can be transplanted, in this embodiment, the network model of the small sample learning channel adopts a prototype network model, and the channel adopts a feature extractor fφFeature extractor f for use with unbalanced learning channelsφThe same ResNet10 network architecture is used and the weight parameters are shared. Small sample loss function LfsCross entropy loss is employed. Input support set sample data (x)i sup,yi sup) And query set sample data (x)i qry,yi qry) First pass feature extractor fφExtracting a feature representation zi sup=fφ(xi sup) And zi qry=fφ(xi qry) Feature extractor fφAfter the characteristics of the input batch data are extracted, a support set per-type sample set S is calculatedkC center of the featurekThen sample collection according to the queryCharacteristic z ofi qryAnd class feature center ckEuclidean distance d (z)i qry,ck) Computing a query set sample xi qryProbability of belonging to class k
Wherein:
finally, according to the small sample loss function LfsCalculating small sample learning channel lossesThe small sample learning channel loss is as follows:
two-channel learning total loss function: the total loss of the two-channel learning is the weighted sum of the loss of the unbalanced learning channel and the loss of the small sample learning channel. The two-channel learning total loss function is as follows:
wherein α is a hyper-parameter related to training round T, and defines the total number of training rounds as Tmaxα relates to training round T as follows:
2) and (3) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning and back propagation, namely training the dual-channel learning model, and storing the optimal parameters of the dual-channel learning model to obtain the optimal dual-channel learning model.
In the process of training the two-channel learning model, the maximum training round number TmaxThe learning rate is initialized to 0.1 by using the SGD optimizer, the learning rate is reduced by 0.1 time when the number of training rounds T reaches 70 and is continuously reduced by 0.1 time when the number of training rounds T reaches 90, and the hyperparameter α in the two-channel learning total loss function and the number of training rounds T are in a parabolic decreasing relation, so that the two-channel learning model is emphasized on an unbalanced learning channel in the early training period and emphasized on a small sample learning channel in the later training period.
When the dual-channel learning model is trained, the performance of the dual-channel model is evaluated by using the accuracy rate and the recall rate of a Many-shot category, a Medium-shot category, an Few-shot category and an Overall category in a verification set of a long-tailed distribution image dataset. The number of samples of the Many-shot category is greater than 100, the number of samples of the Medium-shot category is between 20 and 100, the number of samples of the Few-shot category is less than 20, and the Overall category refers to all categories of the verification set. When the training round number T reaches the set maximum round number TmaxAnd (5) terminating the training and storing the optimal double-channel learning model parameters.
3) And inputting the image data of the test set of the long-tail distribution image data set into the optimal two-channel learning model stored in the last step, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
The following table is a comparison of the two-channel learning model with other image data recognition models in the Places-LT dataset. In the comparison Model, DC-LTR represents a two-channel learning Model, and besides the Plain Model is a naive deep convolutional neural network classification Model, other models are the current mainstream models for processing unbalanced image data sets or long-tail distribution image data sets. For fair comparison, all comparison models are trained using a training set of Place-LT and a Resnet10 network structure, and then Class-Balanced Accuracy and Macro F-measure of Many-shot category, Medium-shot category, Few-shot category and overload category are calculated on the testing set of Place-LT, wherein Class-Balanced Accuracy represents the average recall rate of each Class, and Macro F-measure represents the average Accuracy rate of each Class.
TABLE 1 results of comparative experiments on Places-LT data set
From experimental results, the Class-Balanced Accuracy result and the Macro F-measure result of the double-channel learning model DC-LTR in the Few-shot type and the Overall type obviously exceed those of other comparison models, so that the double-channel learning model can improve the identification Accuracy of the tail type image data with rare data and improve the identification Accuracy of the long-tail distribution image data on the whole; the result of the double-channel learning model DC-LTR on the Medium-shot category has the same advantages, the result on the Many-shot category is slightly reduced, but the result is equivalent to other models based on an imbalance algorithm or long-tail distribution image data identification models, and the result shows that the double-channel learning model does not damage the head class image data identification accuracy rate with rich data while improving the tail class image data identification accuracy rate with rare data. The effectiveness and superiority of the dual-channel learning model are verified through comparison of different models.
The model of the invention is compiled by Python3.7, based on a deep learning framework PyTorch, the model of the GPU for experimental operation is 2 NVIDIA GeForce GTX 1080Ti, and the total is 22GB video memory.
The long tail identification method for other data sets is similar to this method.
In conclusion, the invention combines the unbalanced learning and the small sample learning to solve the problem of long-tail distribution image data identification. The unbalanced learning channel improves the malformation phenomenon that a general algorithm is too biased to the head category, a good classification decision boundary is learned at the same time, and the identification accuracy of a double-channel learning model on an unbalanced data set is improved; the small sample learning channel restores the characteristic representation capability damaged by the unbalanced learning channel by pulling the similar sample and pushing the dissimilar sample, and enhances the recognition capability of the two-channel learning model on the tail image data; the constructed double-channel learning total loss function enables the double-channel learning model to be emphasized on an unbalanced learning channel in the early stage of training and emphasized on a small sample learning channel in the later stage of training, and therefore the recognition accuracy rate of the double-channel learning model on long-tail distribution image data is improved on the whole. Therefore, the invention has practical application value and is worth popularizing.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that the changes in the shape and principle of the present invention should be covered within the protection scope of the present invention.
Claims (4)
1. A long tail distribution image data identification method based on dual-channel learning is characterized by comprising the following steps:
1) constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function; dividing a long-tail distribution image data set into a training set, a verification set and a test set; sampling image data and label data from a training set by using an unbalanced learning channel sampler, inputting the data into an unbalanced learning channel network, and calculating the loss of the unbalanced learning channel; sampling image data and label data from a training set by using a small sample learning channel sampler, inputting the data into a small sample learning channel network, and calculating the small sample learning channel loss; then carrying out weighted summation on the unbalanced learning channel loss and the small sample learning channel loss to obtain a two-channel learning total loss;
2) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning in a back propagation mode, namely training the dual-channel learning model, and storing the optimal parameters of the dual-channel learning model to obtain the optimal dual-channel learning model;
3) and inputting the image data of the test set to the optimal two-channel learning model to obtain a prediction label of the image, namely a prediction result.
2. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein: in step 1), the unbalanced learning channel sampler is as follows:
the input data of the unbalanced learning channel is sampled from a uniform sampler, and each sample in the training set is sampled with equal probability and at most once in each training round T; define B as the number of samples sampled per batch, the input data for the samples is represented as { (x)1 imb,y1 imb),...,(xi imb,yi imb),...,(xB imb,yB imb) Where superscript imb is used to identify the unbalanced learning channel, (x)i imb,yi imb) Representing image data and label data of the ith sample, i is more than or equal to 1 and less than or equal to B;
the case of the unbalanced learning channel network is as follows:
the unbalanced learning channel network is based on an unbalanced classification algorithm, transplants a network model of the unbalanced classification algorithm, and comprises a feature extractor fφSorterAnd an imbalance loss function LimbThree parts, the feature extractor fφFor extracting input data (x)i imb,yi imb) Is characterized byThen the features are expressedInput to a classifierObtaining a predictive tagFinally, the well-defined unbalanced loss function L is combinedimbCalculating the unbalanced learning channel loss of the corresponding batch of samples
The small sample learning channel sampler is as follows:
the input data for the small sample learning channel is sampled from a meta-sampler that first randomly samples N classes in all classes of the training set and then randomly samples K in each of the N classes in each training pass TSA sample and KQThe samples are respectively used as a support set of a small sample learning channelAnd query setThe superscript sup and the superscript qry are respectively used for identifying the support set and the query set;image data and label data representing the ith sample of the support set, 1 ≦ i ≦ N × KS;Image data and label data representing the ith sample of the query set, 1 ≦ i ≦ N × KQ(ii) a Each batch of data consists of a support set S and a query set Q;
the case of the small sample learning channel network is as follows:
the small sample learning channel network is based on a small sample learning algorithm, a network model of the small sample learning algorithm is transplanted, and the small sample learning channel network comprises a feature extractor fφDistance gauge d and loss function LfsThree parts, wherein the small sample learns the feature extractor and imbalance adopted by the channel networkThe feature extractors adopted by the learning channel network use the same network architecture and share weight parameters; input support set sample data (x)i sup,yi sup) And query set sample data (x)i qry,yi qry) First pass feature extractor fφExtracting feature zi sup=fφ(xi sup) And zi qry=fφ(xi qry) Then, according to the distance gauge d, the distance d (x) of the query set sample feature and the support set sample feature is calculatedi qry,yi sup) The label of the support set sample closest to the query set sample is the prediction label of the query set sampleFinally according to the defined small sample loss function LfsCalculating small sample learning channel losses
The two-channel learning total loss function is as follows:
the total loss of the two-channel learning is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss, which is as follows:
in the formula, α is a hyper-parameter related to the training round T, α and the training round number T are in a parabolic decreasing relationship, and a value is 1 at the beginning of training and gradually decreases to 0 along with the increase of the training round number T, so that the dual-channel learning model is focused on an unbalanced learning channel in the early stage of training and focused on a small sample learning channel in the later stage of training.
3. The method for identifying long-tail distribution image data based on two-channel learning as claimed in claim 1, wherein the method is characterized in that: in step 2), when training the two-channel learning model, firstly, the maximum training round number T is setmaxThe method comprises the steps of optimizing an optimizer type and an initial learning rate, inputting sampling data of a uniform sampler to an unbalanced learning channel network and sampling data of a small sample to a small sample learning channel network in each round respectively, calculating unbalanced learning channel loss and small sample learning channel loss simultaneously, carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss, calculating double-channel learning total loss, combining the double-channel learning total loss with an optimizer, reversely propagating and updating a characteristic extractor parameter shared by double-channel weight and an unbalanced learning channel classifier parameter, enabling a hyper-parameter α in a double-channel learning total loss function to be in a parabolic decreasing relation with the number of training rounds, and enabling α to take a value of 1 at the beginning of training and gradually decrease to 0 along with the increase of the number of training rounds, so that the double-channel learning model is emphasized on the unbalanced learning channel at the early stage of training and emphasized on the small sample learning channel at the later stage of training;
evaluating the performance of the dual-channel model by using the accuracy rate and the recall rate of a Many-shot type, a Medium-shot type, an Few-shot type and an Overall type in a verification set of the long-tail distribution image data set, wherein the number of samples of the Many-shot type is more than 100, the number of samples of the Medium-shot type is between 20 and 100, the number of samples of the Few-shot type is less than 20, the Overall type refers to all types of the verification set, and when the number of training rounds reaches a set maximum number of rounds TmaxAnd (5) terminating the training and storing the optimal double-channel learning model parameters.
4. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein: in step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010465433.XA CN111738301B (en) | 2020-05-28 | 2020-05-28 | Long-tail distribution image data identification method based on double-channel learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010465433.XA CN111738301B (en) | 2020-05-28 | 2020-05-28 | Long-tail distribution image data identification method based on double-channel learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111738301A true CN111738301A (en) | 2020-10-02 |
CN111738301B CN111738301B (en) | 2023-06-20 |
Family
ID=72647933
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010465433.XA Active CN111738301B (en) | 2020-05-28 | 2020-05-28 | Long-tail distribution image data identification method based on double-channel learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111738301B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112560904A (en) * | 2020-12-01 | 2021-03-26 | 中国科学技术大学 | Small sample target identification method based on self-adaptive model unknown element learning |
CN112632319A (en) * | 2020-12-22 | 2021-04-09 | 天津大学 | Method for improving overall classification accuracy of long-tail distributed speech based on transfer learning |
CN112632320A (en) * | 2020-12-22 | 2021-04-09 | 天津大学 | Method for improving speech classification tail recognition accuracy based on long tail distribution |
CN113076873A (en) * | 2021-04-01 | 2021-07-06 | 重庆邮电大学 | Crop disease long-tail image identification method based on multi-stage training |
CN113095304A (en) * | 2021-06-08 | 2021-07-09 | 成都考拉悠然科技有限公司 | Method for weakening influence of resampling on pedestrian re-identification |
CN113255832A (en) * | 2021-06-23 | 2021-08-13 | 成都考拉悠然科技有限公司 | Method for identifying long tail distribution of double-branch multi-center |
CN113449613A (en) * | 2021-06-15 | 2021-09-28 | 北京华创智芯科技有限公司 | Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium |
CN113569960A (en) * | 2021-07-29 | 2021-10-29 | 北京邮电大学 | Small sample image classification method and system based on domain adaptation |
CN114283307A (en) * | 2021-12-24 | 2022-04-05 | 中国科学技术大学 | Network training method based on resampling strategy |
CN114511887A (en) * | 2022-03-31 | 2022-05-17 | 北京字节跳动网络技术有限公司 | Tissue image identification method and device, readable medium and electronic equipment |
WO2022099600A1 (en) * | 2020-11-13 | 2022-05-19 | Intel Corporation | Method and system of image hashing object detection for image processing |
CN114863193A (en) * | 2022-07-07 | 2022-08-05 | 之江实验室 | Long-tail learning image classification and training method and device based on mixed batch normalization |
CN114882273A (en) * | 2022-04-24 | 2022-08-09 | 电子科技大学 | Visual identification method, device, equipment and storage medium applied to narrow space |
CN115953631A (en) * | 2023-01-30 | 2023-04-11 | 南开大学 | Long-tail small sample sonar image classification method and system based on deep migration learning |
CN116203929B (en) * | 2023-03-01 | 2024-01-05 | 中国矿业大学 | Industrial process fault diagnosis method for long tail distribution data |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB823263A (en) * | 1956-09-05 | 1959-11-11 | Atomic Energy Authority Uk | Improvements in or relating to nuclear particle discriminators |
CN108830416A (en) * | 2018-06-13 | 2018-11-16 | 四川大学 | Ad click rate prediction framework and algorithm based on user behavior |
US20190095700A1 (en) * | 2017-09-28 | 2019-03-28 | Nec Laboratories America, Inc. | Long-tail large scale face recognition by non-linear feature level domain adaption |
CN109800810A (en) * | 2019-01-22 | 2019-05-24 | 重庆大学 | A kind of few sample learning classifier construction method based on unbalanced data |
CN109961089A (en) * | 2019-02-26 | 2019-07-02 | 中山大学 | Small sample and zero sample image classification method based on metric learning and meta learning |
CN110580500A (en) * | 2019-08-20 | 2019-12-17 | 天津大学 | Character interaction-oriented network weight generation few-sample image classification method |
CN110633758A (en) * | 2019-09-20 | 2019-12-31 | 四川长虹电器股份有限公司 | Method for detecting and locating cancer region aiming at small sample or sample unbalance |
-
2020
- 2020-05-28 CN CN202010465433.XA patent/CN111738301B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB823263A (en) * | 1956-09-05 | 1959-11-11 | Atomic Energy Authority Uk | Improvements in or relating to nuclear particle discriminators |
US20190095700A1 (en) * | 2017-09-28 | 2019-03-28 | Nec Laboratories America, Inc. | Long-tail large scale face recognition by non-linear feature level domain adaption |
CN108830416A (en) * | 2018-06-13 | 2018-11-16 | 四川大学 | Ad click rate prediction framework and algorithm based on user behavior |
CN109800810A (en) * | 2019-01-22 | 2019-05-24 | 重庆大学 | A kind of few sample learning classifier construction method based on unbalanced data |
CN109961089A (en) * | 2019-02-26 | 2019-07-02 | 中山大学 | Small sample and zero sample image classification method based on metric learning and meta learning |
CN110580500A (en) * | 2019-08-20 | 2019-12-17 | 天津大学 | Character interaction-oriented network weight generation few-sample image classification method |
CN110633758A (en) * | 2019-09-20 | 2019-12-31 | 四川长虹电器股份有限公司 | Method for detecting and locating cancer region aiming at small sample or sample unbalance |
Non-Patent Citations (2)
Title |
---|
ENLI LIN 等: "Deep reinforcement learning for imbalanced classification" * |
陈琼 等: "不平衡数据的迁移学习分类算法" * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022099600A1 (en) * | 2020-11-13 | 2022-05-19 | Intel Corporation | Method and system of image hashing object detection for image processing |
CN112560904A (en) * | 2020-12-01 | 2021-03-26 | 中国科学技术大学 | Small sample target identification method based on self-adaptive model unknown element learning |
CN112632319A (en) * | 2020-12-22 | 2021-04-09 | 天津大学 | Method for improving overall classification accuracy of long-tail distributed speech based on transfer learning |
CN112632320A (en) * | 2020-12-22 | 2021-04-09 | 天津大学 | Method for improving speech classification tail recognition accuracy based on long tail distribution |
CN112632319B (en) * | 2020-12-22 | 2023-04-11 | 天津大学 | Method for improving overall classification accuracy of long-tail distributed speech based on transfer learning |
CN113076873A (en) * | 2021-04-01 | 2021-07-06 | 重庆邮电大学 | Crop disease long-tail image identification method based on multi-stage training |
CN113095304A (en) * | 2021-06-08 | 2021-07-09 | 成都考拉悠然科技有限公司 | Method for weakening influence of resampling on pedestrian re-identification |
CN113449613A (en) * | 2021-06-15 | 2021-09-28 | 北京华创智芯科技有限公司 | Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium |
CN113449613B (en) * | 2021-06-15 | 2024-02-27 | 北京华创智芯科技有限公司 | Multi-task long tail distribution image recognition method, system, electronic equipment and medium |
CN113255832A (en) * | 2021-06-23 | 2021-08-13 | 成都考拉悠然科技有限公司 | Method for identifying long tail distribution of double-branch multi-center |
CN113255832B (en) * | 2021-06-23 | 2021-10-01 | 成都考拉悠然科技有限公司 | Method for identifying long tail distribution of double-branch multi-center |
CN113569960A (en) * | 2021-07-29 | 2021-10-29 | 北京邮电大学 | Small sample image classification method and system based on domain adaptation |
CN113569960B (en) * | 2021-07-29 | 2023-12-26 | 北京邮电大学 | Small sample image classification method and system based on domain adaptation |
CN114283307B (en) * | 2021-12-24 | 2023-10-27 | 中国科学技术大学 | Network training method based on resampling strategy |
CN114283307A (en) * | 2021-12-24 | 2022-04-05 | 中国科学技术大学 | Network training method based on resampling strategy |
CN114511887B (en) * | 2022-03-31 | 2022-07-05 | 北京字节跳动网络技术有限公司 | Tissue image identification method and device, readable medium and electronic equipment |
CN114511887A (en) * | 2022-03-31 | 2022-05-17 | 北京字节跳动网络技术有限公司 | Tissue image identification method and device, readable medium and electronic equipment |
CN114882273A (en) * | 2022-04-24 | 2022-08-09 | 电子科技大学 | Visual identification method, device, equipment and storage medium applied to narrow space |
CN114882273B (en) * | 2022-04-24 | 2023-04-18 | 电子科技大学 | Visual identification method, device, equipment and storage medium applied to narrow space |
CN114863193A (en) * | 2022-07-07 | 2022-08-05 | 之江实验室 | Long-tail learning image classification and training method and device based on mixed batch normalization |
CN115953631B (en) * | 2023-01-30 | 2023-09-15 | 南开大学 | Long-tail small sample sonar image classification method and system based on deep migration learning |
CN115953631A (en) * | 2023-01-30 | 2023-04-11 | 南开大学 | Long-tail small sample sonar image classification method and system based on deep migration learning |
CN116203929B (en) * | 2023-03-01 | 2024-01-05 | 中国矿业大学 | Industrial process fault diagnosis method for long tail distribution data |
Also Published As
Publication number | Publication date |
---|---|
CN111738301B (en) | 2023-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111738301A (en) | Long-tail distribution image data identification method based on two-channel learning | |
CN109657584B (en) | Improved LeNet-5 fusion network traffic sign identification method for assisting driving | |
Xiang et al. | Fruit image classification based on Mobilenetv2 with transfer learning technique | |
CN112308158A (en) | Multi-source field self-adaptive model and method based on partial feature alignment | |
CN108764317B (en) | Residual convolutional neural network image classification method based on multipath feature weighting | |
CN108121975B (en) | Face recognition method combining original data and generated data | |
CN111696101A (en) | Light-weight solanaceae disease identification method based on SE-Inception | |
CN111985581A (en) | Sample-level attention network-based few-sample learning method | |
CN110942091A (en) | Semi-supervised few-sample image classification method for searching reliable abnormal data center | |
CN109344856B (en) | Offline signature identification method based on multilayer discriminant feature learning | |
CN111738303A (en) | Long-tail distribution image identification method based on hierarchical learning | |
CN113743505A (en) | Improved SSD target detection method based on self-attention and feature fusion | |
CN115205594A (en) | Long-tail image data classification method based on mixed samples | |
CN101414365B (en) | Vector code quantizer based on particle group | |
CN112766378A (en) | Cross-domain small sample image classification model method focusing on fine-grained identification | |
CN115984213A (en) | Industrial product appearance defect detection method based on deep clustering | |
CN116452862A (en) | Image classification method based on domain generalization learning | |
CN111462090A (en) | Multi-scale image target detection method | |
CN114898171A (en) | Real-time target detection method suitable for embedded platform | |
Zhang et al. | A new JPEG image steganalysis technique combining rich model features and convolutional neural networks | |
CN113255832B (en) | Method for identifying long tail distribution of double-branch multi-center | |
CN112528077B (en) | Video face retrieval method and system based on video embedding | |
CN113505120A (en) | Double-stage noise cleaning method for large-scale face data set | |
US20140343945A1 (en) | Method of visual voice recognition by following-up the local deformations of a set of points of interest of the speaker's mouth | |
CN115984946A (en) | Face recognition model forgetting method and system based on ensemble learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |