CN111738301A - Long-tail distribution image data identification method based on two-channel learning - Google Patents

Long-tail distribution image data identification method based on two-channel learning Download PDF

Info

Publication number
CN111738301A
CN111738301A CN202010465433.XA CN202010465433A CN111738301A CN 111738301 A CN111738301 A CN 111738301A CN 202010465433 A CN202010465433 A CN 202010465433A CN 111738301 A CN111738301 A CN 111738301A
Authority
CN
China
Prior art keywords
channel
learning
unbalanced
training
small sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010465433.XA
Other languages
Chinese (zh)
Other versions
CN111738301B (en
Inventor
陈琼
林恩禄
朱戈仁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202010465433.XA priority Critical patent/CN111738301B/en
Publication of CN111738301A publication Critical patent/CN111738301A/en
Application granted granted Critical
Publication of CN111738301B publication Critical patent/CN111738301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a long-tail distribution image data identification method based on dual-channel learning, which comprises the following steps of: 1) constructing a double-channel learning model combining unbalanced learning and small sample learning; 2) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning in a back propagation manner, and storing the optimal parameters of the dual-channel learning model; 3) and inputting the image data of the test set to the optimal double-channel learning model to obtain a prediction label of the image. The invention combines unbalanced learning and small sample learning to solve the problem of long tail distribution image data identification, the unbalanced learning channel can improve the identification accuracy of an unbalanced data set, the small sample learning channel can improve the characteristic representation of model learning, the model is emphasized on the unbalanced learning channel in the early stage of training and emphasized on the small sample learning channel in the later stage of training due to the double-channel total loss, and thus the identification accuracy of the long tail distribution image data is improved on the whole.

Description

Long-tail distribution image data identification method based on two-channel learning
Technical Field
The invention relates to the technical field of unbalanced classification, small sample learning and long-tail distribution image data identification in machine learning, in particular to a long-tail distribution image data identification method based on double-channel learning.
Background
Long-tail distribution image data identification generally adopts imbalance learning related technologies, and the technologies are mainly divided into a data level and an algorithm level. The data plane techniques mainly include a hybrid sampling method of down-sampling most samples, up-sampling few samples, or a combination of both. However, the resampled data cannot reflect the real data distribution characteristics, for example, the down-sampling method discards most samples, thereby losing much valuable information in the data set, and the up-sampling method causes the over-fitting problem and brings great computational power consumption. The algorithm level technology mainly readjusts the weight of each category through a cost sensitive method, and the method alleviates the problem of long tail distribution image data identification to a certain extent, but does not comprehensively consider the condition that a large number of tail categories only have few samples, so that the identification accuracy of the tail categories is still low. In addition, feasible solution ideas also include learning knowledge from head-class rich data to migrate to tail classes, designing a loss function suitable for long-tail distribution image data identification and constructing a more reasonable long-tail distribution image data identification model.
Data in real life are often presented in a long-tail distribution mode, however, currently, research on long-tail distribution image data identification is still in a preliminary stage, all long-tail distribution image data identification methods have limitations, and the identification accuracy of tail categories cannot be well improved.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art, and provides an effective, scientific and reasonable long-tail distribution image data identification method based on double-channel learning, wherein the method combines unbalanced learning and small sample learning to solve the problem of long-tail distribution image data identification, the unbalanced learning channel can improve the identification accuracy of a model to an unbalanced data set, and the small sample learning channel can improve the feature representation learned by the model and enhance the identification capability of the model to tail type image data; the constructed double-channel learning total loss function enables the model to be emphasized on an unbalanced learning channel in the early stage of training and be emphasized on a small sample learning channel in the later stage of training, and therefore the recognition accuracy of the model on long-tail image data is improved on the whole. The method provided by the invention is suitable for the problems of unbalanced multi-classification and long-tail distribution image data identification, and is a universal method with stronger robustness.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: a long-tail distribution image data identification method based on dual-channel learning comprises the following steps:
1) constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function; dividing a long-tail distribution image data set into a training set, a verification set and a test set; sampling image data and label data from a training set by using an unbalanced learning channel sampler, inputting the data into an unbalanced learning channel network, and calculating the loss of the unbalanced learning channel; sampling image data and label data from a training set by using a small sample learning channel sampler, inputting the data into a small sample learning channel network, and calculating the small sample learning channel loss; then carrying out weighted summation on the unbalanced learning channel loss and the small sample learning channel loss to obtain a two-channel learning total loss;
2) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning in a back propagation mode, namely training the dual-channel learning model, and storing the optimal parameters of the dual-channel learning model to obtain the optimal dual-channel learning model;
3) and inputting the image data of the test set to the optimal two-channel learning model to obtain a prediction label of the image, namely a prediction result.
In step 1), the unbalanced learning channel sampler is as follows:
the input data of the unbalanced learning channel is sampled from a uniform sampler, and each sample in the training set is sampled with equal probability and at most once in each training round T; define B as the number of samples sampled per batch, the input data for the samples is represented as { (x)1 imb,y1 imb),...,(xi imb,yi imb),...,(xB imb,yB imb) Where superscript imb is used to identify the unbalanced learning channel, (x)i imb,yi imb) Representing image data and label data of the ith sample, i is more than or equal to 1 and less than or equal to B;
the case of the unbalanced learning channel network is as follows:
the unbalanced learning channel network is based on an unbalanced classification algorithm, transplants a network model of the unbalanced classification algorithm, and comprises a feature extractor fφSorter
Figure BDA0002512483940000031
And an imbalance loss function LimbThree parts, the feature extractor fφFor extracting input data (x)i imb,yi imb) Is characterized by
Figure BDA0002512483940000032
Then the features are expressed
Figure BDA0002512483940000033
Input to a classifier
Figure BDA0002512483940000034
Obtaining a predictive tag
Figure BDA0002512483940000035
Finally, the well-defined unbalanced loss function L is combinedimbCalculating the unbalanced learning channel loss of the corresponding batch of samples
Figure BDA0002512483940000036
The small sample learning channel sampler is as follows:
the input data for the small sample learning channel is sampled from a meta-sampler that first randomly samples N classes in all classes of the training set and then randomly samples each of the N classes in each training pass TSampling KSA sample and KQThe samples are respectively used as a support set of a small sample learning channel
Figure BDA0002512483940000037
And query set
Figure BDA0002512483940000038
The superscript sup and the superscript qry are respectively used for identifying the support set and the query set;
Figure BDA0002512483940000039
image data and label data representing the ith sample of the support set, 1 ≦ i ≦ N × KS
Figure BDA00025124839400000310
Image data and label data representing the ith sample of the query set, 1 ≦ i ≦ N × KQ(ii) a Each batch of data consists of a support set S and a query set Q;
the case of the small sample learning channel network is as follows:
the small sample learning channel network is based on a small sample learning algorithm, a network model of the small sample learning algorithm is transplanted, and the small sample learning channel network comprises a feature extractor fφDistance gauge d and loss function LfsThe method comprises three parts, wherein a feature extractor adopted by a small sample learning channel network and a feature extractor adopted by an unbalanced learning channel network use the same network architecture and share weight parameters; input support set sample data (x)i sup,yi sup) And query set sample data (x)i qry,yi qry) First pass feature extractor fφExtracting feature zi sup=fφ(xi sup) And zi qry=fφ(xi qry) Then, according to the distance gauge d, the distance d (x) of the query set sample feature and the support set sample feature is calculatedi qry,yi sup) The label of the support set sample closest to the query set sample is the query set samplePredictive tag
Figure BDA0002512483940000041
Finally according to the defined small sample loss function LfsCalculating small sample learning channel losses
Figure BDA0002512483940000042
The two-channel learning total loss function is as follows:
the total loss of the two-channel learning is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss, which is as follows:
Figure BDA0002512483940000043
in the formula, α is a hyper-parameter related to the training round T, α and the training round number T are in a parabolic decreasing relationship, and a value is 1 at the beginning of training and gradually decreases to 0 along with the increase of the training round number T, so that the dual-channel learning model is focused on an unbalanced learning channel in the early stage of training and focused on a small sample learning channel in the later stage of training.
In step 2), when training the two-channel learning model, firstly, the maximum training round number T is setmaxThe method comprises the steps of optimizing an optimizer type and an initial learning rate, inputting sampling data of a uniform sampler to an unbalanced learning channel network and sampling data of a small sample to a small sample learning channel network in each turn respectively, calculating unbalanced learning channel loss and small sample learning channel loss simultaneously, then carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss, calculating double-channel learning total loss, combining the double-channel learning total loss with an optimizer, reversely propagating and updating a characteristic extractor parameter shared by double-channel weight and an unbalanced learning channel classifier parameter, enabling a hyper-parameter α in a double-channel learning total loss function to be in a parabolic decreasing relation with the number of training turns, setting the value of α to be 1 at the beginning of training, gradually reducing the hyper-parameter to 0 along with the increase of the number of training turns, and enabling the double-channel learning model to be more emphasized than the unbalanced learning in the early stage of trainingThe channel is emphasized to learn the channel by small samples in the later training period;
evaluating the performance of the dual-channel model by using the accuracy rate and the recall rate of a Many-shot type, a Medium-shot type, an Few-shot type and an Overall type in a verification set of the long-tail distribution image data set, wherein the number of samples of the Many-shot type is more than 100, the number of samples of the Medium-shot type is between 20 and 100, the number of samples of the Few-shot type is less than 20, the Overall type refers to all types of the verification set, and when the number of training rounds reaches a set maximum number of rounds TmaxAnd (5) terminating the training and storing the optimal double-channel learning model parameters.
In step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention adopts a mode of combining an unbalanced learning channel and a small sample learning channel, compared with the method of only using the unbalanced learning method, the added small sample learning channel can improve the feature representation, enhance the compactness in class and improve the identification capability of a double-channel learning model to the tail class image data with rare data.
2. The uniform sampler adopted by the unbalanced learning channel can keep the original distribution of the long-tail distribution image data set, and is beneficial to the representation learning of the characteristics.
3. The small sample learning channel adopts the meta sampler to perform meta sampling on all classes of a training set of the long-tail distribution image data set, and samples a small amount of data of different classes in different rounds to perform learning as a meta task, so that the dual-channel learning model can learn the self-adaptive capacity of a small amount of sample recognition tasks and can fully utilize the data set.
4. The two-channel learning total loss function constructed by the invention is the weighted summation of the unbalanced learning channel loss and the small sample learning channel loss, the two-channel learning model emphasizes on the unbalanced learning channel in the early stage of training so as to learn a good decision boundary, and emphasizes on the small sample learning channel in the later stage of training, the characteristic representation of the two-channel learning model damaged by unbalanced learning is gradually corrected by pulling the similar samples and pushing the heterogeneous samples, meanwhile, the decision boundary learned by the unbalanced learning channel is ensured not to be damaged, and the identification accuracy of the two-channel learning model on long-tail distribution image data is integrally improved.
5. The long-tail distribution image data identification method based on the two-channel learning uses the output of the last layer of classifier of the unbalanced learning channel network as the final prediction result. When the dual-channel learning model is trained, the accuracy rate and the recall rate of the Many-shot type, the Medium-shot type and the Few-shot type in the verification set of the long-tail distribution image data set are used for evaluating the performance of the dual-channel learning model, the change of the model authenticity performance can be tracked better, and the trained model is more reliable.
Drawings
FIG. 1 is a diagram illustrating an example of input data according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of a two-channel learning model structure according to the present invention.
Detailed Description
The present invention will be further described with reference to the following specific examples.
The plates 365 dataset is a large image dataset covering 365 scene categories, each category containing no more than 5000 training pictures, 50 verification pictures and 900 test pictures. The Places365 original data set is downsampled according to the pareto distribution with the power exponent parameter of 6, the training set of the obtained long-tail distribution image data set totally comprises 62500 pictures, each class at most comprises 4980 pictures and at least 5 pictures, and the place-LT of the constructed long-tail distribution image data set is shown in figure 1. And (4) sampling 20 pictures in each type in a verification set of the long-tail distribution image data set, and tracking and evaluating the performance of the two-channel learning model. And (3) sampling 50 pictures in each type of the test set of the long-tail distribution image data set, and evaluating and comparing the performance of the dual-channel learning model and other image data identification models.
For the constructed long-tailed distribution image dataset, the data preprocessing operation is as follows: all pictures are first adjusted to 256 x 256, randomly cropped to 224 x 224 and then horizontally flipped with 50% probability during training and randomly dithered in brightness, contrast and saturation of the picture to enhance the picture, with the image being center cropped to 224 x 224 without further enhancement during verification and testing.
As shown in fig. 2, the method for identifying long-tail distribution image data based on two-channel learning according to this embodiment includes the following steps:
1) constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function, wherein:
unbalanced learning channel sampler: the input data for the unbalanced learning channel is sampled from a uniform sampler. Each sample in the training set of the long-tailed distribution image dataset is sampled with equal probability and at most once in each training round T. Define B as the number of samples sampled per batch, B is set to 128 in this embodiment, and the input data for the sampling is represented by { (x)1 imb,y1 imb),...,(xi imb,yi imb),...,(xB imb,yB imb) Where superscript imb is used to identify the unbalanced learning channel, (x)i imb,yi imb) Image data and label data representing the ith (1. ltoreq. i.ltoreq.B) sample, respectively.
Unbalanced learning channel network: based on the unbalanced classification algorithm, the network model of the unbalanced classification algorithm can be transplanted, in this embodiment, the unbalanced learning channel network adopts an LDAM unbalanced classification network, wherein the feature extractor fφResidual error network and classifier adopting ResNet10
Figure BDA0002512483940000071
Using full-link networksUnbalanced loss function LimbLDAM losses were used. Feature extractor fφFirst, input data (x) is extractedi imb,yi imb) Is characterized by
Figure BDA0002512483940000072
Then the features are expressed
Figure BDA0002512483940000073
Input to a classifier
Figure BDA0002512483940000074
Obtaining a predictive tag
Figure BDA0002512483940000075
Finally, calculating the unbalanced learning channel loss of the batch of samples by utilizing an LDAM loss function
Figure BDA0002512483940000076
Record the category as yi imbSample x ofi imbIs characterized by being represented as
Figure BDA0002512483940000077
Training set yi imbThe number of class samples is
Figure BDA0002512483940000078
The hyperparameter C is set to 0.5 and the LDAM loss function is given by:
Figure BDA0002512483940000079
wherein:
Figure BDA0002512483940000081
small sample learning channel sampler: the input data for the small sample learning channel is sampled from a meta-sampler. At each training round T, the meta-sampler first distributes the bins of the training set of image data sets at the long tailRandom sampling N in 5 classes, then random sampling K in each of the 5 classes S1 sample and K Q1 sample is respectively used as a support set of a small sample learning channel
Figure BDA0002512483940000082
And query set
Figure BDA0002512483940000083
Wherein the superscript sup and the superscript qry are used to identify the support set and the query set respectively,
Figure BDA0002512483940000084
i (1. ltoreq. i. ltoreq.N × K) of the support setS) The image data and the label data of the individual samples,
Figure BDA0002512483940000085
the ith (1 ≦ i ≦ N × K) representing the query setQ) Image data and label data for individual samples. Each batch of data consists of a support set S and a query set Q.
Small sample learning channel network: based on the small sample learning algorithm, the network model of the small sample learning algorithm can be transplanted, in this embodiment, the network model of the small sample learning channel adopts a prototype network model, and the channel adopts a feature extractor fφFeature extractor f for use with unbalanced learning channelsφThe same ResNet10 network architecture is used and the weight parameters are shared. Small sample loss function LfsCross entropy loss is employed. Input support set sample data (x)i sup,yi sup) And query set sample data (x)i qry,yi qry) First pass feature extractor fφExtracting a feature representation zi sup=fφ(xi sup) And zi qry=fφ(xi qry) Feature extractor fφAfter the characteristics of the input batch data are extracted, a support set per-type sample set S is calculatedkC center of the featurekThen sample collection according to the queryCharacteristic z ofi qryAnd class feature center ckEuclidean distance d (z)i qry,ck) Computing a query set sample xi qryProbability of belonging to class k
Figure BDA0002512483940000086
Figure BDA0002512483940000087
Wherein:
Figure BDA0002512483940000088
finally, according to the small sample loss function LfsCalculating small sample learning channel losses
Figure BDA0002512483940000091
The small sample learning channel loss is as follows:
Figure BDA0002512483940000092
two-channel learning total loss function: the total loss of the two-channel learning is the weighted sum of the loss of the unbalanced learning channel and the loss of the small sample learning channel. The two-channel learning total loss function is as follows:
Figure BDA0002512483940000093
wherein α is a hyper-parameter related to training round T, and defines the total number of training rounds as Tmaxα relates to training round T as follows:
Figure BDA0002512483940000094
2) and (3) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning and back propagation, namely training the dual-channel learning model, and storing the optimal parameters of the dual-channel learning model to obtain the optimal dual-channel learning model.
In the process of training the two-channel learning model, the maximum training round number TmaxThe learning rate is initialized to 0.1 by using the SGD optimizer, the learning rate is reduced by 0.1 time when the number of training rounds T reaches 70 and is continuously reduced by 0.1 time when the number of training rounds T reaches 90, and the hyperparameter α in the two-channel learning total loss function and the number of training rounds T are in a parabolic decreasing relation, so that the two-channel learning model is emphasized on an unbalanced learning channel in the early training period and emphasized on a small sample learning channel in the later training period.
When the dual-channel learning model is trained, the performance of the dual-channel model is evaluated by using the accuracy rate and the recall rate of a Many-shot category, a Medium-shot category, an Few-shot category and an Overall category in a verification set of a long-tailed distribution image dataset. The number of samples of the Many-shot category is greater than 100, the number of samples of the Medium-shot category is between 20 and 100, the number of samples of the Few-shot category is less than 20, and the Overall category refers to all categories of the verification set. When the training round number T reaches the set maximum round number TmaxAnd (5) terminating the training and storing the optimal double-channel learning model parameters.
3) And inputting the image data of the test set of the long-tail distribution image data set into the optimal two-channel learning model stored in the last step, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
The following table is a comparison of the two-channel learning model with other image data recognition models in the Places-LT dataset. In the comparison Model, DC-LTR represents a two-channel learning Model, and besides the Plain Model is a naive deep convolutional neural network classification Model, other models are the current mainstream models for processing unbalanced image data sets or long-tail distribution image data sets. For fair comparison, all comparison models are trained using a training set of Place-LT and a Resnet10 network structure, and then Class-Balanced Accuracy and Macro F-measure of Many-shot category, Medium-shot category, Few-shot category and overload category are calculated on the testing set of Place-LT, wherein Class-Balanced Accuracy represents the average recall rate of each Class, and Macro F-measure represents the average Accuracy rate of each Class.
TABLE 1 results of comparative experiments on Places-LT data set
Figure BDA0002512483940000101
From experimental results, the Class-Balanced Accuracy result and the Macro F-measure result of the double-channel learning model DC-LTR in the Few-shot type and the Overall type obviously exceed those of other comparison models, so that the double-channel learning model can improve the identification Accuracy of the tail type image data with rare data and improve the identification Accuracy of the long-tail distribution image data on the whole; the result of the double-channel learning model DC-LTR on the Medium-shot category has the same advantages, the result on the Many-shot category is slightly reduced, but the result is equivalent to other models based on an imbalance algorithm or long-tail distribution image data identification models, and the result shows that the double-channel learning model does not damage the head class image data identification accuracy rate with rich data while improving the tail class image data identification accuracy rate with rare data. The effectiveness and superiority of the dual-channel learning model are verified through comparison of different models.
The model of the invention is compiled by Python3.7, based on a deep learning framework PyTorch, the model of the GPU for experimental operation is 2 NVIDIA GeForce GTX 1080Ti, and the total is 22GB video memory.
The long tail identification method for other data sets is similar to this method.
In conclusion, the invention combines the unbalanced learning and the small sample learning to solve the problem of long-tail distribution image data identification. The unbalanced learning channel improves the malformation phenomenon that a general algorithm is too biased to the head category, a good classification decision boundary is learned at the same time, and the identification accuracy of a double-channel learning model on an unbalanced data set is improved; the small sample learning channel restores the characteristic representation capability damaged by the unbalanced learning channel by pulling the similar sample and pushing the dissimilar sample, and enhances the recognition capability of the two-channel learning model on the tail image data; the constructed double-channel learning total loss function enables the double-channel learning model to be emphasized on an unbalanced learning channel in the early stage of training and emphasized on a small sample learning channel in the later stage of training, and therefore the recognition accuracy rate of the double-channel learning model on long-tail distribution image data is improved on the whole. Therefore, the invention has practical application value and is worth popularizing.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that the changes in the shape and principle of the present invention should be covered within the protection scope of the present invention.

Claims (4)

1. A long tail distribution image data identification method based on dual-channel learning is characterized by comprising the following steps:
1) constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function; dividing a long-tail distribution image data set into a training set, a verification set and a test set; sampling image data and label data from a training set by using an unbalanced learning channel sampler, inputting the data into an unbalanced learning channel network, and calculating the loss of the unbalanced learning channel; sampling image data and label data from a training set by using a small sample learning channel sampler, inputting the data into a small sample learning channel network, and calculating the small sample learning channel loss; then carrying out weighted summation on the unbalanced learning channel loss and the small sample learning channel loss to obtain a two-channel learning total loss;
2) updating all parameters in the dual-channel learning model by utilizing the total loss of the dual-channel learning in a back propagation mode, namely training the dual-channel learning model, and storing the optimal parameters of the dual-channel learning model to obtain the optimal dual-channel learning model;
3) and inputting the image data of the test set to the optimal two-channel learning model to obtain a prediction label of the image, namely a prediction result.
2. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein: in step 1), the unbalanced learning channel sampler is as follows:
the input data of the unbalanced learning channel is sampled from a uniform sampler, and each sample in the training set is sampled with equal probability and at most once in each training round T; define B as the number of samples sampled per batch, the input data for the samples is represented as { (x)1 imb,y1 imb),...,(xi imb,yi imb),...,(xB imb,yB imb) Where superscript imb is used to identify the unbalanced learning channel, (x)i imb,yi imb) Representing image data and label data of the ith sample, i is more than or equal to 1 and less than or equal to B;
the case of the unbalanced learning channel network is as follows:
the unbalanced learning channel network is based on an unbalanced classification algorithm, transplants a network model of the unbalanced classification algorithm, and comprises a feature extractor fφSorter
Figure FDA0002512483930000021
And an imbalance loss function LimbThree parts, the feature extractor fφFor extracting input data (x)i imb,yi imb) Is characterized by
Figure FDA0002512483930000022
Then the features are expressed
Figure FDA0002512483930000023
Input to a classifier
Figure FDA0002512483930000024
Obtaining a predictive tag
Figure FDA0002512483930000025
Finally, the well-defined unbalanced loss function L is combinedimbCalculating the unbalanced learning channel loss of the corresponding batch of samples
Figure FDA0002512483930000026
The small sample learning channel sampler is as follows:
the input data for the small sample learning channel is sampled from a meta-sampler that first randomly samples N classes in all classes of the training set and then randomly samples K in each of the N classes in each training pass TSA sample and KQThe samples are respectively used as a support set of a small sample learning channel
Figure FDA0002512483930000027
And query set
Figure FDA0002512483930000028
The superscript sup and the superscript qry are respectively used for identifying the support set and the query set;
Figure FDA0002512483930000029
image data and label data representing the ith sample of the support set, 1 ≦ i ≦ N × KS
Figure FDA00025124839300000210
Image data and label data representing the ith sample of the query set, 1 ≦ i ≦ N × KQ(ii) a Each batch of data consists of a support set S and a query set Q;
the case of the small sample learning channel network is as follows:
the small sample learning channel network is based on a small sample learning algorithm, a network model of the small sample learning algorithm is transplanted, and the small sample learning channel network comprises a feature extractor fφDistance gauge d and loss function LfsThree parts, wherein the small sample learns the feature extractor and imbalance adopted by the channel networkThe feature extractors adopted by the learning channel network use the same network architecture and share weight parameters; input support set sample data (x)i sup,yi sup) And query set sample data (x)i qry,yi qry) First pass feature extractor fφExtracting feature zi sup=fφ(xi sup) And zi qry=fφ(xi qry) Then, according to the distance gauge d, the distance d (x) of the query set sample feature and the support set sample feature is calculatedi qry,yi sup) The label of the support set sample closest to the query set sample is the prediction label of the query set sample
Figure FDA0002512483930000031
Finally according to the defined small sample loss function LfsCalculating small sample learning channel losses
Figure FDA0002512483930000032
The two-channel learning total loss function is as follows:
the total loss of the two-channel learning is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss, which is as follows:
Figure FDA0002512483930000033
in the formula, α is a hyper-parameter related to the training round T, α and the training round number T are in a parabolic decreasing relationship, and a value is 1 at the beginning of training and gradually decreases to 0 along with the increase of the training round number T, so that the dual-channel learning model is focused on an unbalanced learning channel in the early stage of training and focused on a small sample learning channel in the later stage of training.
3. The method for identifying long-tail distribution image data based on two-channel learning as claimed in claim 1, wherein the method is characterized in that: in step 2), when training the two-channel learning model, firstly, the maximum training round number T is setmaxThe method comprises the steps of optimizing an optimizer type and an initial learning rate, inputting sampling data of a uniform sampler to an unbalanced learning channel network and sampling data of a small sample to a small sample learning channel network in each round respectively, calculating unbalanced learning channel loss and small sample learning channel loss simultaneously, carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss, calculating double-channel learning total loss, combining the double-channel learning total loss with an optimizer, reversely propagating and updating a characteristic extractor parameter shared by double-channel weight and an unbalanced learning channel classifier parameter, enabling a hyper-parameter α in a double-channel learning total loss function to be in a parabolic decreasing relation with the number of training rounds, and enabling α to take a value of 1 at the beginning of training and gradually decrease to 0 along with the increase of the number of training rounds, so that the double-channel learning model is emphasized on the unbalanced learning channel at the early stage of training and emphasized on the small sample learning channel at the later stage of training;
evaluating the performance of the dual-channel model by using the accuracy rate and the recall rate of a Many-shot type, a Medium-shot type, an Few-shot type and an Overall type in a verification set of the long-tail distribution image data set, wherein the number of samples of the Many-shot type is more than 100, the number of samples of the Medium-shot type is between 20 and 100, the number of samples of the Few-shot type is less than 20, the Overall type refers to all types of the verification set, and when the number of training rounds reaches a set maximum number of rounds TmaxAnd (5) terminating the training and storing the optimal double-channel learning model parameters.
4. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein: in step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.
CN202010465433.XA 2020-05-28 2020-05-28 Long-tail distribution image data identification method based on double-channel learning Active CN111738301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010465433.XA CN111738301B (en) 2020-05-28 2020-05-28 Long-tail distribution image data identification method based on double-channel learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010465433.XA CN111738301B (en) 2020-05-28 2020-05-28 Long-tail distribution image data identification method based on double-channel learning

Publications (2)

Publication Number Publication Date
CN111738301A true CN111738301A (en) 2020-10-02
CN111738301B CN111738301B (en) 2023-06-20

Family

ID=72647933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010465433.XA Active CN111738301B (en) 2020-05-28 2020-05-28 Long-tail distribution image data identification method based on double-channel learning

Country Status (1)

Country Link
CN (1) CN111738301B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560904A (en) * 2020-12-01 2021-03-26 中国科学技术大学 Small sample target identification method based on self-adaptive model unknown element learning
CN112632319A (en) * 2020-12-22 2021-04-09 天津大学 Method for improving overall classification accuracy of long-tail distributed speech based on transfer learning
CN112632320A (en) * 2020-12-22 2021-04-09 天津大学 Method for improving speech classification tail recognition accuracy based on long tail distribution
CN113076873A (en) * 2021-04-01 2021-07-06 重庆邮电大学 Crop disease long-tail image identification method based on multi-stage training
CN113095304A (en) * 2021-06-08 2021-07-09 成都考拉悠然科技有限公司 Method for weakening influence of resampling on pedestrian re-identification
CN113255832A (en) * 2021-06-23 2021-08-13 成都考拉悠然科技有限公司 Method for identifying long tail distribution of double-branch multi-center
CN113449613A (en) * 2021-06-15 2021-09-28 北京华创智芯科技有限公司 Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium
CN113569960A (en) * 2021-07-29 2021-10-29 北京邮电大学 Small sample image classification method and system based on domain adaptation
CN114283307A (en) * 2021-12-24 2022-04-05 中国科学技术大学 Network training method based on resampling strategy
CN114511887A (en) * 2022-03-31 2022-05-17 北京字节跳动网络技术有限公司 Tissue image identification method and device, readable medium and electronic equipment
WO2022099600A1 (en) * 2020-11-13 2022-05-19 Intel Corporation Method and system of image hashing object detection for image processing
CN114863193A (en) * 2022-07-07 2022-08-05 之江实验室 Long-tail learning image classification and training method and device based on mixed batch normalization
CN114882273A (en) * 2022-04-24 2022-08-09 电子科技大学 Visual identification method, device, equipment and storage medium applied to narrow space
CN115953631A (en) * 2023-01-30 2023-04-11 南开大学 Long-tail small sample sonar image classification method and system based on deep migration learning
CN116203929B (en) * 2023-03-01 2024-01-05 中国矿业大学 Industrial process fault diagnosis method for long tail distribution data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB823263A (en) * 1956-09-05 1959-11-11 Atomic Energy Authority Uk Improvements in or relating to nuclear particle discriminators
CN108830416A (en) * 2018-06-13 2018-11-16 四川大学 Ad click rate prediction framework and algorithm based on user behavior
US20190095700A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Long-tail large scale face recognition by non-linear feature level domain adaption
CN109800810A (en) * 2019-01-22 2019-05-24 重庆大学 A kind of few sample learning classifier construction method based on unbalanced data
CN109961089A (en) * 2019-02-26 2019-07-02 中山大学 Small sample and zero sample image classification method based on metric learning and meta learning
CN110580500A (en) * 2019-08-20 2019-12-17 天津大学 Character interaction-oriented network weight generation few-sample image classification method
CN110633758A (en) * 2019-09-20 2019-12-31 四川长虹电器股份有限公司 Method for detecting and locating cancer region aiming at small sample or sample unbalance

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB823263A (en) * 1956-09-05 1959-11-11 Atomic Energy Authority Uk Improvements in or relating to nuclear particle discriminators
US20190095700A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Long-tail large scale face recognition by non-linear feature level domain adaption
CN108830416A (en) * 2018-06-13 2018-11-16 四川大学 Ad click rate prediction framework and algorithm based on user behavior
CN109800810A (en) * 2019-01-22 2019-05-24 重庆大学 A kind of few sample learning classifier construction method based on unbalanced data
CN109961089A (en) * 2019-02-26 2019-07-02 中山大学 Small sample and zero sample image classification method based on metric learning and meta learning
CN110580500A (en) * 2019-08-20 2019-12-17 天津大学 Character interaction-oriented network weight generation few-sample image classification method
CN110633758A (en) * 2019-09-20 2019-12-31 四川长虹电器股份有限公司 Method for detecting and locating cancer region aiming at small sample or sample unbalance

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ENLI LIN 等: "Deep reinforcement learning for imbalanced classification" *
陈琼 等: "不平衡数据的迁移学习分类算法" *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022099600A1 (en) * 2020-11-13 2022-05-19 Intel Corporation Method and system of image hashing object detection for image processing
CN112560904A (en) * 2020-12-01 2021-03-26 中国科学技术大学 Small sample target identification method based on self-adaptive model unknown element learning
CN112632319A (en) * 2020-12-22 2021-04-09 天津大学 Method for improving overall classification accuracy of long-tail distributed speech based on transfer learning
CN112632320A (en) * 2020-12-22 2021-04-09 天津大学 Method for improving speech classification tail recognition accuracy based on long tail distribution
CN112632319B (en) * 2020-12-22 2023-04-11 天津大学 Method for improving overall classification accuracy of long-tail distributed speech based on transfer learning
CN113076873A (en) * 2021-04-01 2021-07-06 重庆邮电大学 Crop disease long-tail image identification method based on multi-stage training
CN113095304A (en) * 2021-06-08 2021-07-09 成都考拉悠然科技有限公司 Method for weakening influence of resampling on pedestrian re-identification
CN113449613A (en) * 2021-06-15 2021-09-28 北京华创智芯科技有限公司 Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium
CN113449613B (en) * 2021-06-15 2024-02-27 北京华创智芯科技有限公司 Multi-task long tail distribution image recognition method, system, electronic equipment and medium
CN113255832A (en) * 2021-06-23 2021-08-13 成都考拉悠然科技有限公司 Method for identifying long tail distribution of double-branch multi-center
CN113255832B (en) * 2021-06-23 2021-10-01 成都考拉悠然科技有限公司 Method for identifying long tail distribution of double-branch multi-center
CN113569960A (en) * 2021-07-29 2021-10-29 北京邮电大学 Small sample image classification method and system based on domain adaptation
CN113569960B (en) * 2021-07-29 2023-12-26 北京邮电大学 Small sample image classification method and system based on domain adaptation
CN114283307B (en) * 2021-12-24 2023-10-27 中国科学技术大学 Network training method based on resampling strategy
CN114283307A (en) * 2021-12-24 2022-04-05 中国科学技术大学 Network training method based on resampling strategy
CN114511887B (en) * 2022-03-31 2022-07-05 北京字节跳动网络技术有限公司 Tissue image identification method and device, readable medium and electronic equipment
CN114511887A (en) * 2022-03-31 2022-05-17 北京字节跳动网络技术有限公司 Tissue image identification method and device, readable medium and electronic equipment
CN114882273A (en) * 2022-04-24 2022-08-09 电子科技大学 Visual identification method, device, equipment and storage medium applied to narrow space
CN114882273B (en) * 2022-04-24 2023-04-18 电子科技大学 Visual identification method, device, equipment and storage medium applied to narrow space
CN114863193A (en) * 2022-07-07 2022-08-05 之江实验室 Long-tail learning image classification and training method and device based on mixed batch normalization
CN115953631B (en) * 2023-01-30 2023-09-15 南开大学 Long-tail small sample sonar image classification method and system based on deep migration learning
CN115953631A (en) * 2023-01-30 2023-04-11 南开大学 Long-tail small sample sonar image classification method and system based on deep migration learning
CN116203929B (en) * 2023-03-01 2024-01-05 中国矿业大学 Industrial process fault diagnosis method for long tail distribution data

Also Published As

Publication number Publication date
CN111738301B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
CN111738301A (en) Long-tail distribution image data identification method based on two-channel learning
CN109657584B (en) Improved LeNet-5 fusion network traffic sign identification method for assisting driving
Xiang et al. Fruit image classification based on Mobilenetv2 with transfer learning technique
CN112308158A (en) Multi-source field self-adaptive model and method based on partial feature alignment
CN108764317B (en) Residual convolutional neural network image classification method based on multipath feature weighting
CN108121975B (en) Face recognition method combining original data and generated data
CN111696101A (en) Light-weight solanaceae disease identification method based on SE-Inception
CN111985581A (en) Sample-level attention network-based few-sample learning method
CN110942091A (en) Semi-supervised few-sample image classification method for searching reliable abnormal data center
CN109344856B (en) Offline signature identification method based on multilayer discriminant feature learning
CN111738303A (en) Long-tail distribution image identification method based on hierarchical learning
CN113743505A (en) Improved SSD target detection method based on self-attention and feature fusion
CN115205594A (en) Long-tail image data classification method based on mixed samples
CN101414365B (en) Vector code quantizer based on particle group
CN112766378A (en) Cross-domain small sample image classification model method focusing on fine-grained identification
CN115984213A (en) Industrial product appearance defect detection method based on deep clustering
CN116452862A (en) Image classification method based on domain generalization learning
CN111462090A (en) Multi-scale image target detection method
CN114898171A (en) Real-time target detection method suitable for embedded platform
Zhang et al. A new JPEG image steganalysis technique combining rich model features and convolutional neural networks
CN113255832B (en) Method for identifying long tail distribution of double-branch multi-center
CN112528077B (en) Video face retrieval method and system based on video embedding
CN113505120A (en) Double-stage noise cleaning method for large-scale face data set
US20140343945A1 (en) Method of visual voice recognition by following-up the local deformations of a set of points of interest of the speaker's mouth
CN115984946A (en) Face recognition model forgetting method and system based on ensemble learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant