CN111738301B

CN111738301B - Long-tail distribution image data identification method based on double-channel learning

Info

Publication number: CN111738301B
Application number: CN202010465433.XA
Authority: CN
Inventors: 陈琼; 林恩禄; 朱戈仁
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2020-05-28
Filing date: 2020-05-28
Publication date: 2023-06-20
Anticipated expiration: 2040-05-28
Also published as: CN111738301A

Abstract

The invention discloses a long tail distribution image data identification method based on double-channel learning, which comprises the following steps: 1) Constructing a two-channel learning model combining unbalanced learning and small sample learning; 2) Updating all parameters in the two-channel learning model by using the total loss of the two-channel learning, and storing the optimal parameters of the two-channel learning model; 3) Inputting the image data of the test set to an optimal two-channel learning model to obtain a prediction label of the image. According to the invention, unbalanced learning and small sample learning are combined to solve the problem of long-tail distributed image data identification, the unbalanced learning channel can improve the identification accuracy of an unbalanced data set, the small sample learning channel can improve the characteristic representation of model learning, the model is focused on the unbalanced learning channel in the early training stage, and the small sample learning channel in the later training stage, so that the identification accuracy of long-tail distributed image data is improved as a whole.

Description

Long-tail distribution image data identification method based on double-channel learning

Technical Field

The invention relates to the technical fields of unbalanced classification, small sample learning and long-tail distribution image data identification in machine learning, in particular to a long-tail distribution image data identification method based on double-channel learning.

Background

Long tail distributed image data identification generally adopts unbalanced learning related technology, and the technology is mainly divided into a data layer and an algorithm layer. The techniques at the data plane mainly include a mixed sampling method of downsampling majority class samples, upsampling minority class samples, or a combination of both. However, resampled data cannot reflect the real data distribution characteristics, for example, most samples are discarded by the downsampling method, so that a lot of valuable information in the data set is lost, and the upsampling method causes an overfitting problem and causes great computational power consumption. The algorithm level technology mainly readjusts the weight of each category through a cost sensitive method, and the method alleviates the problem of long-tail distribution image data identification to a certain extent, but does not comprehensively consider the situation that a large number of tail categories have few samples, so that the tail category identification accuracy is still low. In addition, the feasible solution thinking is to learn knowledge from the head class rich data and migrate to the tail class, design a loss function suitable for long-tail distribution image data identification, and construct a more reasonable long-tail distribution image data identification model.

Data in real life are often presented in a long-tail distribution form, however, the current research on long-tail distribution image data identification is still in a preliminary stage, and all long-tail distribution image data identification methods have limitations and cannot well improve the identification accuracy of tail categories.

Disclosure of Invention

The invention aims to overcome the defects and shortcomings of the prior art, and provides an effective, scientific and reasonable long-tail distribution image data identification method based on double-channel learning, wherein unbalanced learning and small-sample learning are combined to solve the problem of long-tail distribution image data identification, an unbalanced learning channel can improve the identification accuracy of a model to an unbalanced data set, a small-sample learning channel can improve the characteristic representation learned by the model, and the identification capability of the model to tail type image data is enhanced; the built double-channel learning total loss function enables the model to be focused on an unbalanced learning channel in the early training stage and focused on a small sample learning channel in the later training stage, so that the recognition accuracy of the model on long-tail image data is improved as a whole. The method provided by the invention is applicable to the problems of unbalanced multi-classification and long-tail distribution image data identification, and is a general method with stronger robustness.

In order to achieve the above purpose, the technical scheme provided by the invention is as follows: a long tail distribution image data identification method based on double-channel learning comprises the following steps:

1) Constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function; dividing the long-tail distribution image data set into a training set, a verification set and a test set; sampling image data and label data from a training set by using an unbalanced learning channel sampler, inputting the data into an unbalanced learning channel network, and calculating unbalanced learning channel loss; sampling image data and label data from a training set by using a small sample learning channel sampler, inputting the data into a small sample learning channel network, and calculating small sample learning channel loss; then carrying out weighted summation on the unbalanced learning channel loss and the small sample learning channel loss to obtain a two-channel learning total loss;

2) Using the total loss of the two-channel learning, and updating all parameters in the two-channel learning model by back propagation, namely training the two-channel learning model, and storing the optimal two-channel learning model parameters to obtain an optimal two-channel learning model;

3) Inputting the image data of the test set to an optimal two-channel learning model, and obtaining a prediction label, namely a prediction result, of the image.

In step 1), the case of the unbalanced learning channel sampler is as follows:

the input data of the unbalanced learning channel is sampled from a uniform sampler, and each sample in the training set is sampled with equal probability and at most once in each training round T; define B as the number of samples sampled per batch, the sampled input data is represented as { (x) ₁ ^imb ,y ₁ ^imb ),...,(x _i ^imb ,y _i ^imb ),...,(x _B ^imb ,y _B ^imb ) A superscript imb identifying an unbalanced learning path, (x) _i ^imb ,y _i ^imb ) Image data and label data representing the ith sample, i is more than or equal to 1 and less than or equal to B;

the unbalanced learning channel network is as follows:

the unbalanced learning channel network is based on an unbalanced classification algorithmThe network model of the unbalanced classification algorithm is transplanted and comprises a feature extractor f _φ Classifier

And an unbalance loss function L _imb Three parts, the feature extractor f _φ For extracting input data (x _i ^imb ,y _i ^imb ) Characteristic representation of +.>

The feature representation is then ++>

Input to classifier->

Get predictive tag->

Finally combine the defined unbalance loss function L _imb Calculating unbalanced learning channel loss of corresponding batch of samples>

The small sample learning channel sampler is as follows:

the input data of the small sample learning channel is sampled from a meta-sampler that randomly samples N categories in all categories of the training set first, and then randomly samples K in each of the N categories, for each training round T _S Samples and K _Q The individual samples are respectively used as the support sets of the small sample learning channels

And a query set

Wherein the superscript sup and the superscript qry are respectively used for identifying the support set and the queryA collection; />

Image data and label data representing the ith sample of the support set, 1.ltoreq.i.ltoreq.NxK _S ；

Image data and label data representing the ith sample of the query set, 1.ltoreq.i.ltoreq.NxK _Q The method comprises the steps of carrying out a first treatment on the surface of the The data of each batch consists of a support set S and a query set Q;

the small sample learning channel network is as follows:

the small sample learning channel network is based on a small sample learning algorithm, and a network model of the small sample learning algorithm is transplanted and comprises a feature extractor f _φ Distance gauge d and loss function L _fs Three parts, wherein the feature extractor adopted by the small sample learning channel network and the feature extractor adopted by the unbalanced learning channel network use the same network architecture and share weight parameters; input support set sample data (x _i ^sup ,y _i ^sup ) And query set sample data (x _i ^qry ,y _i ^qry ) First pass feature extractor f _φ Extracting feature z _i ^sup ＝f _φ (x _i ^sup ) And z _i ^qry ＝f _φ (x _i ^qry ) Then, according to the distance scale d, calculating the distance d (x) between the sample characteristics of the query set and the sample characteristics of the support set _i ^qry ,y _i ^sup ) The label of the support set sample closest to the query set sample is the prediction label of the query set sample

Finally according to the defined small sample loss function L _fs Calculating the small sample learning channel loss->

The two-channel learning total loss function is as follows:

the two-channel learning total loss is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss, as follows:

wherein, alpha is a super parameter related to the training round T, alpha and the training round number T are in parabolic decreasing relation, the value is 1 at the beginning of training, and the value gradually decreases to 0 along with the increase of the training round number T, so that the two-channel learning model focuses on an unbalanced learning channel in the early stage of training and focuses on a small sample learning channel in the later stage of training.

In step 2), when training the two-channel learning model, a maximum training round number T is first set _max The method comprises the steps of firstly, inputting sampling data of an optimizer type and an initial learning rate into an unbalanced learning channel network by using a uniform sampler and inputting sampling data of a meta sampler into a small sample learning channel network by using each round, simultaneously calculating unbalanced learning channel loss and small sample learning channel loss, then carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss to calculate a two-channel learning total loss, combining the two-channel learning total loss with the optimizer, updating a characteristic extractor parameter and an unbalanced learning channel classifier parameter of two-channel weight sharing by back propagation, wherein an over-parameter alpha in a two-channel learning total loss function and a training round number are in parabolic decreasing relation, the alpha is 1 at the beginning of training, and gradually decreases to 0 along with the increase of the training round number, so that the two-channel learning model is focused on the unbalanced learning channel in the early training period and focused on the small sample learning channel in the later training period;

evaluating performance of the dual-channel model using accuracy and recall rates of a Many-shot class, a Medium-shot class, a Few-shot class, and an Overall class in a verification set of the long-tail distributed image dataset, wherein the number of samples of the Many-shot class is greater than 100, the number of samples of the Medium-shot class is between 20 and 100, and the number of samples of the Few-shot classLess than 20, the override category refers to all categories of the validation set when the training round number reaches a set maximum round number T _max And when the training is stopped, the optimal two-channel learning model parameters are saved.

In step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1. compared with the method for only using unbalanced learning, the method for identifying the tail type image data of the unbalanced learning system has the advantages that the method for identifying the tail type image data of the unbalanced learning system adopts a mode of combining the unbalanced learning channel and the small sample learning channel, and the added small sample learning channel can improve characteristic representation, enhance intra-class compactness and improve the identification capability of the two-channel learning model on the tail type image data with rare data.

2. The uniform sampler adopted by the unbalanced learning channel can keep the original distribution of the long-tail distribution image dataset, and is beneficial to the representation learning of the characteristics.

3. The meta sampler adopted by the small sample learning channel is used for meta sampling all categories of a training set of a long-tail distributed image dataset, and learning is carried out by taking a small amount of data of different categories as meta tasks in different rounds of sampling, so that the two-channel learning model can learn the self-adaption capability of identifying tasks on a small amount of samples and fully utilize the dataset.

4. The total loss function of the two-channel learning constructed by the invention is the weighted sum of the loss of the unbalanced learning channel and the loss of the small sample learning channel, the two-channel learning model focuses on the unbalanced learning channel in the early training stage so as to learn a good decision boundary, and focuses on the small sample learning channel in the later training stage, the characteristic representation of the two-channel learning model damaged by the unbalanced learning is gradually corrected by pulling similar samples and pushing different samples, meanwhile, the decision boundary learned by the unbalanced learning channel is not damaged, and the recognition accuracy of the two-channel learning model on long-tail distribution image data is improved as a whole.

5. The long tail distribution image data identification method based on the double-channel learning uses the output of the last layer classifier of the unbalanced learning channel network as a final prediction result. And when the two-channel learning model is trained, the performance of the two-channel learning model is evaluated by using the accuracy and recall rates of the Many-shot class, the Medium-shot class and the Few-shot class in the verification set of the long-tail distributed image dataset, so that the change of the real performance of the model can be tracked better, and the trained model is more reliable.

Drawings

Fig. 1 is a diagram showing an example of input data according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of a two-channel learning model structure according to the present invention.

Detailed Description

The invention will be further illustrated with reference to specific examples.

Places365 dataset is a large image dataset covering 365 scene categories, each category containing no more than 5000 training pictures, 50 verification pictures, and 900 test pictures. The Places365 original data set is downsampled according to the pareto distribution with the power exponent parameter of 6, the training lump of the obtained long-tail distribution image data set totally comprises 62500 pictures, wherein each class contains 4980 pictures at most and 5 pictures at least, and the training set Places-LT of the constructed long-tail distribution image data set is shown in figure 1. The verification set of the long-tail distributed image dataset samples 20 pictures per class for tracking and evaluating the performance of the two-channel learning model. 50 pictures are sampled from each class of test set of the long-tail distributed image data set and are used for evaluating and comparing the performance of the two-channel learning model and other image data identification models.

For a constructed long tail distributed image dataset, the data preprocessing operation is as follows: all pictures were first adjusted to 256 x 256, randomly cropped to 224 x 224 during training, then flipped horizontally with 50% probability, and randomly dithered in brightness, contrast, and saturation of the pictures to enhance the pictures, which were center cropped to 224 x 224 without further enhancement during verification and testing.

As shown in fig. 2, the long tail distribution image data identification method based on the dual-channel learning provided by the embodiment includes the following steps:

1) Constructing a two-channel learning model consisting of an unbalanced learning channel sampler, an unbalanced learning channel network, a small sample learning channel sampler, a small sample learning channel network and a two-channel learning total loss function, wherein:

imbalance learning channel sampler: the input data of the unbalanced learning channel is sampled from a uniform sampler. In each training round T, each sample in the training set of long-tail distributed image datasets is sampled with equal probability and at most once. Defining B as the number of samples sampled per batch, B is set to 128 in this embodiment, and the sampled input data is represented as { (x) ₁ ^imb ,y ₁ ^imb ),...,(x _i ^imb ,y _i ^imb ),...,(x _B ^imb ,y _B ^imb ) A }, wherein superscript imb is used to identify an imbalance learning channel, (x) _i ^imb ,y _i ^imb ) Image data and label data of the i (1. Ltoreq.i.ltoreq.B) th sample are represented, respectively.

Unbalanced learning channel network: based on the unbalanced classification algorithm, the network model of the unbalanced classification algorithm can be transplanted, in this embodiment, the unbalanced learning channel network adopts an LDAM unbalanced classification network, wherein the feature extractor f _φ Adopting ResNet10 residual error network and classifier

Using a fully-connected layer network, unbalanced loss function L _imb LDAM losses are used. Feature extractor f _φ First, input data (x _i ^imb ,y _i ^imb ) Characteristic representation of +.>

The feature representation is then ++>

Input to classifier->

Get predictive tag->

Finally, calculating the unbalanced learning channel loss of the batch of samples by using the LDAM loss function>

Record category y _i ^imb Is of sample x of (2) _i ^imb Is characterized by

Training set y _i ^imb The number of class samples is +.>

The super parameter C is set to 0.5, and the LDAM loss function is as follows:

wherein:

a small sample learning channel sampler: the input data of the small sample learning channel is sampled from a meta-sampler. In each training round T, the meta-sampler first randomly samples n=5 categories among all categories of the training set of the long-tail distributed image dataset, and then randomly samples K in each of the 5 categories _S =1 sample and K _Q Support set for small sample learning channel with 1 sample

And a query set

Wherein the superscript sup and the superscript qry are used for identifying the support set and the query set, respectively, < >>

The ith (i is more than or equal to 1 and less than or equal to N multiplied by K) of the support set is represented _S ) Image data and label data for each sample,

the ith (i is more than or equal to 1 and less than or equal to N multiplied by K) of the query set is represented _Q ) Image data and label data for each sample. The data for each batch consists of a support set S and a query set Q.

Small sample learning channel network: based on the small sample learning algorithm, the network model of the small sample learning algorithm can be transplanted, in this embodiment, the network model of the small sample learning channel adopts a prototype network model, and the channel adopts a feature extractor f _φ Feature extractor f for use with an unbalanced learning channel _φ The same ResNet10 network architecture is used and the weight parameters are shared. Small sample loss function L _fs Cross entropy loss is employed. Input support set sample data (x _i ^sup ,y _i ^sup ) And query set sample data (x _i ^qry ,y _i ^qry ) First pass feature extractor f _φ Extracting the feature representation z _i ^sup ＝f _φ (x _i ^sup ) And z _i ^qry ＝f _φ (x _i ^qry ) Feature extractor f _φ After extracting the characteristics of the input batch data, calculating each type of sample set S of the support set _k Feature center c of (2) _k Then according to the sample characteristics z of the query set _i ^qry And class feature center c _k Is the Euclidean distance d (z) _i ^qry ,c _k ) Calculate the sample x of the query set _i ^qry Probability of belonging to class k

Wherein:

finally, according to the small sample loss function L _fs Calculating a small sample learning channel loss

The small sample learning channel loss is as follows:

double-channel learning total loss function: the two-channel learning total loss is a weighted sum of the unbalanced learning channel loss and the small sample learning channel loss. The two-channel learning total loss function is as follows:

wherein alpha is a super parameter related to training round T, and the total round number of training is defined as T _max The relationship of α to training round T is as follows:

2) And (3) updating all parameters in the two-channel learning model by using the total loss of the two-channel learning through back propagation, namely training the two-channel learning model, and storing the optimal two-channel learning model parameters to obtain the optimal two-channel learning model.

In the process of training the two-channel learning model, the maximum training round number T _max Set to 120, the optimizer uses an SGD optimizer, the learning rate is initialized to 0.1, the learning rate drops by a factor of 0.1 when the training round number T reaches 70, and the learning rate continues to drop by a factor of 0.1 when the training round number reaches 90. The super parameter alpha in the dual-channel learning total loss function and the training round number T are in parabolic decreasing relation, so that the dual-channel learning model is focused on an unbalanced learning channel in the early stage of training and focused on a small sample learning channel in the later stage of training.

And when the two-channel learning model is trained, the accuracy and recall of the Many-shot class, the Medium-shot class, the Few-shot class and the Overall class in the verification set of the long-tail distributed image dataset are used for evaluating the performance of the two-channel model. The number of samples of the Many-shot class is greater than 100, the number of samples of the Medium-shot class is between 20 and 100, the number of samples of the Few-shot class is less than 20, and the overlay class refers to all classes of the verification set. When the training round number T reaches the set maximum round number T _max And when the training is stopped, the optimal two-channel learning model parameters are saved.

3) Inputting the image data of the test set of the long-tail distributed image data set into the optimal two-channel learning model stored in the last step, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the prediction result of the image data of the test set finally.

The following table is a comparison of the two-channel learning model with other image data recognition models in the Places-LT dataset. In the comparison Model, the DC-LTR represents a two-channel learning Model, and other models are currently mainstream models for processing an unbalanced image dataset or a long tail distributed image dataset except that the Plain Model is a naive deep convolutional neural network classification Model. For fair comparison, all comparison models were trained using a Places-LT training set and Resnet10 network structure, and then Class-Balanced Accuracy and Macro F-measure for Many-shot Class, medium-shot Class, few-shot Class, and Overall Class were calculated on Places-LT test set, where Class-Balanced Accuracy represents average recall per Class and Macro F-measure represents average precision per Class.

TABLE 1 results of comparative experiments on Places-LT datasets

From the experimental results, the Class-Balanced Accuracy results and the Macro F-measure results of the two-channel learning model DC-LTR on the Few-shot Class and the Overall Class are obviously superior to those of other contrast models, so that the recognition accuracy of the tail type image data with sparse data can be improved by the two-channel learning model, and the recognition accuracy of the long tail distribution image data is improved as a whole; the DC-LTR of the two-channel learning model has the same advantages that the result on the Medium-shot class is slightly reduced, but the result on the Many-shot class is equivalent to other models based on an unbalanced algorithm or long-tail distribution image data recognition models, so that the recognition accuracy of the tail type image data with sparse data is improved, and meanwhile, the recognition accuracy of the head type image data with abundant data is not damaged. The effectiveness and superiority of the two-channel learning model are verified by comparing different models.

The model is written by using Python3.7, and based on a deep learning framework PyTorch, the model of the GPU of experimental operation is 2 blocks NVIDIA GeForce GTX 1080Ti, and the total 22GB video memory.

The long tail identification method of other data sets is similar to this method.

In summary, the invention combines unbalanced learning and small sample learning to solve the long tail distribution image data identification problem. The unbalanced learning channel improves the malformation phenomenon that a general algorithm is too biased to the head category, and simultaneously learns a good classification decision boundary, thereby improving the recognition accuracy of the unbalanced data set by the two-channel learning model; the small sample learning channel restores the characteristic representation capability damaged by the unbalanced learning channel by pulling similar samples and pushing different samples, and enhances the recognition capability of the two-channel learning model on tail image data; the constructed two-channel learning total loss function enables the two-channel learning model to focus on an unbalanced learning channel in the early training stage and focus on a small sample learning channel in the later training stage, so that the recognition accuracy of the two-channel learning model on long-tail distribution image data is improved as a whole. Therefore, the invention has practical application value and is worth popularizing.

The above embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention, so variations in shape and principles of the present invention should be covered.

Claims

1. A long tail distribution image data identification method based on double-channel learning is characterized by comprising the following steps:

the unbalanced learning channel sampler is as follows:

the unbalanced learning channel network is as follows:

the unbalanced learning channel network is based on an unbalanced classification algorithm, and a network model of the unbalanced classification algorithm is transplanted and comprises a feature extractor f _φ Classifier

The feature representation is then ++>

Input to a classifier

Get predictive tag->

The small sample learning channel sampler is as follows:

the input data of the small sample learning channel is sampled from a meta-sampler that randomly samples N categories in all categories of the training set first, and then randomly samples K in each of the N categories, for each training round T _S Samples and K _Q The individual samples are respectively used as small sample learning channelsSupport set of tracks

And a query set

Wherein, the superscript sup and the superscript qry are respectively used for identifying a supporting set and a query set; />

Image data and label data representing the g-th sample of the support set, 1.ltoreq.g.ltoreq.NxK _S ；

Image data and tag data representing the p-th sample of the query set, 1.ltoreq.p.ltoreq.NxK _Q The method comprises the steps of carrying out a first treatment on the surface of the The data of each batch consists of a support set S and a query set Q;

the small sample learning channel network is as follows:

the small sample learning channel network is based on a small sample learning algorithm, and a network model of the small sample learning algorithm is transplanted and comprises a feature extractor f _φ Distance gauge d and loss function L _fs Three parts, wherein the feature extractor adopted by the small sample learning channel network and the feature extractor adopted by the unbalanced learning channel network use the same network architecture and share weight parameters; input support set sample data

And query set sample data->

First pass feature extractor f _φ Extracting feature z _g ^sup ＝f _φ (x _g ^sup ) And z _p ^qry ＝f _φ (x _p ^qry ) Then according to the distance metric d, calculating the sample characteristics of the query set and the sample characteristics of the support setDistance of sign d (x _p ^qry ,y _g ^sup ) The label of the support set sample closest to the query set sample is the predictive label of the query set sample +.>

The two-channel learning total loss function is as follows:

wherein, alpha is a super parameter related to the training round T, alpha and the training round number T are in parabolic decreasing relation, the value is 1 at the beginning of training, and gradually decreases to 0 along with the increase of the training round number T, so that the two-channel learning model focuses on an unbalanced learning channel in the early stage of training and focuses on a small sample learning channel in the later stage of training;

2. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein the method comprises the following steps: in step 2), when training the two-channel learning model, a maximum training round number T is first set _max The method comprises the steps of firstly, inputting sampling data of an optimizer type and an initial learning rate into an unbalanced learning channel network by using a uniform sampler and inputting sampling data of a meta sampler into a small sample learning channel network by using each round, simultaneously calculating unbalanced learning channel loss and small sample learning channel loss, then carrying out weighted summation according to the unbalanced learning channel loss and the small sample learning channel loss to calculate a two-channel learning total loss, combining the two-channel learning total loss with the optimizer, updating a characteristic extractor parameter and an unbalanced learning channel classifier parameter of two-channel weight sharing by back propagation, wherein an over-parameter alpha in a two-channel learning total loss function and a training round number are in parabolic decreasing relation, the alpha is 1 at the beginning of training, and gradually decreases to 0 along with the increase of the training round number, so that the two-channel learning model is focused on the unbalanced learning channel in the early training period and focused on the small sample learning channel in the later training period;

the method comprises the steps of evaluating the performance of a dual-channel model by using the accuracy rate and recall rate of a Many-shot class, a Medium-shot class, a Few-shot class and an Overall class in a verification set of a long-tail distribution image dataset, wherein the number of samples of the Many-shot class is greater than 100, the number of samples of the Medium-shot class is between 20 and 100, the number of samples of the Few-shot class is less than 20, the Overall class refers to all classes of the verification set, and when the training round number reaches a set maximum round number T _max And when the training is stopped, the optimal two-channel learning model parameters are saved.

3. The long-tail distribution image data identification method based on two-channel learning as claimed in claim 1, wherein the method comprises the following steps: in step 3), inputting the image data of the test set into an optimal two-channel learning model, wherein the output of the last layer of classifier of the unbalanced learning channel network in the model is the final prediction result of the image data of the test set.