CN115661549A - Fine-grained classification denoising training method based on prediction confidence - Google Patents

Fine-grained classification denoising training method based on prediction confidence Download PDF

Info

Publication number
CN115661549A
CN115661549A CN202211452486.3A CN202211452486A CN115661549A CN 115661549 A CN115661549 A CN 115661549A CN 202211452486 A CN202211452486 A CN 202211452486A CN 115661549 A CN115661549 A CN 115661549A
Authority
CN
China
Prior art keywords
prediction
training
sample
samples
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211452486.3A
Other languages
Chinese (zh)
Inventor
沈复民
姚亚洲
张传一
姚钰龙
孙泽人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Code Geek Technology Co ltd
Original Assignee
Nanjing Code Geek Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Code Geek Technology Co ltd filed Critical Nanjing Code Geek Technology Co ltd
Priority to CN202211452486.3A priority Critical patent/CN115661549A/en
Publication of CN115661549A publication Critical patent/CN115661549A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a fine-grained classification denoising training method based on prediction confidence, which comprises the following steps of S1, utilizing all training samples to participate in preheating training, and recording the prediction results of each sample for a few times as a historical prediction set; s2, generating a normalized prediction confidence coefficient of each sample through a histogram generated by a historical prediction set; and S3, balancing weights of the sample labels and the sample prediction by adopting the normalized prediction confidence coefficient, and dynamically correcting the loss value. In the invention, the dynamic loss replaces the common cross entropy loss to distinguish the noise outside the distribution from other samples, so that the noise outside the distribution can be better removed; when the model is trained on the noisy data set, denoising training is carried out in a frame through loss correction and a global sample selection strategy, and the classification precision of the fine-grained visual recognition model is obviously improved.

Description

Fine-grained classification denoising training method based on prediction confidence
Technical Field
The invention relates to the technical field of fine-grained image classification, in particular to a fine-grained classification denoising training method based on prediction confidence.
Background
Noise in noisy data sets is generally divided into two categories: the first type of noise is intra-distribution noise, i.e., the true label of the sample itself belongs to the label set of the data set, but is mistakenly labeled as other labels of the data set; the second type of noise is out-of-distribution noise, where the true label of the sample is not in the label set of the dataset. The image content of the distributed external noise sample is only weakly associated with the label labeling condition but not in accordance with the labeling principle, and some or even no relationship exists. If a data set contains both types of noise, the data set is called an open-set noisy data set. The noisy data sets obtained under natural conditions are almost all open-set data sets, and closed-set data sets are rather rare.
The research community has proposed various ideas to deal with the problem of noise contained in the training data set. One class of methods is known as "loss correction" (loss correction) or label correction (label correction). The conventional practice of loss correction is to add some correction to the loss values during neural network model training to avoid over-fitting noise samples within the distribution. Some methods also correct the in-distribution noise in a form of learning a noise transfer matrix, but cannot simultaneously and correctly process the out-distribution noise, the effect on large-scale data is not ideal, the real label of the out-distribution noise is not in the label definition domain of the data set, and the label result of forcibly correcting the in-distribution noise sample by using the noise transfer matrix is difficult to obtain.
Disclosure of Invention
The invention provides a neural network denoising training method based on prediction confidence coefficient, and solves the problem that a fine-grained image classification model is difficult to train on a noisy data set.
In order to achieve the purpose, the invention provides the following technical scheme: a fine-grained classification denoising training method based on prediction confidence comprises the following steps:
s1, firstly, utilizing all training samples to participate in preheating training, and recording the prediction results of each sample for a few times as a historical prediction set;
s2, generating the normalized prediction confidence coefficient of each sample through a histogram generated by a historical prediction set, wherein the normalized prediction confidence coefficient is as follows:
s21, through the formula
Figure DEST_PATH_IMAGE001
(6.1) calculating a histogram of the prediction labels with respect to the total prediction times;
s22, deducing the confidence of the correct label according to the historical prediction result;
s23, performing normalization operation on the basis of the cross entropy;
and S3, balancing weights of the sample labels and the sample prediction by adopting the normalized prediction confidence coefficient, and dynamically correcting the loss value.
Further, in S1, the training must first go through several rounds of preheating training at the beginning, and after the preheating training process is completed, each sample in the training set D is subjected to the preheating training
Figure DEST_PATH_IMAGE002
,
Figure DEST_PATH_IMAGE003
Performing inference and obtaining a prediction result, N being the number of samples of the data set; the inference process respectively obtains Softmax probability distribution vectors from backbone network output formed by two convolutional neural networks, and then calculates prediction results
Figure DEST_PATH_IMAGE004
(ii) a Note book
Figure DEST_PATH_IMAGE005
For sample images
Figure 155163DEST_PATH_IMAGE002
More recently, the development of new and more recently developed devices
Figure DEST_PATH_IMAGE006
Historical prediction sequences of round training
Figure DEST_PATH_IMAGE007
For each sample image
Figure DEST_PATH_IMAGE008
The prediction confidence of (2); dynamic correction loss based on prediction confidence
Figure 210319DEST_PATH_IMAGE007
Balancing the result of label one-hot coding and prediction weight and neural network output to calculate cross entropy to obtain corrected loss value; in the training process, the samples with higher prediction confidence coefficient are selected to form a training sample set actually participating in training
Figure DEST_PATH_IMAGE009
(ii) a Comprises
Figure DEST_PATH_IMAGE010
The sample set of individual training examples update the fine-grained image classification neural network model with their corrected loss values.
Further, deep neural networks tend to fit clean and simple samples and then begin to adapt to difficult and noisy samples; use all training samples D before
Figure DEST_PATH_IMAGE011
In the round robin, a preheating strategy is used for training a target neural network; the cross-entropy loss formula used for training is:
Figure DEST_PATH_IMAGE012
(6.2)
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE013
representing a sample
Figure 803106DEST_PATH_IMAGE008
Label of (2), cross entropy loss in equation (6.2)
Figure DEST_PATH_IMAGE014
Updating the neural network in the preheating stage; in the formula
Figure DEST_PATH_IMAGE015
The output vector representing the last softmax layer is calculated by equation (6.3):
Figure DEST_PATH_IMAGE016
(6.3)
in the formula
Figure DEST_PATH_IMAGE017
A mapping function representing a neural network is provided,
Figure DEST_PATH_IMAGE018
is the output of the fully-connected layer preceding the last softmax layer, k is the number of classes of the dataset,
Figure DEST_PATH_IMAGE019
for network parameters, individual sample images
Figure 680538DEST_PATH_IMAGE008
Corresponding inference result
Figure DEST_PATH_IMAGE020
The formula for calculation is:
Figure DEST_PATH_IMAGE021
(6.4)
recording and updating each sample image in the training set during the whole training process
Figure 901435DEST_PATH_IMAGE008
In the near vicinity
Figure DEST_PATH_IMAGE022
Prediction in round training, note
Figure DEST_PATH_IMAGE023
Figure DEST_PATH_IMAGE024
Sample image for network
Figure 599264DEST_PATH_IMAGE008
Predictive labels in the T-th (i.e., current) round of training.
Further, after the warm-up training phase is completed, a prediction is performed on the samples in the training sample D using the trained neural network and then a historical prediction sequence is established and updated
Figure DEST_PATH_IMAGE025
(ii) a The PFL loss of all training samples in training sample D was calculated using the average equation (6.15); after training is completed, the number of discarded samples, namely the proportion of the noise samples which are judged to be out-of-distribution to the total number of samples, is controlled by using a ratio delta (%) in general, and PFL loss before ranking can be selected
Figure DEST_PATH_IMAGE026
Form a new training sample set
Figure DEST_PATH_IMAGE027
Performing an update of the neural network model, pre-heating a newly selected training sample set for each round of training after training
Figure 7243DEST_PATH_IMAGE027
The generation process of (2) is shown in formula (6.5):
Figure DEST_PATH_IMAGE028
(6.5)
wherein
Figure DEST_PATH_IMAGE029
Representing a sample
Figure 815930DEST_PATH_IMAGE002
The clean samples and the noise samples in the distribution are selected into a training set of the training of the round by a formula (6.5) in the global sample selection stage; to avoid false exclusion of useful samples, the samples with high loss based on the prediction stability metric are only excluded from the training set in the current round of training, but the normalized prediction confidence of all samples is recalculated in the next round of global sample selection
Figure DEST_PATH_IMAGE030
And updating the historical prediction sequence
Figure 239737DEST_PATH_IMAGE025
The process formula is as follows:
Figure DEST_PATH_IMAGE031
(6.6);
Figure DEST_PATH_IMAGE032
is a sample image
Figure 842888DEST_PATH_IMAGE008
A corresponding label.
Furthermore, in S21, since the noise-containing training data has no correlation between the content of the distributed noise samples and the content of the clean samples and the content of the distributed noise samples, the prediction result of the distributed noise samples will be changed continuously during the early training process, so that
Figure DEST_PATH_IMAGE033
Representing a sample
Figure 800480DEST_PATH_IMAGE002
Predicting a sequence of outcomes in history
Figure 6333DEST_PATH_IMAGE005
Is predicted as the frequency of tag j and
Figure DEST_PATH_IMAGE034
k is the number of categories of the data set;
Figure DEST_PATH_IMAGE035
can be calculated by equation (6.1):
Figure DEST_PATH_IMAGE036
(6.1)
wherein the content of the first and second substances,
Figure 405085DEST_PATH_IMAGE020
is the result of the prediction of the sample image,
Figure DEST_PATH_IMAGE037
is the size of its historical prediction result set, i.e. the total number of predictions,
Figure 772612DEST_PATH_IMAGE035
i.e. a histogram of the prediction labels with respect to the total number of predictions.
Further, in S22, the frequency of occurrence of the sample label in the prediction history is statistically positively correlated with the probability that the label is a true label; the likelihood that such a prediction belongs to a true tag is defined as the "confidence" that the correct tag was inferred from historical predictions;
the concept of entropy is matched with the concept of confidence degree, and can be used for expressing each sample image
Figure 483079DEST_PATH_IMAGE002
The uncertainty of the prediction result of (1) is in the form of:
Figure DEST_PATH_IMAGE038
(6.7)
Figure DEST_PATH_IMAGE039
representing a sample
Figure 161798DEST_PATH_IMAGE008
Predicting a sequence of outcomes in history
Figure 274110DEST_PATH_IMAGE025
The frequency of the label y predicted as the highest frequency in the prediction history, the histogram characteristic of the prediction history reflects the uncertainty of the inference history to the attribution of the label, and the following formula can be used to describe the uncertainty of the prediction history when the uncertainty is the maximum
Figure DEST_PATH_IMAGE040
The minimum case is:
Figure DEST_PATH_IMAGE041
(6.8)
wherein k represents the total class number of the tags in the data set,
Figure 219063DEST_PATH_IMAGE022
represents the length of the prediction history sequence; in a commonly used network image fine-grained classification dataset, the label category number in the dataset is far greater than the length setting of the actual historical prediction record, i.e.
Figure DEST_PATH_IMAGE042
(ii) a Therefore, the first and second electrodes are formed on the substrate,
Figure DEST_PATH_IMAGE043
the calculation method of the historical prediction maximum uncertainty can be obtained, and the formula is as follows:
Figure DEST_PATH_IMAGE044
(6.9)。
furthermore, in S23, the use of cross entropy is inconvenient, and the cross entropy itself has a lower bound, but the difference between the upper bounds of the cross entropy and the lower bounds of the cross entropy is very large under different conditions, so that the normalization operation needs to be performed on the basis of the cross entropy; given the maximum historical prediction uncertainty calculation method, one can define
Figure DEST_PATH_IMAGE045
For normalizing the historical prediction uncertainty to make the value range constant
Figure DEST_PATH_IMAGE046
To facilitate measurement;
Figure 26614DEST_PATH_IMAGE045
see formula (6.10):
Figure DEST_PATH_IMAGE047
(6.10)
in summary, we can calculate and update the prediction confidence for each input sample according to the prediction history, and sample image
Figure 43111DEST_PATH_IMAGE008
Normalized prediction confidence of
Figure 210263DEST_PATH_IMAGE007
The definition is shown in formula (6.11):
Figure DEST_PATH_IMAGE048
(6.11)。
further, in S3, the conventional idea of processing the noise sample is to identify the noise sample by modeling the distribution of the noise, and generally adopt a "loss correction" method;
the basic loss function of the neural network adopts cross entropy, and the addition of additional targets is divided into two basic modes: one is a soft scheme and the other is a hard scheme; the detailed structure of the soft scheme is as follows:
Figure DEST_PATH_IMAGE049
(6.12)
q is a prediction vector output by the neural network, t is a noise label vector, L is the total label category number, and beta is a value range constant in the range
Figure DEST_PATH_IMAGE050
The hyper-parameter between, corresponding to the "soft solution", is a hard solution, changing the regression objective to a known one
Figure DEST_PATH_IMAGE051
Is known as
Figure DEST_PATH_IMAGE052
In the case of the maximum value of q
Figure DEST_PATH_IMAGE053
Maximum posterior probability of, note
Figure DEST_PATH_IMAGE054
The formal formula is:
Figure DEST_PATH_IMAGE055
(6.13)
after a suitable loss correction method is selected, the network parameters are updated using a stochastic gradient descent optimization tool.
Furthermore, the neural network is updated by using a loss function, so that the noise sample is quickly overfitted in the training process, thereby causing the performance to be reduced; it is therefore desirable to compensate for the loss value using both the label and the prediction to mitigate the tendency of the neural network to fit to noise, and when compensation is brought in, equation (6.2 can be rewritten as:
Figure DEST_PATH_IMAGE056
(6.14)
Figure 139167DEST_PATH_IMAGE032
is a sample image
Figure 558647DEST_PATH_IMAGE008
The corresponding label is marked with a corresponding label,
Figure 909994DEST_PATH_IMAGE020
is the result of the prediction thereof,
Figure DEST_PATH_IMAGE057
is the last softmax layer output defined by equation (6.3), and in general, the parameters
Figure DEST_PATH_IMAGE058
Are often set to a fixed value, e.g.
Figure 931652DEST_PATH_IMAGE058
= 0.8 to statically balance the label and the prediction result.
Further, the normalized prediction confidence defined by equation (6.11) is used
Figure DEST_PATH_IMAGE059
Dynamically determining the weight relationship between each sample label and the prediction result to achieve the purpose of dynamically correcting the loss value;
equation (6.14) can be rewritten to a dynamic loss based on prediction uncertainty by introducing equation (6.11), specifically:
Figure DEST_PATH_IMAGE060
(6.15)
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE061
using normalized prediction confidence
Figure 749566DEST_PATH_IMAGE030
The compensation degree of the current loss value is dynamically adjusted; equation (6.15) is referred to as dynamic penalty based on Prediction confidence, abbreviated as PFL penalty (Prediction Fidelity Loss).
Compared with the prior art, the invention has the beneficial effects that:
in the application, a dynamic loss based on prediction confidence coefficient is used for replacing a common cross entropy loss to distinguish the noise outside the distribution from other samples (clean samples and noise inside the distribution), so that the noise outside the distribution can be better removed; analyzing a historical prediction result and dynamically correcting a loss value by using a method for realizing the correction of noise in distribution and identifying noise outside the distribution according to the prediction confidence coefficient so as to achieve the aim of relieving the interference of the noise in the distribution on training; when a model is trained on a noisy data set, global sample selection is carried out by utilizing the prediction confidence coefficient, and the strategy is integrated into a simple and effective noisy data set fine-grained image classification training frame, so that denoising training is carried out in one frame through loss correction and a global sample selection strategy, and the classification precision of a fine-grained visual recognition model is obviously improved.
Drawings
FIG. 1 is a flowchart of a classification algorithm for images with a dry granularity based on a prediction confidence level according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention is a fine-grained classification denoising training method based on prediction confidence, which is based on a prediction deviceThe noise-containing fine-grained classification denoising algorithm method for the reliability is mainly divided into two parts: the method comprises the steps of firstly, dynamic loss correction based on prediction confidence, and secondly, global sample selection based on prediction confidence, wherein the capacity of learning a fine-grained image classification model from internet noisy image data sets is effectively improved by combining the dynamic loss correction and the global sample selection in a frame; in the training phase, the framework of the method can be described simply as: definition of
Figure DEST_PATH_IMAGE062
For the number i sample in the training data set D, wherein
Figure 125184DEST_PATH_IMAGE002
Is an image
Figure DEST_PATH_IMAGE063
Is its corresponding label. It can be known that
Figure DEST_PATH_IMAGE064
And k is the number of classes in the data set, and
Figure 155588DEST_PATH_IMAGE003
and the number of samples of the N data sets, due to the presence of noise in the original training set,
Figure 96999DEST_PATH_IMAGE063
is not always the case
Figure 679290DEST_PATH_IMAGE002
The corresponding real label, if any,
Figure DEST_PATH_IMAGE065
is a sample image
Figure 542204DEST_PATH_IMAGE002
When the sample is a clean sample, the real label of (1)
Figure DEST_PATH_IMAGE066
When the sample is an in-distribution noise weThe noise samples should be corrected to the correct label ready for use
Figure 966845DEST_PATH_IMAGE065
Replacement of
Figure 497183DEST_PATH_IMAGE063
And should be discarded from the training set when the samples are out-of-distribution noise.
The general framework of the algorithm of the present invention is shown in fig. 1, and it must first go through several rounds of preheating training at the beginning of training, and after the preheating training process is completed, each sample of the training set D is subjected to the preheating training
Figure 250375DEST_PATH_IMAGE002
,
Figure 131744DEST_PATH_IMAGE003
Executing inference and obtaining a prediction result, wherein the inference process respectively obtains Softmax probability distribution vectors from backbone network output consisting of two convolutional neural networks, and then the prediction result is calculated
Figure 628584DEST_PATH_IMAGE004
Memory for recording
Figure 279008DEST_PATH_IMAGE005
For sample images
Figure 937523DEST_PATH_IMAGE002
More recently, the development of new and more recently developed devices
Figure 509450DEST_PATH_IMAGE006
Historical prediction sequences of round training
Figure DEST_PATH_IMAGE067
For each sample image
Figure 13243DEST_PATH_IMAGE002
Then, dynamic correction based on the prediction confidence is lost
Figure 783753DEST_PATH_IMAGE067
Balancing the results of label one-hot coding and prediction weight and the neural network output calculation cross entropy to obtain the corrected loss value, and selecting the samples with higher prediction confidence coefficient to form a training sample set actually participating in training in the training process
Figure 613169DEST_PATH_IMAGE009
(ii) a Ratio is generally used
Figure DEST_PATH_IMAGE068
The number of discarded samples, i.e. the proportion of the samples determined to be out-of-distribution noise in the total number of samples, is controlled. Finally, contain
Figure DEST_PATH_IMAGE069
The sample set of individual training examples update the fine-grained image classification neural network model with their corrected loss values.
In this embodiment, since the noise-containing training data contains the out-of-distribution noise samples independent of the contents of the clean samples and the in-distribution noise samples, the prediction results of these out-of-distribution noise samples will change during the early training process. Consider the case where only noise samples and clean samples within the distribution exist in the dataset, let
Figure DEST_PATH_IMAGE070
Representing a sample
Figure 872724DEST_PATH_IMAGE002
Predicting a sequence of outcomes in history
Figure 711367DEST_PATH_IMAGE005
Is predicted as the frequency of tag j and
Figure DEST_PATH_IMAGE071
k is the number of categories of the data set;
Figure 805225DEST_PATH_IMAGE035
can be represented by formula (6)1) calculating:
Figure 71122DEST_PATH_IMAGE036
(6.1)
wherein the content of the first and second substances,
Figure 148799DEST_PATH_IMAGE020
is the result of the prediction of the sample image,
Figure 791133DEST_PATH_IMAGE037
is the size of its historical prediction result set, i.e. the total number of predictions,
Figure 270656DEST_PATH_IMAGE035
i.e. a histogram of the prediction labels with respect to the total number of predictions.
The frequency with which a sample label appears in the prediction history is statistically positively correlated with the likelihood that the label is a genuine label. The likelihood that a prediction belongs to a true tag is defined as the "confidence" that the correct tag was inferred from historical predictions.
The concept of entropy is consistent with the above-mentioned confidence expression concept, and can be used to express each sample image
Figure 910716DEST_PATH_IMAGE002
The uncertainty of the predicted result of (2) is shown in formula (6.7):
Figure 475689DEST_PATH_IMAGE038
(6.7)
the histogram characteristics of the prediction history reflect the uncertainty of the inference history with respect to the label attribution, and the flatter the more the histogram distribution uncertainty is, the weaker the histogram distribution uncertainty is, the more concentrated the histogram distribution uncertainty is. Equation (6.8) describes when the uncertainty of the prediction history is greatest
Figure 656135DEST_PATH_IMAGE070
The minimum case is:
Figure 990164DEST_PATH_IMAGE041
(6.8)
wherein k represents the total class number of the tags in the data set,
Figure 597863DEST_PATH_IMAGE022
representing the length of the prediction history sequence. In a commonly used network image fine-grained classification data set, the number of label categories (mostly more than one hundred categories) in the data set is far greater than the length (about 10) of an actual historical prediction record, that is, the label categories are set
Figure DEST_PATH_IMAGE072
. Therefore, it can be considered that
Figure DEST_PATH_IMAGE073
From the above analysis, a method for calculating the maximum uncertainty of the historical prediction can be obtained, as shown in equation (6.9):
Figure 584886DEST_PATH_IMAGE044
(6.9)
the cross entropy is used only and inconvenient to use, the cross entropy has a lower bound, but the difference of the upper bound of the cross entropy is large under different conditions, so that the normalization operation needs to be executed on the basis of the cross entropy. Given the maximum historical prediction uncertainty calculation method, one can define
Figure DEST_PATH_IMAGE074
For normalizing the historical prediction uncertainty to make the value range constant
Figure 772285DEST_PATH_IMAGE050
To measure in a convenient way:
Figure 226400DEST_PATH_IMAGE074
the form of the formula is as follows:
Figure 5000DEST_PATH_IMAGE047
(6.10)
in summary, we can calculate and update the prediction confidence for each input sample according to the prediction history, sample image
Figure 544566DEST_PATH_IMAGE008
Normalized prediction confidence of
Figure 66814DEST_PATH_IMAGE007
The definition is shown in formula (6.11):
Figure 375436DEST_PATH_IMAGE048
(6.11)。
in this embodiment, the conventional idea of processing noise samples is to identify the noise samples by modeling the distribution of the noise. Accurately modeling noise is difficult in many cases, and it is not possible to model noise in a large number of samples with a significant effect. The situation of explicitly raising the distribution of noise or giving reconstruction errors is not always in accordance with the actual situation, and the method is not common in the large-scale data training neural network scene, so that the model-based method is gradually replaced by other methods. A common way to deal with the noisy data training problem is to add terms to the loss function so that the loss function can be less affected by the noise samples under certain conditions, and the above idea is generally called a "loss correction" method.
The idea of implementing loss correction is to dynamically adjust the target implementation of training according to the current neural network state, and a Bootstrapping strategy can be introduced, and the main method is as follows: the prediction target, which is a Convex Combination (constellation Combination) between the wrong tag vector and the result of the current prediction output of the neural network, is dynamically updated in the existing state of the model. As the training process continues to advance, neural networks should be more inclined to trust the current predicted output. Because the correct sample with the dominant scale exists in the training sample set, the network prediction result after training and the error label keep a certain difference, and therefore the method can finally reduce the influence of the incorrectly labeled sample on the training.
According to a common scheme, cross entropy is adopted as a basic loss function of the target neural network, and an additional optimization target needs to be added to the basic loss function to reflect the current state of the model. While adding additional targets is generally divided into two basic approaches: one is to directly use the prediction vector output by the neural network, called soft scheme; and the method of generating a prediction result (one-hot tag) using the prediction vector is called a hard scheme. The detailed structure of the "soft scheme" is shown in equation (6.7):
Figure 324937DEST_PATH_IMAGE049
(6.12)
it can be shown that the final optimization objective using equation (6.12) is equivalent to a Softmax regression with a minimum entropy regularization term whose function is to make the model more prone to predictive labeling. Corresponding to the "soft solution" is a hard solution that changes the regression objective to a known one
Figure 86220DEST_PATH_IMAGE051
In the case of
Figure 412159DEST_PATH_IMAGE053
Maximum posterior probability of, note
Figure DEST_PATH_IMAGE075
In the form of equation (6.13):
Figure 44129DEST_PATH_IMAGE055
(6.13)
after selecting a proper loss correction method, a normal neural network optimization process is executed next, data are fed into the neural network in batches, and network parameters are updated by using optimization tools such as random gradient descent and the like. The mode of updating the network parameters is similar to the process of an EM algorithm, a confidence label (correction label) corresponding to a sample is estimated by utilizing the convex combination of an original label and a model prediction label in the expectation stage, and the network parameters are updated in the maximization stage to enable the model to better predict the label generated in the last step.
If the neural network is updated directly with the loss function, the network will quickly begin to over-fit the noise samples during the training process, resulting in degraded performance. It is therefore desirable to consider adopting the bootstrapping strategy while compensating for loss values using the labels and predictors to mitigate the tendency of the neural network to fit to noise. When compensation is brought in, equation (6.2) can be rewritten as equation (6.14):
Figure 164531DEST_PATH_IMAGE056
(6.14)
Figure 678689DEST_PATH_IMAGE032
is a sample image
Figure 542740DEST_PATH_IMAGE008
The corresponding label is marked with a corresponding label,
Figure 823024DEST_PATH_IMAGE020
is the result of the prediction thereof,
Figure 379907DEST_PATH_IMAGE057
is the last softmax layer output defined by equation (6.3).
Normalized prediction confidence defined using equation (6.11)
Figure 115782DEST_PATH_IMAGE059
And dynamically determining the weight relationship between each sample label and the prediction result to achieve the aim of dynamically correcting the loss value. The specific idea is as follows: in the training process, if the label prediction confidence of a certain sample image is very high, the certain sample image is a clean sample or distributionThe probability of the internal noise sample is larger, and the loss value is corrected by the prediction result to a great extent; conversely, if there is a frequent variation in the prediction history of a sample, resulting in a large prediction uncertainty, the sample is a difficult sample or has a large possibility of being out-of-distribution noise, and the sample loss value should depend on the original label to a large extent. In summary, equation (6.14) can be rewritten to the dynamic loss based on prediction uncertainty by introducing equation (6.11), i.e., equation (6.15):
Figure 783524DEST_PATH_IMAGE060
(6.15)
wherein, the first and the second end of the pipe are connected with each other,
Figure 655665DEST_PATH_IMAGE061
using normalized prediction confidence
Figure 383450DEST_PATH_IMAGE030
The degree of compensation for the current Loss value is dynamically adjusted, and equation (6.15 refers to the dynamic Loss based on the Prediction confidence, abbreviated as PFL Loss).
In this embodiment, the cross-entropy loss boundaries between the in-distribution noise samples, the out-distribution noise samples and the clean samples are not always well defined, and thus cannot be well distinguished from each other in any scenario. The confidence of the prediction result of the noise samples outside the distribution in the training process is always lower than that of the noise samples in the clean distribution and the noise in the distribution, and the phenomenon provides a valuable clue for distinguishing the noise samples outside the distribution from other samples. The rule of prediction change of each sample in the training process reveals a feasible sample selection strategy, namely, the sample selection is driven by measuring the historical prediction confidence. The adoption of the strategy is more effective than the simple use of cross entropy loss, and clean samples and noise samples in distribution can be fully utilized while external noise is identified.
In order to smoothly introduce the strategy, preheating training needs to be adopted in the initial stage of the algorithm, the first few rounds of training are carried out, and a deep neural network tends to be fittedClean and simple samples and then start to adapt to difficult and noisy samples, inspired by the above conclusions, with all training samples D preceding
Figure 606621DEST_PATH_IMAGE011
In the round robin, a preheating strategy is firstly used for training a target neural network, the process of presetting the training does not include any loss correction and sample selection process, and the cross entropy loss used for the training is as shown in a formula (6.2):
Figure 812474DEST_PATH_IMAGE012
(6.2)
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE076
representing a sample
Figure 273542DEST_PATH_IMAGE002
Label of (2), cross entropy loss in equation (6.2)
Figure DEST_PATH_IMAGE077
For updating the neural network only during the warm-up phase, in which
Figure DEST_PATH_IMAGE078
The output vector representing the last softmax layer is calculated by equation (6.3):
Figure 109911DEST_PATH_IMAGE016
(6.3)
in the formula
Figure 554799DEST_PATH_IMAGE017
A mapping function representing a neural network is provided,
Figure DEST_PATH_IMAGE079
is the output of the fully connected layer preceding the last softmax layer, then the respective sample image
Figure 33185DEST_PATH_IMAGE002
Corresponding inference result
Figure 369268DEST_PATH_IMAGE004
It can be calculated by equation (6.4):
Figure 173276DEST_PATH_IMAGE021
(6.4)
during the whole training process (including the preheating training process), each sample image in the training set is recorded and updated
Figure 371039DEST_PATH_IMAGE002
In the near field
Figure 715432DEST_PATH_IMAGE006
Prediction in round training, note
Figure DEST_PATH_IMAGE080
. After the warm-up training phase is over, a prediction is performed on all samples of training set D using the trained neural network and then a historical prediction sequence is established and updated
Figure 354355DEST_PATH_IMAGE005
. The PFL loss for all training samples of training set D was calculated using the average equation (6.15). After completion of a round of training, PFL loss ranking can be selected
Figure 329265DEST_PATH_IMAGE010
Form a new training sample set
Figure 14324DEST_PATH_IMAGE009
And performing updating of the neural network model, selecting no useful sample in the mini-batch by the method, and selecting a sample participating in training on the whole training set by the algorithm so as to reduce the influence caused by the unbalanced noise distribution phenomenon across the mini-batch. To sum up, after the preheating trainingNewly selected training sample set for each round of training
Figure 365671DEST_PATH_IMAGE009
The generation process of (2) is shown in formula (6.5):
Figure 186996DEST_PATH_IMAGE028
(6.5)
wherein
Figure 598386DEST_PATH_IMAGE029
Representing a sample
Figure 505162DEST_PATH_IMAGE002
The clean samples and the distributed intra-noise samples are selected by equation (6.5) into the training set of the current round of training in the global sample selection phase. To avoid false exclusion of useful samples, the samples with high loss based on the prediction stability metric are only excluded from the training set in the current round of training, but the normalized prediction confidence of all samples is recalculated in the next round of global sample selection
Figure 394621DEST_PATH_IMAGE059
And updating the historical prediction sequence
Figure 336032DEST_PATH_IMAGE005
The process is shown in formula (6.6):
Figure 652744DEST_PATH_IMAGE031
(6.6)。
the loss correction based on prediction confidence and global sample selection algorithm is as follows:
inputting training sample set D training turns
Figure DEST_PATH_IMAGE081
Preheating training turns
Figure DEST_PATH_IMAGE082
Sample noise ratio
Figure DEST_PATH_IMAGE083
Length of history recorded
Figure DEST_PATH_IMAGE084
for
Figure DEST_PATH_IMAGE085
do
for training each training sample in the data set D
Figure 184832DEST_PATH_IMAGE002
do
Calculate the current sample according to equation (6.4)
Figure 81244DEST_PATH_IMAGE002
Predicted result of (2)
Figure 877161DEST_PATH_IMAGE004
if
Figure DEST_PATH_IMAGE086
then
Predicting the result of each sample after the training of the current round
Figure 99195DEST_PATH_IMAGE004
Adding to historical prediction result sequences
Figure 980563DEST_PATH_IMAGE005
To the end of (1);
else
using prediction results of current individual samples
Figure 211825DEST_PATH_IMAGE004
Replacing historical predictor sequences
Figure 862249DEST_PATH_IMAGE005
Of the earliest prediction record, guaranteed sequenceThe length is not more than
Figure 786343DEST_PATH_IMAGE084
end
if
Figure DEST_PATH_IMAGE087
then
Calculating the Cross entropy loss according to equation (6.2)
Figure DEST_PATH_IMAGE088
According to cross entropy loss
Figure 89760DEST_PATH_IMAGE088
The gradient is calculated and the update network is propagated backwards.
else
Calculate the normalized predicted stability of the samples in D according to equation (6.11)
Figure DEST_PATH_IMAGE089
Calculating dynamic loss based on predicted stability according to equation (6.15)
Figure DEST_PATH_IMAGE090
Selection according to equation (6.5)
Figure DEST_PATH_IMAGE091
The samples form an actual training set of the training round
Figure 531237DEST_PATH_IMAGE009
Calculating the corrected loss value according to equation (6.6)
Figure 36168DEST_PATH_IMAGE088
According to the corrected loss value
Figure 396742DEST_PATH_IMAGE088
Reverse directionSpreading;
End;
and (3) outputting: for training the loss of back propagation.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims (10)

1. A fine-grained classification denoising training method based on prediction confidence is characterized by comprising the following steps:
s1, firstly, utilizing all training samples to participate in preheating training, and recording a recent prediction result of each sample as a historical prediction set;
s2, generating a normalized prediction confidence coefficient of each sample through a histogram generated by a historical prediction set, wherein the normalized prediction confidence coefficient is as follows:
s21, calculating a histogram of the prediction labels relative to the total prediction times through a formula;
s22, deducing the confidence coefficient of the correct label according to the historical prediction result;
s23, performing normalization operation on the basis of the cross entropy;
and S3, balancing weights of the sample labels and the sample prediction by adopting the normalized prediction confidence coefficient, and dynamically correcting the loss value.
2. The fine-grained classification denoising training method based on prediction confidence coefficient as claimed in claim 1, wherein in S1, the training is first pre-heated at the beginning, and each sample in the training samples D is pre-heated after the pre-heating training process is completed
Figure 563592DEST_PATH_IMAGE001
,
Figure 356842DEST_PATH_IMAGE002
Performing inference and obtaining a prediction result, obtaining the number of samples of the N data sets, respectively obtaining Softmax probability distribution vectors from backbone network outputs formed by the two convolutional neural networks in a physical process, and then calculating the prediction result
Figure 597331DEST_PATH_IMAGE003
(ii) a Note the book
Figure 610286DEST_PATH_IMAGE004
For sample images
Figure 227212DEST_PATH_IMAGE001
Historical prediction sequences of training
Figure 110855DEST_PATH_IMAGE005
For each sample image
Figure 850141DEST_PATH_IMAGE001
The prediction confidence of (2); dynamic correction loss based on prediction confidence
Figure 22496DEST_PATH_IMAGE005
And balancing the label one-hot coding and the predicted result after the weight and the neural network output to calculate the cross entropy to obtain a corrected loss value.
3. The fine-grained classification denoising training method based on prediction confidence as claimed in claim 2, wherein the training sample D is put in front
Figure 239851DEST_PATH_IMAGE006
In the round robin, a preheating strategy is used for training a target neural network, and a cross entropy loss formula used for training is as follows:
Figure 446841DEST_PATH_IMAGE007
(6.2)
wherein the content of the first and second substances,
Figure 825870DEST_PATH_IMAGE008
representing a sample
Figure 547838DEST_PATH_IMAGE001
Label of (2), cross entropy loss in equation (6.2)
Figure 506567DEST_PATH_IMAGE009
Updating the neural network in the preheating stage;
in the formula
Figure 630381DEST_PATH_IMAGE010
The output vector representing the last softmax layer is calculated by equation (6.3):
Figure 649152DEST_PATH_IMAGE011
(6.3)
in the formula
Figure 94302DEST_PATH_IMAGE012
A mapping function representing a neural network is provided,
Figure 653460DEST_PATH_IMAGE013
is the output of the fully-connected layer preceding the last softmax layer, k is the number of categories of the dataset,
Figure 569463DEST_PATH_IMAGE014
for network parameters, sample images
Figure 821453DEST_PATH_IMAGE001
Corresponding inference result
Figure 455697DEST_PATH_IMAGE003
Calculating outThe formula is as follows:
Figure 552966DEST_PATH_IMAGE015
(6.4)
recording and updating each sample image in the training set during the whole training process
Figure 385792DEST_PATH_IMAGE001
In the near vicinity
Figure 746367DEST_PATH_IMAGE016
Prediction in round training, note
Figure 930223DEST_PATH_IMAGE017
Figure 565604DEST_PATH_IMAGE018
Sample image for network
Figure 456200DEST_PATH_IMAGE019
Predictive labels in the T-th (i.e., current) round of training.
4. The fine-grained classification denoising training method based on prediction confidence as claimed in claim 3, wherein after the preheating training phase is finished, a trained neural network is used to perform a prediction on the samples in the training sample D and then to establish and update the historical prediction sequence
Figure 49992DEST_PATH_IMAGE004
(ii) a The PFL loss of all training samples in training sample D was calculated using the average equation (6.15); after training is finished, the number of discarded samples is controlled by using a ratio delta (%), namely the proportion of the samples judged to be noise outside the distribution to the total number of samples, and PFL loss before ranking is selected
Figure 393249DEST_PATH_IMAGE020
The training samples form a new training sample set
Figure 862014DEST_PATH_IMAGE021
Performing an update of the neural network model, pre-heating a newly selected training sample set for each round of training after training
Figure 403854DEST_PATH_IMAGE021
The generation process of (2) is shown in formula (6.5):
Figure 106230DEST_PATH_IMAGE022
(6.5)
wherein
Figure 999100DEST_PATH_IMAGE023
Representing a sample
Figure 179546DEST_PATH_IMAGE001
The clean samples and the noise samples in the distribution are selected into a training set of the training of the round by a formula (6.5) in the global sample selection stage; the normalized prediction confidence of all samples is recalculated in the next round of global sample selection
Figure 107050DEST_PATH_IMAGE024
And updating the historical prediction sequence
Figure 980328DEST_PATH_IMAGE004
The process formula is as follows:
Figure 360494DEST_PATH_IMAGE025
(6.6);
Figure 141368DEST_PATH_IMAGE026
is a sample image
Figure 861063DEST_PATH_IMAGE019
A corresponding label.
5. The fine-grained classification denoising training method based on prediction confidence as claimed in claim 3, wherein in S21, let
Figure 967559DEST_PATH_IMAGE027
Representing a sample
Figure 772704DEST_PATH_IMAGE001
Predicting a sequence of outcomes in history
Figure 622848DEST_PATH_IMAGE004
Is predicted as the frequency of tag j and
Figure 495252DEST_PATH_IMAGE028
k is the number of categories of the data set;
Figure 710332DEST_PATH_IMAGE027
can be calculated by equation (6.1):
Figure 799511DEST_PATH_IMAGE029
(6.1)
wherein the content of the first and second substances,
Figure 391029DEST_PATH_IMAGE003
is the result of the prediction of the sample image,
Figure 882054DEST_PATH_IMAGE030
is the size of its historical prediction result set, i.e. the total number of predictions,
Figure 64773DEST_PATH_IMAGE027
i.e. the predicted signature phaseHistogram for the total number of predictions.
6. The fine-grained classification denoising training method based on prediction confidence as claimed in claim 5, wherein in S22, the frequency of occurrence of sample labels in prediction history is statistically in positive correlation with the probability that the label is a true label; the likelihood that such a prediction belongs to a true tag is defined as the "confidence" that the correct tag was inferred from historical predictions;
the concept of entropy is matched with the concept of confidence degree, and can be used for expressing each sample image
Figure 844510DEST_PATH_IMAGE001
The uncertainty of the prediction result of (1) is in the form of:
Figure 36457DEST_PATH_IMAGE031
(6.7)
Figure 319671DEST_PATH_IMAGE032
representing a sample
Figure 204450DEST_PATH_IMAGE019
Predicting a sequence of outcomes in history
Figure 205905DEST_PATH_IMAGE033
The frequency of the label y predicted as the highest frequency in the prediction history, the histogram characteristic of the prediction history reflects the uncertainty of the inference history to the attribution of the label, and the following formula can be used to describe the uncertainty of the prediction history when the uncertainty is the maximum
Figure 670384DEST_PATH_IMAGE034
The minimum case is:
Figure 136000DEST_PATH_IMAGE035
(6.8)
wherein k represents the total class number of the tags in the data set,
Figure 129364DEST_PATH_IMAGE016
represents the length of the prediction history sequence; in a commonly used network image fine-grained classification dataset, the label category number in the dataset is far greater than the length setting of the actual historical prediction record, i.e.
Figure 913387DEST_PATH_IMAGE036
(ii) a Therefore, the temperature of the molten steel is controlled,
Figure 650399DEST_PATH_IMAGE037
the calculation method of the historical prediction maximum uncertainty can be obtained, and the formula is as follows:
Figure 704943DEST_PATH_IMAGE038
(6.9)。
7. the fine-grained classification denoising training method based on prediction confidence as claimed in claim 6, wherein in S23, normalization operation is performed on the basis of cross entropy; the known maximum historical prediction uncertainty calculation method is defined
Figure 665945DEST_PATH_IMAGE039
Normalizing the historical prediction uncertainty to a constant value
Figure 641992DEST_PATH_IMAGE040
To measure in a convenient way:
Figure 979432DEST_PATH_IMAGE039
the form of the formula is as follows:
Figure 91745DEST_PATH_IMAGE041
(6.10)
calculating and updating prediction confidence for each input sample based on prediction history, sample image
Figure 489228DEST_PATH_IMAGE001
Normalized prediction confidence of
Figure 483729DEST_PATH_IMAGE005
The definition is shown in formula (6.11):
Figure 562543DEST_PATH_IMAGE042
(6.11)。
8. the fine-grained classification denoising training method based on prediction confidence coefficient as claimed in claim 7, wherein in S3, a "loss correction" method is adopted for processing noise samples;
the basic loss function of the neural network adopts cross entropy, and the addition of additional targets is divided into two basic modes: one is a soft scheme and the other is a hard scheme; the detailed structure of the soft scheme is as the formula:
Figure 857258DEST_PATH_IMAGE043
(6.12)
q is a prediction vector output by the neural network, t is a noise label vector, L is the total class number of labels, and beta is a value range constant in
Figure 894484DEST_PATH_IMAGE044
The hyper-parameter between, corresponding to the "soft solution", is a hard solution, changing the regression objective to a known one
Figure 845123DEST_PATH_IMAGE045
Is known as
Figure 25831DEST_PATH_IMAGE046
In the case of the maximum value of q
Figure 112735DEST_PATH_IMAGE047
Maximum posterior probability of, note
Figure 586442DEST_PATH_IMAGE048
The formal formula is:
Figure 758797DEST_PATH_IMAGE049
(6.13)
after a loss correction method is selected, the network parameters are updated by using a random gradient descent optimization tool.
9. The fine-grained classification denoising training method based on prediction confidence as claimed in claim 6, wherein the neural network is updated by using a loss function, and when compensation is brought, the formula (6.2) can be rewritten as:
Figure 976152DEST_PATH_IMAGE050
(6.14)
Figure 979880DEST_PATH_IMAGE051
is a sample image
Figure 827751DEST_PATH_IMAGE001
The corresponding label is marked with a corresponding label,
Figure 549719DEST_PATH_IMAGE003
is the result of the prediction thereof,
Figure 508448DEST_PATH_IMAGE052
is the last softmax layer output, parameter, defined by equation (6.3)
Figure 632261DEST_PATH_IMAGE053
Is set to a fixed value, e.g.
Figure 447771DEST_PATH_IMAGE053
= 0.8 to statically balance the label and the prediction result.
10. The fine-grained classification denoising training method based on prediction confidence as claimed in claim 9, wherein the normalized prediction confidence defined by formula (6.11) is used
Figure 329139DEST_PATH_IMAGE024
Dynamically determining the weight relationship between each sample label and the prediction result to achieve the purpose of dynamically correcting the loss value;
equation (6.14) is rewritten to a dynamic loss based on prediction uncertainty by introducing equation (6.11), specifically:
Figure 153876DEST_PATH_IMAGE054
(6.15)
wherein the content of the first and second substances,
Figure 69879DEST_PATH_IMAGE055
using normalized prediction confidence
Figure 832123DEST_PATH_IMAGE024
The compensation degree of the current loss value is dynamically adjusted; equation (6.15) is referred to as the dynamic penalty based on prediction confidence, abbreviated as PFL penalty.
CN202211452486.3A 2022-11-21 2022-11-21 Fine-grained classification denoising training method based on prediction confidence Pending CN115661549A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211452486.3A CN115661549A (en) 2022-11-21 2022-11-21 Fine-grained classification denoising training method based on prediction confidence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211452486.3A CN115661549A (en) 2022-11-21 2022-11-21 Fine-grained classification denoising training method based on prediction confidence

Publications (1)

Publication Number Publication Date
CN115661549A true CN115661549A (en) 2023-01-31

Family

ID=85017297

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211452486.3A Pending CN115661549A (en) 2022-11-21 2022-11-21 Fine-grained classification denoising training method based on prediction confidence

Country Status (1)

Country Link
CN (1) CN115661549A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116567719A (en) * 2023-07-05 2023-08-08 北京集度科技有限公司 Data transmission method, vehicle-mounted system, device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111861909A (en) * 2020-06-29 2020-10-30 南京理工大学 Network fine-grained image denoising and classifying method
CN112232407A (en) * 2020-10-15 2021-01-15 杭州迪英加科技有限公司 Neural network model training method and device for pathological image sample
CN114190950A (en) * 2021-11-18 2022-03-18 电子科技大学 Intelligent electrocardiogram analysis method and electrocardiograph for containing noise label

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111861909A (en) * 2020-06-29 2020-10-30 南京理工大学 Network fine-grained image denoising and classifying method
CN112232407A (en) * 2020-10-15 2021-01-15 杭州迪英加科技有限公司 Neural network model training method and device for pathological image sample
CN114190950A (en) * 2021-11-18 2022-03-18 电子科技大学 Intelligent electrocardiogram analysis method and electrocardiograph for containing noise label

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116567719A (en) * 2023-07-05 2023-08-08 北京集度科技有限公司 Data transmission method, vehicle-mounted system, device and storage medium
CN116567719B (en) * 2023-07-05 2023-11-10 北京集度科技有限公司 Data transmission method, vehicle-mounted system, device and storage medium

Similar Documents

Publication Publication Date Title
CN110321926B (en) Migration method and system based on depth residual error correction network
Yoon et al. Data valuation using reinforcement learning
CN113688949B (en) Network image data set denoising method based on dual-network joint label correction
CN111832627A (en) Image classification model training method, classification method and system for suppressing label noise
CN111784595B (en) Dynamic tag smooth weighting loss method and device based on historical record
WO2019091402A1 (en) Method and device for age estimation
CN113221903B (en) Cross-domain self-adaptive semantic segmentation method and system
CN115661549A (en) Fine-grained classification denoising training method based on prediction confidence
CN110110372B (en) Automatic segmentation prediction method for user time sequence behavior
CN112990385A (en) Active crowdsourcing image learning method based on semi-supervised variational self-encoder
CN113537630A (en) Training method and device of business prediction model
CN111105241A (en) Identification method for anti-fraud of credit card transaction
CN110059251B (en) Collaborative filtering recommendation method based on multi-relation implicit feedback confidence
CN110310199B (en) Method and system for constructing loan risk prediction model and loan risk prediction method
Ji et al. How to handle noisy labels for robust learning from uncertainty
JP2021149842A (en) Machine learning system and machine learning method
Li et al. Inter-domain mixup for semi-supervised domain adaptation
Peng et al. FaxMatch: Multi‐Curriculum Pseudo‐Labeling for semi‐supervised medical image classification
CN112541010A (en) User gender prediction method based on logistic regression
CN116486150A (en) Uncertainty perception-based regression error reduction method for image classification model
CN115829693A (en) Contextual slot machine delay feedback recommendation method and system based on causal counterfactual
Donini et al. An efficient method to impose fairness in linear models
CN113627538B (en) Method for training asymmetric generation of image generated by countermeasure network and electronic device
CN115861625A (en) Self-label modifying method for processing noise label
CN113191984B (en) Deep learning-based motion blurred image joint restoration and classification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230131

RJ01 Rejection of invention patent application after publication