CN116051880A - Result prediction method based on uncertainty evaluation under label noise - Google Patents

Result prediction method based on uncertainty evaluation under label noise Download PDF

Info

Publication number
CN116051880A
CN116051880A CN202210529489.6A CN202210529489A CN116051880A CN 116051880 A CN116051880 A CN 116051880A CN 202210529489 A CN202210529489 A CN 202210529489A CN 116051880 A CN116051880 A CN 116051880A
Authority
CN
China
Prior art keywords
training data
initial
training
uncertainty
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210529489.6A
Other languages
Chinese (zh)
Inventor
潘超
袁博
姚新
周维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Southwest University of Science and Technology
Original Assignee
Huawei Technologies Co Ltd
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Southwest University of Science and Technology filed Critical Huawei Technologies Co Ltd
Priority to CN202210529489.6A priority Critical patent/CN116051880A/en
Publication of CN116051880A publication Critical patent/CN116051880A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a result prediction method based on uncertainty evaluation under label noise, which comprises the following steps: acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data; inputting real scene data into a trained deep neural network model and a trained function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights; and obtaining a target prediction result according to the first uncertainty prediction value and the initial prediction result. According to the embodiment of the invention, the weight is added to the training data in the training stage, the training sample is screened according to the weight, and the influence of the label noise on the prediction uncertainty is eliminated under the condition that the generalization of the deep neural network model is not influenced, so that the prediction result is more accurate.

Description

Result prediction method based on uncertainty evaluation under label noise
Technical Field
The invention relates to the technical field of machine learning, in particular to a result prediction method based on uncertainty evaluation under label noise.
Background
In the machine learning field, data used to train a model is generally referred to as Id-distribution (ID) data, and data in which the distribution of trained ID data is inconsistent is referred to as Out-of-distribution (OOD) data. In a real application scenario, the training data is limited and it is impossible to include all possible scenarios, so the OOD data may appear in the real application scenario. Trucks such as elephants on roads, rollover, are not typically present in the trained ID data, but are likely to be present in real scenes.
Deep learning represented by deep neural networks has achieved excellent results in recent years, and has been widely used in many fields. In tasks such as picture classification, speech recognition, etc., neural networks can achieve excellent results when the input data is distributed independently of the trained ID data, but they often cannot recognize when these predictions may be erroneous. Even more serious is that the prediction result of the neural network model is not trusted in the presence of the OOD data. The neural network model cannot provide correct prediction for the OOD data, and does not give a low prediction confidence to the OOD data, but rather, usually, the neural network model can incorrectly predict the OOD data as a certain label of the ID data with a high confidence.
Deep neural networks perform well in giving accurate predictions, but lack of knowledge of the reliability of a given output is a major obstacle in deploying such models in safety critical applications. In fields with high safety requirements, such as medical diagnosis, automatic driving, etc., the neural network model needs to accurately evaluate the degree of certainty of its prediction results, i.e., quantify the prediction uncertainty. Still further, the neural network model needs to give high confidence in predicting correct data, low confidence in predicting incorrect data, and low confidence in OOD data that cannot be predicted and that is inconsistent with the training data distribution.
Various methods have recently been proposed to quantify the prediction uncertainty in deep neural networks. The prediction uncertainty quantization methods can be used for misclassification detection and OOD data detection of prediction results, and good effects are achieved under the condition that no label noise exists in training data. However, in the practical application scenario, various labeling errors exist in the training data, and even in a widely used data set, a small proportion of label noise exists. In this scenario, the existing prediction uncertainty quantization method is not robust enough, and even if a small proportion of label noise exists, the effect of the prediction uncertainty on misclassification detection and OOD data detection is greatly reduced.
Accordingly, there is a need for improvement and development in the art.
Disclosure of Invention
The invention aims to solve the technical problems that the uncertainty evaluation for the label noise is provided for aiming at the defects in the prior art, and aims to solve the problems that the prediction uncertainty quantization method in the prior art is not robust enough, even if the label noise with a small proportion exists, the effect of the prediction uncertainty for misclassification detection and OOD data detection is greatly reduced.
The technical scheme adopted by the invention for solving the problems is as follows:
in a first aspect, an embodiment of the present invention provides a method for predicting a result based on uncertainty evaluation under tag noise, where the method includes:
acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data;
inputting real scene data into a trained deep neural network model and a trained function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights;
And obtaining a target prediction result according to the first uncertainty prediction value and the initial prediction result.
In one implementation, the training process of the deep neural network model includes:
acquiring an initial training data set containing label noise; the label noise is noise caused by labeling error labels on training data;
inputting the initial training data set into an initial deep neural network model and the function model to obtain a second uncertainty prediction value set corresponding to the initial training data set, obtaining a weight set corresponding to the initial training data set according to the second uncertainty prediction value set, obtaining an intermediate training data set according to the weight set and the initial training data set, taking the intermediate training data set as the initial training data set, and repeating the step of inputting the initial training data set into the initial deep neural network model and the function model;
and stopping iterative training when the initial deep neural network model meets the preset training conditions, and obtaining a trained deep neural network model.
In one implementation, the step of inputting the initial training data set into an initial deep neural network model and the function model to obtain a second uncertainty prediction value set corresponding to the initial training data set, obtaining a weight set corresponding to the initial training data set according to the second uncertainty prediction value set, obtaining an intermediate training data set according to the weight set and the initial training data set, using the intermediate training data set as an initial training data set, and repeatedly performing the step of inputting the initial training data set into an initial deep neural network model and the function model includes:
For the t-th iterative training, acquiring a loss function value set corresponding to the initial training data set in the t-th iterative training and a second uncertainty prediction value set corresponding to the initial training data set in the t-th iterative training, obtaining a training data weight set corresponding to the initial training data set in the t-th iterative training according to the loss function value set and the second uncertainty prediction value set, and weighting the initial training data set in the t-th iterative training based on the training data weight set to obtain an intermediate training data set; the training data weight set is used for representing the importance degree of training data in the initial deep neural network model; wherein t is greater than or equal to 1;
and inputting the intermediate training data set into an initial deep neural network model and the function model, and performing t+1st iteration training.
In one implementation, the initial training data set includes a number of initial training data; the obtaining the loss function value set corresponding to the initial training data set in the t-th iterative training and the second uncertainty prediction value set corresponding to the initial training data set in the t-th iterative training, and obtaining the training data weight set corresponding to the initial training data set in the t-th iterative training according to the loss function value set and the second uncertainty prediction value set includes:
Acquiring a loss function value corresponding to the initial training data in the t-th iterative training aiming at each initial training data; inputting the initial training data into the initial deep neural network model for multiple times to obtain a plurality of output results, and performing variance calculation on the plurality of output results based on the function model to obtain a second uncertainty predicted value corresponding to the initial training data; subtracting the loss function value from the obtained preset first value to obtain a difference value; multiplying the difference value by the second uncertainty prediction value to obtain a first product; taking the first product as an index, and taking the obtained preset second value as a base number to obtain an index operation result; taking the index operation result as training data weight corresponding to the initial training data in the t-th iterative training;
and taking the training data weights corresponding to all the initial training data in the initial training data set as a training data weight set corresponding to the initial training data set.
In one implementation, the weighting the initial training data set in the t-th iterative training based on the training data weight set to obtain an intermediate training data set includes:
Acquiring preset total iterative training times and the number of all initial training data in the initial training data set;
multiplying t by the number to obtain a second product;
dividing the second product by the total iterative training times to obtain a quotient;
and obtaining an intermediate training data set according to the quotient and the training data weight set.
In one implementation, the obtaining the intermediate training data set according to the quotient and the training data weight set includes:
searching weights corresponding to the quotient values in the training data weight set to obtain reference weights;
resetting the training data weight less than the reference weight in the training data weight set by 0 to obtain an intermediate training data set.
In one implementation, the obtaining the target prediction result according to the first uncertainty prediction value and the initial prediction result includes:
and when the first uncertainty prediction value is smaller than a preset threshold value, taking the initial prediction result as a target prediction result.
In a second aspect, an embodiment of the present invention further provides a result prediction apparatus based on uncertainty evaluation under tag noise, where the apparatus includes: the real scene data acquisition module is used for acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data;
The prediction module is used for inputting the real scene data into the trained deep neural network model and a preset function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights;
and the target prediction result acquisition module is used for acquiring a target prediction result according to the first uncertainty prediction value and the initial prediction result.
In a third aspect, an embodiment of the present invention further provides an intelligent terminal, including a memory, and one or more programs, where the one or more programs are stored in the memory, and configured to be executed by the one or more processors, where the one or more programs include a result prediction method for performing the tag noise uncertainty-based estimation according to any one of the above.
In a fourth aspect, embodiments of the present invention further provide a non-transitory computer-readable storage medium, which when executed by a processor of an electronic device, enables the electronic device to perform a method of predicting a result based on an uncertainty evaluation under tag noise as set forth in any one of the above.
The invention has the beneficial effects that: the embodiment of the invention firstly acquires the real scene data; wherein, partial data in the real scene data is distributed differently from training data; then inputting the real scene data into a trained deep neural network model and a function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights; finally, a target prediction result is obtained according to the first uncertainty prediction value and the initial prediction result; therefore, the embodiment of the invention eliminates the influence of the label noise on the prediction uncertainty by adding the weight to the training data in the training stage and screening the training sample according to the weight under the condition that the generalization of the deep neural network model is not influenced, so that the prediction result is more accurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art.
Fig. 1 is a schematic flow chart of a result prediction method based on uncertainty evaluation under the tag noise provided by the embodiment of the invention.
Fig. 2 is a frame diagram of a result prediction method based on uncertainty evaluation under label noise according to an implementation manner provided by an embodiment of the present invention.
Fig. 3 is a schematic flow chart of a result prediction method based on uncertainty evaluation under label noise of an implementation manner according to an embodiment of the present invention.
Fig. 4 is a schematic block diagram of a result prediction device based on uncertainty evaluation under tag noise provided by an embodiment of the present invention.
Fig. 5 is a schematic block diagram of an internal structure of an intelligent terminal according to an embodiment of the present invention.
Detailed Description
The invention discloses a result prediction method based on uncertainty evaluation under label noise, which is used for making the purposes, technical schemes and effects of the invention clearer and more definite, and is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Since in the prior art, one common approach to the evaluation of the prediction uncertainty of neural networks is to rely on bayesian methods (e.g., variational bayes or markov chain monte carlo), the core is to calculate the posterior distribution of neural network parameters. However, accurate bayesian reasoning is often difficult to handle, so that the posterior can only be calculated approximately, and an approximate bayesian reasoning method is often used in a practical scene.
Monte-Carlo Dropout (MC-Dropout) may be interpreted as an integrated form with shared network parameters, or as approximate Bayesian reasoning. Dropout techniques typically discard a proportion of neurons randomly during training, acting as regularization to prevent overfitting. MC-Dropout also uses droopout in the inferential prediction process to estimate the prediction distribution. After dropout is turned on, the result of each prediction is not fixed, and each forward pass can be considered as sampling from the posterior distribution of network weights. The mean and variance of the multiple random forward passes are the prediction result and prediction uncertainty, respectively.
When the training data does not contain label noise, the prediction uncertainty obtained by MC-Dropout evaluation can be effectively used for misclassification detection and OOD data detection. However, when the tag noise exists in the training data, the performance of MC-Dropout for misclassification detection and OOD detection may be significantly degraded. Even a small proportion of tag noise has a large impact on detection performance. The use of MC-Dropout is greatly limited because of more or less labeling errors and label noise in labeling the real scene.
In order to solve the problems in the prior art, the embodiment provides a result prediction method based on uncertainty evaluation under the condition of label noise, which is characterized in that weights are added to training data in a training stage, training samples are screened according to the weights, and the influence of label noise on prediction uncertainty is eliminated under the condition that generalization of a deep neural network model is not influenced, so that a prediction result is more accurate. In the specific implementation, firstly, acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data; then inputting the real scene data into a trained deep neural network model and a function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights; and finally, obtaining a target prediction result according to the first uncertainty prediction value and the initial prediction result.
Exemplary method
The embodiment provides a result prediction method based on uncertainty evaluation under label noise, which can be applied to intelligent terminals for machine learning. As shown in fig. 1-2 in particular, the method comprises:
step S100, acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data;
specifically, acquiring real scene data, wherein part of data in the real scene data is distributed differently from training data; that is, the real scene data includes data Out-of-distribution (OOD) data which is inconsistent with the distribution of the training data ID (ID-distribution) data.
After obtaining the real scene data, the following steps may be performed as shown in fig. 1: s200, inputting real scene data into a trained deep neural network model and a function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights;
specifically, the trained deep neural network model is obtained based on a plurality of iterative training, weights are added to training data in each iterative training process, and the training data are screened according to the weights, so that the trained deep neural network model cannot be generalized, and the influence of tag noise on prediction uncertainty is eliminated. Inputting the real scene data into a trained deep neural network model, and outputting an initial prediction result through the deep neural network model; the real scene data are input into a trained deep neural network model for a plurality of times, and a first uncertainty prediction value is obtained by performing variance calculation on a plurality of initial prediction results based on a function model (MC-Dropout).
In one implementation, the training process of the deep neural network model includes the following steps: acquiring an initial training data set containing label noise; the label noise is noise caused by labeling error labels on training data; inputting the initial training data set into an initial deep neural network model and the function model to obtain a second uncertainty prediction value set corresponding to the initial training data set, obtaining a weight set corresponding to the initial training data set according to the second uncertainty prediction value set, obtaining an intermediate training data set according to the weight set and the initial training data set, taking the intermediate training data set as the initial training data set, and repeating the step of inputting the initial training data set into the initial deep neural network model and the function model; and stopping iterative training when the initial deep neural network model meets the preset training conditions, and obtaining a trained deep neural network model.
In particular, the performance of widely used MC-Dropout based uncertainty assessment methods can be significantly reduced, even to a completely random extent, due to the presence of a proportion of tag noise in the training data. The technical problem to be solved by the invention is an uncertainty evaluation method for robustness of training data containing label noise so as to realize accurate prediction of results. The invention acquires an initial training data set containing label noise, and prepares for subsequent training.
In this embodiment, the function model is MC-Dropout, that is, in a scenario where the initial training data set contains tag noise, the method is based on the MC-Dropout framework, when the initial training data set is input to the initial deep neural network model, there is one model output, when the same initial training data is input to the initial deep neural network model multiple times, the variance of a plurality of model outputs, that is, the second uncertainty prediction value, can be obtained through the MC-Dropout, then all the initial training data sets are input to the initial deep neural network model, the second uncertainty prediction value set corresponding to the initial training data set is obtained through the MC-Dropout, in order to further eliminate the influence of the tag noise in the initial training data, a weight set is obtained according to the second uncertainty prediction value set, and meanwhile, a loss function value set and the second uncertainty prediction value set can be calculated mathematically to obtain the weight set, or other methods can be adopted to obtain the weight set, which is not limited specifically. After the weight set is included, the initial training data set is weighted according to the weight set, so that an updated initial training data set, namely an intermediate training data set, is obtained, the intermediate training data set is used as the initial training data set to be output into the deep neural network model again, and the steps of inputting the initial training data set into the initial deep neural network model and the function model to obtain a second uncertainty prediction value set corresponding to the initial training data set are repeatedly executed. That is, the invention adds weight to the training sample in the model training process, and screens the sample according to the weight, so that the influence of the label noise on the prediction uncertainty is eliminated under the condition that the generalization of the model is not influenced.
In this embodiment, when the number of training iterations T of the initial deep neural network model reaches a maximum value, for example, T times, the iterative training is stopped, and a trained deep neural network model is obtained at this time, which can be used in an actual application scenario.
In one implementation, the step of inputting the initial training data set into an initial deep neural network model and the function model to obtain a second uncertainty prediction value set corresponding to the initial training data set, obtaining a weight set corresponding to the initial training data set according to the second uncertainty prediction value set, obtaining an intermediate training data set according to the weight set and the initial training data set, using the intermediate training data set as an initial training data set, and repeatedly performing the step of inputting the initial training data set into the initial deep neural network model and the function model includes the following steps: for the t-th iterative training, acquiring a loss function value set corresponding to the initial training data set in the t-th iterative training and a second uncertainty prediction value set corresponding to the initial training data set in the t-th iterative training, obtaining a training data weight set corresponding to the initial training data set in the t-th iterative training according to the loss function value set and the second uncertainty prediction value set, and weighting the initial training data set in the t-th iterative training based on the training data weight set to obtain an intermediate training data set; the training data weight set is used for representing the importance degree of training data in the initial deep neural network model; wherein t is greater than or equal to 1; and inputting the intermediate training data set into an initial deep neural network model and the function model, and performing t+1st iteration training.
In particular, in the field of high safety requirements, there is no way to ensureWhen the labels in the training data are completely correct, in order to ensure the validity of the prediction uncertainty calculated by the neural network model, a scheme as shown in the flowchart of fig. 3 can be adopted. The loss function value will be different for different samples during the training process. A small loss function value for a sample indicates that the sample is easy to learn. Samples with large loss function values contain two classes: the first is data on the decision boundary of different categories of data, and the second is data with wrong labels, both of which are difficult to learn. From this, it can be found that samples with large loss function values may be either samples with wrong labels or samples at decision boundaries. The wrong samples of the label are detrimental to the training of the model, while the samples on the decision boundary contain more information, which is the key to model optimization. Samples with small loss function values typically do not contain label noise and should be given a higher weight during training. However, if training is performed using only samples with small loss function values (most of the samples having correct labels), the generalization of the obtained model is poor because the number of samples is small, the samples are not at decision boundaries, the amount of information contained is small, and the obtained model is poor. Thus, in addition to taking into account the loss function value of the sample when weighting the training samples, other factors, such as prediction uncertainty, may be considered, which may be used for misclassification detection and OOD data detection. Samples with large prediction uncertainty may be in two regions: the region where the OOD data is located outside the training data distribution on the decision boundary of the data of different categories. The training data is generally considered as ID data, so when training the neural network model, each training sample point is considered to be in the region where the ID data is located, not the region where the OOD data is located, and thus the training samples with large prediction uncertainty can only be samples on decision boundaries. These samples contain more information and need to be given higher weights. In addition, different training phases also have an effect on eliminating tag noise: although the impact of false labels on neural network model generalization can already be greatly reduced after weighting training samples with loss function values and prediction uncertainties. However, even if the wrong data is given a small weight, the image will be still And the accuracy of prediction uncertainty is responded. Meanwhile, the data with wrong labels is helpful for the neural network model to learn the characteristics of the data in the early stage of training, so that the generalization of the model is improved, but in the later stage of training, the deep neural network model can memorize the wrong labels, so that the performance is reduced. Therefore, the weight of the training sample also needs to be dynamically adjusted according to the training stage, and as the training proceeds, the weight of more samples which are possibly marked with errors is reset to 0, i.e. the training is filtered according to the weight. Taking into account the effects of the three factors (loss function value of training sample, prediction uncertainty, training phase) described above, in this embodiment, for the t-th iterative training, i.e. in the t-th epoch, the initial training data set (x) is obtained in the t-th iterative training i ,y i ) (wherein x i For initial training data, y i For the label corresponding to the initial training data, i is the ith initial training data) a set of loss function values
Figure BDA0003645977160000121
And a second set of Uncertainty predictors { Uncertinty (x) corresponding to the initial training data set in the t-th iteration training i ) Obtaining a training data weight set corresponding to the initial training data set in the t-th iterative training according to the loss function value set and the second uncertainty prediction value set, and based on the training data weight set { w } i The training data may be further weighted by { w }, for i Screening to obtain a final training data weight set, and then weighting the initial training data set in the t-th iterative training to obtain an intermediate training data set; the updated initial training data set (intermediate training data set) is then input to the initial deep neural network model and the functional model for the t+1st iteration training. In this way, through repeated iterative training, the training initial training data set is continuously screened, so that the neural network can not only keep enough good generalization, but also remove samples which are more likely to be marked with errors, and the effect of prediction uncertainty is improved.
In one implementation, the initial training data set includes a number of initial training data for a t-th iterative training; the obtaining a loss function value set corresponding to the initial training data set in the t-th iterative training and a second uncertainty prediction value set corresponding to the initial training data set in the t-th iterative training, and obtaining a training data weight set corresponding to the initial training data set in the t-th iterative training according to the loss function value set and the second uncertainty prediction value set includes the following steps: acquiring a loss function value corresponding to the initial training data in the t-th iterative training aiming at each initial training data; inputting the initial training data into the initial deep neural network model for multiple times to obtain a plurality of output results, and performing variance calculation on the plurality of output results based on the function model to obtain a second uncertainty predicted value corresponding to the initial training data; subtracting the loss function value from the obtained preset first value to obtain a difference value; multiplying the difference value by the second uncertainty prediction value to obtain a first product; taking the first product as an index, and taking the obtained preset second value as a base number to obtain an index operation result; taking the index operation result as training data weight corresponding to the initial training data in the t-th iterative training; and taking the training data weights corresponding to all the initial training data in the initial training data set as a training data weight set corresponding to the initial training data set.
Specifically, the loss function value of the initial training data is calculated for each initial training data in the initial training data set
Figure BDA0003645977160000131
Wherein y is i Tag for initial training data +.>
Figure BDA0003645977160000132
For the output of the deep neural network, inputting the initial training data into the initial deep neural network model for a plurality of times to obtain a plurality of output results, and based on the function model, outputting a plurality of output resultsThe result is subjected to variance calculation to obtain a second Uncertainty predictive value Uncertinty (x) of the initial training data i ). Assuming that the preset first value is 0, other values may be taken in practice, which are not limited here, subtracting the loss function value from 0 to obtain the difference +.>
Figure BDA0003645977160000133
The difference is-
Figure BDA0003645977160000134
Multiplying the second Uncertainty prediction value Uncertainty (x) i ) Obtaining a first product-
Figure BDA0003645977160000141
Assuming that the preset second value is 10, denoted by e, taking e as a base, taking said first product +.>
Figure BDA0003645977160000142
As an index, obtain training data weight
Figure BDA0003645977160000143
In this way, training data weights corresponding to all initial training data in the initial training data set are used as training data weight sets { w } corresponding to the initial training data set i }。
In one implementation, for the t-th iterative training, the weighting the initial training data set in the t-th iterative training based on the training data weight set to obtain an intermediate training data set includes the following steps: acquiring preset total iterative training times and the number of all initial training data in the initial training data set; multiplying t by the number to obtain a second product; dividing the second product by the total iterative training times to obtain a quotient; searching weights corresponding to the quotient values in the training data weight set to obtain reference weights; resetting the training data weight less than the reference weight in the training data weight set by 0 to obtain an intermediate training data set.
Specifically, for errorsThe influence of the label noise on the generalization of the neural network model can be eliminated after the sample of the label is given smaller weight, but the influence on the prediction uncertainty is still larger. As training proceeds, further samples of label errors need to be removed. In this embodiment, a preset total iterative training number T and the number N of all initial training data in the initial training data set are obtained, T is multiplied by the number N to obtain a second product t×n, and the second product is divided by the total iterative training number to obtain a quotient
Figure BDA0003645977160000144
Searching the quotient +.>
Figure BDA0003645977160000145
Corresponding weights, obtaining a reference weight w, and collecting the training data weight set { w } i And (3) resetting the training data weight smaller than the reference weight w by 0 to obtain an intermediate training data set. Thus, the greater the epoch trained, the more samples are weighted to be set to 0. After the training samples are screened in the multiple iteration mode, the neural network can keep enough generalization, remove samples which are more likely to be marked with errors, and improve the effect of prediction uncertainty.
After obtaining the first uncertainty prediction value and the initial prediction result, the following step S300 shown in fig. 1-2 may be performed, where a target prediction result is obtained according to the first uncertainty prediction value and the initial prediction result.
Specifically, a preset condition may be set, an initial prediction result corresponding to the first uncertainty prediction value meeting the preset condition is output, and a target prediction result is obtained, that is, the trained deep neural network model is deployed into an actual scene, so that the initial prediction result can be given, and the first uncertainty prediction value can be given at the same time, and the first uncertainty prediction value can be used for misclassification detection and OOD data detection, and finally whether to output the initial prediction result depends on the first uncertainty prediction value.
In one implementation, step S300 includes the steps of:
and S301, when the first uncertainty prediction value is smaller than a preset threshold value, taking the initial prediction result as a target prediction result.
Specifically, when the first uncertainty prediction value is smaller than a preset threshold value, that is, the uncertainty is low, the initial prediction result is taken as a target prediction result; and when the first uncertainty prediction value is greater than or equal to a preset threshold value, namely the uncertainty is high, the prediction is refused to be made, and the target prediction result is not given.
The invention has the advantages that:
1. and dynamically weighting more samples in the training process, and screening the training samples according to the weights to eliminate the influence of the label noise on the prediction uncertainty.
2. Samples with low loss function values are given a higher weight.
3. Samples with high prediction uncertainty calculated based on MC-Dropout are given higher weights.
4. As training progresses, more and more low weight samples are deleted to reduce the impact of tag noise on prediction uncertainty.
Exemplary apparatus
As shown in fig. 4, an embodiment of the present invention provides a result prediction apparatus based on uncertainty evaluation under tag noise, which includes a real scene data acquisition module 401, a prediction module 402, and a target prediction result acquisition module 403, wherein: a real scene data acquisition module 401, configured to acquire real scene data; wherein, partial data in the real scene data is distributed differently from training data;
the prediction module 402 is configured to input real scene data into a trained deep neural network model and a preset function model, and obtain a first uncertainty predicted value and an initial predicted result, where the deep neural network model is obtained based on a plurality of iterative training processes, add weights to training data in each iterative training process, and screen the training data according to the weights;
and a target prediction result obtaining module 403, configured to obtain a target prediction result according to the first uncertainty prediction value and the initial prediction result.
Based on the above embodiment, the present invention further provides an intelligent terminal, and a functional block diagram thereof may be shown in fig. 5. The intelligent terminal comprises a processor, a memory, a network interface, a display screen and a temperature sensor which are connected through a system bus. The processor of the intelligent terminal is used for providing computing and control capabilities. The memory of the intelligent terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the intelligent terminal is used for communicating with an external terminal through network connection. The computer program, when executed by a processor, implements a method of outcome prediction based on uncertainty evaluation under tag noise. The display screen of the intelligent terminal can be a liquid crystal display screen or an electronic ink display screen, and a temperature sensor of the intelligent terminal is arranged in the intelligent terminal in advance and used for detecting the running temperature of internal equipment.
It will be appreciated by those skilled in the art that the schematic diagram in fig. 5 is merely a block diagram of a portion of the structure related to the present invention and is not limiting of the smart terminal to which the present invention is applied, and that a specific smart terminal may include more or less components than those shown in the drawings, or may combine some components, or have a different arrangement of components.
In one embodiment, a smart terminal is provided that includes a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data;
inputting real scene data into a trained deep neural network model and a trained function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights;
and obtaining a target prediction result according to the first uncertainty prediction value and the initial prediction result.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
In summary, the invention discloses a result prediction method based on uncertainty evaluation under label noise, which comprises the following steps: acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data; inputting real scene data into a trained deep neural network model and a trained function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights; obtaining a target prediction result according to the first uncertainty prediction value and the initial prediction result; when the label noise exists in the training data, the effect of the existing prediction uncertainty evaluation method is not good enough. The samples with low loss function values are given higher weight, so that the samples with correct labels can be reserved, and the influence of label noise can be eliminated. The samples with high prediction uncertainty are given higher weight, so that the samples on the decision boundary can be reserved, and the generalization of the model can be maintained. As training progresses, more and more low weight samples are gradually removed, ensuring that the reliability of the prediction uncertainty is not affected by false labels. After screening training samples by using the methods, prediction uncertainty can be still robust under the condition of label noise. Even if the annotation errors exist in the training data, the reliable prediction uncertainty can be given by deploying the model in an actual scene after the model is trained by the method. For the new sample, the neural network model will give a prediction result, and uncertainty of this prediction result, if the uncertainty is high, careful judgment is needed, even if the prediction is refused. If the uncertainty is low, then a prediction result may be given.
Based on the above embodiments, the present invention discloses a method for predicting a result based on uncertainty evaluation under the condition of tag noise, it should be understood that the application of the present invention is not limited to the above examples, and those skilled in the art can make modifications or changes according to the above description, and all such modifications and changes should fall within the scope of the appended claims.

Claims (10)

1. A method for predicting a result based on uncertainty evaluation under tag noise, the method comprising:
acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data;
inputting real scene data into a trained deep neural network model and a trained function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights;
and obtaining a target prediction result according to the first uncertainty prediction value and the initial prediction result.
2. The method for predicting results based on uncertainty evaluation under label noise of claim 1, wherein the training process of the deep neural network model comprises:
Acquiring an initial training data set containing label noise; the label noise is noise caused by labeling error labels on training data;
inputting the initial training data set into an initial deep neural network model and the function model to obtain a second uncertainty prediction value set corresponding to the initial training data set, obtaining a weight set corresponding to the initial training data set according to the second uncertainty prediction value set, obtaining an intermediate training data set according to the weight set and the initial training data set, taking the intermediate training data set as the initial training data set, and repeating the step of inputting the initial training data set into the initial deep neural network model and the function model;
and stopping iterative training when the initial deep neural network model meets the preset training conditions, and obtaining a trained deep neural network model.
3. The method according to claim 2, wherein the step of inputting the initial training data set into an initial deep neural network model and the function model to obtain a second uncertainty prediction value set corresponding to the initial training data set, obtaining a weight set corresponding to the initial training data set according to the second uncertainty prediction value set, obtaining an intermediate training data set according to the weight set and the initial training data set, using the intermediate training data set as an initial training data set, and repeatedly performing the step of inputting the initial training data set into an initial deep neural network model and the function model includes:
For the t-th iterative training, acquiring a loss function value set corresponding to the initial training data set in the t-th iterative training and a second uncertainty prediction value set corresponding to the initial training data set in the t-th iterative training, obtaining a training data weight set corresponding to the initial training data set in the t-th iterative training according to the loss function value set and the second uncertainty prediction value set, and weighting the initial training data set in the t-th iterative training based on the training data weight set to obtain an intermediate training data set; the training data weight set is used for representing the importance degree of training data in the initial deep neural network model; wherein t is greater than or equal to 1;
and inputting the intermediate training data set into an initial deep neural network model and the function model, and performing t+1st iteration training.
4. A method of predicting results based on uncertainty evaluation in tag noise as claimed in claim 3, wherein said initial training data set comprises a number of initial training data; the obtaining the loss function value set corresponding to the initial training data set in the t-th iterative training and the second uncertainty prediction value set corresponding to the initial training data set in the t-th iterative training, and obtaining the training data weight set corresponding to the initial training data set in the t-th iterative training according to the loss function value set and the second uncertainty prediction value set includes:
Acquiring a loss function value corresponding to the initial training data in the t-th iterative training aiming at each initial training data; inputting the initial training data into the initial deep neural network model for multiple times to obtain a plurality of output results, and performing variance calculation on the plurality of output results based on the function model to obtain a second uncertainty predicted value corresponding to the initial training data; subtracting the loss function value from the obtained preset first value to obtain a difference value; multiplying the difference value by the second uncertainty prediction value to obtain a first product; taking the first product as an index, and taking the obtained preset second value as a base number to obtain an index operation result; taking the index operation result as training data weight corresponding to the initial training data in the t-th iterative training;
and taking the training data weights corresponding to all the initial training data in the initial training data set as a training data weight set corresponding to the initial training data set.
5. The method of claim 4, wherein the weighting the initial training data set in the t-th iteration training based on the training data weight set to obtain an intermediate training data set comprises:
Acquiring preset total iterative training times and the number of all initial training data in the initial training data set;
multiplying t by the number to obtain a second product;
dividing the second product by the total iterative training times to obtain a quotient;
and obtaining an intermediate training data set according to the quotient and the training data weight set.
6. The method of claim 5, wherein obtaining an intermediate training data set based on the quotient and the training data weight set comprises:
searching weights corresponding to the quotient values in the training data weight set to obtain reference weights;
resetting the training data weight less than the reference weight in the training data weight set by 0 to obtain an intermediate training data set.
7. The method of claim 1, wherein the obtaining a target prediction result from the first uncertainty prediction value and the initial prediction result comprises:
and when the first uncertainty prediction value is smaller than a preset threshold value, taking the initial prediction result as a target prediction result.
8. A result prediction apparatus based on uncertainty evaluation under label noise, the apparatus comprising:
the real scene data acquisition module is used for acquiring real scene data; wherein, partial data in the real scene data is distributed differently from training data;
the prediction module is used for inputting the real scene data into the trained deep neural network model and a preset function model to obtain a first uncertainty predicted value and an initial predicted result, wherein the deep neural network model is obtained based on a plurality of iterative training processes, weights are added to training data in each iterative training process, and the training data are screened according to the weights;
and the target prediction result acquisition module is used for acquiring a target prediction result according to the first uncertainty prediction value and the initial prediction result.
9. An intelligent terminal comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for performing the method of any of claims 1-7.
10. A non-transitory computer readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any one of claims 1-7.
CN202210529489.6A 2022-05-16 2022-05-16 Result prediction method based on uncertainty evaluation under label noise Pending CN116051880A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210529489.6A CN116051880A (en) 2022-05-16 2022-05-16 Result prediction method based on uncertainty evaluation under label noise

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210529489.6A CN116051880A (en) 2022-05-16 2022-05-16 Result prediction method based on uncertainty evaluation under label noise

Publications (1)

Publication Number Publication Date
CN116051880A true CN116051880A (en) 2023-05-02

Family

ID=86124404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210529489.6A Pending CN116051880A (en) 2022-05-16 2022-05-16 Result prediction method based on uncertainty evaluation under label noise

Country Status (1)

Country Link
CN (1) CN116051880A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117131855A (en) * 2023-09-19 2023-11-28 中科(天津)智能科技有限公司 Meta-space activity data analysis method and system based on intelligent digital twin

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117131855A (en) * 2023-09-19 2023-11-28 中科(天津)智能科技有限公司 Meta-space activity data analysis method and system based on intelligent digital twin

Similar Documents

Publication Publication Date Title
CN109840588B (en) Neural network model training method, device, computer equipment and storage medium
CN112990432B (en) Target recognition model training method and device and electronic equipment
CN110909822B (en) Satellite anomaly detection method based on improved Gaussian process regression model
EP3848836A1 (en) Processing a model trained based on a loss function
CN110969200B (en) Image target detection model training method and device based on consistency negative sample
CN114332578A (en) Image anomaly detection model training method, image anomaly detection method and device
CN112287947B (en) Regional suggestion frame detection method, terminal and storage medium
CN109271957B (en) Face gender identification method and device
CN112613617A (en) Uncertainty estimation method and device based on regression model
CN116051880A (en) Result prediction method based on uncertainty evaluation under label noise
US20210019636A1 (en) Prediction model construction device, prediction model construction method and prediction model construction program recording medium
CN117057443A (en) Prompt learning method of visual language model and electronic equipment
CN116959216A (en) Experimental operation monitoring and early warning method, device and system
KR102192461B1 (en) Apparatus and method for learning neural network capable of modeling uncerrainty
CN111401563B (en) Machine learning model updating method and device
CN116665798A (en) Air pollution trend early warning method and related device
CN114495114B (en) Text sequence recognition model calibration method based on CTC decoder
CN116964588A (en) Target detection method, target detection model training method and device
KR102413588B1 (en) Object recognition model recommendation method, system and computer program according to training data
CN112699809B (en) Vaccinia category identification method, device, computer equipment and storage medium
EP3975071A1 (en) Identifying and quantifying confounding bias based on expert knowledge
CN112446428B (en) Image data processing method and device
CN117523218A (en) Label generation, training of image classification model and image classification method and device
CN114020905A (en) Text classification external distribution sample detection method, device, medium and equipment
CN115545150A (en) Decision reliability evaluation method, device and equipment based on credible graph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination