CN110363228A - Noise label correcting method - Google Patents
Noise label correcting method Download PDFInfo
- Publication number
- CN110363228A CN110363228A CN201910562002.2A CN201910562002A CN110363228A CN 110363228 A CN110363228 A CN 110363228A CN 201910562002 A CN201910562002 A CN 201910562002A CN 110363228 A CN110363228 A CN 110363228A
- Authority
- CN
- China
- Prior art keywords
- sample
- label
- noise
- samples
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000002372 labelling Methods 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 15
- 230000001174 ascending effect Effects 0.000 claims description 8
- 238000012937 correction Methods 0.000 claims description 4
- 239000005018 casein Substances 0.000 claims description 3
- BECPQYXYKAMYBN-UHFFFAOYSA-N casein, tech. Chemical compound NCCCCC(C(O)=O)N=C(O)C(CC(O)=O)N=C(O)C(CCC(O)=N)N=C(O)C(CC(C)C)N=C(O)C(CCC(O)=O)N=C(O)C(CC(O)=O)N=C(O)C(CCC(O)=O)N=C(O)C(C(C)O)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=O)N=C(O)C(CCC(O)=O)N=C(O)C(COP(O)(O)=O)N=C(O)C(CCC(O)=N)N=C(O)C(N)CC1=CC=CC=C1 BECPQYXYKAMYBN-UHFFFAOYSA-N 0.000 claims description 3
- 235000021240 caseins Nutrition 0.000 claims description 3
- 230000006870 function Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013138 pruning Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The present invention provides a kind of noise label weight mask methods, comprising the following steps: step 1, is classified using base classifier to observation sample and estimates noise rate, identify noise label data;Step 2, noise exemplar is marked using base classifier again, obtains the clean sample data set after noise exemplar is corrected.
Description
Technical Field
The invention relates to a data mining technology, in particular to a noise label correction method.
Background
Conventional supervised learning classification problems typically assume that the labels of the data sets are complete, i.e., that each data set sample has a correct label that is noise-free. However, in the real world, due to the randomness of the labeling process, the sample label is easily contaminated by noise, resulting in inaccuracy of the sample label. The generation of noisy data is typically related to the acquisition path of the data set. For example, in the process of labeling the original data, the amount of sample data information provided to a labeling person is not enough, so that the labeling person misclassifies the samples, or the classification process itself is a subjective process or the professional knowledge of the labeling person is not enough to ensure the classification correctness. Various data labeling platforms which are popular at present are also one of the sources of noise data, and the labeling platforms utilize a large number of registered users to realize a crowd-sourced data labeling work. Such as Amazon Mechanical turn by Amazon, data services platforms such as data farms, the kyoto micro project, and the like. The data labels obtained by the data set obtained by the method are not completely in accordance with the real situation due to professional limitations or personal differences of annotators, and the opinions of different annotators on the same sample may be different, so that different label results are obtained for the same sample. Noise in the Data set can be divided into feature noise and tag noise according to where the noise is generated, with noise in the tag generally having a greater effect on the model performance than noise in the feature (Mirylenka K, Giannakopoulos G, Do L M, et al. on classifier noise in the presentation of the refining noise [ J ]. Data Mining and Knowledge Discovery, 2017). In binary classification, the PU (Positive-under) learning problem is proposed according to the characteristics of the noise distribution in the Positive and negative case datasets (KhetanA, lipon Z C, and crank a. learning From noise single-under Data [ J ]. 2017). PU learning represents a binary classification task where only a portion of the positive training samples in the dataset are labeled and none of the other samples are labeled. All unlabeled samples can be treated as negative examples for the PU learning problem. Thus, the PU learning problem is transformed into a noisy binary classification problem. The existence of the noise label data not only has serious negative influence on the classification accuracy of the classifier model, but also increases the complexity of the classifier. Therefore, the classification learning algorithm adaptive to the noise label data is designed, and has important research significance and application value.
For classification problems with Noise labels, Fre nay, B summarize a number of solutions, including Noise cleaning algorithms, Noise label robustness methods, and Noise label modeling methods (Frenay B, Verleysen M. classification in the Presence of Label Noise: A Survey [ J ]. IEEE Transactions on Neural Networks and Learning Systems, 2014). The noise label robust method uses the adaptive capacity of the model to noise, and the sensitivity of different models to label noise is different. A classifier insensitive to tag noise needs to be selected for learning. For example, in the empirical risk minimization problem of binary classification, a loss function is used to measure the loss of misclassification and the classifier is learned by minimizing the minimum loss of samples. Common losses are 0-1 losses. For uniform tag noise, the 0-1 loss and the least squares loss are for the anti-noise tag. While for other loss functions, even in the case of a uniform noise distribution, they are not anti-noise labeled, such as 1) exponential loss 2) logarithmic loss 3) change loss. Most learning algorithms in machine learning are not entirely noise-immune labeled and are only effective when the training data is disturbed by a small amount of label noise. With the development of deep learning, neural networks are often used in image classification problems to solve noise Label image problems, e.g., Mnih proposes to incorporate noise models into neural networks, but it only considers binary classification, and assumes that the noise belongs to symmetric Label noise (MnihV, Hinton g.
Solving the noise label learning problem using a noise clean-up strategy typically requires two steps: (1) estimating the noise rate and (2) using the noise rate and the prediction. To estimate the noise rate, Scott et al used to estimate the inversion noise rate sum by establishing a lower bound method (Blanchard G, Flaska M, Handy G, et al. Classification with asymmetry LabelNoise: Consistenc and maximum Denoising [ J ]. Journal of Machine Learning Research, 2013). However, the unbounded function obtained by this method may not be able to converge. After adding additional assumptions, Scott (2015) proposes a time-efficient noise rate Estimation method, but the Estimation performance is poor (Scott C.A. random conversion for future delivery Estimation, with, Application to learning from noise laboratories [ J ]. 2015). Liu Tao rewrites the modification loss function by the Importance weight, but the rewritten weight is derived from the prediction probability and therefore may be sensitive to inaccurate estimates (Liu T, Tao D. Classification with noise laboratories by opportunity estimation [ J ]. IEEE Transactions on Pattern Analysis & Machine Analysis, 2014). Natarajan (2013) does not propose a method of estimating noise but considers the noise rate as a parameter optimized in the cross-validation process (Natarajan N, Dhillon I S, Ravikumar P K, et. Natarajan proposes two methods to modify the loss function, the first method constructs an unbiased estimator of the correct distribution from the noise distribution, but this estimator may still be a non-convex function even if the original loss function is a convex one. The second method establishes a tag-dependent loss function such that the minimum risk and correctly distributed risk of Nat13 are equal for 0-1 losses. Northcutt proposes Learning (Learning with confidence algorithms) from trusted samples, calculating the value of the equivariant according to the Classification probability of the base classifier on the noise data, and deleting the samples identified as noise label data according to the size of the prediction result of each sample by the base classifier, which is called Rank Pruning (Northcutt C G, Wu T, Chuang I L. Learning with confidence algorithms: Rank Pruning for Robust Classification with noise labels [ J ]. 2017).
Disclosure of Invention
The invention aims to provide a noise label correcting method.
The technical scheme for realizing the purpose of the invention is as follows: a noise tag correction method, comprising the steps of:
step 1, predicting a sample by using a base classifier to obtain a sample prediction probability, respectively taking prediction probability expectation values of all samples in a positive example set and a negative example set as a lower bound threshold and an upper bound threshold, judging a real label of an observed sample by using the lower bound threshold and the upper bound threshold, and identifying noise label data;
step 2, re-labeling the noise label sample by using a base classifier to obtain a clean sample data set after the noise label sample is corrected; wherein
After the binary classification result is identified as the noise label samples, sorting the samples in an ascending order according to the prediction probability value of each sample in the base classifier, and re-labeling the labels of the previous a samples as 0 in the observation of the positive sample set; after observing the negative sample setEach sample label is re-labeled as 1;
and 2, as for the multi-class classification result, according to a classification result matrix obtained by predicting all sample data by the base classifier, re-labeling the label of the sample as the label which belongs to the sample except the current label when the prediction probability is maximum by using the probability matrix.
Further, the step 1 specifically comprises the following steps:
step 1.1, the base classifier predicts a sample to obtain a sample prediction probability g (x) P (s 1| x); is provided with
Noise rate ρ1P (s-0 | y-1) represents the probability that a sample with a true label of 1 is incorrectly labeled as 0,
representing the number of samples for which the observation label is 1 and the true label is 1,
representing the number of samples with an observation tag of 0 and a true tag of 1,
representing the number of samples for which the observation label is 1 and the true label is 0,
represents the number of samples with an observation label of 0 and a true label of 0;
step 1.2, classification results using base classifiersJudging the real label of the sample: using lower bound thresholds LBy=1Judging whether the real label of the sample is 1, and setting the real label of the observation sample to be 1 when the prediction result of the observation sample on a base classifier g (x) is greater than the lower bound threshold value; when the prediction result of the observation sample on the base classifier is less than the upper bound threshold UBy=0When the actual label of the observation sample is set to 0.
Step 1.3, calculate
Wherein,in order to observe the set of positive examples samples,for observing the negative sample set, the upper and lower threshold values are respectively set as the expected values of the classification probability g (x) of the positive and negative samples on the base classifier:
step 1.4, calculating the estimated value of the noise rateAnd
step 1.5, deducing the value of the reversal noise rate according to the estimated value of the noise rate by Bayes' theorem
Step 1.6, settingThe number of samples with a true tag of 0 in the observation sample set,representing the number of samples with a real label of 1 in the observation negative sample set, and sorting the samples in an ascending order according to the predicted value of each sample base classifier g (x); sample set of observation right caseMiddle and frontTaking the sample as a noise label sample in the normal sample set; sample set of negative examples under observationMiddle and back rowIndividual samples are considered as noise labeled samples in the negative sample set.
Further, the specific process of obtaining the clean sample data set after the noise label samples are corrected for the binary classification condition in step 2 is as follows:
after the noise label samples are identified, the samples are sorted in an ascending order according to the predicted probability value of each sample in the base classifier g (x) P (s 1| x). Sample set of observation right caseIn front ofThe label of each sample is re-labeled as 0; sample set of negative examples under observationIn, afterEach sample label is re-labeled as 1;
re-labeled positive sample setAnd negative sample setRespectively expressed as:
wherein,representing the g (x) value in the sample set of the observation positive caseThe small value of g (x) is,sample set g (x) values representing negative observationsLarge g (x) values.
Further, for the multi-class classification condition in step 2, a clean sample data set of the noise label sample after being corrected is obtained by adopting label re-labeling of the noise sample, and the specific process is as follows:
when predicting all sample data, the base classifier needs to record the probability that the sample belongs to each class, and a classification result matrix psx is obtained, wherein the probability is pijI belongs to N, j belongs to K, psx is a N multiplied by K probability matrix, where N is the number of samples and K is the number of label categories, where the probability value represents the classification result matrix of the base classification to all the samples, and the ith row p of the matrixi=(pi1,pi2,,,pik) Represents a sample xiProbability, value p, of belonging to classes of labels under base classifier f (x)ijRepresents a sample xiBelong to kjThe probability of a class;
when sample x is determined to be a noise label, label x is re-labeled with the probability matrix psx as the label to which the prediction probability is the highest except the current label:
yi relabel=kmax(kmax=arg max psxi)
wherein k ismaxIs a sample xiRemoving original noise label s of the sample in classification probability of a base classifieriThe label category to which the outer probability maximum belongs.
Compared with the prior art, the invention has the following advantages: (1) a general solution is provided for noise label learning, and the method is suitable for classifiers in any form; (2) the noise sample identification rate is high, all sample information is fully utilized, and the robustness of the classifier in a noise environment is improved; (3) the algorithm is applicable to binary classification and multi-class classification problems.
The invention is further described below with reference to the accompanying drawings.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of a process for identifying noise label data based on a base classifier.
Fig. 3 is a schematic diagram of a noise label sample re-labeling process.
Detailed Description
Referring to fig. 1, a method for classifying observation samples and estimating noise rate by using a base classifier identifies noise tag data, and the process is as follows:
step 1, classifying observation samples by using a base classifier in combination with fig. 2, estimating noise rate, and identifying noise label data, wherein the process is as follows:
in step 1.1, the base classifier clf predicts clf.fit (X, s) for the sample, and obtains a sample prediction probability g (X) ═ P (s ═ 1| X). The base classifier can select any existing classification algorithm as long as the prediction probability of the sample can be obtained.
For noise rate ρ1P (s-0 | y-1) represents the probability that a sample with a true label of 1 is incorrectly labeled as 0, i.e., the proportion of the number of samples with an observation label of 0 in the sample set with a correct label of 1. The number of samples in each case is represented by the following variables:a sample representing that an observation label is 1 and a real label is 1;a sample representing that the observation label is 0 and the real label is 1;a sample representing that the observation label is 1 and the real label is 0;the sample indicates that the observation label is 0 and the real label is 0.
Step 1.2, because the true distribution of the sample is unknown, the classification result of the base classifier is usedAnd judging the real label of the sample. Using lower bound thresholds LBy=1Judging whether the true label of the sample is 1, and when the prediction result of the observation sample on the base classifier g (x) is greater than the lower bound threshold, assuming that the observation sample is observedThe true label of the test sample is 1. Also using upper bound UBy=0And judging whether the real label of the observed sample is 0.
Step 1.3, calculate
Wherein,in order to observe the set of positive examples samples,to observe the negative sample set, the threshold is set to the expected value of the probability g (x) that the positive and negative samples are classified on the base classifier:
step 1.4, calculating the estimated value of the noise rateAndthe process is as follows:
step 1.5, deducing a value of the inversion noise rate according to the estimated value of the noise rate by Bayes' theorem:
wherein p iss1P (s ═ 1) represents the number of positive samples in the observation sample set. Since the inversion noise rate represents the probability of observing a true tag being 0 or 1 in a sample of positive and negative examples, the inversion noise rate represents the probability of observing a true tag being 0 or 1 in a sample of positive and negative examplesThe number of samples with a true tag of 0 in the observation due sample set, that is, the number of noise samples in the observation due sample set, is represented. In the same way, the method for preparing the composite material,the number of samples with a true tag of 1 in the observation negative sample set, that is, the number of noise samples in the observation negative sample set, is shown. Finally, according to the predicted value of each sample base classifier g (x), sorting the samples in an ascending order, and observing the sample set of the positive caseMiddle and frontTaking the sample as a noise label sample in a positive sample set, and observing a negative sample setMiddle and back rowIndividual samples are considered as noise labeled samples in the negative sample set.
And 2, re-labeling the noise label sample by using the classification result of the base classifier in combination with the graph 3 to obtain a clean sample data set after the noise label sample is corrected, wherein the specific process is as follows:
step 2.1, for the binary classification case. After the noise label samples are identified, the samples are sorted in an ascending order according to the predicted probability value of each sample in the base classifier g (x) P (s 1| x). Sample set of observation right caseIn front ofThe label of each sample is re-labeled as 0; sample set of negative examples under observationIn, afterThe individual sample labels are relabeled as 1. Re-labeled positive sample setAnd negative sample setRespectively expressed as:
whereinRepresenting the g (x) value in the sample set of the observation positive caseThe small value of g (x) is,sample set g (x) values representing negative observationsLarge g (x) values.
And 2.2, for the multi-class classification condition. In the case of multi-class classification, the total number of classes of sample labels is more than two, and the label re-labeling of a noise sample needs to consider which class of label the sample most likely belongs to and assign the label. The label of the noise sample re-labeling needs to be selected according to the classification result of the base classifier on all samples. Therefore, when the base classifier predicts all sample data, it needs to record the probability that the sample belongs to each class, and finally obtains a classification result matrix psx ═ pijI ∈ N, j ∈ K }, psx is a N × K probability matrix (where N is the number of samples and K is the number of label categories), where the probability value represents the classification result matrix of the base classification for all samples, and the ith row p of the matrixi=(pi1,pi2,,,pik) Represents a sample xiProbability of belonging to classes of labels under base classifier f (x), where the value pijRepresents a sample xiBelong to kjThe probability of a class. When sample x is determined to be a noise label, the label of x is re-labeled with the probability matrix psx as the label to which the prediction probability is the greatest except for the current label. I.e. for the noise label sample xiThe relabeled labels are:
yi relabel=kmax(kmax=arg max psxi)
wherein k ismaxIs a sample xiRemoving original noise label s of the sample in classification probability of a base classifieriThe label category to which the outer probability maximum belongs. And finally, the data obtained after re-labeling is the correct data set after the noise label is corrected.
Claims (4)
1. A noise tag correction method, comprising the steps of:
step 1, predicting a sample by using a base classifier to obtain a sample prediction probability, respectively taking prediction probability expectation values of all samples in a positive example set and a negative example set as a lower bound threshold and an upper bound threshold, judging a real label of an observed sample by using the lower bound threshold and the upper bound threshold, and identifying noise label data;
step 2, re-labeling the noise label sample by using a base classifier to obtain a clean sample data set after the noise label sample is corrected; wherein
After the binary classification result is identified as the noise label samples, sorting the samples in an ascending order according to the prediction probability value of each sample in the base classifier, and re-labeling the labels of the previous a samples as 0 in the observation of the positive sample set; after observing the negative sample setEach sample label is re-labeled as 1;
and 2, as for the multi-class classification result, according to a classification result matrix obtained by predicting all sample data by the base classifier, re-labeling the label of the sample as the label which belongs to the sample except the current label when the prediction probability is maximum by using the probability matrix.
2. The method according to claim 1, wherein the specific steps of step 1 include:
step 1.1, the base classifier predicts a sample to obtain a sample prediction probability g (x) P (s 1| x); is provided with
Noise rate ρ1P (s-0 | y-1) represents the probability that a sample with a true label of 1 is incorrectly labeled as 0,
representing the number of samples for which the observation label is 1 and the true label is 1,
representing the number of samples with an observation tag of 0 and a true tag of 1,
representing the number of samples for which the observation label is 1 and the true label is 0,
represents the number of samples with an observation label of 0 and a true label of 0;
step 1.2, classification results using base classifiersJudging the real label of the sample: using lower bound thresholds LBy=1Judging whether the real label of the sample is 1, and setting the real label of the observation sample to be 1 when the prediction result of the observation sample on a base classifier g (x) is greater than the lower bound threshold value; when the prediction result of the observation sample on the base classifier is less than the upper bound threshold UBy=0When the actual label of the observation sample is set to 0.
Step 1.3, calculate
Wherein,in order to observe the set of positive examples samples,for observing the negative sample set, the upper and lower threshold values are respectively set as the expected values of the classification probability g (x) of the positive and negative samples on the base classifier:
step 1.4, calculating the estimated value of the noise rateAnd
step 1.5, deducing the value of the reversal noise rate according to the estimated value of the noise rate by Bayes' theorem
Step 1.6, settingThe number of samples with a true tag of 0 in the observation sample set,representing the number of samples with a real label of 1 in the observation negative sample set, and sorting the samples in an ascending order according to the predicted value of each sample base classifier g (x); sample set of observation right caseMiddle and frontTaking the sample as a noise label sample in the normal sample set; sample set of negative examples under observationMiddle and back rowIndividual samples are considered as noise labeled samples in the negative sample set.
3. The method according to claim 2, wherein the specific process of obtaining the clean sample data set after the noise label samples are modified for the binary classification case in step 2 is:
after the noise label samples are identified, the samples are sorted in an ascending order according to the predicted probability value of each sample in the base classifier g (x) P (s 1| x). Sample set of observation right caseIn front ofThe label of each sample is re-labeled as 0; sample set of negative examples under observationIn, afterEach sample label is re-labeled as 1;
re-labeled positive sample setAnd negative sample setRespectively expressed as:
wherein,representing the g (x) value in the sample set of the observation positive caseThe small value of g (x) is,sample set g (x) values representing negative observationsLarge g (x) values.
4. The method according to claim 2, wherein, for the multi-class classification condition in step 2, the label re-labeling of the noise sample is adopted to obtain a clean sample data set after the noise label sample is modified, and the specific process is as follows:
when predicting all sample data, the base classifier needs to record the probability that the sample belongs to each class, and a classification result matrix psx is obtained, wherein the probability is pijI belongs to N, j belongs to K, psx is a N multiplied by K probability matrix, where N is the number of samples and K is the number of label categories, where the probability value represents the classification result matrix of the base classification to all the samples, and the ith row p of the matrixi=(pi1,pi2,,,pik) Represents a sample xiProbability, value p, of belonging to classes of labels under base classifier f (x)ijRepresents a sample xiBelong to kjThe probability of a class;
when sample x is determined to be a noise label, label x is re-labeled with the probability matrix psx as the label to which the prediction probability is the highest except the current label:
yi relabel=kmax(kmax=arg max psxi)
wherein k ismaxIs a sample xiRemoving original noise label s of the sample in classification probability of a base classifieriThe label category to which the outer probability maximum belongs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910562002.2A CN110363228B (en) | 2019-06-26 | 2019-06-26 | Noise label correction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910562002.2A CN110363228B (en) | 2019-06-26 | 2019-06-26 | Noise label correction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110363228A true CN110363228A (en) | 2019-10-22 |
CN110363228B CN110363228B (en) | 2022-09-06 |
Family
ID=68216503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910562002.2A Active CN110363228B (en) | 2019-06-26 | 2019-06-26 | Noise label correction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110363228B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814883A (en) * | 2020-07-10 | 2020-10-23 | 重庆大学 | Label noise correction method based on heterogeneous integration |
CN112101328A (en) * | 2020-11-19 | 2020-12-18 | 四川新网银行股份有限公司 | Method for identifying and processing label noise in deep learning |
CN113139628A (en) * | 2021-06-22 | 2021-07-20 | 腾讯科技(深圳)有限公司 | Sample image identification method, device and equipment and readable storage medium |
WO2022032471A1 (en) * | 2020-08-11 | 2022-02-17 | 香港中文大学(深圳) | Method and apparatus for training neural network model, and storage medium and device |
WO2022194049A1 (en) * | 2021-03-15 | 2022-09-22 | 华为技术有限公司 | Object processing method and apparatus |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105426826A (en) * | 2015-11-09 | 2016-03-23 | 张静 | Tag noise correction based crowd-sourced tagging data quality improvement method |
CN107292330A (en) * | 2017-05-02 | 2017-10-24 | 南京航空航天大学 | A kind of iterative label Noise Identification algorithm based on supervised learning and semi-supervised learning double-point information |
-
2019
- 2019-06-26 CN CN201910562002.2A patent/CN110363228B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105426826A (en) * | 2015-11-09 | 2016-03-23 | 张静 | Tag noise correction based crowd-sourced tagging data quality improvement method |
CN107292330A (en) * | 2017-05-02 | 2017-10-24 | 南京航空航天大学 | A kind of iterative label Noise Identification algorithm based on supervised learning and semi-supervised learning double-point information |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814883A (en) * | 2020-07-10 | 2020-10-23 | 重庆大学 | Label noise correction method based on heterogeneous integration |
WO2022032471A1 (en) * | 2020-08-11 | 2022-02-17 | 香港中文大学(深圳) | Method and apparatus for training neural network model, and storage medium and device |
CN112101328A (en) * | 2020-11-19 | 2020-12-18 | 四川新网银行股份有限公司 | Method for identifying and processing label noise in deep learning |
WO2022194049A1 (en) * | 2021-03-15 | 2022-09-22 | 华为技术有限公司 | Object processing method and apparatus |
CN113139628A (en) * | 2021-06-22 | 2021-07-20 | 腾讯科技(深圳)有限公司 | Sample image identification method, device and equipment and readable storage medium |
CN113139628B (en) * | 2021-06-22 | 2021-09-17 | 腾讯科技(深圳)有限公司 | Sample image identification method, device and equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110363228B (en) | 2022-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110363228B (en) | Noise label correction method | |
CN109934293B (en) | Image recognition method, device, medium and confusion perception convolutional neural network | |
US11741693B2 (en) | System and method for semi-supervised conditional generative modeling using adversarial networks | |
WO2022121289A1 (en) | Methods and systems for mining minority-class data samples for training neural network | |
WO2018121690A1 (en) | Object attribute detection method and device, neural network training method and device, and regional detection method and device | |
JP5506722B2 (en) | Method for training a multi-class classifier | |
Simpson et al. | Dynamic bayesian combination of multiple imperfect classifiers | |
CN105574538B (en) | Classification model training method and device | |
JP5558412B2 (en) | System and method for adapting a classifier to detect objects in a particular scene | |
Wang et al. | Efficient learning by directed acyclic graph for resource constrained prediction | |
JP2019521443A (en) | Cell annotation method and annotation system using adaptive additional learning | |
CN108228684B (en) | Method and device for training clustering model, electronic equipment and computer storage medium | |
Song et al. | Active learning with confidence-based answers for crowdsourcing labeling tasks | |
JP6897749B2 (en) | Learning methods, learning systems, and learning programs | |
Venanzi et al. | Time-sensitive bayesian information aggregation for crowdsourcing systems | |
CN114333040B (en) | Multi-level target detection method and system | |
Duan et al. | Learning with auxiliary less-noisy labels | |
EP4053757A1 (en) | Degradation suppression program, degradation suppression method, and information processing device | |
CN117061322A (en) | Internet of things flow pool management method and system | |
JPWO2017188048A1 (en) | Creation device, creation program, and creation method | |
Majumdar et al. | Subgroup invariant perturbation for unbiased pre-trained model prediction | |
KR20100116404A (en) | Method and apparatus of dividing separated cell and grouped cell from image | |
CN110688484A (en) | Microblog sensitive event speech detection method based on unbalanced Bayesian classification | |
US20210319269A1 (en) | Apparatus for determining a classifier for identifying objects in an image, an apparatus for identifying objects in an image and corresponding methods | |
CN112541010B (en) | User gender prediction method based on logistic regression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |