CN113516207B - Long-tail distribution image classification method with noise label - Google Patents

Long-tail distribution image classification method with noise label Download PDF

Info

Publication number
CN113516207B
CN113516207B CN202111059448.7A CN202111059448A CN113516207B CN 113516207 B CN113516207 B CN 113516207B CN 202111059448 A CN202111059448 A CN 202111059448A CN 113516207 B CN113516207 B CN 113516207B
Authority
CN
China
Prior art keywords
data
noise
sample
loss
relaxation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111059448.7A
Other languages
Chinese (zh)
Other versions
CN113516207A (en
Inventor
程乐超
茅一宁
冯尊磊
宋明黎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202111059448.7A priority Critical patent/CN113516207B/en
Publication of CN113516207A publication Critical patent/CN113516207A/en
Application granted granted Critical
Publication of CN113516207B publication Critical patent/CN113516207B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a long-tail distribution image classification method with a noise label, which is used for learning through relaxation interval loss depending on a sample and assisting with an anti-noise data enhancement strategy and solving the problem of image classification with a long-tail characteristic and a noise label at the same time. According to the data noise characteristics, introducing a relaxation variable dependent on the sample when calculating the sample function interval to relax the interval constraint, and calculating the smooth relaxation loss dependent on the sample according to the sample interval classification; according to the data long tail characteristics, a data enhancement strategy adjusted in stages is implemented, samples are enhanced strongly and weakly respectively, and a sample screening mechanism based on relaxation loss is provided in the formal training stage for screening out noise data. The method is simple and convenient to implement, flexible in means, and capable of remarkably improving the classification effect on long-tail data, noise data and training data with the characteristics of the long-tail data and the noise data.

Description

Long-tail distribution image classification method with noise label
Technical Field
The invention relates to the field of image classification, in particular to a method for classifying images under noise labels and long tail distribution data.
Background
In recent years, Convolutional Neural Networks (CNNs) have been widely used in the field of computer vision. In the case of a fixed amount of training data, the overfitting phenomenon is increasingly prominent due to the increase of the number of parameters, and the requirement for accurately labeling data is also increased in order to improve the overall performance. However, obtaining a large number of accurately labeled samples is often quite expensive. In this regard, non-expert crowd-sourcing or systematic tagging is a practical solution, however this easily leads to mislabeling of tags. Many reference datasets, such as ImageNet, CIFAR-10/-100, MNIST, QuickDraw, etc., contain 3% to 10% noise label samples. Existing research on noisy labels has generally focused on splitting correctly labeled and incorrectly labeled samples, but neglecting the distribution of the data. In the real world, data often presents the characteristic of long tail distribution, several main categories in the data set occupy the dominance, and the data of other categories is insufficient in quantity. Therefore, in the current image classification task based on the deep neural network, how to classify data with long tail features and noise labels simultaneously to reduce the influence of the noise labels under long tail distribution is very important in practical application.
Disclosure of Invention
In order to solve the defects of the prior art and achieve the purpose of reducing the influence of noise labels under long tail distribution, the invention adopts the following technical scheme:
a method for classifying long-tail distribution images with noise labels comprises the following steps:
s1, according to the data noise characteristics, each sample image and the noise label thereof
Figure 979595DEST_PATH_IMAGE001
At sample intervals
Figure 472893DEST_PATH_IMAGE002
On the basis of (2), introducing a relaxation variable
Figure 835742DEST_PATH_IMAGE003
Forming sample relaxation intervals of noisy samples
Figure 340672DEST_PATH_IMAGE004
The sample interval is
Figure 560301DEST_PATH_IMAGE005
Class interval of
Figure 681841DEST_PATH_IMAGE006
Wherein
Figure 376609DEST_PATH_IMAGE007
Is shown as
Figure 1625DEST_PATH_IMAGE008
A sample
Figure 392155DEST_PATH_IMAGE009
Is marked with a label
Figure 469832DEST_PATH_IMAGE010
Is a category
Figure 440063DEST_PATH_IMAGE011
I.e. specimen
Figure 44219DEST_PATH_IMAGE009
Belong to the category
Figure 481017DEST_PATH_IMAGE011
And the process of, accordingly,
Figure 436203DEST_PATH_IMAGE012
indicates all belong to the category
Figure 351070DEST_PATH_IMAGE011
A set of sequence numbers of samples of (1);
the sample relaxation intervals were:
Figure 278574DEST_PATH_IMAGE013
wherein,
Figure 10907DEST_PATH_IMAGE014
indicating the sample image and its correct label,
Figure 63177DEST_PATH_IMAGE015
representing a prediction function for predicting to which class a sample image belongs,
Figure 906368DEST_PATH_IMAGE016
in order to be a sample space, the sample space,Nis the total number of samples and is,
Figure 360483DEST_PATH_IMAGE017
is composed of
Figure 532226DEST_PATH_IMAGE018
A set of tags for each of the categories,
Figure 71791DEST_PATH_IMAGE019
the representation of the real number field is performed,
Figure 718673DEST_PATH_IMAGE020
is shown and
Figure 27295DEST_PATH_IMAGE021
different noise labels
Figure 101430DEST_PATH_IMAGE022
And x corresponding thereto, the largest value among the values obtained by the prediction function,
Figure 190609DEST_PATH_IMAGE023
Figure 516548DEST_PATH_IMAGE024
representing an optimal interval; the traditional DNN classification network usually follows a linear conversion layer after a feature extractor, but the strategy is easy to generate the situation that the classifier falls into linear inseparability when fitting data with noise, so the invention proposes a relaxation variable
Figure 804310DEST_PATH_IMAGE003
Introducing slack variables with relaxed spacing constraints
Figure 924713DEST_PATH_IMAGE003
Sample relaxation interval of
Figure 829084DEST_PATH_IMAGE004
The tolerance of classification prediction results is increased;
according to the sample interval, the smooth relaxation Loss (Slack Loss) depending on the sample is calculated in a segmented mode
Figure 693135DEST_PATH_IMAGE025
S2, according to the data long tail characteristics, implementing the divisionData Augmentation strategy for stage adjustment for noisy Data sets
Figure 38665DEST_PATH_IMAGE026
Each group of samples in (1)
Figure 720182DEST_PATH_IMAGE027
For sample image
Figure 456057DEST_PATH_IMAGE028
Respectively carrying out weak data enhancement and strong data enhancement to obtain corresponding weak enhancement data and strong enhancement data, dividing training into a preheating stage and a formal stage, considering the negative influence of a strong data enhancement method on a high noise rate data set, respectively calculating and adding relaxation loss in the training stage by using the weak enhancement data and the strong enhancement data, and calculating and adding the relaxation loss in the noise rate
Figure 979924DEST_PATH_IMAGE029
And
Figure 117644DEST_PATH_IMAGE030
as the weight, in the preheating stage, directly calculating the relaxation loss of the weak enhancement data and the strong enhancement data; in the formal training stage, a group of sample images are screened and relaxed to be used as pure data according to the relaxation loss in the preheating stage, residual noise data are screened out, and the relaxation loss is calculated. The method of injecting strong data enhancement during the warm-up phase of training may improve performance for training of low noise data sets, but is counterproductive as the noise of the data set increases. Conversely, the weak data enhancement during the warm-up phase can greatly improve the performance of the high noise data training. Based on this summary, the present invention divides model training into two phases, adjusting the enhancement strategy at different phases.
Further, the relaxation loss in S1 is:
Figure 235642DEST_PATH_IMAGE031
further onThe warm-up stage in S2, using weak enhancement data directly
Figure 193233DEST_PATH_IMAGE032
And strong enhancement data
Figure 992562DEST_PATH_IMAGE033
Calculating the relaxation loss as the noise rate
Figure 109423DEST_PATH_IMAGE029
And
Figure 742529DEST_PATH_IMAGE030
as weights, the overall loss is calculated:
Figure 843209DEST_PATH_IMAGE034
wherein,
Figure 852754DEST_PATH_IMAGE035
further, the formal training phase in S2 includes the following steps:
s21, screening out the slack loss according to the slack loss in the preheating stage
Figure 824121DEST_PATH_IMAGE036
Figure 221604DEST_PATH_IMAGE037
As weak enhancement data
Figure 153788DEST_PATH_IMAGE032
And strong enhancement data
Figure 357236DEST_PATH_IMAGE033
Front of minimum medium slack loss
Figure 324055DEST_PATH_IMAGE030
A partial sample image;
s22, according toWeak enhanced data after screening
Figure 426528DEST_PATH_IMAGE036
From strong enhancement data
Figure 439483DEST_PATH_IMAGE033
Obtained by intermediate sampling
Figure 790830DEST_PATH_IMAGE038
According to the screened strong enhancement data
Figure 736789DEST_PATH_IMAGE037
From weak enhancement of data
Figure 148179DEST_PATH_IMAGE032
Obtained by intermediate sampling
Figure 179589DEST_PATH_IMAGE039
Screening out the remaining noise data;
s23, obtaining
Figure 334627DEST_PATH_IMAGE039
Figure 135093DEST_PATH_IMAGE038
As correct sample image, at the noise rate
Figure 45280DEST_PATH_IMAGE029
And
Figure 439352DEST_PATH_IMAGE030
as weight, calculating the overall loss, returning the loss, updating network parameters:
Figure 257135DEST_PATH_IMAGE040
wherein,
Figure 318632DEST_PATH_IMAGE041
further, in the S21, the
Figure 196458DEST_PATH_IMAGE036
Figure 812248DEST_PATH_IMAGE037
The following screens were used:
Figure 657492DEST_PATH_IMAGE042
Figure 432550DEST_PATH_IMAGE043
further, in the above S1, an optimum interval is set
Figure 356644DEST_PATH_IMAGE044
Figure 115521DEST_PATH_IMAGE045
For training data points
Figure 150473DEST_PATH_IMAGE046
Sample interval
Figure 780038DEST_PATH_IMAGE047
Greater than the optimum interval
Figure 202929DEST_PATH_IMAGE048
Therefore, it needs to be pushed to the class boundary to make the data boundary more gradual; for the sample interval at
Figure 793310DEST_PATH_IMAGE049
Data points within the interval
Figure 756587DEST_PATH_IMAGE050
Figure 381603DEST_PATH_IMAGE051
In the opposite direction, so that the data point has a certain probability of turning into the other side of the class boundary;
Figure 37712DEST_PATH_IMAGE048
Figure 115390DEST_PATH_IMAGE052
indicating for a category
Figure 820041DEST_PATH_IMAGE008
And
Figure 427127DEST_PATH_IMAGE053
is not an exact formula but is stated to be inversely proportional to the number of samples corresponding to the class in view of the relationship between the two classes
Figure 863925DEST_PATH_IMAGE054
And
Figure 819111DEST_PATH_IMAGE055
is/are as follows
Figure 999557DEST_PATH_IMAGE056
To the power. Thereby setting the sample-dependent tolerance range.
Further, the relaxation variable in S1
Figure 723799DEST_PATH_IMAGE003
Will be uniformly distributed
Figure 659394DEST_PATH_IMAGE057
Multiplication by
Figure 711664DEST_PATH_IMAGE029
From which the slack variable is extracted
Figure 820434DEST_PATH_IMAGE003
I.e. by
Figure 71287DEST_PATH_IMAGE058
Figure 974521DEST_PATH_IMAGE029
Representing the noise rate, i.e., the probability of sample label error.
Further, for the setting of the long tail distribution data, the total number of samples is
Figure 514087DEST_PATH_IMAGE059
In each category of training data
Figure 364231DEST_PATH_IMAGE011
The number of training samples is
Figure 60136DEST_PATH_IMAGE060
Satisfy the following requirements
Figure 9637DEST_PATH_IMAGE061
The ratio of the most sample number class to the least sample number class is used as an imbalance factor (imbalance factor)
Figure 161133DEST_PATH_IMAGE062
I.e. by
Figure 221493DEST_PATH_IMAGE063
Further, the sample image and its noise label in S1
Figure 978096DEST_PATH_IMAGE001
By transition matrix (transition matrix)
Figure 223133DEST_PATH_IMAGE064
Representation represents a noise label:
Figure 471711DEST_PATH_IMAGE065
wherein,
Figure 725975DEST_PATH_IMAGE066
representing a sample image
Figure 9189DEST_PATH_IMAGE028
The corresponding category of the content file,
Figure 628389DEST_PATH_IMAGE028
is shown asnThe number of images of the sample is determined,
Figure 488898DEST_PATH_IMAGE067
representing categories
Figure 156640DEST_PATH_IMAGE066
Is classified into categoriesjThe probability of (a) of (b) being,
Figure 684573DEST_PATH_IMAGE068
. For the setting of the noise data, there are 2 cases, i.e., class-independent noise (class-dependent noise) and class-dependent noise (class-dependent noise). The category-independent noise assumes that the mislabeled samples are randomly and uniformly distributed, and the category-dependent noise focuses on the phenomenon of artificial labeling error caused by visual similarity. Both types of noise distributions may use a transition matrix
Figure 146778DEST_PATH_IMAGE064
And (4) showing.
Further, the sample image and its noise label in S1
Figure 497513DEST_PATH_IMAGE001
Sampled in noisy data sets
Figure 296841DEST_PATH_IMAGE069
Corresponding to the sample image and its correct label
Figure 289068DEST_PATH_IMAGE014
Sampled in a clean data set
Figure 312388DEST_PATH_IMAGE070
Wherein
Figure 350751DEST_PATH_IMAGE028
Is shown asnThe number of images of the sample is determined,
Figure 360295DEST_PATH_IMAGE066
representing a sample image
Figure 597242DEST_PATH_IMAGE028
The corresponding category of the content file,
Figure 994725DEST_PATH_IMAGE059
as to the number of samples,
Figure 926909DEST_PATH_IMAGE071
average sampling from potential distribution of data
Figure 130357DEST_PATH_IMAGE072
The invention has the advantages and beneficial effects that:
starting from the category correlation interval, the method introduces the relaxation variable depending on the sample, relaxes the interval constraint, and increases the tolerance of the classification prediction result, thereby bearing the risk of wrong classification caused by noise or unbalanced distribution; considering the negative influence of a strong data enhancement method on a high-noise-rate data set, the method calculates the relaxation loss in the training stage by using weak enhancement data and strong enhancement data respectively; the method for injecting strong data enhancement in the preheating stage of training can improve the performance of the training of low-noise data sets, but can be counterproductive when the noise of the data sets is increased, and on the contrary, the weak data enhancement in the preheating stage can greatly improve the performance of the training of high-noise data. Finally, the effect of noise signatures under long tail distributions is reduced.
Drawings
FIG. 1a is a graph of accuracy versus loss variation for noise sample learning on a CIFAR-10 data set.
FIG. 1b is a graph of accuracy versus loss variation for long tail distribution learning on a CIFAR-10 dataset.
FIG. 2a is a graph of the distribution of class independent noise (asymmetric noise ratio)
Figure 831597DEST_PATH_IMAGE073
)。
FIG. 2b shows the distribution of class-dependent noise (asymmetric noise ratio)
Figure 399981DEST_PATH_IMAGE073
)。
Fig. 2c is a distribution diagram of class independent noise under a long tail distribution.
Fig. 2d is a distribution diagram of class-correlated noise under a long-tailed distribution.
FIG. 3 is a graph of sample dependence tolerance in the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
Noise label learning has received a lot of attention in recent years and also has achieved surprising effects. However, existing deep neural networks DNN still have drawbacks in addressing noise labeling and long tail learning. As shown in fig. 1a, in which the noise ratio is symmetrical
Figure 206745DEST_PATH_IMAGE074
When DNN is used to fit the noise label, the fluctuation in validation accuracy accounts for the noise capacity of the model. As shown in fig. 1b, in which the imbalance factor
Figure 558092DEST_PATH_IMAGE075
The application of DNN in long tail distribution learning also embodies similar characteristics, i.e. DNN fits the main class first, then gradually the tail class. From the above analysis, it can be found that the contradiction between the few-sample class learning and the noise fitting confounds the prediction of the neural network, giving the noise label learning band under the long tail distributionNew challenges arise.
Deep Neural Networks (DNNs), during training with noisy data, tend to remember common patterns first, and then fit noise samples step by step. A similar process occurs in class imbalance learning, where the network tends to fit the main class first, then over-fit the tail class step by step. In this regard, the present invention starts with class-aware intervals (class-dependent margin), and introduces a sample-dependent relaxation Variable (Slack Variable) for taking up the risk of misclassification due to noise or unbalanced distribution. In addition, the invention also provides an anti-noise data enhancement strategy.
1. Experimental setup and preparation
The invention mainly solves the problem of classifying data with long tail characteristics and noise labels simultaneously in an image classification task. Defining an input space as a set of input spaces
Figure 504051DEST_PATH_IMAGE076
Figure 915441DEST_PATH_IMAGE028
Is as followsnAn input image is input to the image processing device,Nis the total number of samples and is,
Figure 150113DEST_PATH_IMAGE077
the label sets of individual categories are
Figure 695364DEST_PATH_IMAGE078
Potential distribution of data
Figure 371196DEST_PATH_IMAGE079
. Training data pairs
Figure 78121DEST_PATH_IMAGE070
Sample at
Figure 800089DEST_PATH_IMAGE080
Figure 493238DEST_PATH_IMAGE059
Is the total number of samples and is,
Figure 679369DEST_PATH_IMAGE066
representing an input image
Figure 166982DEST_PATH_IMAGE028
A corresponding category. The image classification task designed in the present invention therefore needs to be aimed at by constraining
Figure 172984DEST_PATH_IMAGE081
Deriving a prediction function
Figure 651DEST_PATH_IMAGE015
Indicating to input
Figure 651075DEST_PATH_IMAGE076
To proceed with
Figure 699802DEST_PATH_IMAGE082
Calculating and outputting classification results
Figure 802887DEST_PATH_IMAGE083
So that
Figure 228053DEST_PATH_IMAGE084
The number of (a) is the largest, in short,
Figure 998563DEST_PATH_IMAGE085
predicting each input image
Figure 421454DEST_PATH_IMAGE028
To which category the prediction result belongs, and outputting the prediction result
Figure 136469DEST_PATH_IMAGE066
Figure 975112DEST_PATH_IMAGE085
The optimization goal is to maximize the correct number of prediction results,
Figure 990341DEST_PATH_IMAGE019
representing a real number domain.
For the setting of the long tail distribution data, the total number of samples is set as
Figure 256237DEST_PATH_IMAGE059
In each category of training data
Figure 661811DEST_PATH_IMAGE011
Is defined as the number of training samples
Figure 163199DEST_PATH_IMAGE060
Satisfy the following requirements
Figure 642722DEST_PATH_IMAGE061
. The invention defines the sample number ratio between the most sample number category and the least sample number category as an unbalance factor (imbalance factor)
Figure 466803DEST_PATH_IMAGE062
I.e. by
Figure 359673DEST_PATH_IMAGE063
. As shown in fig. 2a, 2b, the distribution of the long tail data generally follows an exponential decay.
For the setting of the noise data, there are 2 cases, i.e., class-independent noise (class-dependent noise) and class-dependent noise (class-dependent noise). The category-independent noise assumes that the mislabeled samples are randomly and uniformly distributed, and the category-dependent noise focuses on the phenomenon of artificial labeling error caused by visual similarity. Both types of noise distribution may use a transition matrix (transition matrix)
Figure 540118DEST_PATH_IMAGE064
Is shown, each element therein
Figure 998782DEST_PATH_IMAGE086
Representative categories
Figure 606480DEST_PATH_IMAGE008
Is classified into categories
Figure 783384DEST_PATH_IMAGE053
The probability of (c). Corresponding correct sample and label thereof
Figure 767520DEST_PATH_IMAGE014
Sampling in clean data sets
Figure 80690DEST_PATH_IMAGE070
Samples representing data and noise labels therefor
Figure 859290DEST_PATH_IMAGE001
Sampling in noisy data sets
Figure 523490DEST_PATH_IMAGE069
Wherein
Figure 45738DEST_PATH_IMAGE059
As to the number of samples,
Figure 478993DEST_PATH_IMAGE087
can be defined as:
Figure 428495DEST_PATH_IMAGE088
as shown in FIGS. 2a and 2b, the class-independent noise, or symmetric noise, is assumed to be of a certain class
Figure 579990DEST_PATH_IMAGE011
Are evenly distributed over the other classes, i.e.
Figure 640350DEST_PATH_IMAGE089
Figure 931042DEST_PATH_IMAGE029
Representing the noise rate, i.e., the probability of sample label error; while class-dependent noise, or asymmetric noise, assumes a certain oneCategories
Figure 317024DEST_PATH_IMAGE090
Are all error labeled as another specific category
Figure 690236DEST_PATH_IMAGE091
I.e. by
Figure 554287DEST_PATH_IMAGE092
. As shown in fig. 2c and 2d, the setting of noise data under a long tail distribution.
2. Sample dependent tolerance range setting
Conventional DNN classification networks typically follow the feature extractor with a linear conversion layer, however this strategy tends to produce cases where the classifier falls into linearity inseparability when fitting to noisy data. Therefore, the present invention proposes a relaxation variable
Figure 962135DEST_PATH_IMAGE003
To relax the interval constraint and increase the tolerance of classification. Relaxation variables
Figure 50177DEST_PATH_IMAGE003
At corresponding optimum intervals empirically
Figure 113947DEST_PATH_IMAGE093
To be restricted, i.e.
Figure 781689DEST_PATH_IMAGE023
(ii) a For data with noisy samples, there is a noise due to each sample
Figure 716147DEST_PATH_IMAGE029
The probability of (noise rate) being a wrong sample, so here we will distribute uniformly
Figure 506249DEST_PATH_IMAGE057
Multiplication by noise rate
Figure 854053DEST_PATH_IMAGE029
From which the slack variable is extracted
Figure 59907DEST_PATH_IMAGE003
I.e. by
Figure 176767DEST_PATH_IMAGE058
At sample intervals
Figure 75453DEST_PATH_IMAGE005
The present invention defines the relaxation interval as:
Figure 910554DEST_PATH_IMAGE013
the relaxation interval increases the tolerance of the classification prediction results. As shown in FIG. 3, the optimum interval is set
Figure 920098DEST_PATH_IMAGE044
Figure 177552DEST_PATH_IMAGE045
For training data points
Figure 981560DEST_PATH_IMAGE094
Interval of function
Figure 303957DEST_PATH_IMAGE047
Greater than the optimum interval
Figure 851613DEST_PATH_IMAGE048
Therefore, it needs to be pushed to the class boundary to make the data boundary more gradual; for the function interval at
Figure 943066DEST_PATH_IMAGE049
Data points within the interval
Figure 777030DEST_PATH_IMAGE095
Figure 524406DEST_PATH_IMAGE096
In the opposite direction, so that the data point has a certain probability of turning into the other side of the class boundary;
Figure 875753DEST_PATH_IMAGE048
Figure 87291DEST_PATH_IMAGE052
relative to that in the above theoretical calculation
Figure 826577DEST_PATH_IMAGE093
In the present embodiment, the description is given for the category
Figure 998933DEST_PATH_IMAGE008
And
Figure 13025DEST_PATH_IMAGE053
is not an exact formula but specifies that they are inversely proportional to the number of samples of the class in view of the relationship between the two classes
Figure 285262DEST_PATH_IMAGE054
And
Figure 867553DEST_PATH_IMAGE055
is/are as follows
Figure 386259DEST_PATH_IMAGE056
To the power.
3. Loss of slack space
Relaxed interval for a particular noise sample
Figure 79409DEST_PATH_IMAGE004
The sample dependent relaxation loss is defined as:
Figure 734381DEST_PATH_IMAGE097
4. weak data enhancement strategy and strong data enhancement strategy
The present invention relates to 2 data enhancement strategies, namely weak data enhancement and strong data enhancement. The implementation of weak data enhancement (weak augmentation) is simple random flip (flip) and crop (crop), while strong data enhancement (strong augmentation) uses the implementation of AutoAutoAutoAutoaugmentation and adopts a data enhancement strategy automatically selected by a search algorithm on ImageNet.
5. Anti-noise data enhancement strategy implemented in stages
The method of injecting strong data enhancement during the warm-up phase of training may improve performance for training of low noise data sets, but is counterproductive as the noise of the data set increases. Conversely, the weak data enhancement during the warm-up phase can greatly improve the performance of the high noise data training. Based on this summary, the present invention divides model training into two phases, adjusting the enhancement strategy at different phases. In the warm-up phase, weak enhancement data is directly used
Figure 753152DEST_PATH_IMAGE032
And strong enhancement data
Figure 759155DEST_PATH_IMAGE033
Calculating the loss, namely:
Figure 583891DEST_PATH_IMAGE034
in the formal training stage, the proportion of the screening quantity to the total quantity of the samples is that
Figure 234315DEST_PATH_IMAGE030
Of (2) a sample
Figure 283043DEST_PATH_IMAGE098
The remaining noise data is filtered out as "correct samples" for calculating the loss and update parameters, defining the loss as:
Figure 979603DEST_PATH_IMAGE040
wherein
Figure 608031DEST_PATH_IMAGE035
6. Training phase data screening strategy
The invention screens the data in the formal training phase and has the noise rate of
Figure 644120DEST_PATH_IMAGE029
The training data of (1) is selected as the proportion of the total number of samples
Figure 798502DEST_PATH_IMAGE030
Effective sample of
Figure 779096DEST_PATH_IMAGE098
As "correct samples" for calculating the loss and update parameters, the remaining noise data is filtered out. The screening process firstly separately
Figure 617739DEST_PATH_IMAGE099
Is defined as:
Figure 101810DEST_PATH_IMAGE100
Figure 367707DEST_PATH_IMAGE101
namely, it is
Figure 570018DEST_PATH_IMAGE099
Respectively representing weakly enhanced data
Figure 212352DEST_PATH_IMAGE032
And strong enhancement data
Figure 816508DEST_PATH_IMAGE033
Front of minimum medium slack loss
Figure 518885DEST_PATH_IMAGE030
A portion of the sample. Then, according to the screened weak enhancement data
Figure 208493DEST_PATH_IMAGE036
From strong enhancement data
Figure 388938DEST_PATH_IMAGE033
Obtained by intermediate sampling
Figure 847601DEST_PATH_IMAGE038
(ii) a Similarly, according to the screened strong enhancement data
Figure 455300DEST_PATH_IMAGE037
From weak enhancement of data
Figure 632204DEST_PATH_IMAGE032
Obtained by intermediate sampling
Figure 350761DEST_PATH_IMAGE039
The remaining noise data is filtered out.
Specifically, the sample-dependent relaxation interval loss learning method comprises the following steps:
step 1: according to the data noise characteristics, each sample and the noise label thereof
Figure 932440DEST_PATH_IMAGE001
After calculating the sample function interval (functional margin)
Figure 711040DEST_PATH_IMAGE004
When a sample dependent relaxation Variable (Slack Variable) is introduced
Figure 375239DEST_PATH_IMAGE003
And calculating the sample-dependent smooth relaxation Loss (Slack Loss) according to the sample interval by sections by relaxing the interval constraint
Figure 897487DEST_PATH_IMAGE025
Data samples and noise signatures thereof
Figure 330743DEST_PATH_IMAGE001
Sampling in noisy data sets
Figure 608140DEST_PATH_IMAGE069
Corresponding to correct sample and label thereof
Figure 369423DEST_PATH_IMAGE014
Sampling in clean data sets
Figure 819996DEST_PATH_IMAGE070
Wherein
Figure 983124DEST_PATH_IMAGE059
As to the number of samples,
Figure 228161DEST_PATH_IMAGE029
a noise rate representing the probability of sample label error,
Figure 742318DEST_PATH_IMAGE071
average sampling from potential distribution of data
Figure 731003DEST_PATH_IMAGE072
Figure 748638DEST_PATH_IMAGE102
In order to input the space, the input device is provided with a display,
Figure 430155DEST_PATH_IMAGE103
is composed of
Figure 166030DEST_PATH_IMAGE077
A set of labels for each category.
For the prediction function
Figure 424317DEST_PATH_IMAGE015
Spacing of samples
Figure 421092DEST_PATH_IMAGE002
Is defined as:
Figure 148876DEST_PATH_IMAGE104
at the same time, classify
Figure 496681DEST_PATH_IMAGE011
Is defined as
Figure 702534DEST_PATH_IMAGE006
Wherein
Figure 84974DEST_PATH_IMAGE007
Is shown as
Figure 983660DEST_PATH_IMAGE008
Sample (serial number
Figure 553182DEST_PATH_IMAGE008
Sample (1)
Figure 562726DEST_PATH_IMAGE009
Is marked with a label
Figure 799672DEST_PATH_IMAGE010
Is a category
Figure 603680DEST_PATH_IMAGE011
I.e. specimen
Figure 926077DEST_PATH_IMAGE009
Belong to the category
Figure 473733DEST_PATH_IMAGE011
And the process of, accordingly,
Figure 565186DEST_PATH_IMAGE012
indicates all belong to the category
Figure 540095DEST_PATH_IMAGE011
A set of sequence numbers of samples of (1).
Relaxed interval for a particular noise sample
Figure 352718DEST_PATH_IMAGE004
At sample intervals
Figure 704065DEST_PATH_IMAGE002
On the basis of (2), introducing a relaxation variable
Figure 650024DEST_PATH_IMAGE003
The relaxation interval is defined as:
Figure 61414DEST_PATH_IMAGE105
relaxation variance of sample
Figure 296086DEST_PATH_IMAGE003
At corresponding optimum intervals empirically
Figure 310179DEST_PATH_IMAGE093
To be restricted, i.e.
Figure 251590DEST_PATH_IMAGE023
(ii) a For noise rate of
Figure 692935DEST_PATH_IMAGE029
Will be evenly distributed
Figure 87008DEST_PATH_IMAGE057
Multiplication by noise rate
Figure 373632DEST_PATH_IMAGE029
From which the slack variable is extracted
Figure 294184DEST_PATH_IMAGE003
I.e. by
Figure 47376DEST_PATH_IMAGE058
Relaxing intervals for noise samples
Figure 53378DEST_PATH_IMAGE004
The sample dependent relaxation loss is defined as:
Figure 284640DEST_PATH_IMAGE106
step 2: according to the Data long tail characteristic, a Data Augmentation strategy (Data Augmentation) adjusted in stages is implemented. And respectively carrying out strong data enhancement and weak data enhancement on the sample. In the preheating stage, directly calculating relaxation loss; in the formal training phase, a mechanism is provided to screen small loss samples as clean data, to screen out noisy data, and to calculate slack loss.
For noisy data sets
Figure 56768DEST_PATH_IMAGE026
Each group of samples in (1)
Figure 980862DEST_PATH_IMAGE027
To input of
Figure 474160DEST_PATH_IMAGE028
Respectively carrying out weak data enhancement and strong data enhancement to obtain corresponding weak enhancement data
Figure 509112DEST_PATH_IMAGE032
And strong enhancement data
Figure 138676DEST_PATH_IMAGE033
Considering the negative effect of strong data enhancement method on high noise rate data set, the invention uses the relaxation loss of training stage with weak enhancement data respectively
Figure 827147DEST_PATH_IMAGE032
And strong enhancement data
Figure 683107DEST_PATH_IMAGE033
Are calculated and added to obtain the noise ratio
Figure 380805DEST_PATH_IMAGE029
And
Figure 5821DEST_PATH_IMAGE030
as weights, the penalty is defined as:
Figure 396351DEST_PATH_IMAGE107
wherein
Figure 474029DEST_PATH_IMAGE035
The training is divided into a preheating stage and a formal stage, and the loss and the parameters are calculated and updated in the following modes respectively:
2.1: in the warm-up phase, weak enhancement data is directly used
Figure 240996DEST_PATH_IMAGE032
And strong enhancement data
Figure 720519DEST_PATH_IMAGE033
Calculating the loss, namely:
Figure 281951DEST_PATH_IMAGE034
2.2: in the formal training stage, according to the relaxation loss of the samples, the proportion of the screening quantity to the total amount of the samples is
Figure 846924DEST_PATH_IMAGE030
Of (2) a sample
Figure 154933DEST_PATH_IMAGE098
The remaining noise data is filtered out as "correct samples" for calculating the loss and update parameters, defining the loss as:
Figure 82438DEST_PATH_IMAGE108
in the formal training stage, the sample is screened
Figure 690137DEST_PATH_IMAGE098
Is first defined separately
Figure 867040DEST_PATH_IMAGE099
Enhancing data for weaknesses
Figure 444652DEST_PATH_IMAGE032
And strong enhancement data
Figure 226663DEST_PATH_IMAGE033
Front of minimum medium slack loss
Figure 5263DEST_PATH_IMAGE030
A portion of the sample; then, according to the screened weak enhancement data
Figure 669463DEST_PATH_IMAGE036
From strong enhancement data
Figure 191711DEST_PATH_IMAGE033
Obtained by intermediate sampling
Figure 890546DEST_PATH_IMAGE038
(ii) a Similarly, according to the screened strong enhancement data
Figure 840047DEST_PATH_IMAGE037
From weak enhancement of data
Figure 460384DEST_PATH_IMAGE032
Obtained by intermediate sampling
Figure 786324DEST_PATH_IMAGE039
The remaining noise data is filtered out. Using the obtained
Figure 172672DEST_PATH_IMAGE039
Figure 293075DEST_PATH_IMAGE038
And calculating the loss by using the formula, returning the loss and updating the network parameters.
As shown in Table 1, on the CIFAR-10 and CIFAR-100 data sets with noise, using ResNet34 as a general network framework, noise ratios were set to the class-independent noise and the class-dependent noise, respectively
Figure 931867DEST_PATH_IMAGE109
And
Figure 795918DEST_PATH_IMAGE110
compared with Bootstrap, Forward, GCE, SCE and other methods. For class independent noise, the relaxation penalty proposed herein outweighs all other approaches. For class-dependent noise, the relaxation penalty is in
Figure 141448DEST_PATH_IMAGE111
Is slightly better than other methods, but in
Figure 822966DEST_PATH_IMAGE112
The accuracy is not high. To this end, we give a reasonable explanation: the relaxation variables introduced by the invention add random disturbance to the sample label distribution, and the random disturbance may have negative effects because the category-related noise is accurate in the non-corresponding categories. And when the noise rate is small (
Figure 824420DEST_PATH_IMAGE111
) Regularization adjustment of relaxation loss may balance this negative effect.
As shown in Table 2, on CIFAR-10 and CIFAR-100 datasets with long tail distributions, using ResNet34 as a general network framework, an imbalance factor is set
Figure 351216DEST_PATH_IMAGE113
And Focal localMixup, CE-DRW, CE-DRS, LDAM-DRW, BBN, etc. It can be seen that when the data is extremely unbalanced, the accuracy of the classification result of the invention is much higher than that of other methods. When in use
Figure 488936DEST_PATH_IMAGE114
The performance of relaxation loss is somewhat inadequate for reasons similar to the interpretation of noise learning in the previous section.
TABLE 1
Figure 341354DEST_PATH_IMAGE115
Different methods classify the results in accuracy (%) on the noisy data set CIFAR-10/100, with the highest accuracy being marked in bold and the second accuracy being marked in oblique bold. Wherein the Slack Loss method uses the Slack Loss (Slack Loss) in the present invention as a Loss function, but does not use a data enhancement strategy; the Slack Loss + method uses the relaxation Loss and data enhancement strategy of the present invention, i.e., the complete method given by the present invention.
TABLE 2
Figure 298946DEST_PATH_IMAGE116
The classification result accuracy (%) of the different methods on the long tail data set CIFAR-10/100, the highest accuracy being marked in bold, the second accuracy being marked in oblique bold. Wherein the Slack Loss method uses the Slack Loss (Slack Loss) in the present invention as a Loss function.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A long-tail distribution image classification method with a noise label is characterized by comprising the following steps:
s1, according to the data noise characteristics, the sample image and the noise label thereof
Figure DEST_PATH_IMAGE001
At sample intervals
Figure DEST_PATH_IMAGE002
On the basis of (2), introducing a relaxation variable
Figure DEST_PATH_IMAGE003
Forming sample relaxation intervals of noisy samples
Figure DEST_PATH_IMAGE004
The sample interval is
Figure DEST_PATH_IMAGE005
Class interval of
Figure DEST_PATH_IMAGE006
Wherein
Figure DEST_PATH_IMAGE007
Is shown as
Figure DEST_PATH_IMAGE008
A sample
Figure DEST_PATH_IMAGE009
Is marked with a label
Figure DEST_PATH_IMAGE010
Is a category
Figure DEST_PATH_IMAGE011
Figure DEST_PATH_IMAGE012
Indicates all belong to the category
Figure 805706DEST_PATH_IMAGE011
A set of sequence numbers of samples of (1);
the sample relaxation intervals were:
Figure DEST_PATH_IMAGE013
wherein,
Figure DEST_PATH_IMAGE014
indicating the sample image and its correct label,
Figure DEST_PATH_IMAGE015
representing a prediction function for predicting to which class a sample image belongs,
Figure DEST_PATH_IMAGE016
in order to be a sample space, the sample space,Nis the total number of samples and is,
Figure DEST_PATH_IMAGE017
is composed of
Figure DEST_PATH_IMAGE018
A set of tags for each of the categories,
Figure DEST_PATH_IMAGE019
the representation of the real number field is performed,
Figure DEST_PATH_IMAGE020
is shown and
Figure DEST_PATH_IMAGE021
different noise labels
Figure DEST_PATH_IMAGE022
And x corresponding thereto, the largest value among the values obtained by the prediction function,
Figure DEST_PATH_IMAGE023
Figure DEST_PATH_IMAGE024
representing an optimal interval;
piecewise computing sample-dependent relaxation loss from sample interval
Figure DEST_PATH_IMAGE025
S2, according to the data long tail characteristic, the data enhancement strategy adjusted by stages is used for the sample image
Figure DEST_PATH_IMAGE026
Respectively performing weak data enhancement and strong data enhancement to obtain corresponding weak enhancement data and strong enhancement data, dividing training into a preheating stage and a formal stage, and directly calculating relaxation losses of the weak enhancement data and the strong enhancement data in the preheating stage; in the formal training stage, a group of sample images are screened and relaxed to be used as pure data according to the relaxation loss in the preheating stage, residual noise data are screened out, and the relaxation loss is calculated.
2. The method for classifying long-tail distribution images with noise labels as claimed in claim 1, wherein the relaxation loss in S1 is:
Figure DEST_PATH_IMAGE027
3. the method for classifying long tail distribution images with noise labels as claimed in claim 1, wherein the pre-heating stage in S2 directly uses weak enhancement data
Figure DEST_PATH_IMAGE028
And strong enhancement data
Figure DEST_PATH_IMAGE029
Calculating the relaxation loss as the noise rate
Figure DEST_PATH_IMAGE030
And
Figure DEST_PATH_IMAGE031
as weights, the overall loss is calculated:
Figure DEST_PATH_IMAGE032
wherein,
Figure DEST_PATH_IMAGE033
4. the method for classifying long-tail distribution images with noise labels as claimed in claim 1, wherein the formal training phase in S2 includes the following steps:
s21, screening out the slack loss according to the slack loss in the preheating stage
Figure DEST_PATH_IMAGE034
Figure DEST_PATH_IMAGE035
As weak enhancement data
Figure 617454DEST_PATH_IMAGE028
And strong enhancement data
Figure 334874DEST_PATH_IMAGE029
Front of minimum medium slack loss
Figure 908245DEST_PATH_IMAGE031
A partial sample image;
s22, according to the screened weak enhancement data
Figure 660301DEST_PATH_IMAGE034
From strong enhancement data
Figure 872976DEST_PATH_IMAGE029
Obtained by intermediate sampling
Figure DEST_PATH_IMAGE036
According to the screened strong enhancement data
Figure 264643DEST_PATH_IMAGE035
From weak enhancement of data
Figure 650494DEST_PATH_IMAGE028
Obtained by intermediate sampling
Figure DEST_PATH_IMAGE037
Screening out the remaining noise data;
s23, obtaining
Figure 712516DEST_PATH_IMAGE037
Figure 643562DEST_PATH_IMAGE036
As correct sample image, at the noise rate
Figure 584843DEST_PATH_IMAGE030
And
Figure 259537DEST_PATH_IMAGE031
as weights, the overall loss is calculated:
Figure DEST_PATH_IMAGE038
wherein,
Figure 110819DEST_PATH_IMAGE033
5. the method for classifying long tail distribution images with noise labels as claimed in claim 4, wherein in S21, the method comprises
Figure 727614DEST_PATH_IMAGE034
Figure 172502DEST_PATH_IMAGE035
The following screens were used:
Figure DEST_PATH_IMAGE039
Figure DEST_PATH_IMAGE040
6. the method for classifying long tail distribution images with noise labels as claimed in claim 1, wherein in S1, an optimal interval is set
Figure DEST_PATH_IMAGE041
Figure DEST_PATH_IMAGE042
For training data points
Figure DEST_PATH_IMAGE043
Sample interval
Figure DEST_PATH_IMAGE044
Greater than the optimum interval
Figure DEST_PATH_IMAGE045
Pushing it to the class boundary, making the data boundary more gradual; for the sample interval at
Figure DEST_PATH_IMAGE046
Data points within the interval
Figure DEST_PATH_IMAGE047
Figure DEST_PATH_IMAGE048
In the opposite direction, so that the data point has a certain probability of turning into the other side of the class boundary;
Figure 466866DEST_PATH_IMAGE045
Figure DEST_PATH_IMAGE049
indicating for a category
Figure 234971DEST_PATH_IMAGE008
And
Figure DEST_PATH_IMAGE050
is inversely proportional to the number of samples corresponding to the class
Figure DEST_PATH_IMAGE051
And
Figure DEST_PATH_IMAGE052
is/are as follows
Figure DEST_PATH_IMAGE053
To the power.
7. The method for classifying long tail distribution images with noise labels as claimed in claim 1, wherein the relaxation variables in S1
Figure 286584DEST_PATH_IMAGE003
Will be uniformly distributed
Figure DEST_PATH_IMAGE054
Multiplication by
Figure 343401DEST_PATH_IMAGE030
From which the slack variable is extracted
Figure 359899DEST_PATH_IMAGE003
I.e. by
Figure DEST_PATH_IMAGE055
Figure 982510DEST_PATH_IMAGE030
Representing the noise rate, i.e., the probability of sample label error.
8. The method of claim 1, wherein the total number of samples is
Figure DEST_PATH_IMAGE056
In each category of training data
Figure 881720DEST_PATH_IMAGE011
The number of training samples is
Figure DEST_PATH_IMAGE057
Satisfy the following requirements
Figure DEST_PATH_IMAGE058
The ratio of the most sample number class to the least sample number class is used as the imbalance factor
Figure DEST_PATH_IMAGE059
I.e. by
Figure DEST_PATH_IMAGE060
9. The method for classifying long tail distribution images with noise labels as claimed in claim 1, wherein the sample image and its noise label in S1
Figure 534156DEST_PATH_IMAGE001
By means of a transfer matrix
Figure DEST_PATH_IMAGE061
Represents the noise label:
Figure DEST_PATH_IMAGE062
wherein,
Figure DEST_PATH_IMAGE063
representing a sample image
Figure 872121DEST_PATH_IMAGE026
The corresponding category of the content file,
Figure 411556DEST_PATH_IMAGE026
is shown asnThe number of images of the sample is determined,
Figure DEST_PATH_IMAGE064
representing categories
Figure 478738DEST_PATH_IMAGE063
Is classified into categoriesjThe probability of (a) of (b) being,
Figure DEST_PATH_IMAGE065
10. the method for classifying long tail distribution images with noise labels as claimed in claim 1, wherein the sample image in S1And its noise label
Figure 44236DEST_PATH_IMAGE001
Sampled in noisy data sets
Figure DEST_PATH_IMAGE066
Corresponding to the sample image and its correct label
Figure 589487DEST_PATH_IMAGE014
Sampled in a clean data set
Figure DEST_PATH_IMAGE067
Wherein
Figure 186690DEST_PATH_IMAGE026
Is shown asnThe number of images of the sample is determined,
Figure 237823DEST_PATH_IMAGE063
representing a sample image
Figure 350004DEST_PATH_IMAGE026
The corresponding category of the content file,
Figure 511995DEST_PATH_IMAGE056
as to the number of samples,
Figure DEST_PATH_IMAGE068
average sampling from potential distribution of data
Figure DEST_PATH_IMAGE069
CN202111059448.7A 2021-09-10 2021-09-10 Long-tail distribution image classification method with noise label Active CN113516207B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111059448.7A CN113516207B (en) 2021-09-10 2021-09-10 Long-tail distribution image classification method with noise label

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111059448.7A CN113516207B (en) 2021-09-10 2021-09-10 Long-tail distribution image classification method with noise label

Publications (2)

Publication Number Publication Date
CN113516207A CN113516207A (en) 2021-10-19
CN113516207B true CN113516207B (en) 2022-01-25

Family

ID=78063294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111059448.7A Active CN113516207B (en) 2021-09-10 2021-09-10 Long-tail distribution image classification method with noise label

Country Status (1)

Country Link
CN (1) CN113516207B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989905A (en) * 2021-11-16 2022-01-28 广东履安实业有限公司 Training of face recognition model, face recognition method and related device
CN113869463B (en) * 2021-12-02 2022-04-15 之江实验室 Long tail noise learning method based on cross enhancement matching
CN114519850A (en) * 2022-04-20 2022-05-20 宁波博登智能科技有限公司 Marking system and method for automatic target detection of two-dimensional image
CN114863193B (en) * 2022-07-07 2022-12-02 之江实验室 Long-tail learning image classification and training method and device based on mixed batch normalization

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945372B (en) * 2012-10-18 2015-06-24 浙江大学 Classifying method based on multi-label constraint support vector machine
CN109543693B (en) * 2018-11-28 2021-05-07 中国人民解放军国防科技大学 Weak labeling data noise reduction method based on regularization label propagation
CN111737552A (en) * 2020-06-04 2020-10-02 中国科学院自动化研究所 Method, device and equipment for extracting training information model and acquiring knowledge graph
CN111832627B (en) * 2020-06-19 2022-08-05 华中科技大学 Image classification model training method, classification method and system for suppressing label noise
CN112101328A (en) * 2020-11-19 2020-12-18 四川新网银行股份有限公司 Method for identifying and processing label noise in deep learning

Also Published As

Publication number Publication date
CN113516207A (en) 2021-10-19

Similar Documents

Publication Publication Date Title
CN113516207B (en) Long-tail distribution image classification method with noise label
Wang et al. Esrgan: Enhanced super-resolution generative adversarial networks
CN111738301B (en) Long-tail distribution image data identification method based on double-channel learning
CN106779064A (en) Deep neural network self-training method based on data characteristics
CN112101544A (en) Training method and device of neural network suitable for long-tail distributed data set
CN109857871A (en) A kind of customer relationship discovery method based on social networks magnanimity context data
CN104143081A (en) Smile recognition system and method based on mouth features
CN107564007B (en) Scene segmentation correction method and system fusing global information
CN116503676B (en) Picture classification method and system based on knowledge distillation small sample increment learning
CN104809706A (en) Single lens computational imaging method based on gentle image color change priori
CN105306296A (en) Data filter processing method based on LTE (Long Term Evolution) signaling
Al-Amaren et al. RHN: A residual holistic neural network for edge detection
CN113837959A (en) Image denoising model training method, image denoising method and image denoising system
CN113011337A (en) Chinese character library generation method and system based on deep meta learning
CN115810191A (en) Pathological cell classification method based on multi-attention fusion and high-precision segmentation network
Xue et al. Research on edge detection operator of a convolutional neural network
Fernandez-Fernandez et al. Quick, stat!: A statistical analysis of the quick, draw! dataset
CN114492631A (en) Spatial attention calculation method based on channel attention
CN116488974B (en) Light modulation identification method and system combined with attention mechanism
Li et al. Learning domain-aware detection head with prompt tuning
Zhou et al. Ec-darts: Inducing equalized and consistent optimization into darts
CN117421657A (en) Sampling and learning method and system for noisy labels based on oversampling strategy
CN117095217A (en) Multi-stage comparative knowledge distillation process
CN111008940A (en) Image enhancement method and device
CN115392344A (en) Long tail identification method for strong and weak double-branch network with difficult sample perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant