CN115564960B

CN115564960B - Network image label denoising method combining sample selection and label correction

Info

Publication number: CN115564960B
Application number: CN202211408454.3A
Authority: CN
Inventors: 姚亚洲; 黄丹; 沈复民; 孙泽人; 申恒涛
Original assignee: Nanjing Code Geek Technology Co ltd
Current assignee: Nanjing Code Geek Technology Co ltd
Priority date: 2022-11-10
Filing date: 2022-11-10
Publication date: 2023-03-03
Anticipated expiration: 2042-11-10
Also published as: CN115564960A

Abstract

The invention discloses a network image label denoising method combining sample selection and label correction, which comprises the following steps: s1, firstly, selecting a clean sample according to cosine similarity between the sample and a category center; s2, selecting a reusable sample from the rest samples through sample uncertainty dynamic state and correcting; s3, finally, updating the network by using the clean sample and the corrected reusable sample; according to the method, after the clean samples are selected according to the cosine similarity between the samples and the category centers, the reusable samples are dynamically selected from the rest samples according to the uncertainty of the samples and are corrected, and finally the clean samples and the corrected reusable samples are used for updating the network, so that the utilization rate of the samples is improved, and meanwhile, the fine-grained classification performance is improved.

Description

Network image label denoising method combining sample selection and label correction

Technical Field

The invention relates to the technical field of network label denoising, in particular to a network image label denoising method combining sample selection and label correction.

Background

For the noise problem, besides improving the accuracy of sample selection by reducing the sample coincidence rate between classes, another idea is to further reduce the influence of noise labels on the neural network by combining noise sample selection and loss correction. The method based on sample selection is to select clean samples for subsequent training by a certain method, wherein part of noise samples discarded by the sample selection method are internal noise, the samples are called reusable samples, and the true labels of the samples are still in a data set. Therefore, the utilization rate of the sample can be effectively improved by reusing the part of the sample, and the problem to be solved is a problem which needs to be solved for the classification of the fine-grained images which lack the data set.

Disclosure of Invention

The invention aims to provide a network image label denoising method combining sample selection and label correction, so as to solve the problems in the background technology.

In order to achieve the purpose, the invention provides the following technical scheme: a network image label denoising method combining sample selection and label correction comprises the following steps:

s1, firstly, selecting a clean sample according to cosine similarity between the sample and a category center;

s2, selecting a reusable sample from the rest samples through sample uncertainty dynamic state and correcting;

and S3, finally, updating the network by using the clean sample and the corrected reusable sample.

Further, in S1, the features of the picture are normalized in the Softmax layer, and the output process of the Softmax layer may be represented as:

（6.1）

（6.2）

after normalization, using a hyper-parameter s to scale cosine values, the Softmax output under the L2 constraint after feature normalization is calculated as follows:

（6.3）；

wherein the content of the first and second substances,

and

indicating the ith sample and its label.

Further, after normalization, the features are characterized by angles on a hypersphereDegree distribution, parameters of the last fully-connected layer

For the center of each class generated by pre-training, the output of the network full-connection layer is the cosine distance between the picture feature and the center of each class

(ii) a Recording the cosine similarity of each picture with its corresponding class center:

（6.4）

is the ith sample and its class center

Sorting H, taking the example with large cosine similarity in the training of each batch, sending the example into a peer-to-peer network, and carrying out the next training; the selection formula is as follows:

（6.5）

wherein the content of the first and second substances,

for a correctable drop rate, D is the set of samples, and Dr is the reusable samples.

Further, the clean sample Dc is selected in S1, and the remaining samples can be divided into two types, namely a reusable sample Dr and a noise set Dn which need to be discarded in subsequent training;

when the sample is

Prediction uncertainty of

The sample belongs to the reusable sample set Dr if the following condition is satisfied:

（6.6）

wherein

Is a sample

Is not determined, and

to represent

Median of uncertainty of medium samples, the uncertainty of each sample is measured by cross entropy:

（6.7）。

further, each sample was recorded

Last 10 predictions

Prediction is updated as training progresses:

（6.8）

according to

Record the sample

The category j with the largest number of predicted times and the number m,

is a sample

Probability of being predicted as j:

（6.9）

the uncertainty of the prediction is the smallest when the n predictions are all the same, and the time is the smallest when the n predictions are all the same

，

(ii) a The uncertainty is the largest when the n predictions are different, and the time is the moment

，

And n is 10.

Further, in S3, during the previous n training sessions, the output of the Softmax layer is smoothed, and is propagated backwards with the following loss:

（6.10）

（6.11）；

wherein

The tag smoothing factor for the data set.

Further, after n times of training, the reusable sample Dr is selected using the formula (6.6) and used

Updating the network:

（6.12）

（6.13）

（6.14）

j is the category with the largest predicted times in the continuous n prediction processes.

Compared with the prior art, the invention has the beneficial effects that: according to the method, after the clean samples are selected according to the cosine similarity between the samples and the category centers, the reusable samples are dynamically selected from the rest samples according to the uncertainty of the samples and are corrected, and finally the clean samples and the corrected reusable samples are used together to update the network, so that the utilization rate of the samples is improved, and meanwhile, the fine-grained classification performance is improved.

Drawings

FIG. 1 is a schematic diagram of the front half of the CSSLC frame structure of the present invention;

FIG. 2 is a rear half of a CSSLC frame body structure of the present invention;

figure 3 is a diagram of the steps of a CSSLC method of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

Referring to fig. 1, fig. 2 and fig. 3, the invention is a combination Sample selection and label Correction combined network image label denoising method (combination Sample selection with Loss Correction), referred to as CSSLC for short, different from the single Sample selection method and Loss Correction method, the method performs Loss Correction on a part of reusable samples on the basis of Sample selection, which can greatly improve the Sample utilization rate and the image classification performance;

firstly, a sample set is set

The division into three sets: clean sample set Dc, reusable sample set Dr and noise set Dn, sample set

Wherein

Is the (i) th training sample,

is that

The label of (1); for the reusable sample set Dr,

is not a sample

The true label of (1) is recorded as

In the next step, distinguishing a clean sample set Dc, reusing the sample set Dr and a noise set Dn, and performing loss correction on the reusable sample set Dr and then sending the reusable sample set Dr into a network for training;

on the premise of choosing a clean sample based on sample selection, a reusable sample is dynamically chosen again through the uncertainty of the sample for the noise sample which is to be discarded, and loss correction is carried out on the reusable sample. Since the higher the uncertainty for a sample, the more likely it is a noise sample, and the lower the uncertainty, the more likely it is a reusable sample.

In this embodiment, the conventional sample selection method first calculates the loss of the sample, then selects the sample according to the small loss, and selects the sample according to the cosine similarity between the sample and the class center, then calculates the loss, and selects the available sample before calculating the loss, and calculates the loss by using the samples.

Based on a simple observation, the network will fit a simple clean sample first, and the cosine similarity between the simple clean sample and the class center will be lower than that of the noise sample, so the clean sample is selected directly according to the cosine similarity between the sample and the class center.

The goal of Softmax is to maximize the probability of correct classification as much as possible, so it ignores some pictures that are difficult to resolve, i.e. low quality pictures, and preferentially fits high quality pictures; in order to increase the utilization rate of the picture, the features of the picture are normalized at the Softmax layer, so that the hard example gains more attention from the network, and the final output process of the Softmax layer can be expressed as:

（6.1）

（6.2）

（6.3）

wherein the content of the first and second substances,

and with

Expressing the ith sample and its label, after normalization, the characteristics are distributed in angle on the hypersphere, and the parameters of the last full connection layer

For the center of each class generated by pre-training, the output of the network full-connection layer is the cosine distance between the picture characteristic and the center of each class

Recording the cosine similarity of each picture and the corresponding class center:

（6.4）

is the ith sample and its class center

The cosine distances of the training data are sorted, the instances with large cosine similarity in each batch of training are sent to a peer-to-peer network, and the next step of training is carried outThe selection formula is as follows:

（6.5）

wherein, the first and the second end of the pipe are connected with each other,

for a correctable drop rate, D is the sample set, dr is the reusable sample, and the selected picture is fed into the peer-to-peer network update network.

In this embodiment, after the clean sample Dc is selected, the remaining samples may be divided into two types, where one type of label is in the data set, and through training, the network predicts the correct label of this type of sample, and through correcting the label of this type of sample, the network may still continue to learn from this type of sample set, which is called reusable sample Dr, and another type of label is not in the data set, which is called noise set Dn, and needs to be discarded in the subsequent training.

When a reusable sample is fed into the network, the network will tend to give a definite prediction (which is not consistent with the label given by the data set) after training, and when a noisy sample is fed into the network, the network will give an uncertain prediction, entropy is used to measure the uncertainty of the sample, and the reusable sample is selected.

When the sample is

Prediction uncertainty of

（6.6）

wherein

Is a sample

Is not determined, and

to represent

Median of uncertainty of samples, cross entropy is used to measure the uncertainty of each sample:

（6.7）

record each sample

Last 10 predictions

Prediction is updated as training progresses:

（6.8）

according to

Record the sample

The category j with the largest number of predicted times and the number m,

is a sample

Probability of being predicted as j:

（6.9）

，

(ii) a The uncertainty is greatest when the n predictions are all different, when

，

And n is 10.

In this embodiment, a BCNN network is used for training, in the training, clean samples Dc are first selected, label smoothing is helpful for the network to learn in noisy data, in the previous n training processes, the output of the Softmax layer is smoothed, and back propagation is performed with the following loss:

（6.10）

（6.11）；

wherein

The tag smoothing factor for the data set.

TrainingAfter n times, reusable samples Dr are sorted out using equation (6.6) and used

Updating the network:

（6.12）

（6.13）

（6.14）

The algorithm flow of the invention is as follows:

inputting:

training set D

Mini-batch training set Dm

Clean sample set Dc

Reusable sample set Dr

Total number of training times Tmax

Number of pre-training times Tk

Discard rate

Number of iterations Nmax

And (3) outputting: updating the network h

Randomly initializing network parameters Dc = D, dr = D

:

:

Record each sample according to equation (6.8)

Predictive tagging of

Will predict the label

Is added to

Will be provided with

Replace oldest records

The clean sample Dc is selected according to equation (6.5)

Updating the network h according to equation (6.11)

else:

Selecting a clean sample Dc according to formula (6.5)

Selecting reusable samples Dr according to formula (6.6)

:

Correcting the label of the sample to a true label according to equation (6.13)

Updating the network h according to equation (6.14)

。

Generally speaking, the invention mainly combines the sample selection with the loss correction, and provides a new method for selecting reusable samples, so that the method improves the sample utilization rate and improves the fine-grained classification performance.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing embodiments, or equivalents may be substituted for elements thereof.

Claims

1. A network image label denoising method combining sample selection and label correction is characterized by comprising the following steps:

s2, selecting a reusable sample from the rest samples through sample uncertainty dynamic state and correcting the reusable sample;

s3, finally, updating the network by using the clean sample and the corrected reusable sample;

in S1, the features of the picture are normalized in the Softmax layer, and the output process of the Softmax layer may be represented as:

w _i x _i ＝||w _i ||||x _i ||cosθ _i ＝cosθ _i (6.2)

wherein x is _i And y _i Represents the ith sample and its label;

after normalization, the characteristics are distributed on the hypersphere in an angle, and the parameter w of the last full-connection layer _j For the center of each class generated by pre-training, the output of the network full-link layer is the cosine distance cos theta between the picture feature and the center of each class _j (ii) a Recording the cosine similarity of each picture with its corresponding class center:

is the ith sample and its class center y _i The cosine distances of the training data are used for sorting H, and the instances with high cosine similarity are taken from the training of each batch and sent to a peer-to-peer network for the next training; the selection formula is as follows:

where τ is a correctable discard rate, D is a set of samples, and Dr is a reusable sample.

2. The method for denoising network image labels combining sample selection and label correction as claimed in claim 1, wherein the clean samples Dc are selected in S1, and the remaining samples can be divided into two types, which are respectively reusable samples Dr and noise sets Dn, and need to be discarded in the subsequent training;

when the sample x _i Prediction uncertainty of f (x) _i ) The sample belongs to the reusable sample set Dr if the following condition is satisfied:

wherein f (x) _i ) Is a sample x _i And midf (x) _i ) Is shown (D) _r ∩D _n ) Median of uncertainty of samples, cross entropy is used to measure the uncertainty of each sample:

3. the method for denoising network image labels combining sample selection and label correction as claimed in claim 2, wherein each sample x is recorded _i Prediction Pre of the last 10 times _i Prediction is updated as training progresses:

Pre _i ＝{prec ₁ ，prec ₂ ，...，prec _n } (6.8)

according to Pre _i Record sample x _i Class j with the largest predicted number of times and number m, p _i Is a sample x _i Probability of being predicted as j:

p _i ＝m/n (6.9)

of the n predictions, where p is the same, the uncertainty is minimal when all n predictions are the same _i ＝1，f(x _i ) =0; the uncertainty is greatest when n predictions are different, when p is the time of prediction _i ＝1/n，f(x _i ) And (4) taking 10 as n, wherein the = log 1/n.

4. The method for denoising network image labels combining sample selection and label correction as claimed in claim 3, wherein in S3, the output of Softmax layer is smoothed in the previous n training processes, and the backward propagation is performed with the following loss:

where α is the tag smoothing factor for the data set.

5. The method as claimed in claim 4, wherein after training n times, the reusable sample Dr is selected using formula (6.6), and L is used _CSSLC Updating the network: