CN116012569A

CN116012569A - Multi-label image recognition method based on deep learning and under noisy data

Info

Publication number: CN116012569A
Application number: CN202310299402.5A
Authority: CN
Inventors: 陈添水; 徐志华; 黄衍聪; 柯梓铭; 付晨博; 范耀洲; 杨志景
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2023-03-24
Filing date: 2023-03-24
Publication date: 2023-04-25
Anticipated expiration: 2043-03-24
Also published as: CN116012569B

Abstract

The invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label; the method can carry out label correction on the multi-label noisy data set, saves the cost of manpower and material resources, and realizes the efficient utilization of data resources; meanwhile, the prediction result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, so that the noise can be weakened, and the over fitting of the noise can be avoided.

Description

Multi-label image recognition method based on deep learning and under noisy data

Technical Field

The invention relates to the technical field of computer vision and image multi-label classification, in particular to a multi-label image recognition method based on deep learning and under noisy data.

Background

With the continuous development of internet technology, artificial intelligence technology is mature, and deep learning has become one of the most fire branches in the artificial intelligence technology. Deep learning is popular because of excellent performance, abundant frames, convenient calling and simple entry. However, conventional deep learning algorithms require a large number of manually labeled samples as data sets, which are typically large in sample size, often up to tens or even hundreds of thousands of samples, and require that the labels for each sample be accurate. Thus, the creation of a quality dataset suitable for training requires significant human and capital costs, which represents a significant impediment to further development of deep learning. On the other hand, there is a large amount of data containing tag noise on the internet, that is, tags of part of the data are erroneous, and the data can be easily obtained by using a crawler. The traditional deep learning algorithm can only train by using clean and correct data of the labels, and for multi-label noisy data, the traditional deep learning algorithm cannot use the data, so that the waste of data resources is caused.

Taking the identification of orange pictures as an example, a plurality of pictures with labels of orange on the network are found to be wrongly marked after analysis, for example, the pictures of lemon with similar shape and appearance to the orange are marked as orange, and the wrongly marked types are called as first type wrongly marked; or an object far from the orange, such as a sunset of orange, is marked as "orange", and such a mismark is referred to as a second type of mismark. If the data with the error labels are directly used for training a traditional deep learning network, the network learns a plurality of error data, so that the generalization effect of the model is poor, and the model is difficult to apply in a floor mode. In the face of this, there are two approaches to improvement: firstly, the pictures are marked again manually, which consumes great manpower and material resources; and secondly, the part of the data set is directly discarded, and the data resource is wasted.

Therefore, how to train the neural network by using the noisy data sets conveniently is one of the problems to be solved in the future development of deep learning, and is also a trend of development in the big data age.

The prior art discloses a weak supervision image multi-label classification method based on meta learning, which comprises the steps of providing an image multi-label classification model based on label information enhancement, adopting a neural network of an encoding-decoding architecture, and sequentially judging whether labels in a label sequence are related in a sequence labeling mode to obtain related labels of the image; aiming at the phenomenon of model overfitting caused by insufficient supervision information in a weak supervision environment, a teacher-student network architecture training method based on meta learning is also provided, and the accuracy of image annotation is further improved; the method in the prior art only aims at solving the problem that effective modeling cannot be realized due to tag missing, the image without tags or tag errors cannot be effectively corrected, and the accuracy of labeling a data set containing a large amount of noise and error tags is low.

Disclosure of Invention

The invention provides a multi-label image recognition method based on deep learning and under noisy data, which aims to overcome the defect that the correction effect of a data set containing multiple noisy labels in the prior art is poor, and can correct the labels of the multi-label noisy data set, save the cost of manpower and material resources and realize the efficient utilization of data resources.

In order to solve the technical problems, the technical scheme of the invention is as follows:

a multi-label image recognition method based on deep learning under noisy data comprises the following steps:

s1: acquiring a multi-label noisy data set and preprocessing;

s2: establishing a double-branch multi-label correction neural network model;

s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model;

s4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.

Preferably, in the step S1, the specific method for acquiring and preprocessing the multi-tag noisy data set is as follows:

acquiring a multi-label noisy data set according to preset K multi-label classification categories K;

dividing the obtained multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo label

The training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures ¹ And a second sub-training set D ², wherein ,/>

，/>

，

，/>

Representing the i picture->

And its corresponding pseudo tag->

；

Determining length and width data and pseudo tags of pictures in each sub-training set

Wherein the length of the picture is denoted as H and the width of the picture is denoted as W; and finishing the pretreatment of the multi-label noisy data set.

Preferably, a pseudo tag of each picture in each sub-training set is determined

The specific method of the values of (2) is as follows:

judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category k

Otherwise->

。

Preferably, the dual-branch multi-label correction neural network model in step S2 is specifically:

the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel ¹ And a second label modifier model M ² The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M ¹ And a second label modifier model M ² The structure of the model is the same and the model parameters are different;

the first label modifier model M ¹ Or a second label modifier model M ² Comprises a feature extractor, an instance comparison learning module and a category prototype comparison which are connected in sequenceThe device comprises a learning module, a classifier and a label correction module.

Preferably, in the step S3, the preprocessed multi-label noisy data set is input into a dual-branch multi-label correction neural network model for comparison learning training, so as to obtain an optimized dual-branch multi-label correction neural network model, and the specific method comprises the following steps:

s3.1: will first sub training set D ¹ In a picture of

And a second sub-training set D ² Picture->

Common input into a two-branch, multi-tag modified neural network model, wherein +.>

Satisfy->

；

S3.2: modifying the submodel M by using the first label respectively ¹ And a second label modifier model M ² The feature extractor of (1) is used for inputting pictures

And picture->

Extracting features to obtain first features ∈>

And second feature->

And third feature->

And fourth feature->

；

S3.3: will first specialSign of sign

And second feature->

Common input of first tag modifier sub-model M ¹ Is to add the third feature +.>

And fourth feature->

Common input of a second tag modifier sub-model M ² Is to picture +.>

Is>

And third feature->

Performing first contrast learning, and performing +.>

Second feature->

And fourth feature->

Performing first contrast learning, and setting a first loss function +.>

Correction of the first label sub-model M ¹ And a second label modifier model M ² The instance comparison learning module of (a) performs parameter updating;

s3.4: will first feature

Inputting a first label modifier model M ¹ Is compared with a preset first category prototype feature>

Performing a second contrast learning to obtain a fourth characteristic +.>

Inputting a second label modifier model M ² Category prototype comparison learning module of (2) and a preset second category prototype feature +.>

Performing a second contrast learning and setting a second loss function +.>

Correction of the first label sub-model M ¹ And a second label modifier model M ² The category prototype comparison learning module of (1) performs parameter updating;

s3.5: will first feature

Inputting a first label modifier model M ¹ In the classifier of (2), calculate the output picture

Classification probability of (c); fourth feature- >

Inputting a second label modifier model M ² In the classifier of (2), calculate the output picture

Classification probability of (c);

s3.6: picture is made

Is input into a first label modifier sub-model M ¹ The label correction module of (1) for picture->

Pseudo tag of->

Performing label correction to obtain picture->

Is->

The method comprises the steps of carrying out a first treatment on the surface of the Picture->

Is input into a second label modifier sub-model M ² The label correction module of (1) for picture->

Pseudo tag of->

Performing label correction to obtain picture->

Is->

The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->

Respectively calculating a first label correction sub-model M ¹ And a second label correction sub-model M ² The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;

s3.7: according to a first loss function

Second loss function->

And a third loss function->

Setting the total loss function->

And updating parameters of the double-branch multi-label correction neural network model to obtain an optimized double-branch multi-label correction neural network model.

Preferably, the specific method of step S3.3 is as follows:

will first feature

And second feature->

And fourth feature- >

Common input of a second tag modifier sub-model M ² An instance comparison learning module of (a);

for pictures

According to the first feature->

And third feature->

Calculating corresponding first feature vectors

And a second feature vector->

The method specifically comprises the following steps:

wherein ,C₁ For pictures

Is a pseudo tag number of (a); />

Representing picture->

C of (2) ₁ A j-th feature vector;

the obtained first feature vector

Satisfy->

Second feature vector

Satisfy->

；

According to the first feature vector

And a second feature vector->

Construction of the first positive sample pair->

And constructs the first circulation sequence +.>

Satisfy->

，R ₁ For the first cycle sequence->

According to the first cyclic sequence +.>

Constructing a first negative sample pair

Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;

setting a first loss function

Modifying the first label sub-model M ¹ The example comparison learning module of (1) performs parameter updating, specifically:

wherein ,

modifying the submodel M for the first tag ¹ In the example contrast learning module of (1), for pictures +.>

Is a first loss function value,/>

For picture->

Total number of categories required for multi-tag classification, +.>

For picture->

Corresponding->

Category (S),>

is a temperature coefficient>

For picture->

Is>

The 1 st feature vector after dimension reduction;

For pictures

According to the second feature->

And fourth feature->

Calculating corresponding third eigenvector->

And fourth feature vector->

The method specifically comprises the following steps:

wherein ,C₂ For pictures

Is a pseudo tag number of (a); />

Representing picture->

C of (2) ₂ A j-th feature vector;

the obtained third feature vector

Satisfy->

Fourth feature vector->

Satisfy->

；

According to the third feature vector

And fourth feature vector->

Constructing a second positive sample pair

And constructing a second circulation sequence +.>

Satisfy->

，R ₂ For the second cycle sequence->

According to the second cyclic sequence +.>

Constructing a second negative pair of samples

Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;

setting a first loss function

Modifying the second label sub-model M ² The example comparison learning module of (1) performs parameter updating, specifically:

wherein ,

modifying the submodel M for the second label ² In the example contrast learning module of (1), for pictures +.>

Is a first loss function value,/>

For picture->

Total number of categories required for multi-tag classification, +.>

For picture->

Corresponding->

The number of categories of the product,

for picture->

Is>

And the 2 nd feature vector after dimension reduction.

Preferably, the specific method of step S3.4 is as follows:

will first feature

Inputting a first label modifier model M ¹ The category prototype comparison learning module of (1) compares pictures

Is>

Prototype feature +.>

Proceeding withSecond contrast learning, update the first class prototype feature with momentum method>

：

wherein ,

for the first class prototype feature corresponding to the updated kth class,/for example>

For a first class prototype feature corresponding to the kth class, m is a preset momentum;

setting a second loss function

Modifying the first label sub-model M ¹ The category prototype comparison learning module of (1) performs parameter updating, specifically:

wherein ,

modifying the submodel M for the first tag ¹ In the category prototype comparison learning module of (1), for pictures +.>

A second loss function value of (2);

will fourth feature

Inputting a second label modifier model M ² The category prototype comparison learning module of (1) compares the pictures +.>

Second feature vector +.>

And a second category prototype feature +.>

Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>

：

wherein ,

for the updated->

Second category prototype feature corresponding to the respective category, +.>

Is->

A second class prototype feature corresponding to the individual class;

setting a second loss function

Modifying the second label sub-model M ² The category prototype comparison learning module of (1) performs parameter updating, specifically:

wherein ,

modifying the submodel M for the second label ² In the category prototype comparison learning module of (1), for pictures +.>

A second loss function value of (c).

Preferably, the specific method of step S3.5 is as follows:

will first feature

Inputting a first label modifier model M ¹ In the classifier of (2) calculating the output picture +.>

The classification probability of (3) is specifically:

wherein ,

for picture->

Classification probability of->

For sigmoid function, +.>

Calculating a function for the confidence score of the classifier;

will fourth feature

Inputting a second label modifier model M ² In the classifier of (2) calculating the output picture +.>

The classification probability of (3) is specifically:

/>

wherein ,

for picture->

Classification probability of->

For sigmoid function, +.>

A function is calculated for the confidence score of the classifier.

Preferably, the specific method of step S3.6 is as follows:

picture is made

Classification probability of->

Inputting a first label modifier model M ¹ Is set to a first threshold +.>

Second threshold->

Third threshold->

And a fourth threshold->

Dynamically updating four thresholds by using a preset momentum m;

according to the updated third threshold value

And a fourth threshold->

And picture->

Classification probability of- >

Determining binary noise tag->

Is a value of (2);

according to the updated first threshold value

And a second threshold->

Get pictures->

Intermediate label->

；

When noise label

At the time of using picture->

Intermediate label->

Replacement picture->

Is a pseudo tag of (a)

As picture->

Is->

；

When noise label

When the picture is reserved->

Pseudo tag of->

As picture->

Is a correction label of (a)

；

Picture is made

Classification probability of->

Inputting a second label modifier model M ² A tag correction module of (a);

according to the updated third threshold value

And a fourth threshold->

And picture->

Classification probability of->

Determining binary noise tag->

Is a value of (2);

according to the updated first threshold value

And a second threshold->

Get pictures->

Intermediate label->

；

When noise label

At the time of using picture->

Intermediate label->

Replacement picture->

Pseudo tag of->

As picture->

Is->

；

When noise label

When the picture is reserved->

Pseudo tag of->

As picture->

Is->

；

The third loss function

The method comprises the following steps:

wherein ,

the binary cross entropy for the i-th picture is lost.

Preferably, the total loss function in step S3.7

The method comprises the following steps:

wherein ,

for the total loss function value->

For the first loss function->

Balance factor of- >

As a second loss function

Is a balance factor of (a).

Compared with the prior art, the technical scheme of the invention has the beneficial effects that:

the invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, acquiring a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;

according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, a dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction and image recognition can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.

Drawings

Fig. 1 is a flowchart of a multi-label image recognition method under noisy data based on deep learning according to embodiment 1.

Fig. 2 is a comparative learning training flowchart of the dual-branch multi-label modified neural network model provided in embodiment 2.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;

for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;

it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.

Example 1

As shown in fig. 1, the embodiment provides a multi-label image recognition method under noisy data based on deep learning, which includes the following steps:

s1: acquiring a multi-label noisy data set and preprocessing;

s2: establishing a double-branch multi-label correction neural network model;

In the specific implementation process, firstly, a multi-label noisy data set is obtained and preprocessed, and in the embodiment, the multi-label noisy data set is obtained from the Internet; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;

according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of the multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is achieved.

Example 2

The embodiment provides a multi-label image recognition method based on deep learning under noisy data, which comprises the following steps:

s1: acquiring a multi-label noisy data set and preprocessing;

s2: establishing a double-branch multi-label correction neural network model;

In the specific implementation process, firstly acquiring and preprocessing a multi-label noisy data set, acquiring the multi-label noisy data set according to preset K multi-label classification categories K, dividing the acquired multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo label

The training set is marked as X, and the specific method is as follows:

microsoft COCO and Pascal VOC are the two most widely used datasets in evaluating the MLR algorithm, where the Microsoft COCO dataset contains 80 categories and the Pascal VOC dataset contains 20 categories, in this embodiment, the 80 categories contained in the Microsoft COCO dataset are selected to construct the Web-COCO and Web-Pascal datasets, with one or more categories selected randomly as keywords, such as: "person" or "person, truck, bus";

Searching corresponding pictures from a search engine, wherein the pictures comprise google, hundred degrees and necessary pictures, and taking more than 500000 obtained noisy pictures as a multi-label noisy data set;

then, eliminating incomplete and repeated pictures, constructing a Web-COCO data set by using the rest 290000 noisy pictures, and deeply selecting pictures at least comprising one of 20 PascalVOC categories to construct the Web-Pascal data set;

the Web-COCO data set contains 290000 pictures, and each picture needs to be endowed with a pseudo tag according to category keywords

Randomly selecting 20000 pictures for manual annotation, and endowing the pictures with more accurate and deeper description;

the Web-co dataset had the following drawbacks: firstly, the label noise exists, and when the network searches data, the label noise is inevitably generated; in the multi-tag picture of the present embodiment, tag noise can be divided into the following cases: a picture contains many categories of information, but the corresponding keywords do not contain these categories, which may lead to erroneous, negative-type pictures; a better description of noisy pictures is obtained by calculating the accuracy and recall of each class, which results show an average recall and accuracy of 46.1% and 64.6%, respectively, which indicates that there is severe label noise in the dataset;

Another drawback is semantic dispersion, a multi-labeled image contains multiple semantic objects spread across the image; therefore, it is necessary to find the corresponding semantic region to help find the missing label, while searching the whole image also helps correct the wrong front label;

a third drawback is that the category is not uniform, and in the real world, the phenomenon of category non-uniformity is common and appears more serious when the network retrieves multi-tag pictures; for example: the maximum number of pictures is "people", 15% and the minimum sum of 20 pictures is only 5% of the total, for evaluating WS-MLR tasks we use Web-COCO as training set and Microsoft COCO as validation set containing 40,504 fully manually annotated images;

the Web-Pascal data set comprises 236043 pictures, 20 categories in the Pascal VOC data set are used, and similar to the Web-COCO data set, the Web-Pascal data set also has the defects of label noise, semantic dispersion, uneven categories and the like; likewise, 4952 manually annotated pictures in the Web-Pascal dataset are used as verification sets, and other pictures are used as training sets;

dividing the training set into two first sub-training sets D with the same number of pictures ¹ And a second sub-training set D ², wherein ,

，/>

，/>

，/>

representing the i picture->

And its corresponding pseudo tag->

；

Wherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;

determining pseudo tags for pictures in each sub-training set

The specific method of the values of (2) is as follows:

Otherwise->

As shown in fig. 2, a dual-branch multi-label correction neural network model is established, and the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel ¹ And a second label modifier model M ² The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M ¹ And a second label modifier model M ² Is identical in structure and model parametersDifferent;

the first label modifier model M ¹ Or a second label modifier model M ² The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence;

inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:

S3.1: will first sub training set D ¹ In a picture of

And a second sub-training set D ² Picture->

Satisfy->

；

And picture->

Extracting features to obtain first features ∈>

And second feature->

And third feature->

And fourth feature->

；

S3.3: will first feature

And second feature->

And fourth feature->

Common input of a second tag modifier sub-model M ² Is to picture +.>

Is>

And third feature->

Performing first contrast learning, and performing +.>

Second feature->

And fourth feature->

Performing first contrast learning, and setting a first loss function +.>

Correction of the first label sub-model M ¹ And a second label modifier model M ² The example comparison learning module of (1) performs parameter updating, specifically:

will first feature

And second feature->

Common input of first tag modifier sub-model M ¹ Is to add the third feature +. >

And fourth feature->

for pictures

According to the first feature->

And third feature->

Calculating corresponding first feature vectors

And a second feature vector->

The method specifically comprises the following steps:

/>

wherein ,C₁ For pictures

Is a pseudo tag number of (a); />

Representing picture->

C of (2) ₁ A j-th feature vector;

the obtained first feature vector

Satisfy->

Second feature vector

Satisfy->

；

According to the first feature vector

And a second feature vector->

Constructing a first positive sample pair

And constructs the first circulation sequence +.>

Satisfy->

，R ₁ For the first cycle sequence->

In the present embodiment, R ₁ =8192, according to the first cyclic sequence +.>

Construction of the first negative sample pair +.>

setting a first lossFunction of

wherein ,

Is a first loss function value,/>

For picture->

Total number of categories required for multi-tag classification, +.>

For picture->

Corresponding->

The number of categories of the product,

Is a temperature coefficient>

For picture->

Is>

The 1 st eigenvector after dimension reduction, in this embodiment,/th eigenvector>

，

Is 128%>

Is 2048 in dimension;

for pictures

According to the second feature->

And fourth feature->

Calculating corresponding third eigenvector->

And fourth feature vector->

The method specifically comprises the following steps:

wherein ,C₂ For pictures

Is a pseudo tag number of (a); />

Representing picture->

C of (2) ₂ A j-th feature vector;

the obtained third feature vector

Satisfy->

Fourth feature vector->

Satisfy->

；

According to the third feature vector

And fourth feature vector->

Constructing a second positive sample pair

And constructing a second circulation sequence +.>

Satisfy->

，R ₂ For the second cycle sequence->

In the present embodiment, R ₂ =8192, according to the second cyclic sequence +.>

Construction of a second negative sample pair +.>

setting a first loss function

Modifying the second label sub-model M ² The example comparison learning module of (1) performs parameter updating, specifically: />

wherein ,

Is a first loss function value,/>

For picture->

Total number of categories required for multi-tag classification, +. >

For picture->

Corresponding->

The number of categories of the product,

for picture->

Is>

The 2 nd feature vector after dimension reduction;

s3.4: will first feature

Performing a second contrast learning to obtain a fourth characteristic +.>

Performing a second contrast learning and setting a second loss function +.>

Correction of the first label sub-model M ¹ And a second label modifier model M ² The category prototype comparison learning module of (1) performs parameter updating, specifically:

will first feature

Is>

Prototype feature +.>

Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>

：

wherein ,

Is the firstThe first class prototype features corresponding to the k classes, m being a preset momentum;

setting a second loss function

wherein ,

A second loss function value of (2);

will fourth feature

Second feature vector +.>

And a second category prototype feature +.>

：

wherein ,

for the updated->

Second category prototype feature corresponding to the respective category, +.>

Is->

A second class prototype feature corresponding to the individual class;

setting a second loss function

/>

wherein ,

A second loss function value of (2);

s3.5: will first feature

Classification probability of (c); fourth feature->

The classification probability of (3) is specifically:

will first feature

Inputting a first label modifier model M ¹ In the classifier of (2) calculating the output picture +. >

The classification probability of (3) is specifically:

wherein ,

for picture->

Classification probability of->

For sigmoid function, +.>

Calculating a function for the confidence score of the classifier;

will fourth feature

The classification probability of (3) is specifically:

wherein ,

for picture->

Classification probability of->

For sigmoid function, +.>

Calculating a function for the confidence score of the classifier;

s3.6: picture is made

Pseudo tag of->

Performing label correction to obtain picture->

Is->

Pseudo tag of->

Performing label correction to obtain picture->

Is->

Respectively calculating a first label correction sub-model M ¹ And a second label correction sub-model M ² The cross entropy loss of the label correction module of (a) to update parameters, specifically:

picture is made

Classification probability of->

Inputting a first label modifier model M ¹ Is set to a first threshold +.>

Second threshold->

Third threshold- >

And a fourth threshold->

Dynamically updating four thresholds with a preset momentum m, in this embodiment a third threshold +.>

Fourth threshold->

；

According to the updated third threshold value

And a fourth threshold->

And picture->

Classification probability of->

Determining binary noise tag->

Is a value of (2);

according to the updated first threshold value

And a second threshold->

Get pictures->

Intermediate label->

；

When noise label

At the time of using picture->

Intermediate label->

Replacement picture->

Is a pseudo tag of (a)

As picture->

Is->

；

When noise label

When the picture is reserved->

Pseudo tag of->

As picture->

Is a correction label of (a)

；

First tag modifier sub-model M ¹ The specific correction process in the label correction module is as follows:

picture is made

Classification probability of->

Inputting a second label modifier model M ² A tag correction module of (a);

according to the updated third threshold value

And a fourth threshold->

And picture->

Classification probability of->

Determining binary noise tag->

Is a value of (2);

according to the updated first threshold value

And a second thresholdValue->

Get pictures->

Intermediate label->

；

When noise label

At the time of using picture->

Intermediate label->

Replacement picture->

Pseudo tag of->

As picture->

Is->

；

When noise label

When the picture is reserved->

Pseudo tag of->

As picture->

Is->

；

The third loss function

The method comprises the following steps:

wherein ,

a binary cross entropy penalty for the i-th picture;

s3.7: according to a first loss function

Second loss function->

And a third loss function->

Setting the total loss function->

Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;

the total loss function

The method comprises the following steps:

wherein ,

for the total loss function value->

For the first loss function->

Balance factor of->

As a second loss function

In the present example, ++>

，/>

；

Finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;

according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.

The same or similar reference numerals correspond to the same or similar components;

the terms describing the positional relationship in the drawings are merely illustrative, and are not to be construed as limiting the present patent;

it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims

1. The multi-label image recognition method based on deep learning under noisy data is characterized by comprising the following steps:

s1: acquiring a multi-label noisy data set and preprocessing;

s2: establishing a double-branch multi-label correction neural network model;

2. The method for identifying the multi-label image under the noisy data based on the deep learning according to claim 1, wherein in the step S1, the specific method for acquiring the multi-label noisy data set and preprocessing is as follows:

The training set is marked as X; dividing the training set into two pictures againEqual number of first sub-training sets D ¹ And a second sub-training set D ², wherein ,/>

，/>

，

，/>

Representing the i picture->

And its corresponding pseudo tag->

；

3. The method for identifying multi-label images under noisy data based on deep learning according to claim 2, wherein the pseudo labels of the pictures in each sub-training set are determined

The specific method of the values of (2) is as follows:

Otherwise->

。

4. The method for identifying multi-label image under noisy data based on deep learning according to claim 1, wherein the dual-branch multi-label correction neural network model in step S2 is specifically:

the first label modifier model M ¹ Or a second label modifier model M ² The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence.

5. The method for identifying the multi-label image under the noisy data based on deep learning according to claim 4, wherein in the step S3, the preprocessed multi-label noisy data set is input into a dual-branch multi-label correction neural network model for comparison learning training, and an optimized dual-branch multi-label correction neural network model is obtained, and the specific method comprises the following steps:

S3.1: will first sub training set D ¹ In a picture of

And a second sub-training set D ² Picture->

Satisfy->

；

And picture->

Extracting features to obtain first features ∈>

And second feature->

And third feature->

And fourth feature->

；

S3.3: will first feature

And second feature->

And fourth feature->

Common input of a second tag modifier sub-model M ² Is to picture +.>

Is>

And third feature->

Performing first contrast learning, and performing +.>

Second feature->

And fourth feature->

Performing first contrast learning, and setting a first loss function +.>

s3.4: will first feature

Inputting a first label modifier model M ¹ Is compared with a preset first category prototype feature >

Performing a second contrast learning to obtain a fourth characteristic +.>

Performing a second contrast learning and setting a second loss function +.>

s3.5: will first feature

Classification probability of (c); fourth feature->

Classification probability of (c);

s3.6: picture is made

Pseudo tag of->

Performing label correction to obtain picture->

Is->

Pseudo tag of->

Performing label correction to obtain picture->

Is->

S3.7: according to a first loss function

Second loss function->

And a third loss function->

Setting the total loss function->

6. The method for identifying multi-label image under noisy data based on deep learning according to claim 5, wherein the specific method in step S3.3 is as follows:

will first feature

And second feature->

And fourth feature->

for pictures

According to the first feature->

And third feature->

Calculating corresponding first eigenvector->

And a second feature vector->

The method specifically comprises the following steps:

wherein ,C₁ For pictures

Is a pseudo tag number of (a); />

Representing picture->

C of (2) ₁ A j-th feature vector; />

The obtained first feature vector

Satisfy->

Second feature vector->

Satisfy->

；

According to the first feature vector

And a second feature vector->

Constructing a first positive sample pair

And constructs the first circulation sequence +.>

Satisfy->

，R ₁ For the first cycle sequence- >

According to the first cyclic sequence +.>

Constructing a first negative sample pair

setting a first loss function

wherein ,

Is a first loss function value,/>

For picture->

Total number of categories required for multi-tag classification, +.>

For picture->

Corresponding->

Category (S),>

is a temperature coefficient>

For picture->

Is>

The 1 st feature vector after dimension reduction;

for pictures

According to the second feature->

And fourth feature->

Calculating corresponding third eigenvector->

And fourth feature vector->

The method specifically comprises the following steps:

wherein ,C₂ For pictures

Is a pseudo tag number of (a); />

Representing picture->

C of (2) ₂ A j-th feature vector;

the obtained third feature vector

Satisfy->

Fourth feature vector->

Satisfy the following requirements

；

According to the third feature vector

And fourth feature vector->

Construction of a second positive sample pair->

And constructing a second circulation sequence +.>

Satisfy->

，R ₂ For the second cycle sequence->

According to the second cyclic sequence +. >

Construction of a second negative sample pair +.>

setting a first loss function

wherein ,

Is a first loss function value,/>

For picture->

Total number of categories required for multi-tag classification, +.>

For picture->

Corresponding->

Category (S),>

for picture->

Is>

And the 2 nd feature vector after dimension reduction.

7. The method for identifying multi-label image under noisy data based on deep learning according to claim 6, wherein the specific method in step S3.4 is as follows:

will first feature

Inputting a first label modifier model M ¹ The category prototype comparison learning module of (1) compares the pictures +.>

Is>

Prototype feature +.>

：

wherein ,

Setting a second loss function

wherein ,

A second loss function value of (2);

will fourth feature

Second feature vector +.>

And a second category prototype feature +.>

：

wherein ,

for the updated->

Second category prototype feature corresponding to the respective category, +.>

Is->

A second class prototype feature corresponding to the individual class;

setting a second loss function

wherein ,

A second loss function value of (c).

8. The method for identifying multi-label image under noisy data based on deep learning according to claim 7, wherein the specific method in step S3.5 is as follows:

Will first feature

The classification probability of (3) is specifically:

/>

wherein ,

for picture->

Classification probability of->

For sigmoid function, +.>

Calculating a function for the confidence score of the classifier;

will fourth feature

The classification probability of (3) is specifically:

wherein ,

for picture->

Classification probability of->

For sigmoid function, +.>

A function is calculated for the confidence score of the classifier.

9. The method for identifying multi-label images under noisy data based on deep learning according to claim 8, wherein the specific method in step S3.6 is as follows:

picture is made