CN116012569B - Multi-label image recognition method based on deep learning and under noisy data - Google Patents
Multi-label image recognition method based on deep learning and under noisy data Download PDFInfo
- Publication number
- CN116012569B CN116012569B CN202310299402.5A CN202310299402A CN116012569B CN 116012569 B CN116012569 B CN 116012569B CN 202310299402 A CN202310299402 A CN 202310299402A CN 116012569 B CN116012569 B CN 116012569B
- Authority
- CN
- China
- Prior art keywords
- label
- picture
- feature
- model
- tag
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 94
- 238000013135 deep learning Methods 0.000 title claims abstract description 24
- 238000012937 correction Methods 0.000 claims abstract description 113
- 238000012549 training Methods 0.000 claims abstract description 66
- 238000003062 neural network model Methods 0.000 claims abstract description 48
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 239000003607 modifier Substances 0.000 claims description 93
- 230000006870 function Effects 0.000 claims description 48
- 230000009467 reduction Effects 0.000 claims description 16
- 238000010276 construction Methods 0.000 claims description 8
- 125000004122 cyclic group Chemical group 0.000 claims description 5
- 238000012795 verification Methods 0.000 claims description 4
- 239000000463 material Substances 0.000 abstract description 6
- 238000011161 development Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 239000006185 dispersion Substances 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003313 weakening effect Effects 0.000 description 2
- 244000248349 Citrus limon Species 0.000 description 1
- 235000005979 Citrus limon Nutrition 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label; the method can carry out label correction on the multi-label noisy data set, saves the cost of manpower and material resources, and realizes the efficient utilization of data resources; meanwhile, the prediction result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, so that the noise can be weakened, and the over fitting of the noise can be avoided.
Description
Technical Field
The invention relates to the technical field of computer vision and image multi-label classification, in particular to a multi-label image recognition method based on deep learning and under noisy data.
Background
With the continuous development of internet technology, artificial intelligence technology is mature, and deep learning has become one of the most fire branches in the artificial intelligence technology. Deep learning is popular because of excellent performance, abundant frames, convenient calling and simple entry. However, conventional deep learning algorithms require a large number of manually labeled samples as data sets, which are typically large in sample size, often up to tens or even hundreds of thousands of samples, and require that the labels for each sample be accurate. Thus, the creation of a quality dataset suitable for training requires significant human and capital costs, which represents a significant impediment to further development of deep learning. On the other hand, there is a large amount of data containing tag noise on the internet, that is, tags of part of the data are erroneous, and the data can be easily obtained by using a crawler. The traditional deep learning algorithm can only train by using clean and correct data of the labels, and for multi-label noisy data, the traditional deep learning algorithm cannot use the data, so that the waste of data resources is caused.
Taking the identification of orange pictures as an example, a plurality of pictures with labels of orange on the network are found to be wrongly marked after analysis, for example, the pictures of lemon with similar shape and appearance to the orange are marked as orange, and the wrongly marked types are called as first type wrongly marked; or an object far from the orange, such as a sunset of orange, is marked as "orange", and such a mismark is referred to as a second type of mismark. If the data with the error labels are directly used for training a traditional deep learning network, the network learns a plurality of error data, so that the generalization effect of the model is poor, and the model is difficult to apply in a floor mode. In the face of this, there are two approaches to improvement: firstly, the pictures are marked again manually, which consumes great manpower and material resources; and secondly, the part of the data set is directly discarded, and the data resource is wasted.
Therefore, how to train the neural network by using the noisy data sets conveniently is one of the problems to be solved in the future development of deep learning, and is also a trend of development in the big data age.
The prior art discloses a weak supervision image multi-label classification method based on meta learning, which comprises the steps of providing an image multi-label classification model based on label information enhancement, adopting a neural network of an encoding-decoding architecture, and sequentially judging whether labels in a label sequence are related in a sequence labeling mode to obtain related labels of the image; aiming at the phenomenon of model overfitting caused by insufficient supervision information in a weak supervision environment, a teacher-student network architecture training method based on meta learning is also provided, and the accuracy of image annotation is further improved; the method in the prior art only aims at solving the problem that effective modeling cannot be realized due to tag missing, the image without tags or tag errors cannot be effectively corrected, and the accuracy of labeling a data set containing a large amount of noise and error tags is low.
Disclosure of Invention
The invention provides a multi-label image recognition method based on deep learning and under noisy data, which aims to overcome the defect that the correction effect of a data set containing multiple noisy labels in the prior art is poor, and can correct the labels of the multi-label noisy data set, save the cost of manpower and material resources and realize the efficient utilization of data resources.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a multi-label image recognition method based on deep learning under noisy data comprises the following steps:
s1: the method comprises the steps of obtaining a multi-label noisy data set and preprocessing, wherein the specific method comprises the following steps:
acquiring a multi-label noisy data set according to preset K multi-label classification categories;
dividing the obtained multi-tag noisy data set intoThe training set comprises N pictures, and each picture is marked with a pseudo tagThe training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,/>,,/>,/>Representing the i picture->And its corresponding pseudo tag->;
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
S2: the method comprises the steps of establishing a double-branch multi-label correction neural network model, specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 Includes feature extractor, instance contrast learning connected in turnThe device comprises a module, a category prototype comparison learning module, a classifier and a label correction module;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:
s3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting pictures And picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->;
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 Example contrast learning moduleUpdating row parameters;
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +.>Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>Classification probability of (c);
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (2) for picturesPseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain a pictureIs->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
s3.7: according to a first loss functionSecond loss function->And a third loss function->Setting the total loss function->Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
S4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
Preferably, a pseudo tag of each picture in each sub-training set is determinedThe specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category kOtherwise->。
Preferably, the specific method of step S3.3 is as follows:
will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculating corresponding first eigenvector->And a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy- >Second feature vector->Satisfy the following requirements/>;
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructs the first circulation sequence +.>Satisfies the following conditions,R 1 For the first cycle sequence->Is a sequence length of (2);
according to a first cyclic sequenceConstruction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is the first of (2)The 2 nd feature vector after dimension reduction; />For picture->Classifying the value of the pseudo tag of class k relative to the multi-tag;
for picturesAccording to the second feature->And fourth feature->Calculating corresponding third eigenvector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture- >C of (2) 2 A j-th feature vector;
the obtained third feature vectorSatisfy->Fourth feature vector->Satisfy the following requirements;
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd constructing a second circulation sequence +.>Satisfy->,R 2 For the second cycle sequence->Is a sequence length of (2);
according to a second cyclic sequenceConstruction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,is a second tag modifierModel M 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->The number of categories of the product,for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st feature vector after dimension reduction; />For picture->Relative multi-tag class->Is a pseudo tag value of (a).
Preferably, the specific method of step S3.4 is as follows:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares pictures Is>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>:
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for picturesA second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>:
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (c).
Preferably, the specific method of step S3.5 is as follows:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>A function is calculated for the confidence score of the classifier.
Preferably, the specific method of step S3.6 is as follows:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold value And a second threshold->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Is a pseudo tag of (a)As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->;
Picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->;
The third loss functionThe method comprises the following steps:
wherein ,loss of binary cross entropy for the ith picture,/->The value of the pseudo tag of category k is classified for the i-th picture relative to the multi-tag.
Preferably, the total loss function in step S3.7The method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of- >As a second loss functionIs a balance factor of (a).
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, acquiring a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, a dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction and image recognition can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.
Drawings
Fig. 1 is a flowchart of a multi-label image recognition method under noisy data based on deep learning according to embodiment 1.
Fig. 2 is a comparative learning training flowchart of the dual-branch multi-label modified neural network model provided in embodiment 2.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, the embodiment provides a multi-label image recognition method under noisy data based on deep learning, which includes the following steps:
s1: acquiring a multi-label noisy data set and preprocessing;
s2: establishing a double-branch multi-label correction neural network model;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model;
S4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
In the specific implementation process, firstly, a multi-label noisy data set is obtained and preprocessed, and in the embodiment, the multi-label noisy data set is obtained from the Internet; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of the multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is achieved.
Example 2
The embodiment provides a multi-label image recognition method based on deep learning under noisy data, which comprises the following steps:
s1: the method comprises the steps of obtaining a multi-label noisy data set and preprocessing, wherein the specific method comprises the following steps:
acquiring a multi-label noisy data set according to preset K multi-label classification categories;
dividing the obtained multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo labelThe training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,/>,,/>,/>Representing the i picture->And its corresponding pseudo tag->;
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
s2: the method comprises the steps of establishing a double-branch multi-label correction neural network model, specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence;
s3: as shown in fig. 2, the preprocessed multi-label noisy data set is input into a dual-branch multi-label correction neural network model for comparison learning training, and an optimized dual-branch multi-label correction neural network model is obtained, and the specific method is as follows:
s3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting picturesAnd picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->;
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +. >And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The instance comparison learning module of (a) performs parameter updating;
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +.>Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +. >Classification probability of (c);
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (2) for picturesPseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain a pictureIs->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
s3.7: according to a first loss functionSecond loss function->And a third loss function->Setting the total loss function->Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
s4: acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, acquiring a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
Determining pseudo tags for pictures in each sub-training setThe specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category kOtherwise->;
The specific method of the step S3.3 is as follows:
will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculation ofCorresponding first feature vector->And a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy->Second feature vector->Satisfy->;
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructing a first cycle sequenceColumn->Satisfies the following conditions,R 1 For the first cycle sequence->Is a sequence length of (2);
according to a first cyclic sequenceConstruction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
Setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is the first of (2)The 2 nd feature vector after dimension reduction; />For picture->Classifying the value of the pseudo tag of class k relative to the multi-tag;
for picturesAccording to the second feature->And fourth feature->Calculate the correspondingThird feature vector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 2 A j-th feature vector;
the obtained third feature vectorSatisfy->Fourth feature vector->Satisfy the following requirements;
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd constructing a second circulation sequence +.>Satisfy->,R 2 For the second cycle sequence->Is a sequence length of (2);
according to a second cyclic sequenceConstruction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
Setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->The number of categories of the product,for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st feature vector after dimension reduction; />For picture->Relative multi-tag class->Is a pseudo tag value of (1);
the specific method of the step S3.4 is as follows:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares picturesIs>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>:
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for picturesA second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>:
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically: />
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (2);
the specific method of the step S3.5 is as follows:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>Classification probability of (1), in particular:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
the specific method of the step S3.6 is as follows:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Is a pseudo tag of (a)As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->;
First tag modifier sub-model M 1 The specific correction process in the label correction module is as follows:
picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold- >And picture->Classification probability of->Determining binaryNoise label->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->;
The third loss functionThe method comprises the following steps:
wherein ,loss of binary cross entropy for the ith picture,/->Classifying the value of the pseudo tag of category k for the i-th picture relative to the multi-tag;
the total loss function in step S3.7The method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of->As a second loss functionIs a balance factor of (a).
In the specific implementation process, firstly acquiring and preprocessing a multi-label noisy data set, acquiring the multi-label noisy data set according to preset K multi-label classification categories, dividing the acquired multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo labelThe training set is marked as X, and the specific method is as follows:
microsoft COCO and Pascal VOC are the two most widely used datasets in evaluating the MLR algorithm, where the Microsoft COCO dataset contains 80 categories and the Pascal VOC dataset contains 20 categories, in this embodiment, the 80 categories contained in the Microsoft COCO dataset are selected to construct the Web-COCO and Web-Pascal datasets, with one or more categories selected randomly as keywords, such as: "person" or "person, truck, bus";
Searching corresponding pictures from a search engine, wherein the pictures comprise google, hundred degrees and necessary pictures, and taking more than 500000 obtained noisy pictures as a multi-label noisy data set;
then, eliminating incomplete and repeated pictures, constructing a Web-COCO data set by using the rest 290000 noisy pictures, and deeply selecting pictures at least comprising one of 20 Pascal VOC categories to construct the Web-Pascal data set;
the Web-COCO data set contains 290000 pictures, and each picture needs to be endowed with a pseudo tag according to category keywordsRandomly selecting 20000 pictures for manual annotation, and endowing the pictures with more accurate and deeper description;
the Web-co dataset had the following drawbacks: firstly, the label noise exists, and when the network searches data, the label noise is inevitably generated; in the multi-tag picture of the present embodiment, tag noise can be divided into the following cases: a picture contains many categories of information, but the corresponding keywords do not contain these categories, which may lead to erroneous, negative-type pictures; a better description of noisy pictures is obtained by calculating the accuracy and recall of each class, which results show an average recall and accuracy of 46.1% and 64.6%, respectively, which indicates that there is severe label noise in the dataset;
Another drawback is semantic dispersion, a multi-labeled image contains multiple semantic objects spread across the image; therefore, it is necessary to find the corresponding semantic region to help find the missing label, while searching the whole image also helps correct the wrong front label;
a third drawback is that the category is not uniform, and in the real world, the phenomenon of category non-uniformity is common and appears more serious when the network retrieves multi-tag pictures; for example: the maximum number of pictures is "people", 15% and the minimum sum of 20 pictures is only 5% of the total, for evaluating WS-MLR tasks we use Web-COCO as training set and Microsoft COCO as validation set containing 40,504 fully manually annotated images;
the Web-Pascal data set comprises 236043 pictures, 20 categories in the Pascal VOC data set are used, and similar to the Web-COCO data set, the Web-Pascal data set also has the defects of label noise, semantic dispersion, uneven categories and the like; likewise, 4952 manually annotated pictures in the Web-Pascal dataset are used as verification sets, and other pictures are used as training sets;
dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,,/>,/>,/>representing the i picture->And its corresponding pseudo tag->;
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
establishing a double-branch multi-label correction neural network model;
inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:
s3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting picturesAnd picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->;
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculate the corresponding firstFeature vectorAnd a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy->Second feature vector->Satisfy->;
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructs the first circulation sequence +. >Satisfies the following conditions,R 1 For the first cycle sequence->In the present embodiment, R 1 =8192; according to the first cycle sequence->Construction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is the first of (2)The 2 nd eigenvector after dimension reduction, < ->For picture->The value of the pseudo tag for the category k of the relative multi-tag class, in this embodiment ++>,/>Is 128%>Is 2048 in dimension; />
For picturesAccording to the second feature->And fourth feature->Calculating corresponding third eigenvector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 2 A j-th feature vector;
the obtained third feature vector Satisfy->Fourth feature vector->Satisfy the following requirements;
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd constructing a second circulation sequence +.>Satisfy->,R 2 For the second cycle sequence->In the present embodiment, R 2 =8192; according to the second circulation sequence->Construction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->The number of categories of the product,for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Relative multi-tag class->Is a pseudo tag value of (1);
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +. >Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares picturesIs>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>:
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for picturesA second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +. >Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>:
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (2);
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2), an output graph is calculatedSheet->The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (2) for picturesPseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain a pictureIs->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of (a) to update parameters, specifically:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds with a preset momentum m, in this embodiment a third threshold +.>Fourth threshold->;
According to the updated third threshold valueAnd a fourth threshold->And picture- >Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Is a pseudo tag of (a)As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->;
First tag modifier sub-model M 1 The specific correction process in the label correction module is as follows:
picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->;
The third loss functionThe method comprises the following steps:
wherein ,a binary cross entropy penalty for the i-th picture;
s3.7: according to a first loss function Second loss function->And a third loss function->Setting the total loss function->Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
the total loss functionThe method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of->For the second loss function->In the present example, ++>,/>;
Finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.
The same or similar reference numerals correspond to the same or similar components;
the terms describing the positional relationship in the drawings are merely illustrative, and are not to be construed as limiting the present patent;
it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.
Claims (7)
1. The multi-label image recognition method based on deep learning under noisy data is characterized by comprising the following steps:
s1: the method comprises the steps of obtaining a multi-label noisy data set and preprocessing, wherein the specific method comprises the following steps:
acquiring a multi-label noisy data set according to preset K multi-label classification categories;
dividing the obtained multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo label The training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,/>,,/>,/>Representing the i picture->And its corresponding pseudo tag->;
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
s2: the method comprises the steps of establishing a double-branch multi-label correction neural network model, specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:
S3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting picturesAnd picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->;
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The instance comparison learning module of (a) performs parameter updating;
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +.>Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>Classification probability of (c);
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
s3.7: according to a first loss functionSecond loss function->And a third loss function->Setting a total loss functionParameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
s4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
2. The method for identifying multi-label images under noisy data based on deep learning according to claim 1, wherein the pseudo labels of the pictures in each sub-training set are determinedThe specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category kOtherwise->。
3. The method for identifying the multi-label image under the noisy data based on the deep learning according to claim 2, wherein the specific method of the step S3.3 is as follows:
Will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculating corresponding first eigenvector->And a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy->Second feature vector->Satisfy the following requirements;
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructs the first circulation sequence +.>Satisfies the following conditions,R 1 For the first cycle sequence->Is a sequence length of (2);
according to the first cycle sequenceColumn ofConstruction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/ >For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is>The 2 nd feature vector after dimension reduction; />For picture->Classifying the value of the pseudo tag of class k relative to the multi-tag;
for picturesAccording to the second feature->And fourth feature->Calculating corresponding third eigenvector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 2 A j-th feature vector;
the obtained third feature vectorSatisfy->Fourth feature vector->Satisfy the following requirements;
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd is combined withConstruction of the second circulation sequence->Satisfy->,R 2 For the second cycle sequence->Is a sequence length of (2);
according to a second cyclic sequenceConstruction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st feature vector after dimension reduction;for picture->Relative multi-tag class->Is a pseudo tag value of (a).
4. The method for identifying multi-label image under noisy data based on deep learning according to claim 3, wherein the specific method in step S3.4 is as follows:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares the pictures +.>Is>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>:
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>:
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (c).
5. The method for identifying multi-label image under noisy data based on deep learning according to claim 4, wherein the specific method in step S3.5 is as follows:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +. >Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>A function is calculated for the confidence score of the classifier.
6. The method for identifying multi-label image under noisy data based on deep learning according to claim 5, wherein the specific method in step S3.6 is as follows:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture- >Is->;
Picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second thresholdValue->Get pictures->Intermediate label->;
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->;
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->;
The third loss functionThe method comprises the following steps:
wherein ,loss of binary cross entropy for the ith picture,/->The value of the pseudo tag of category k is classified for the i-th picture relative to the multi-tag.
7. The method for identifying multiple tag images under noisy data according to claim 6, wherein the total loss function in step S3.7The method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of->For the second loss function->Is a balance factor of (a).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310299402.5A CN116012569B (en) | 2023-03-24 | 2023-03-24 | Multi-label image recognition method based on deep learning and under noisy data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310299402.5A CN116012569B (en) | 2023-03-24 | 2023-03-24 | Multi-label image recognition method based on deep learning and under noisy data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116012569A CN116012569A (en) | 2023-04-25 |
CN116012569B true CN116012569B (en) | 2023-08-15 |
Family
ID=86032175
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310299402.5A Active CN116012569B (en) | 2023-03-24 | 2023-03-24 | Multi-label image recognition method based on deep learning and under noisy data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116012569B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416382A (en) * | 2018-03-01 | 2018-08-17 | 南开大学 | One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method |
CN113688949A (en) * | 2021-10-25 | 2021-11-23 | 南京码极客科技有限公司 | Network image data set denoising method based on dual-network joint label correction |
CN114692732A (en) * | 2022-03-11 | 2022-07-01 | 华南理工大学 | Method, system, device and storage medium for updating online label |
CN115147670A (en) * | 2021-03-15 | 2022-10-04 | 华为技术有限公司 | Object processing method and device |
CN115331088A (en) * | 2022-10-13 | 2022-11-11 | 南京航空航天大学 | Robust learning method based on class labels with noise and imbalance |
CN115496948A (en) * | 2022-09-23 | 2022-12-20 | 广东工业大学 | Network supervision fine-grained image identification method and system based on deep learning |
CN115809697A (en) * | 2022-12-26 | 2023-03-17 | 上海高德威智能交通系统有限公司 | Data correction method and device and electronic equipment |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11748613B2 (en) * | 2019-05-10 | 2023-09-05 | Baidu Usa Llc | Systems and methods for large scale semantic indexing with deep level-wise extreme multi-label learning |
US11263476B2 (en) * | 2020-03-19 | 2022-03-01 | Salesforce.Com, Inc. | Unsupervised representation learning with contrastive prototypes |
US20220067506A1 (en) * | 2020-08-28 | 2022-03-03 | Salesforce.Com, Inc. | Systems and methods for partially supervised learning with momentum prototypes |
US20220156591A1 (en) * | 2020-11-13 | 2022-05-19 | Salesforce.Com, Inc. | Systems and methods for semi-supervised learning with contrastive graph regularization |
US20220188645A1 (en) * | 2020-12-16 | 2022-06-16 | Oracle International Corporation | Using generative adversarial networks to construct realistic counterfactual explanations for machine learning models |
-
2023
- 2023-03-24 CN CN202310299402.5A patent/CN116012569B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416382A (en) * | 2018-03-01 | 2018-08-17 | 南开大学 | One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method |
CN115147670A (en) * | 2021-03-15 | 2022-10-04 | 华为技术有限公司 | Object processing method and device |
CN113688949A (en) * | 2021-10-25 | 2021-11-23 | 南京码极客科技有限公司 | Network image data set denoising method based on dual-network joint label correction |
CN114692732A (en) * | 2022-03-11 | 2022-07-01 | 华南理工大学 | Method, system, device and storage medium for updating online label |
CN115496948A (en) * | 2022-09-23 | 2022-12-20 | 广东工业大学 | Network supervision fine-grained image identification method and system based on deep learning |
CN115331088A (en) * | 2022-10-13 | 2022-11-11 | 南京航空航天大学 | Robust learning method based on class labels with noise and imbalance |
CN115809697A (en) * | 2022-12-26 | 2023-03-17 | 上海高德威智能交通系统有限公司 | Data correction method and device and electronic equipment |
Non-Patent Citations (1)
Title |
---|
标签噪声鲁棒学习算法研究综述;宫辰等;《航空兵器》;第27卷(第3期);第20-26页 * |
Also Published As
Publication number | Publication date |
---|---|
CN116012569A (en) | 2023-04-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Better and faster: knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification | |
CN109993100B (en) | Method for realizing facial expression recognition based on deep feature clustering | |
CN110210468B (en) | Character recognition method based on convolutional neural network feature fusion migration | |
CN113378706B (en) | Drawing system for assisting children in observing plants and learning biological diversity | |
CN112651940B (en) | Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network | |
CN113673482B (en) | Cell antinuclear antibody fluorescence recognition method and system based on dynamic label distribution | |
CN113434688B (en) | Data processing method and device for public opinion classification model training | |
CN111079847A (en) | Remote sensing image automatic labeling method based on deep learning | |
CN112712127A (en) | Image emotion polarity classification method combined with graph convolution neural network | |
CN115331284A (en) | Self-healing mechanism-based facial expression recognition method and system in real scene | |
CN113657267A (en) | Semi-supervised pedestrian re-identification model, method and device | |
CN114548256A (en) | Small sample rare bird identification method based on comparative learning | |
CN112949929A (en) | Knowledge tracking method and system based on collaborative embedded enhanced topic representation | |
CN112183464A (en) | Video pedestrian identification method based on deep neural network and graph convolution network | |
CN116152554A (en) | Knowledge-guided small sample image recognition system | |
CN113010683A (en) | Entity relationship identification method and system based on improved graph attention network | |
CN114782752A (en) | Small sample image grouping classification method and device based on self-training | |
CN110175631A (en) | A kind of multiple view clustering method based on common Learning Subspaces structure and cluster oriental matrix | |
CN116051924B (en) | Divide-and-conquer defense method for image countermeasure sample | |
CN116012569B (en) | Multi-label image recognition method based on deep learning and under noisy data | |
CN113592045B (en) | Model adaptive text recognition method and system from printed form to handwritten form | |
CN114120367A (en) | Pedestrian re-identification method and system based on circle loss measurement under meta-learning framework | |
CN115100694A (en) | Fingerprint quick retrieval method based on self-supervision neural network | |
CN111695526B (en) | Network model generation method, pedestrian re-recognition method and device | |
CN114419529A (en) | Cross-modal pedestrian re-identification method and system based on distribution space alignment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |