CN116012569B - Multi-label image recognition method based on deep learning and under noisy data - Google Patents

Multi-label image recognition method based on deep learning and under noisy data Download PDF

Info

Publication number
CN116012569B
CN116012569B CN202310299402.5A CN202310299402A CN116012569B CN 116012569 B CN116012569 B CN 116012569B CN 202310299402 A CN202310299402 A CN 202310299402A CN 116012569 B CN116012569 B CN 116012569B
Authority
CN
China
Prior art keywords
label
picture
feature
model
tag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310299402.5A
Other languages
Chinese (zh)
Other versions
CN116012569A (en
Inventor
陈添水
徐志华
黄衍聪
柯梓铭
付晨博
范耀洲
杨志景
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202310299402.5A priority Critical patent/CN116012569B/en
Publication of CN116012569A publication Critical patent/CN116012569A/en
Application granted granted Critical
Publication of CN116012569B publication Critical patent/CN116012569B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label; the method can carry out label correction on the multi-label noisy data set, saves the cost of manpower and material resources, and realizes the efficient utilization of data resources; meanwhile, the prediction result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, so that the noise can be weakened, and the over fitting of the noise can be avoided.

Description

Multi-label image recognition method based on deep learning and under noisy data
Technical Field
The invention relates to the technical field of computer vision and image multi-label classification, in particular to a multi-label image recognition method based on deep learning and under noisy data.
Background
With the continuous development of internet technology, artificial intelligence technology is mature, and deep learning has become one of the most fire branches in the artificial intelligence technology. Deep learning is popular because of excellent performance, abundant frames, convenient calling and simple entry. However, conventional deep learning algorithms require a large number of manually labeled samples as data sets, which are typically large in sample size, often up to tens or even hundreds of thousands of samples, and require that the labels for each sample be accurate. Thus, the creation of a quality dataset suitable for training requires significant human and capital costs, which represents a significant impediment to further development of deep learning. On the other hand, there is a large amount of data containing tag noise on the internet, that is, tags of part of the data are erroneous, and the data can be easily obtained by using a crawler. The traditional deep learning algorithm can only train by using clean and correct data of the labels, and for multi-label noisy data, the traditional deep learning algorithm cannot use the data, so that the waste of data resources is caused.
Taking the identification of orange pictures as an example, a plurality of pictures with labels of orange on the network are found to be wrongly marked after analysis, for example, the pictures of lemon with similar shape and appearance to the orange are marked as orange, and the wrongly marked types are called as first type wrongly marked; or an object far from the orange, such as a sunset of orange, is marked as "orange", and such a mismark is referred to as a second type of mismark. If the data with the error labels are directly used for training a traditional deep learning network, the network learns a plurality of error data, so that the generalization effect of the model is poor, and the model is difficult to apply in a floor mode. In the face of this, there are two approaches to improvement: firstly, the pictures are marked again manually, which consumes great manpower and material resources; and secondly, the part of the data set is directly discarded, and the data resource is wasted.
Therefore, how to train the neural network by using the noisy data sets conveniently is one of the problems to be solved in the future development of deep learning, and is also a trend of development in the big data age.
The prior art discloses a weak supervision image multi-label classification method based on meta learning, which comprises the steps of providing an image multi-label classification model based on label information enhancement, adopting a neural network of an encoding-decoding architecture, and sequentially judging whether labels in a label sequence are related in a sequence labeling mode to obtain related labels of the image; aiming at the phenomenon of model overfitting caused by insufficient supervision information in a weak supervision environment, a teacher-student network architecture training method based on meta learning is also provided, and the accuracy of image annotation is further improved; the method in the prior art only aims at solving the problem that effective modeling cannot be realized due to tag missing, the image without tags or tag errors cannot be effectively corrected, and the accuracy of labeling a data set containing a large amount of noise and error tags is low.
Disclosure of Invention
The invention provides a multi-label image recognition method based on deep learning and under noisy data, which aims to overcome the defect that the correction effect of a data set containing multiple noisy labels in the prior art is poor, and can correct the labels of the multi-label noisy data set, save the cost of manpower and material resources and realize the efficient utilization of data resources.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a multi-label image recognition method based on deep learning under noisy data comprises the following steps:
s1: the method comprises the steps of obtaining a multi-label noisy data set and preprocessing, wherein the specific method comprises the following steps:
acquiring a multi-label noisy data set according to preset K multi-label classification categories;
dividing the obtained multi-tag noisy data set intoThe training set comprises N pictures, and each picture is marked with a pseudo tagThe training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,/>,/>,/>Representing the i picture->And its corresponding pseudo tag->
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
S2: the method comprises the steps of establishing a double-branch multi-label correction neural network model, specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 Includes feature extractor, instance contrast learning connected in turnThe device comprises a module, a category prototype comparison learning module, a classifier and a label correction module;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:
s3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting pictures And picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 Example contrast learning moduleUpdating row parameters;
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +.>Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>Classification probability of (c);
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (2) for picturesPseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain a pictureIs->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
s3.7: according to a first loss functionSecond loss function->And a third loss function->Setting the total loss function->Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
S4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
Preferably, a pseudo tag of each picture in each sub-training set is determinedThe specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category kOtherwise->
Preferably, the specific method of step S3.3 is as follows:
will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculating corresponding first eigenvector->And a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy- >Second feature vector->Satisfy the following requirements/>
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructs the first circulation sequence +.>Satisfies the following conditions,R 1 For the first cycle sequence->Is a sequence length of (2);
according to a first cyclic sequenceConstruction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is the first of (2)The 2 nd feature vector after dimension reduction; />For picture->Classifying the value of the pseudo tag of class k relative to the multi-tag;
for picturesAccording to the second feature->And fourth feature->Calculating corresponding third eigenvector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture- >C of (2) 2 A j-th feature vector;
the obtained third feature vectorSatisfy->Fourth feature vector->Satisfy the following requirements
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd constructing a second circulation sequence +.>Satisfy->,R 2 For the second cycle sequence->Is a sequence length of (2);
according to a second cyclic sequenceConstruction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,is a second tag modifierModel M 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->The number of categories of the product,for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st feature vector after dimension reduction; />For picture->Relative multi-tag class->Is a pseudo tag value of (a).
Preferably, the specific method of step S3.4 is as follows:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares pictures Is>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for picturesA second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (c).
Preferably, the specific method of step S3.5 is as follows:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>A function is calculated for the confidence score of the classifier.
Preferably, the specific method of step S3.6 is as follows:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold value And a second threshold->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Is a pseudo tag of (a)As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->
Picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->
The third loss functionThe method comprises the following steps:
wherein ,loss of binary cross entropy for the ith picture,/->The value of the pseudo tag of category k is classified for the i-th picture relative to the multi-tag.
Preferably, the total loss function in step S3.7The method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of- >As a second loss functionIs a balance factor of (a).
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, acquiring a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, a dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction and image recognition can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.
Drawings
Fig. 1 is a flowchart of a multi-label image recognition method under noisy data based on deep learning according to embodiment 1.
Fig. 2 is a comparative learning training flowchart of the dual-branch multi-label modified neural network model provided in embodiment 2.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, the embodiment provides a multi-label image recognition method under noisy data based on deep learning, which includes the following steps:
s1: acquiring a multi-label noisy data set and preprocessing;
s2: establishing a double-branch multi-label correction neural network model;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model;
S4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
In the specific implementation process, firstly, a multi-label noisy data set is obtained and preprocessed, and in the embodiment, the multi-label noisy data set is obtained from the Internet; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of the multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is achieved.
Example 2
The embodiment provides a multi-label image recognition method based on deep learning under noisy data, which comprises the following steps:
s1: the method comprises the steps of obtaining a multi-label noisy data set and preprocessing, wherein the specific method comprises the following steps:
acquiring a multi-label noisy data set according to preset K multi-label classification categories;
dividing the obtained multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo labelThe training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,/>,/>,/>Representing the i picture->And its corresponding pseudo tag->
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
s2: the method comprises the steps of establishing a double-branch multi-label correction neural network model, specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence;
s3: as shown in fig. 2, the preprocessed multi-label noisy data set is input into a dual-branch multi-label correction neural network model for comparison learning training, and an optimized dual-branch multi-label correction neural network model is obtained, and the specific method is as follows:
s3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting picturesAnd picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +. >And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The instance comparison learning module of (a) performs parameter updating;
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +.>Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +. >Classification probability of (c);
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (2) for picturesPseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain a pictureIs->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
s3.7: according to a first loss functionSecond loss function->And a third loss function->Setting the total loss function->Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
s4: acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, acquiring a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
Determining pseudo tags for pictures in each sub-training setThe specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category kOtherwise->
The specific method of the step S3.3 is as follows:
will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculation ofCorresponding first feature vector->And a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy->Second feature vector->Satisfy->
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructing a first cycle sequenceColumn->Satisfies the following conditions,R 1 For the first cycle sequence->Is a sequence length of (2);
according to a first cyclic sequenceConstruction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
Setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is the first of (2)The 2 nd feature vector after dimension reduction; />For picture->Classifying the value of the pseudo tag of class k relative to the multi-tag;
for picturesAccording to the second feature->And fourth feature->Calculate the correspondingThird feature vector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 2 A j-th feature vector;
the obtained third feature vectorSatisfy->Fourth feature vector->Satisfy the following requirements
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd constructing a second circulation sequence +.>Satisfy->,R 2 For the second cycle sequence->Is a sequence length of (2);
according to a second cyclic sequenceConstruction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
Setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->The number of categories of the product,for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st feature vector after dimension reduction; />For picture->Relative multi-tag class->Is a pseudo tag value of (1);
the specific method of the step S3.4 is as follows:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares picturesIs>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for picturesA second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically: />
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (2);
the specific method of the step S3.5 is as follows:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>Classification probability of (1), in particular:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
the specific method of the step S3.6 is as follows:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Is a pseudo tag of (a)As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->
First tag modifier sub-model M 1 The specific correction process in the label correction module is as follows:
picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold- >And picture->Classification probability of->Determining binaryNoise label->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->
The third loss functionThe method comprises the following steps:
wherein ,loss of binary cross entropy for the ith picture,/->Classifying the value of the pseudo tag of category k for the i-th picture relative to the multi-tag;
the total loss function in step S3.7The method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of->As a second loss functionIs a balance factor of (a).
In the specific implementation process, firstly acquiring and preprocessing a multi-label noisy data set, acquiring the multi-label noisy data set according to preset K multi-label classification categories, dividing the acquired multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo labelThe training set is marked as X, and the specific method is as follows:
microsoft COCO and Pascal VOC are the two most widely used datasets in evaluating the MLR algorithm, where the Microsoft COCO dataset contains 80 categories and the Pascal VOC dataset contains 20 categories, in this embodiment, the 80 categories contained in the Microsoft COCO dataset are selected to construct the Web-COCO and Web-Pascal datasets, with one or more categories selected randomly as keywords, such as: "person" or "person, truck, bus";
Searching corresponding pictures from a search engine, wherein the pictures comprise google, hundred degrees and necessary pictures, and taking more than 500000 obtained noisy pictures as a multi-label noisy data set;
then, eliminating incomplete and repeated pictures, constructing a Web-COCO data set by using the rest 290000 noisy pictures, and deeply selecting pictures at least comprising one of 20 Pascal VOC categories to construct the Web-Pascal data set;
the Web-COCO data set contains 290000 pictures, and each picture needs to be endowed with a pseudo tag according to category keywordsRandomly selecting 20000 pictures for manual annotation, and endowing the pictures with more accurate and deeper description;
the Web-co dataset had the following drawbacks: firstly, the label noise exists, and when the network searches data, the label noise is inevitably generated; in the multi-tag picture of the present embodiment, tag noise can be divided into the following cases: a picture contains many categories of information, but the corresponding keywords do not contain these categories, which may lead to erroneous, negative-type pictures; a better description of noisy pictures is obtained by calculating the accuracy and recall of each class, which results show an average recall and accuracy of 46.1% and 64.6%, respectively, which indicates that there is severe label noise in the dataset;
Another drawback is semantic dispersion, a multi-labeled image contains multiple semantic objects spread across the image; therefore, it is necessary to find the corresponding semantic region to help find the missing label, while searching the whole image also helps correct the wrong front label;
a third drawback is that the category is not uniform, and in the real world, the phenomenon of category non-uniformity is common and appears more serious when the network retrieves multi-tag pictures; for example: the maximum number of pictures is "people", 15% and the minimum sum of 20 pictures is only 5% of the total, for evaluating WS-MLR tasks we use Web-COCO as training set and Microsoft COCO as validation set containing 40,504 fully manually annotated images;
the Web-Pascal data set comprises 236043 pictures, 20 categories in the Pascal VOC data set are used, and similar to the Web-COCO data set, the Web-Pascal data set also has the defects of label noise, semantic dispersion, uneven categories and the like; likewise, 4952 manually annotated pictures in the Web-Pascal dataset are used as verification sets, and other pictures are used as training sets;
dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,,/>,/>,/>representing the i picture->And its corresponding pseudo tag->
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
establishing a double-branch multi-label correction neural network model;
inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:
s3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting picturesAnd picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculate the corresponding firstFeature vectorAnd a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy->Second feature vector->Satisfy->
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructs the first circulation sequence +. >Satisfies the following conditions,R 1 For the first cycle sequence->In the present embodiment, R 1 =8192; according to the first cycle sequence->Construction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is the first of (2)The 2 nd eigenvector after dimension reduction, < ->For picture->The value of the pseudo tag for the category k of the relative multi-tag class, in this embodiment ++>,/>Is 128%>Is 2048 in dimension; />
For picturesAccording to the second feature->And fourth feature->Calculating corresponding third eigenvector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 2 A j-th feature vector;
the obtained third feature vector Satisfy->Fourth feature vector->Satisfy the following requirements
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd constructing a second circulation sequence +.>Satisfy->,R 2 For the second cycle sequence->In the present embodiment, R 2 =8192; according to the second circulation sequence->Construction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->The number of categories of the product,for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Relative multi-tag class->Is a pseudo tag value of (1);
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +. >Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares picturesIs>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for picturesA second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +. >Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (2);
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2), an output graph is calculatedSheet->The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>Calculating a function for the confidence score of the classifier;
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (2) for picturesPseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain a pictureIs->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of (a) to update parameters, specifically:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds with a preset momentum m, in this embodiment a third threshold +.>Fourth threshold->
According to the updated third threshold valueAnd a fourth threshold->And picture- >Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Is a pseudo tag of (a)As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->
First tag modifier sub-model M 1 The specific correction process in the label correction module is as follows:
picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->
The third loss functionThe method comprises the following steps:
wherein ,a binary cross entropy penalty for the i-th picture;
s3.7: according to a first loss function Second loss function->And a third loss function->Setting the total loss function->Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
the total loss functionThe method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of->For the second loss function->In the present example, ++>,/>
Finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.
The same or similar reference numerals correspond to the same or similar components;
the terms describing the positional relationship in the drawings are merely illustrative, and are not to be construed as limiting the present patent;
it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (7)

1. The multi-label image recognition method based on deep learning under noisy data is characterized by comprising the following steps:
s1: the method comprises the steps of obtaining a multi-label noisy data set and preprocessing, wherein the specific method comprises the following steps:
acquiring a multi-label noisy data set according to preset K multi-label classification categories;
dividing the obtained multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo label The training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,/>,/>,/>Representing the i picture->And its corresponding pseudo tag->
Determining length and width data and pseudo tags of pictures in each sub-training setWherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
s2: the method comprises the steps of establishing a double-branch multi-label correction neural network model, specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:
S3.1: will first sub training set D 1 In a picture ofAnd a second sub-training set D 2 Picture->Common input into a two-branch, multi-tag modified neural network model, wherein +.>Satisfy->,/>For the first sub-training set D 1 Or a second sub-training set D 2 The number of pictures in (a);
s3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting picturesAnd picture->Extracting features to obtain first features ∈>And second feature->And third feature->And fourth feature->
S3.3: will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 Is to picture +.>Is>And third feature->Performing first contrast learning, and performing +.>Second feature->And fourth feature->Performing first contrast learning, and setting a first loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The instance comparison learning module of (a) performs parameter updating;
s3.4: will first featureInputting a first label modifier model M 1 Is compared with a preset first category prototype feature>Performing a second contrast learning to obtain a fourth characteristic +.>Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>Performing a second contrast learning and setting a second loss function +.>Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>Classification probability of (c); fourth feature->Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>Classification probability of (c);
s3.6: picture is madeIs input into a first label modifier sub-model M 1 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the Picture->Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->Pseudo tag of->Performing label correction to obtain picture->Is->The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
s3.7: according to a first loss functionSecond loss function->And a third loss function->Setting a total loss functionParameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
s4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
2. The method for identifying multi-label images under noisy data based on deep learning according to claim 1, wherein the pseudo labels of the pictures in each sub-training set are determinedThe specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category kOtherwise->
3. The method for identifying the multi-label image under the noisy data based on the deep learning according to claim 2, wherein the specific method of the step S3.3 is as follows:
Will first featureAnd second feature->Common input of first tag modifier sub-model M 1 Is to add the third feature +.>And fourth feature->Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for picturesAccording to the first feature->And third feature->Calculating corresponding first eigenvector->And a second feature vector->The method specifically comprises the following steps:
wherein ,C1 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 1 A j-th feature vector;
the obtained first feature vectorSatisfy->Second feature vector->Satisfy the following requirements
According to the first feature vectorAnd a second feature vector->Constructing a first positive sample pairAnd constructs the first circulation sequence +.>Satisfies the following conditions,R 1 For the first cycle sequence->Is a sequence length of (2);
according to the first cycle sequenceColumn ofConstruction of the first negative sample pair +.>Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss functionModifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/ >For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>is a temperature coefficient>For picture->Is>The 1 st eigenvector after dimension reduction, < ->For picture->Is>The 2 nd feature vector after dimension reduction; />For picture->Classifying the value of the pseudo tag of class k relative to the multi-tag;
for picturesAccording to the second feature->And fourth feature->Calculating corresponding third eigenvector->And fourth feature vector->The method specifically comprises the following steps:
wherein ,C2 For picturesIs a pseudo tag number of (a); />Representing picture->C of (2) 2 A j-th feature vector;
the obtained third feature vectorSatisfy->Fourth feature vector->Satisfy the following requirements
According to the third feature vectorAnd fourth feature vector->Constructing a second positive sample pairAnd is combined withConstruction of the second circulation sequence->Satisfy->,R 2 For the second cycle sequence->Is a sequence length of (2);
according to a second cyclic sequenceConstruction of a second negative sample pair +.>Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss functionModifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>Is a first loss function value,/>For picture->Total number of categories required for multi-tag classification, +.>For picture->Corresponding->Category (S),>for picture->Is>The 2 nd eigenvector after dimension reduction, < ->For picture->Is>The 1 st feature vector after dimension reduction;for picture->Relative multi-tag class->Is a pseudo tag value of (a).
4. The method for identifying multi-label image under noisy data based on deep learning according to claim 3, wherein the specific method in step S3.4 is as follows:
will first featureInputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares the pictures +.>Is>Prototype feature +.>Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>
wherein ,for the first class prototype feature corresponding to the updated kth class,/for example>For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss functionModifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (2);
will fourth featureInputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>Second feature vector +.>And a second category prototype feature +.>Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>
wherein ,for the updated->Second category prototype feature corresponding to the respective category, +.>Is->A second class prototype feature corresponding to the individual class;
setting a second loss functionModifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
wherein ,modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>A second loss function value of (c).
5. The method for identifying multi-label image under noisy data based on deep learning according to claim 4, wherein the specific method in step S3.5 is as follows:
will first featureInputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +. >Calculating a function for the confidence score of the classifier;
will fourth featureInputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>The classification probability of (3) is specifically:
wherein ,for picture->Classification probability of->For sigmoid function, +.>A function is calculated for the confidence score of the classifier.
6. The method for identifying multi-label image under noisy data based on deep learning according to claim 5, wherein the specific method in step S3.6 is as follows:
picture is madeClassification probability of->Inputting a first label modifier model M 1 Is set to a first threshold +.>Second threshold->Third threshold->And a fourth threshold->Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second threshold->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture- >Is->
Picture is madeClassification probability of->Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold valueAnd a fourth threshold->And picture->Classification probability of->Determining binary noise tag->Is a value of (2);
according to the updated first threshold valueAnd a second thresholdValue->Get pictures->Intermediate label->
When noise labelAt the time of using picture->Intermediate label->Replacement picture->Pseudo tag of->As picture->Is->
When noise labelWhen the picture is reserved->Pseudo tag of->As picture->Is->
The third loss functionThe method comprises the following steps:
wherein ,loss of binary cross entropy for the ith picture,/->The value of the pseudo tag of category k is classified for the i-th picture relative to the multi-tag.
7. The method for identifying multiple tag images under noisy data according to claim 6, wherein the total loss function in step S3.7The method comprises the following steps:
wherein ,for the total loss function value->For the first loss function->Balance factor of->For the second loss function->Is a balance factor of (a).
CN202310299402.5A 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data Active CN116012569B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310299402.5A CN116012569B (en) 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310299402.5A CN116012569B (en) 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data

Publications (2)

Publication Number Publication Date
CN116012569A CN116012569A (en) 2023-04-25
CN116012569B true CN116012569B (en) 2023-08-15

Family

ID=86032175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310299402.5A Active CN116012569B (en) 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data

Country Status (1)

Country Link
CN (1) CN116012569B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416382A (en) * 2018-03-01 2018-08-17 南开大学 One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method
CN113688949A (en) * 2021-10-25 2021-11-23 南京码极客科技有限公司 Network image data set denoising method based on dual-network joint label correction
CN114692732A (en) * 2022-03-11 2022-07-01 华南理工大学 Method, system, device and storage medium for updating online label
CN115147670A (en) * 2021-03-15 2022-10-04 华为技术有限公司 Object processing method and device
CN115331088A (en) * 2022-10-13 2022-11-11 南京航空航天大学 Robust learning method based on class labels with noise and imbalance
CN115496948A (en) * 2022-09-23 2022-12-20 广东工业大学 Network supervision fine-grained image identification method and system based on deep learning
CN115809697A (en) * 2022-12-26 2023-03-17 上海高德威智能交通系统有限公司 Data correction method and device and electronic equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11748613B2 (en) * 2019-05-10 2023-09-05 Baidu Usa Llc Systems and methods for large scale semantic indexing with deep level-wise extreme multi-label learning
US11263476B2 (en) * 2020-03-19 2022-03-01 Salesforce.Com, Inc. Unsupervised representation learning with contrastive prototypes
US20220067506A1 (en) * 2020-08-28 2022-03-03 Salesforce.Com, Inc. Systems and methods for partially supervised learning with momentum prototypes
US20220156591A1 (en) * 2020-11-13 2022-05-19 Salesforce.Com, Inc. Systems and methods for semi-supervised learning with contrastive graph regularization
US20220188645A1 (en) * 2020-12-16 2022-06-16 Oracle International Corporation Using generative adversarial networks to construct realistic counterfactual explanations for machine learning models

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416382A (en) * 2018-03-01 2018-08-17 南开大学 One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method
CN115147670A (en) * 2021-03-15 2022-10-04 华为技术有限公司 Object processing method and device
CN113688949A (en) * 2021-10-25 2021-11-23 南京码极客科技有限公司 Network image data set denoising method based on dual-network joint label correction
CN114692732A (en) * 2022-03-11 2022-07-01 华南理工大学 Method, system, device and storage medium for updating online label
CN115496948A (en) * 2022-09-23 2022-12-20 广东工业大学 Network supervision fine-grained image identification method and system based on deep learning
CN115331088A (en) * 2022-10-13 2022-11-11 南京航空航天大学 Robust learning method based on class labels with noise and imbalance
CN115809697A (en) * 2022-12-26 2023-03-17 上海高德威智能交通系统有限公司 Data correction method and device and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
标签噪声鲁棒学习算法研究综述;宫辰等;《航空兵器》;第27卷(第3期);第20-26页 *

Also Published As

Publication number Publication date
CN116012569A (en) 2023-04-25

Similar Documents

Publication Publication Date Title
Zhang et al. Better and faster: knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification
CN109993100B (en) Method for realizing facial expression recognition based on deep feature clustering
CN110210468B (en) Character recognition method based on convolutional neural network feature fusion migration
CN113378706B (en) Drawing system for assisting children in observing plants and learning biological diversity
CN112651940B (en) Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN113673482B (en) Cell antinuclear antibody fluorescence recognition method and system based on dynamic label distribution
CN113434688B (en) Data processing method and device for public opinion classification model training
CN111079847A (en) Remote sensing image automatic labeling method based on deep learning
CN112712127A (en) Image emotion polarity classification method combined with graph convolution neural network
CN115331284A (en) Self-healing mechanism-based facial expression recognition method and system in real scene
CN113657267A (en) Semi-supervised pedestrian re-identification model, method and device
CN114548256A (en) Small sample rare bird identification method based on comparative learning
CN112949929A (en) Knowledge tracking method and system based on collaborative embedded enhanced topic representation
CN112183464A (en) Video pedestrian identification method based on deep neural network and graph convolution network
CN116152554A (en) Knowledge-guided small sample image recognition system
CN113010683A (en) Entity relationship identification method and system based on improved graph attention network
CN114782752A (en) Small sample image grouping classification method and device based on self-training
CN110175631A (en) A kind of multiple view clustering method based on common Learning Subspaces structure and cluster oriental matrix
CN116051924B (en) Divide-and-conquer defense method for image countermeasure sample
CN116012569B (en) Multi-label image recognition method based on deep learning and under noisy data
CN113592045B (en) Model adaptive text recognition method and system from printed form to handwritten form
CN114120367A (en) Pedestrian re-identification method and system based on circle loss measurement under meta-learning framework
CN115100694A (en) Fingerprint quick retrieval method based on self-supervision neural network
CN111695526B (en) Network model generation method, pedestrian re-recognition method and device
CN114419529A (en) Cross-modal pedestrian re-identification method and system based on distribution space alignment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant