CN116012569A - Multi-label image recognition method based on deep learning and under noisy data - Google Patents

Multi-label image recognition method based on deep learning and under noisy data Download PDF

Info

Publication number
CN116012569A
CN116012569A CN202310299402.5A CN202310299402A CN116012569A CN 116012569 A CN116012569 A CN 116012569A CN 202310299402 A CN202310299402 A CN 202310299402A CN 116012569 A CN116012569 A CN 116012569A
Authority
CN
China
Prior art keywords
label
picture
feature
model
correction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310299402.5A
Other languages
Chinese (zh)
Other versions
CN116012569B (en
Inventor
陈添水
徐志华
黄衍聪
柯梓铭
付晨博
范耀洲
杨志景
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202310299402.5A priority Critical patent/CN116012569B/en
Publication of CN116012569A publication Critical patent/CN116012569A/en
Application granted granted Critical
Publication of CN116012569B publication Critical patent/CN116012569B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label; the method can carry out label correction on the multi-label noisy data set, saves the cost of manpower and material resources, and realizes the efficient utilization of data resources; meanwhile, the prediction result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, so that the noise can be weakened, and the over fitting of the noise can be avoided.

Description

Multi-label image recognition method based on deep learning and under noisy data
Technical Field
The invention relates to the technical field of computer vision and image multi-label classification, in particular to a multi-label image recognition method based on deep learning and under noisy data.
Background
With the continuous development of internet technology, artificial intelligence technology is mature, and deep learning has become one of the most fire branches in the artificial intelligence technology. Deep learning is popular because of excellent performance, abundant frames, convenient calling and simple entry. However, conventional deep learning algorithms require a large number of manually labeled samples as data sets, which are typically large in sample size, often up to tens or even hundreds of thousands of samples, and require that the labels for each sample be accurate. Thus, the creation of a quality dataset suitable for training requires significant human and capital costs, which represents a significant impediment to further development of deep learning. On the other hand, there is a large amount of data containing tag noise on the internet, that is, tags of part of the data are erroneous, and the data can be easily obtained by using a crawler. The traditional deep learning algorithm can only train by using clean and correct data of the labels, and for multi-label noisy data, the traditional deep learning algorithm cannot use the data, so that the waste of data resources is caused.
Taking the identification of orange pictures as an example, a plurality of pictures with labels of orange on the network are found to be wrongly marked after analysis, for example, the pictures of lemon with similar shape and appearance to the orange are marked as orange, and the wrongly marked types are called as first type wrongly marked; or an object far from the orange, such as a sunset of orange, is marked as "orange", and such a mismark is referred to as a second type of mismark. If the data with the error labels are directly used for training a traditional deep learning network, the network learns a plurality of error data, so that the generalization effect of the model is poor, and the model is difficult to apply in a floor mode. In the face of this, there are two approaches to improvement: firstly, the pictures are marked again manually, which consumes great manpower and material resources; and secondly, the part of the data set is directly discarded, and the data resource is wasted.
Therefore, how to train the neural network by using the noisy data sets conveniently is one of the problems to be solved in the future development of deep learning, and is also a trend of development in the big data age.
The prior art discloses a weak supervision image multi-label classification method based on meta learning, which comprises the steps of providing an image multi-label classification model based on label information enhancement, adopting a neural network of an encoding-decoding architecture, and sequentially judging whether labels in a label sequence are related in a sequence labeling mode to obtain related labels of the image; aiming at the phenomenon of model overfitting caused by insufficient supervision information in a weak supervision environment, a teacher-student network architecture training method based on meta learning is also provided, and the accuracy of image annotation is further improved; the method in the prior art only aims at solving the problem that effective modeling cannot be realized due to tag missing, the image without tags or tag errors cannot be effectively corrected, and the accuracy of labeling a data set containing a large amount of noise and error tags is low.
Disclosure of Invention
The invention provides a multi-label image recognition method based on deep learning and under noisy data, which aims to overcome the defect that the correction effect of a data set containing multiple noisy labels in the prior art is poor, and can correct the labels of the multi-label noisy data set, save the cost of manpower and material resources and realize the efficient utilization of data resources.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a multi-label image recognition method based on deep learning under noisy data comprises the following steps:
s1: acquiring a multi-label noisy data set and preprocessing;
s2: establishing a double-branch multi-label correction neural network model;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model;
s4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
Preferably, in the step S1, the specific method for acquiring and preprocessing the multi-tag noisy data set is as follows:
acquiring a multi-label noisy data set according to preset K multi-label classification categories K;
dividing the obtained multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo label
Figure SMS_1
The training set is marked as X; dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,/>
Figure SMS_2
,/>
Figure SMS_3
Figure SMS_4
,/>
Figure SMS_5
Representing the i picture->
Figure SMS_6
And its corresponding pseudo tag->
Figure SMS_7
Determining length and width data and pseudo tags of pictures in each sub-training set
Figure SMS_8
Wherein the length of the picture is denoted as H and the width of the picture is denoted as W; and finishing the pretreatment of the multi-label noisy data set.
Preferably, a pseudo tag of each picture in each sub-training set is determined
Figure SMS_9
The specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category k
Figure SMS_10
Otherwise->
Figure SMS_11
Preferably, the dual-branch multi-label correction neural network model in step S2 is specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 Comprises a feature extractor, an instance comparison learning module and a category prototype comparison which are connected in sequenceThe device comprises a learning module, a classifier and a label correction module.
Preferably, in the step S3, the preprocessed multi-label noisy data set is input into a dual-branch multi-label correction neural network model for comparison learning training, so as to obtain an optimized dual-branch multi-label correction neural network model, and the specific method comprises the following steps:
s3.1: will first sub training set D 1 In a picture of
Figure SMS_12
And a second sub-training set D 2 Picture->
Figure SMS_13
Common input into a two-branch, multi-tag modified neural network model, wherein +.>
Figure SMS_14
Satisfy->
Figure SMS_15
S3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting pictures
Figure SMS_16
And picture->
Figure SMS_17
Extracting features to obtain first features ∈>
Figure SMS_18
And second feature->
Figure SMS_19
And third feature->
Figure SMS_20
And fourth feature->
Figure SMS_21
S3.3: will first specialSign of sign
Figure SMS_23
And second feature->
Figure SMS_25
Common input of first tag modifier sub-model M 1 Is to add the third feature +.>
Figure SMS_28
And fourth feature->
Figure SMS_24
Common input of a second tag modifier sub-model M 2 Is to picture +.>
Figure SMS_26
Is>
Figure SMS_29
And third feature->
Figure SMS_31
Performing first contrast learning, and performing +.>
Figure SMS_22
Second feature->
Figure SMS_27
And fourth feature->
Figure SMS_30
Performing first contrast learning, and setting a first loss function +.>
Figure SMS_32
Correction of the first label sub-model M 1 And a second label modifier model M 2 The instance comparison learning module of (a) performs parameter updating;
s3.4: will first feature
Figure SMS_33
Inputting a first label modifier model M 1 Is compared with a preset first category prototype feature>
Figure SMS_34
Performing a second contrast learning to obtain a fourth characteristic +.>
Figure SMS_35
Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>
Figure SMS_36
Performing a second contrast learning and setting a second loss function +.>
Figure SMS_37
Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first feature
Figure SMS_38
Inputting a first label modifier model M 1 In the classifier of (2), calculate the output picture
Figure SMS_39
Classification probability of (c); fourth feature- >
Figure SMS_40
Inputting a second label modifier model M 2 In the classifier of (2), calculate the output picture
Figure SMS_41
Classification probability of (c);
s3.6: picture is made
Figure SMS_43
Is input into a first label modifier sub-model M 1 The label correction module of (1) for picture->
Figure SMS_45
Pseudo tag of->
Figure SMS_48
Performing label correction to obtain picture->
Figure SMS_42
Is->
Figure SMS_46
The method comprises the steps of carrying out a first treatment on the surface of the Picture->
Figure SMS_49
Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->
Figure SMS_51
Pseudo tag of->
Figure SMS_44
Performing label correction to obtain picture->
Figure SMS_47
Is->
Figure SMS_50
The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->
Figure SMS_52
Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
s3.7: according to a first loss function
Figure SMS_53
Second loss function->
Figure SMS_54
And a third loss function->
Figure SMS_55
Setting the total loss function->
Figure SMS_56
And updating parameters of the double-branch multi-label correction neural network model to obtain an optimized double-branch multi-label correction neural network model.
Preferably, the specific method of step S3.3 is as follows:
will first feature
Figure SMS_57
And second feature->
Figure SMS_58
Common input of first tag modifier sub-model M 1 Is to add the third feature +.>
Figure SMS_59
And fourth feature- >
Figure SMS_60
Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for pictures
Figure SMS_61
According to the first feature->
Figure SMS_62
And third feature->
Figure SMS_63
Calculating corresponding first feature vectors
Figure SMS_64
And a second feature vector->
Figure SMS_65
The method specifically comprises the following steps:
Figure SMS_66
wherein ,C1 For pictures
Figure SMS_67
Is a pseudo tag number of (a); />
Figure SMS_68
Representing picture->
Figure SMS_69
C of (2) 1 A j-th feature vector;
the obtained first feature vector
Figure SMS_70
Satisfy->
Figure SMS_71
Second feature vector
Figure SMS_72
Satisfy->
Figure SMS_73
According to the first feature vector
Figure SMS_75
And a second feature vector->
Figure SMS_77
Construction of the first positive sample pair->
Figure SMS_79
And constructs the first circulation sequence +.>
Figure SMS_76
Satisfy->
Figure SMS_78
,R 1 For the first cycle sequence->
Figure SMS_80
According to the first cyclic sequence +.>
Figure SMS_81
Constructing a first negative sample pair
Figure SMS_74
Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss function
Figure SMS_82
Modifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
Figure SMS_83
wherein ,
Figure SMS_86
modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>
Figure SMS_87
Is a first loss function value,/>
Figure SMS_90
For picture->
Figure SMS_85
Total number of categories required for multi-tag classification, +.>
Figure SMS_88
For picture->
Figure SMS_91
Corresponding->
Figure SMS_93
Category (S),>
Figure SMS_84
is a temperature coefficient>
Figure SMS_89
For picture->
Figure SMS_92
Is>
Figure SMS_94
The 1 st feature vector after dimension reduction;
For pictures
Figure SMS_95
According to the second feature->
Figure SMS_96
And fourth feature->
Figure SMS_97
Calculating corresponding third eigenvector->
Figure SMS_98
And fourth feature vector->
Figure SMS_99
The method specifically comprises the following steps:
Figure SMS_100
wherein ,C2 For pictures
Figure SMS_101
Is a pseudo tag number of (a); />
Figure SMS_102
Representing picture->
Figure SMS_103
C of (2) 2 A j-th feature vector;
the obtained third feature vector
Figure SMS_104
Satisfy->
Figure SMS_105
Fourth feature vector->
Figure SMS_106
Satisfy->
Figure SMS_107
According to the third feature vector
Figure SMS_109
And fourth feature vector->
Figure SMS_112
Constructing a second positive sample pair
Figure SMS_114
And constructing a second circulation sequence +.>
Figure SMS_110
Satisfy->
Figure SMS_111
,R 2 For the second cycle sequence->
Figure SMS_113
According to the second cyclic sequence +.>
Figure SMS_115
Constructing a second negative pair of samples
Figure SMS_108
Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss function
Figure SMS_116
Modifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
Figure SMS_117
wherein ,
Figure SMS_119
modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>
Figure SMS_122
Is a first loss function value,/>
Figure SMS_125
For picture->
Figure SMS_120
Total number of categories required for multi-tag classification, +.>
Figure SMS_123
For picture->
Figure SMS_126
Corresponding->
Figure SMS_127
The number of categories of the product,
Figure SMS_118
for picture->
Figure SMS_121
Is>
Figure SMS_124
And the 2 nd feature vector after dimension reduction.
Preferably, the specific method of step S3.4 is as follows:
will first feature
Figure SMS_128
Inputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares pictures
Figure SMS_129
Is>
Figure SMS_130
Prototype feature +.>
Figure SMS_131
Proceeding withSecond contrast learning, update the first class prototype feature with momentum method>
Figure SMS_132
Figure SMS_133
wherein ,
Figure SMS_134
for the first class prototype feature corresponding to the updated kth class,/for example>
Figure SMS_135
For a first class prototype feature corresponding to the kth class, m is a preset momentum;
setting a second loss function
Figure SMS_136
Modifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
Figure SMS_137
wherein ,
Figure SMS_138
modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for pictures +.>
Figure SMS_139
A second loss function value of (2);
will fourth feature
Figure SMS_140
Inputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>
Figure SMS_141
Second feature vector +.>
Figure SMS_142
And a second category prototype feature +.>
Figure SMS_143
Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>
Figure SMS_144
Figure SMS_145
wherein ,
Figure SMS_146
for the updated->
Figure SMS_147
Second category prototype feature corresponding to the respective category, +.>
Figure SMS_148
Is->
Figure SMS_149
A second class prototype feature corresponding to the individual class;
setting a second loss function
Figure SMS_150
Modifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
Figure SMS_151
wherein ,
Figure SMS_152
modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>
Figure SMS_153
A second loss function value of (c).
Preferably, the specific method of step S3.5 is as follows:
will first feature
Figure SMS_154
Inputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>
Figure SMS_155
The classification probability of (3) is specifically:
Figure SMS_156
wherein ,
Figure SMS_157
for picture->
Figure SMS_158
Classification probability of->
Figure SMS_159
For sigmoid function, +.>
Figure SMS_160
Calculating a function for the confidence score of the classifier;
will fourth feature
Figure SMS_161
Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>
Figure SMS_162
The classification probability of (3) is specifically:
Figure SMS_163
/>
wherein ,
Figure SMS_164
for picture->
Figure SMS_165
Classification probability of->
Figure SMS_166
For sigmoid function, +.>
Figure SMS_167
A function is calculated for the confidence score of the classifier.
Preferably, the specific method of step S3.6 is as follows:
picture is made
Figure SMS_168
Classification probability of->
Figure SMS_169
Inputting a first label modifier model M 1 Is set to a first threshold +.>
Figure SMS_170
Second threshold->
Figure SMS_171
Third threshold->
Figure SMS_172
And a fourth threshold->
Figure SMS_173
Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold value
Figure SMS_174
And a fourth threshold->
Figure SMS_175
And picture->
Figure SMS_176
Classification probability of- >
Figure SMS_177
Determining binary noise tag->
Figure SMS_178
Is a value of (2);
according to the updated first threshold value
Figure SMS_179
And a second threshold->
Figure SMS_180
Get pictures->
Figure SMS_181
Intermediate label->
Figure SMS_182
When noise label
Figure SMS_183
At the time of using picture->
Figure SMS_184
Intermediate label->
Figure SMS_185
Replacement picture->
Figure SMS_186
Is a pseudo tag of (a)
Figure SMS_187
As picture->
Figure SMS_188
Is->
Figure SMS_189
When noise label
Figure SMS_190
When the picture is reserved->
Figure SMS_191
Pseudo tag of->
Figure SMS_192
As picture->
Figure SMS_193
Is a correction label of (a)
Figure SMS_194
Picture is made
Figure SMS_195
Classification probability of->
Figure SMS_196
Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold value
Figure SMS_197
And a fourth threshold->
Figure SMS_198
And picture->
Figure SMS_199
Classification probability of->
Figure SMS_200
Determining binary noise tag->
Figure SMS_201
Is a value of (2);
according to the updated first threshold value
Figure SMS_202
And a second threshold->
Figure SMS_203
Get pictures->
Figure SMS_204
Intermediate label->
Figure SMS_205
When noise label
Figure SMS_206
At the time of using picture->
Figure SMS_207
Intermediate label->
Figure SMS_208
Replacement picture->
Figure SMS_209
Pseudo tag of->
Figure SMS_210
As picture->
Figure SMS_211
Is->
Figure SMS_212
When noise label
Figure SMS_213
When the picture is reserved->
Figure SMS_214
Pseudo tag of->
Figure SMS_215
As picture->
Figure SMS_216
Is->
Figure SMS_217
The third loss function
Figure SMS_218
The method comprises the following steps:
Figure SMS_219
wherein ,
Figure SMS_220
the binary cross entropy for the i-th picture is lost.
Preferably, the total loss function in step S3.7
Figure SMS_221
The method comprises the following steps:
Figure SMS_222
wherein ,
Figure SMS_223
for the total loss function value->
Figure SMS_224
For the first loss function->
Figure SMS_225
Balance factor of- >
Figure SMS_226
As a second loss function
Figure SMS_227
Is a balance factor of (a).
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a multi-label image recognition method under noisy data based on deep learning, which comprises the steps of acquiring a multi-label noisy data set and preprocessing; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; acquiring a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, acquiring a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, a dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction and image recognition can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.
Drawings
Fig. 1 is a flowchart of a multi-label image recognition method under noisy data based on deep learning according to embodiment 1.
Fig. 2 is a comparative learning training flowchart of the dual-branch multi-label modified neural network model provided in embodiment 2.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, the embodiment provides a multi-label image recognition method under noisy data based on deep learning, which includes the following steps:
s1: acquiring a multi-label noisy data set and preprocessing;
s2: establishing a double-branch multi-label correction neural network model;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model;
S4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
In the specific implementation process, firstly, a multi-label noisy data set is obtained and preprocessed, and in the embodiment, the multi-label noisy data set is obtained from the Internet; establishing a double-branch multi-label correction neural network model; inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model; finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of the multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is achieved.
Example 2
The embodiment provides a multi-label image recognition method based on deep learning under noisy data, which comprises the following steps:
s1: acquiring a multi-label noisy data set and preprocessing;
s2: establishing a double-branch multi-label correction neural network model;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model;
s4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
In the specific implementation process, firstly acquiring and preprocessing a multi-label noisy data set, acquiring the multi-label noisy data set according to preset K multi-label classification categories K, dividing the acquired multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo label
Figure SMS_228
The training set is marked as X, and the specific method is as follows:
microsoft COCO and Pascal VOC are the two most widely used datasets in evaluating the MLR algorithm, where the Microsoft COCO dataset contains 80 categories and the Pascal VOC dataset contains 20 categories, in this embodiment, the 80 categories contained in the Microsoft COCO dataset are selected to construct the Web-COCO and Web-Pascal datasets, with one or more categories selected randomly as keywords, such as: "person" or "person, truck, bus";
Searching corresponding pictures from a search engine, wherein the pictures comprise google, hundred degrees and necessary pictures, and taking more than 500000 obtained noisy pictures as a multi-label noisy data set;
then, eliminating incomplete and repeated pictures, constructing a Web-COCO data set by using the rest 290000 noisy pictures, and deeply selecting pictures at least comprising one of 20 PascalVOC categories to construct the Web-Pascal data set;
the Web-COCO data set contains 290000 pictures, and each picture needs to be endowed with a pseudo tag according to category keywords
Figure SMS_229
Randomly selecting 20000 pictures for manual annotation, and endowing the pictures with more accurate and deeper description;
the Web-co dataset had the following drawbacks: firstly, the label noise exists, and when the network searches data, the label noise is inevitably generated; in the multi-tag picture of the present embodiment, tag noise can be divided into the following cases: a picture contains many categories of information, but the corresponding keywords do not contain these categories, which may lead to erroneous, negative-type pictures; a better description of noisy pictures is obtained by calculating the accuracy and recall of each class, which results show an average recall and accuracy of 46.1% and 64.6%, respectively, which indicates that there is severe label noise in the dataset;
Another drawback is semantic dispersion, a multi-labeled image contains multiple semantic objects spread across the image; therefore, it is necessary to find the corresponding semantic region to help find the missing label, while searching the whole image also helps correct the wrong front label;
a third drawback is that the category is not uniform, and in the real world, the phenomenon of category non-uniformity is common and appears more serious when the network retrieves multi-tag pictures; for example: the maximum number of pictures is "people", 15% and the minimum sum of 20 pictures is only 5% of the total, for evaluating WS-MLR tasks we use Web-COCO as training set and Microsoft COCO as validation set containing 40,504 fully manually annotated images;
the Web-Pascal data set comprises 236043 pictures, 20 categories in the Pascal VOC data set are used, and similar to the Web-COCO data set, the Web-Pascal data set also has the defects of label noise, semantic dispersion, uneven categories and the like; likewise, 4952 manually annotated pictures in the Web-Pascal dataset are used as verification sets, and other pictures are used as training sets;
dividing the training set into two first sub-training sets D with the same number of pictures 1 And a second sub-training set D 2, wherein ,
Figure SMS_230
,/>
Figure SMS_231
,/>
Figure SMS_232
,/>
Figure SMS_233
representing the i picture->
Figure SMS_234
And its corresponding pseudo tag->
Figure SMS_235
Determining length and width data and pseudo tags of pictures in each sub-training set
Figure SMS_236
Wherein the length of the picture is denoted as H and the width of the picture is denoted as W; finishing preprocessing of the multi-label noisy data set;
determining pseudo tags for pictures in each sub-training set
Figure SMS_237
The specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category k
Figure SMS_238
Otherwise->
Figure SMS_239
As shown in fig. 2, a dual-branch multi-label correction neural network model is established, and the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 Is identical in structure and model parametersDifferent;
the first label modifier model M 1 Or a second label modifier model M 2 The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence;
inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model, wherein the specific method comprises the following steps of:
S3.1: will first sub training set D 1 In a picture of
Figure SMS_240
And a second sub-training set D 2 Picture->
Figure SMS_241
Common input into a two-branch, multi-tag modified neural network model, wherein +.>
Figure SMS_242
Satisfy->
Figure SMS_243
S3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting pictures
Figure SMS_244
And picture->
Figure SMS_245
Extracting features to obtain first features ∈>
Figure SMS_246
And second feature->
Figure SMS_247
And third feature->
Figure SMS_248
And fourth feature->
Figure SMS_249
S3.3: will first feature
Figure SMS_251
And second feature->
Figure SMS_253
Common input of first tag modifier sub-model M 1 Is to add the third feature +.>
Figure SMS_256
And fourth feature->
Figure SMS_252
Common input of a second tag modifier sub-model M 2 Is to picture +.>
Figure SMS_255
Is>
Figure SMS_258
And third feature->
Figure SMS_260
Performing first contrast learning, and performing +.>
Figure SMS_250
Second feature->
Figure SMS_254
And fourth feature->
Figure SMS_257
Performing first contrast learning, and setting a first loss function +.>
Figure SMS_259
Correction of the first label sub-model M 1 And a second label modifier model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
will first feature
Figure SMS_261
And second feature->
Figure SMS_262
Common input of first tag modifier sub-model M 1 Is to add the third feature +. >
Figure SMS_263
And fourth feature->
Figure SMS_264
Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for pictures
Figure SMS_265
According to the first feature->
Figure SMS_266
And third feature->
Figure SMS_267
Calculating corresponding first feature vectors
Figure SMS_268
And a second feature vector->
Figure SMS_269
The method specifically comprises the following steps:
Figure SMS_270
/>
wherein ,C1 For pictures
Figure SMS_271
Is a pseudo tag number of (a); />
Figure SMS_272
Representing picture->
Figure SMS_273
C of (2) 1 A j-th feature vector;
the obtained first feature vector
Figure SMS_274
Satisfy->
Figure SMS_275
Second feature vector
Figure SMS_276
Satisfy->
Figure SMS_277
According to the first feature vector
Figure SMS_278
And a second feature vector->
Figure SMS_282
Constructing a first positive sample pair
Figure SMS_284
And constructs the first circulation sequence +.>
Figure SMS_279
Satisfy->
Figure SMS_281
,R 1 For the first cycle sequence->
Figure SMS_283
In the present embodiment, R 1 =8192, according to the first cyclic sequence +.>
Figure SMS_285
Construction of the first negative sample pair +.>
Figure SMS_280
Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first lossFunction of
Figure SMS_286
Modifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
Figure SMS_287
wherein ,
Figure SMS_288
modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>
Figure SMS_292
Is a first loss function value,/>
Figure SMS_296
For picture->
Figure SMS_290
Total number of categories required for multi-tag classification, +.>
Figure SMS_293
For picture->
Figure SMS_298
Corresponding->
Figure SMS_300
The number of categories of the product,
Figure SMS_289
Is a temperature coefficient>
Figure SMS_295
For picture->
Figure SMS_299
Is>
Figure SMS_301
The 1 st eigenvector after dimension reduction, in this embodiment,/th eigenvector>
Figure SMS_291
Figure SMS_294
Is 128%>
Figure SMS_297
Is 2048 in dimension;
for pictures
Figure SMS_302
According to the second feature->
Figure SMS_303
And fourth feature->
Figure SMS_304
Calculating corresponding third eigenvector->
Figure SMS_305
And fourth feature vector->
Figure SMS_306
The method specifically comprises the following steps:
Figure SMS_307
wherein ,C2 For pictures
Figure SMS_308
Is a pseudo tag number of (a); />
Figure SMS_309
Representing picture->
Figure SMS_310
C of (2) 2 A j-th feature vector;
the obtained third feature vector
Figure SMS_311
Satisfy->
Figure SMS_312
Fourth feature vector->
Figure SMS_313
Satisfy->
Figure SMS_314
According to the third feature vector
Figure SMS_315
And fourth feature vector->
Figure SMS_318
Constructing a second positive sample pair
Figure SMS_320
And constructing a second circulation sequence +.>
Figure SMS_316
Satisfy->
Figure SMS_319
,R 2 For the second cycle sequence->
Figure SMS_321
In the present embodiment, R 2 =8192, according to the second cyclic sequence +.>
Figure SMS_322
Construction of a second negative sample pair +.>
Figure SMS_317
Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss function
Figure SMS_323
Modifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically: />
Figure SMS_324
wherein ,
Figure SMS_327
modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>
Figure SMS_328
Is a first loss function value,/>
Figure SMS_331
For picture->
Figure SMS_326
Total number of categories required for multi-tag classification, +. >
Figure SMS_329
For picture->
Figure SMS_332
Corresponding->
Figure SMS_334
The number of categories of the product,
Figure SMS_325
for picture->
Figure SMS_330
Is>
Figure SMS_333
The 2 nd feature vector after dimension reduction;
s3.4: will first feature
Figure SMS_335
Inputting a first label modifier model M 1 Is compared with a preset first category prototype feature>
Figure SMS_336
Performing a second contrast learning to obtain a fourth characteristic +.>
Figure SMS_337
Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>
Figure SMS_338
Performing a second contrast learning and setting a second loss function +.>
Figure SMS_339
Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
will first feature
Figure SMS_340
Inputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares pictures
Figure SMS_341
Is>
Figure SMS_342
Prototype feature +.>
Figure SMS_343
Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>
Figure SMS_344
Figure SMS_345
wherein ,
Figure SMS_346
for the first class prototype feature corresponding to the updated kth class,/for example>
Figure SMS_347
Is the firstThe first class prototype features corresponding to the k classes, m being a preset momentum;
setting a second loss function
Figure SMS_348
Modifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
Figure SMS_349
wherein ,
Figure SMS_350
modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for pictures +.>
Figure SMS_351
A second loss function value of (2);
will fourth feature
Figure SMS_352
Inputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>
Figure SMS_353
Second feature vector +.>
Figure SMS_354
And a second category prototype feature +.>
Figure SMS_355
Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>
Figure SMS_356
Figure SMS_357
wherein ,
Figure SMS_358
for the updated->
Figure SMS_359
Second category prototype feature corresponding to the respective category, +.>
Figure SMS_360
Is->
Figure SMS_361
A second class prototype feature corresponding to the individual class;
setting a second loss function
Figure SMS_362
Modifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
Figure SMS_363
/>
wherein ,
Figure SMS_364
modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>
Figure SMS_365
A second loss function value of (2);
s3.5: will first feature
Figure SMS_366
Inputting a first label modifier model M 1 In the classifier of (2), calculate the output picture
Figure SMS_367
Classification probability of (c); fourth feature->
Figure SMS_368
Inputting a second label modifier model M 2 In the classifier of (2), calculate the output picture
Figure SMS_369
The classification probability of (3) is specifically:
will first feature
Figure SMS_370
Inputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +. >
Figure SMS_371
The classification probability of (3) is specifically:
Figure SMS_372
wherein ,
Figure SMS_373
for picture->
Figure SMS_374
Classification probability of->
Figure SMS_375
For sigmoid function, +.>
Figure SMS_376
Calculating a function for the confidence score of the classifier;
will fourth feature
Figure SMS_377
Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>
Figure SMS_378
The classification probability of (3) is specifically:
Figure SMS_379
wherein ,
Figure SMS_380
for picture->
Figure SMS_381
Classification probability of->
Figure SMS_382
For sigmoid function, +.>
Figure SMS_383
Calculating a function for the confidence score of the classifier;
s3.6: picture is made
Figure SMS_385
Is input into a first label modifier sub-model M 1 The label correction module of (1) for picture->
Figure SMS_389
Pseudo tag of->
Figure SMS_392
Performing label correction to obtain picture->
Figure SMS_384
Is->
Figure SMS_387
The method comprises the steps of carrying out a first treatment on the surface of the Picture->
Figure SMS_390
Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->
Figure SMS_393
Pseudo tag of->
Figure SMS_386
Performing label correction to obtain picture->
Figure SMS_388
Is->
Figure SMS_391
The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->
Figure SMS_394
Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of (a) to update parameters, specifically:
picture is made
Figure SMS_395
Classification probability of->
Figure SMS_399
Inputting a first label modifier model M 1 Is set to a first threshold +.>
Figure SMS_401
Second threshold->
Figure SMS_397
Third threshold- >
Figure SMS_398
And a fourth threshold->
Figure SMS_400
Dynamically updating four thresholds with a preset momentum m, in this embodiment a third threshold +.>
Figure SMS_402
Fourth threshold->
Figure SMS_396
According to the updated third threshold value
Figure SMS_403
And a fourth threshold->
Figure SMS_404
And picture->
Figure SMS_405
Classification probability of->
Figure SMS_406
Determining binary noise tag->
Figure SMS_407
Is a value of (2);
according to the updated first threshold value
Figure SMS_408
And a second threshold->
Figure SMS_409
Get pictures->
Figure SMS_410
Intermediate label->
Figure SMS_411
When noise label
Figure SMS_412
At the time of using picture->
Figure SMS_413
Intermediate label->
Figure SMS_414
Replacement picture->
Figure SMS_415
Is a pseudo tag of (a)
Figure SMS_416
As picture->
Figure SMS_417
Is->
Figure SMS_418
When noise label
Figure SMS_419
When the picture is reserved->
Figure SMS_420
Pseudo tag of->
Figure SMS_421
As picture->
Figure SMS_422
Is a correction label of (a)
Figure SMS_423
First tag modifier sub-model M 1 The specific correction process in the label correction module is as follows:
Figure SMS_424
picture is made
Figure SMS_425
Classification probability of->
Figure SMS_426
Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold value
Figure SMS_427
And a fourth threshold->
Figure SMS_428
And picture->
Figure SMS_429
Classification probability of->
Figure SMS_430
Determining binary noise tag->
Figure SMS_431
Is a value of (2);
according to the updated first threshold value
Figure SMS_432
And a second thresholdValue->
Figure SMS_433
Get pictures->
Figure SMS_434
Intermediate label->
Figure SMS_435
When noise label
Figure SMS_436
At the time of using picture->
Figure SMS_437
Intermediate label->
Figure SMS_438
Replacement picture->
Figure SMS_439
Pseudo tag of->
Figure SMS_440
As picture->
Figure SMS_441
Is->
Figure SMS_442
When noise label
Figure SMS_443
When the picture is reserved->
Figure SMS_444
Pseudo tag of->
Figure SMS_445
As picture->
Figure SMS_446
Is->
Figure SMS_447
The third loss function
Figure SMS_448
The method comprises the following steps:
Figure SMS_449
wherein ,
Figure SMS_450
a binary cross entropy penalty for the i-th picture;
s3.7: according to a first loss function
Figure SMS_451
Second loss function->
Figure SMS_452
And a third loss function->
Figure SMS_453
Setting the total loss function->
Figure SMS_454
Parameter updating is carried out on the double-branch multi-label correction neural network model, and an optimized double-branch multi-label correction neural network model is obtained;
the total loss function
Figure SMS_455
The method comprises the following steps:
Figure SMS_456
wherein ,
Figure SMS_457
for the total loss function value->
Figure SMS_458
For the first loss function->
Figure SMS_459
Balance factor of->
Figure SMS_460
As a second loss function
Figure SMS_461
In the present example, ++>
Figure SMS_462
,/>
Figure SMS_463
Finally, obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label;
according to the method and the device, related pictures can be collected from the Internet as data sets according to specific application of a user, the dual-branch network is trained, a model supporting classification of multi-label pictures is constructed, label correction can be carried out on the multi-label noisy data sets, the cost of manpower and material resources is saved, and efficient utilization of data resources is realized; the invention also provides a contrast learning method, which can learn some common characterizations from each other while the difference exists in the branch networks, and average the prediction of the model when classifying the pictures, so that the result is more robust; in addition, the invention prescribes the upper and lower bounds according to the predicted value of the training picture, and changes the label of the picture with the predicted value exceeding or being lower than the threshold value, thereby achieving the effect of weakening noise and avoiding the overfitting to the noise.
The same or similar reference numerals correspond to the same or similar components;
the terms describing the positional relationship in the drawings are merely illustrative, and are not to be construed as limiting the present patent;
it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (10)

1. The multi-label image recognition method based on deep learning under noisy data is characterized by comprising the following steps:
s1: acquiring a multi-label noisy data set and preprocessing;
s2: establishing a double-branch multi-label correction neural network model;
s3: inputting the preprocessed multi-label noisy data set into a double-branch multi-label correction neural network model for comparison learning training to obtain an optimized double-branch multi-label correction neural network model;
S4: obtaining a noise-containing picture to be corrected, correcting the noise-containing picture to be corrected by using the optimized double-branch multi-label correction neural network model, obtaining a correction label of the noise-containing picture to be corrected, and carrying out image recognition on the noise-containing picture to be corrected according to the correction label.
2. The method for identifying the multi-label image under the noisy data based on the deep learning according to claim 1, wherein in the step S1, the specific method for acquiring the multi-label noisy data set and preprocessing is as follows:
acquiring a multi-label noisy data set according to preset K multi-label classification categories K;
dividing the obtained multi-label noisy data set into a training set and a verification set, wherein the training set comprises N pictures, and each picture is marked with a pseudo label
Figure QLYQS_1
The training set is marked as X; dividing the training set into two pictures againEqual number of first sub-training sets D 1 And a second sub-training set D 2, wherein ,/>
Figure QLYQS_2
,/>
Figure QLYQS_3
Figure QLYQS_4
,/>
Figure QLYQS_5
Representing the i picture->
Figure QLYQS_6
And its corresponding pseudo tag->
Figure QLYQS_7
Determining length and width data and pseudo tags of pictures in each sub-training set
Figure QLYQS_8
Wherein the length of the picture is denoted as H and the width of the picture is denoted as W; and finishing the pretreatment of the multi-label noisy data set.
3. The method for identifying multi-label images under noisy data based on deep learning according to claim 2, wherein the pseudo labels of the pictures in each sub-training set are determined
Figure QLYQS_9
The specific method of the values of (2) is as follows:
judging whether the pictures in each sub-training set belong to a preset multi-label classification category k, if so, the value of the pseudo label of the ith picture relative to the multi-label classification category k
Figure QLYQS_10
Otherwise->
Figure QLYQS_11
4. The method for identifying multi-label image under noisy data based on deep learning according to claim 1, wherein the dual-branch multi-label correction neural network model in step S2 is specifically:
the dual-branch multi-label correction neural network model comprises a first label correction sub-model M which is arranged in parallel 1 And a second label modifier model M 2 The method comprises the steps of carrying out a first treatment on the surface of the The first label modifier model M 1 And a second label modifier model M 2 The structure of the model is the same and the model parameters are different;
the first label modifier model M 1 Or a second label modifier model M 2 The system comprises a feature extractor, an example comparison learning module, a category prototype comparison learning module, a classifier and a label correction module which are connected in sequence.
5. The method for identifying the multi-label image under the noisy data based on deep learning according to claim 4, wherein in the step S3, the preprocessed multi-label noisy data set is input into a dual-branch multi-label correction neural network model for comparison learning training, and an optimized dual-branch multi-label correction neural network model is obtained, and the specific method comprises the following steps:
S3.1: will first sub training set D 1 In a picture of
Figure QLYQS_12
And a second sub-training set D 2 Picture->
Figure QLYQS_13
Common input into a two-branch, multi-tag modified neural network model, wherein +.>
Figure QLYQS_14
Satisfy->
Figure QLYQS_15
S3.2: modifying the submodel M by using the first label respectively 1 And a second label modifier model M 2 The feature extractor of (1) is used for inputting pictures
Figure QLYQS_16
And picture->
Figure QLYQS_17
Extracting features to obtain first features ∈>
Figure QLYQS_18
And second feature->
Figure QLYQS_19
And third feature->
Figure QLYQS_20
And fourth feature->
Figure QLYQS_21
S3.3: will first feature
Figure QLYQS_22
And second feature->
Figure QLYQS_26
Common input of first tag modifier sub-model M 1 Is to add the third feature +.>
Figure QLYQS_29
And fourth feature->
Figure QLYQS_24
Common input of a second tag modifier sub-model M 2 Is to picture +.>
Figure QLYQS_25
Is>
Figure QLYQS_28
And third feature->
Figure QLYQS_30
Performing first contrast learning, and performing +.>
Figure QLYQS_23
Second feature->
Figure QLYQS_27
And fourth feature->
Figure QLYQS_31
Performing first contrast learning, and setting a first loss function +.>
Figure QLYQS_32
Correction of the first label sub-model M 1 And a second label modifier model M 2 The instance comparison learning module of (a) performs parameter updating;
s3.4: will first feature
Figure QLYQS_33
Inputting a first label modifier model M 1 Is compared with a preset first category prototype feature >
Figure QLYQS_34
Performing a second contrast learning to obtain a fourth characteristic +.>
Figure QLYQS_35
Inputting a second label modifier model M 2 Category prototype comparison learning module of (2) and a preset second category prototype feature +.>
Figure QLYQS_36
Performing a second contrast learning and setting a second loss function +.>
Figure QLYQS_37
Correction of the first label sub-model M 1 And a second label modifier model M 2 The category prototype comparison learning module of (1) performs parameter updating;
s3.5: will first feature
Figure QLYQS_38
Inputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>
Figure QLYQS_39
Classification probability of (c); fourth feature->
Figure QLYQS_40
Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>
Figure QLYQS_41
Classification probability of (c);
s3.6: picture is made
Figure QLYQS_43
Is input into a first label modifier sub-model M 1 The label correction module of (1) for picture->
Figure QLYQS_46
Pseudo tag of->
Figure QLYQS_49
Performing label correction to obtain picture->
Figure QLYQS_44
Is->
Figure QLYQS_47
The method comprises the steps of carrying out a first treatment on the surface of the Picture->
Figure QLYQS_50
Is input into a second label modifier sub-model M 2 The label correction module of (1) for picture->
Figure QLYQS_52
Pseudo tag of->
Figure QLYQS_42
Performing label correction to obtain picture->
Figure QLYQS_45
Is->
Figure QLYQS_48
The method comprises the steps of carrying out a first treatment on the surface of the And sets a third loss function->
Figure QLYQS_51
Respectively calculating a first label correction sub-model M 1 And a second label correction sub-model M 2 The cross entropy loss of the label correction module of the (2) is used for carrying out parameter updating;
S3.7: according to a first loss function
Figure QLYQS_53
Second loss function->
Figure QLYQS_54
And a third loss function->
Figure QLYQS_55
Setting the total loss function->
Figure QLYQS_56
And updating parameters of the double-branch multi-label correction neural network model to obtain an optimized double-branch multi-label correction neural network model.
6. The method for identifying multi-label image under noisy data based on deep learning according to claim 5, wherein the specific method in step S3.3 is as follows:
will first feature
Figure QLYQS_57
And second feature->
Figure QLYQS_58
Common input of first tag modifier sub-model M 1 Is to add the third feature +.>
Figure QLYQS_59
And fourth feature->
Figure QLYQS_60
Common input of a second tag modifier sub-model M 2 An instance comparison learning module of (a);
for pictures
Figure QLYQS_61
According to the first feature->
Figure QLYQS_62
And third feature->
Figure QLYQS_63
Calculating corresponding first eigenvector->
Figure QLYQS_64
And a second feature vector->
Figure QLYQS_65
The method specifically comprises the following steps:
Figure QLYQS_66
wherein ,C1 For pictures
Figure QLYQS_67
Is a pseudo tag number of (a); />
Figure QLYQS_68
Representing picture->
Figure QLYQS_69
C of (2) 1 A j-th feature vector; />
The obtained first feature vector
Figure QLYQS_70
Satisfy->
Figure QLYQS_71
Second feature vector->
Figure QLYQS_72
Satisfy->
Figure QLYQS_73
According to the first feature vector
Figure QLYQS_74
And a second feature vector->
Figure QLYQS_77
Constructing a first positive sample pair
Figure QLYQS_79
And constructs the first circulation sequence +.>
Figure QLYQS_75
Satisfy->
Figure QLYQS_78
,R 1 For the first cycle sequence- >
Figure QLYQS_80
According to the first cyclic sequence +.>
Figure QLYQS_81
Constructing a first negative sample pair
Figure QLYQS_76
Performing first contrast learning by using the constructed first positive sample pair and the first negative sample pair;
setting a first loss function
Figure QLYQS_82
Modifying the first label sub-model M 1 The example comparison learning module of (1) performs parameter updating, specifically:
Figure QLYQS_83
wherein ,
Figure QLYQS_85
modifying the submodel M for the first tag 1 In the example contrast learning module of (1), for pictures +.>
Figure QLYQS_87
Is a first loss function value,/>
Figure QLYQS_90
For picture->
Figure QLYQS_86
Total number of categories required for multi-tag classification, +.>
Figure QLYQS_89
For picture->
Figure QLYQS_92
Corresponding->
Figure QLYQS_93
Category (S),>
Figure QLYQS_84
is a temperature coefficient>
Figure QLYQS_88
For picture->
Figure QLYQS_91
Is>
Figure QLYQS_94
The 1 st feature vector after dimension reduction;
for pictures
Figure QLYQS_95
According to the second feature->
Figure QLYQS_96
And fourth feature->
Figure QLYQS_97
Calculating corresponding third eigenvector->
Figure QLYQS_98
And fourth feature vector->
Figure QLYQS_99
The method specifically comprises the following steps:
Figure QLYQS_100
wherein ,C2 For pictures
Figure QLYQS_101
Is a pseudo tag number of (a); />
Figure QLYQS_102
Representing picture->
Figure QLYQS_103
C of (2) 2 A j-th feature vector;
the obtained third feature vector
Figure QLYQS_104
Satisfy->
Figure QLYQS_105
Fourth feature vector->
Figure QLYQS_106
Satisfy the following requirements
Figure QLYQS_107
According to the third feature vector
Figure QLYQS_110
And fourth feature vector->
Figure QLYQS_112
Construction of a second positive sample pair->
Figure QLYQS_114
And constructing a second circulation sequence +.>
Figure QLYQS_109
Satisfy->
Figure QLYQS_111
,R 2 For the second cycle sequence->
Figure QLYQS_113
According to the second cyclic sequence +. >
Figure QLYQS_115
Construction of a second negative sample pair +.>
Figure QLYQS_108
Performing first contrast learning by using the constructed second positive sample pair and the second negative sample pair;
setting a first loss function
Figure QLYQS_116
Modifying the second label sub-model M 2 The example comparison learning module of (1) performs parameter updating, specifically:
Figure QLYQS_117
wherein ,
Figure QLYQS_119
modifying the submodel M for the second label 2 In the example contrast learning module of (1), for pictures +.>
Figure QLYQS_122
Is a first loss function value,/>
Figure QLYQS_127
For picture->
Figure QLYQS_120
Total number of categories required for multi-tag classification, +.>
Figure QLYQS_123
For picture->
Figure QLYQS_125
Corresponding->
Figure QLYQS_126
Category (S),>
Figure QLYQS_118
for picture->
Figure QLYQS_121
Is>
Figure QLYQS_124
And the 2 nd feature vector after dimension reduction.
7. The method for identifying multi-label image under noisy data based on deep learning according to claim 6, wherein the specific method in step S3.4 is as follows:
will first feature
Figure QLYQS_128
Inputting a first label modifier model M 1 The category prototype comparison learning module of (1) compares the pictures +.>
Figure QLYQS_129
Is>
Figure QLYQS_130
Prototype feature +.>
Figure QLYQS_131
Performing a second contrast learning, and updating the first class prototype feature by using a momentum method>
Figure QLYQS_132
Figure QLYQS_133
wherein ,
Figure QLYQS_134
for the first class prototype feature corresponding to the updated kth class,/for example>
Figure QLYQS_135
For a first class prototype feature corresponding to the kth class, m is a preset momentum;
Setting a second loss function
Figure QLYQS_136
Modifying the first label sub-model M 1 The category prototype comparison learning module of (1) performs parameter updating, specifically:
Figure QLYQS_137
wherein ,
Figure QLYQS_138
modifying the submodel M for the first tag 1 In the category prototype comparison learning module of (1), for pictures +.>
Figure QLYQS_139
A second loss function value of (2);
will fourth feature
Figure QLYQS_140
Inputting a second label modifier model M 2 The category prototype comparison learning module of (1) compares the pictures +.>
Figure QLYQS_141
Second feature vector +.>
Figure QLYQS_142
And a second category prototype feature +.>
Figure QLYQS_143
Performing a second contrast learning, and updating the second class prototype feature by using a momentum method>
Figure QLYQS_144
Figure QLYQS_145
wherein ,
Figure QLYQS_146
for the updated->
Figure QLYQS_147
Second category prototype feature corresponding to the respective category, +.>
Figure QLYQS_148
Is->
Figure QLYQS_149
A second class prototype feature corresponding to the individual class;
setting a second loss function
Figure QLYQS_150
Modifying the second label sub-model M 2 The category prototype comparison learning module of (1) performs parameter updating, specifically:
Figure QLYQS_151
wherein ,
Figure QLYQS_152
modifying the submodel M for the second label 2 In the category prototype comparison learning module of (1), for pictures +.>
Figure QLYQS_153
A second loss function value of (c).
8. The method for identifying multi-label image under noisy data based on deep learning according to claim 7, wherein the specific method in step S3.5 is as follows:
Will first feature
Figure QLYQS_154
Inputting a first label modifier model M 1 In the classifier of (2) calculating the output picture +.>
Figure QLYQS_155
The classification probability of (3) is specifically:
Figure QLYQS_156
/>
wherein ,
Figure QLYQS_157
for picture->
Figure QLYQS_158
Classification probability of->
Figure QLYQS_159
For sigmoid function, +.>
Figure QLYQS_160
Calculating a function for the confidence score of the classifier;
will fourth feature
Figure QLYQS_161
Inputting a second label modifier model M 2 In the classifier of (2) calculating the output picture +.>
Figure QLYQS_162
The classification probability of (3) is specifically:
Figure QLYQS_163
wherein ,
Figure QLYQS_164
for picture->
Figure QLYQS_165
Classification probability of->
Figure QLYQS_166
For sigmoid function, +.>
Figure QLYQS_167
A function is calculated for the confidence score of the classifier.
9. The method for identifying multi-label images under noisy data based on deep learning according to claim 8, wherein the specific method in step S3.6 is as follows:
picture is made
Figure QLYQS_168
Classification probability of->
Figure QLYQS_169
Inputting a first label modifier model M 1 Is set to a first threshold +.>
Figure QLYQS_170
Second threshold->
Figure QLYQS_171
Third threshold->
Figure QLYQS_172
And a fourth threshold->
Figure QLYQS_173
Dynamically updating four thresholds by using a preset momentum m;
according to the updated third threshold value
Figure QLYQS_174
And a fourth threshold->
Figure QLYQS_175
And picture->
Figure QLYQS_176
Classification probability of->
Figure QLYQS_177
Determining binary noise tag->
Figure QLYQS_178
Is a value of (2);
according to the updated first threshold value
Figure QLYQS_179
And a second threshold->
Figure QLYQS_180
Get pictures->
Figure QLYQS_181
Intermediate label- >
Figure QLYQS_182
When noise label
Figure QLYQS_183
At the time of using picture->
Figure QLYQS_184
Intermediate label->
Figure QLYQS_185
Replacement picture->
Figure QLYQS_186
Pseudo tag of->
Figure QLYQS_187
As picture->
Figure QLYQS_188
Is->
Figure QLYQS_189
When noise label
Figure QLYQS_190
When the picture is reserved->
Figure QLYQS_191
Pseudo tag of->
Figure QLYQS_192
As picture->
Figure QLYQS_193
Is->
Figure QLYQS_194
Picture is made
Figure QLYQS_195
Classification probability of->
Figure QLYQS_196
Inputting a second label modifier model M 2 A tag correction module of (a);
according to the updated third threshold value
Figure QLYQS_197
And a fourth threshold->
Figure QLYQS_198
And picture->
Figure QLYQS_199
Classification probability of->
Figure QLYQS_200
Determining binary noise tag->
Figure QLYQS_201
Is a value of (2);
according to the updated first threshold value
Figure QLYQS_202
And a second threshold->
Figure QLYQS_203
Get pictures->
Figure QLYQS_204
Intermediate label->
Figure QLYQS_205
When noise label
Figure QLYQS_206
At the time of using picture->
Figure QLYQS_207
Intermediate label->
Figure QLYQS_208
Replacement picture->
Figure QLYQS_209
Pseudo tag of->
Figure QLYQS_210
As picture->
Figure QLYQS_211
Is->
Figure QLYQS_212
When noise label
Figure QLYQS_213
When the picture is reserved->
Figure QLYQS_214
Pseudo tag of->
Figure QLYQS_215
As picture->
Figure QLYQS_216
Is->
Figure QLYQS_217
The third loss function
Figure QLYQS_218
The method comprises the following steps:
Figure QLYQS_219
wherein ,
Figure QLYQS_220
the binary cross entropy for the i-th picture is lost.
10. The method for identifying multiple label images under noisy data according to claim 9, wherein the total loss function in step S3.7
Figure QLYQS_221
The method comprises the following steps:
Figure QLYQS_222
wherein ,
Figure QLYQS_223
as a total loss functionValue of->
Figure QLYQS_224
For the first loss function->
Figure QLYQS_225
Balance factor of- >
Figure QLYQS_226
For the second loss function->
Figure QLYQS_227
Is a balance factor of (a). />
CN202310299402.5A 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data Active CN116012569B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310299402.5A CN116012569B (en) 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310299402.5A CN116012569B (en) 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data

Publications (2)

Publication Number Publication Date
CN116012569A true CN116012569A (en) 2023-04-25
CN116012569B CN116012569B (en) 2023-08-15

Family

ID=86032175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310299402.5A Active CN116012569B (en) 2023-03-24 2023-03-24 Multi-label image recognition method based on deep learning and under noisy data

Country Status (1)

Country Link
CN (1) CN116012569B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416382A (en) * 2018-03-01 2018-08-17 南开大学 One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method
US20200356851A1 (en) * 2019-05-10 2020-11-12 Baidu Usa Llc Systems and methods for large scale semantic indexing with deep level-wise extreme multi-label learning
US20210295091A1 (en) * 2020-03-19 2021-09-23 Salesforce.Com, Inc. Unsupervised representation learning with contrastive prototypes
CN113688949A (en) * 2021-10-25 2021-11-23 南京码极客科技有限公司 Network image data set denoising method based on dual-network joint label correction
US20220067506A1 (en) * 2020-08-28 2022-03-03 Salesforce.Com, Inc. Systems and methods for partially supervised learning with momentum prototypes
US20220156591A1 (en) * 2020-11-13 2022-05-19 Salesforce.Com, Inc. Systems and methods for semi-supervised learning with contrastive graph regularization
US20220188645A1 (en) * 2020-12-16 2022-06-16 Oracle International Corporation Using generative adversarial networks to construct realistic counterfactual explanations for machine learning models
CN114692732A (en) * 2022-03-11 2022-07-01 华南理工大学 Method, system, device and storage medium for updating online label
CN115147670A (en) * 2021-03-15 2022-10-04 华为技术有限公司 Object processing method and device
CN115331088A (en) * 2022-10-13 2022-11-11 南京航空航天大学 Robust learning method based on class labels with noise and imbalance
CN115496948A (en) * 2022-09-23 2022-12-20 广东工业大学 Network supervision fine-grained image identification method and system based on deep learning
CN115809697A (en) * 2022-12-26 2023-03-17 上海高德威智能交通系统有限公司 Data correction method and device and electronic equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416382A (en) * 2018-03-01 2018-08-17 南开大学 One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method
US20200356851A1 (en) * 2019-05-10 2020-11-12 Baidu Usa Llc Systems and methods for large scale semantic indexing with deep level-wise extreme multi-label learning
US20210295091A1 (en) * 2020-03-19 2021-09-23 Salesforce.Com, Inc. Unsupervised representation learning with contrastive prototypes
US20220067506A1 (en) * 2020-08-28 2022-03-03 Salesforce.Com, Inc. Systems and methods for partially supervised learning with momentum prototypes
US20220156591A1 (en) * 2020-11-13 2022-05-19 Salesforce.Com, Inc. Systems and methods for semi-supervised learning with contrastive graph regularization
US20220188645A1 (en) * 2020-12-16 2022-06-16 Oracle International Corporation Using generative adversarial networks to construct realistic counterfactual explanations for machine learning models
CN115147670A (en) * 2021-03-15 2022-10-04 华为技术有限公司 Object processing method and device
CN113688949A (en) * 2021-10-25 2021-11-23 南京码极客科技有限公司 Network image data set denoising method based on dual-network joint label correction
CN114692732A (en) * 2022-03-11 2022-07-01 华南理工大学 Method, system, device and storage medium for updating online label
CN115496948A (en) * 2022-09-23 2022-12-20 广东工业大学 Network supervision fine-grained image identification method and system based on deep learning
CN115331088A (en) * 2022-10-13 2022-11-11 南京航空航天大学 Robust learning method based on class labels with noise and imbalance
CN115809697A (en) * 2022-12-26 2023-03-17 上海高德威智能交通系统有限公司 Data correction method and device and electronic equipment

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
GUOQING ZHANG ET AL.: "Exploiting Multi-granularity Features for Unsupervised Domain Adaptation Person Re-identification", 《2022 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE BIG DATA AND INTELLIGENT SYSTEMS (HDIS)》, pages 223 - 227 *
TIANSHUI CHEN ET AL.: "Heterogeneous Semantic Transfer for Multi-label Recognition with Partial Labels", 《ARXIV:2205.11131V1 [CS.CV]》, pages 1 - 13 *
XUDONG WANGET AL.: "Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination", 《PROCEEDINGS OF THE IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》, pages 12586 - 12595 *
周彧聪等: "互补学习:一种面向图像应用和噪声标注的深度神经网络训练方法", 《计算机研究与发展》, vol. 54, no. 12, pages 2649 - 2659 *
宫辰等: "标签噪声鲁棒学习算法研究综述", 《航空兵器》, vol. 27, no. 3, pages 20 - 26 *
蔡宇佳等: "基于实例相关标签噪声的消除算法综述", 《智能计算机与应用》, vol. 12, no. 12, pages 30 - 35 *
陈昊等: "软伪标签和多尺度特征融合的行人重识别", 《激光与光电子学进展研》, vol. 59, no. 24, pages 1 - 8 *

Also Published As

Publication number Publication date
CN116012569B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN110188227B (en) Hash image retrieval method based on deep learning and low-rank matrix optimization
Zhang et al. Better and faster: knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification
CN110210468B (en) Character recognition method based on convolutional neural network feature fusion migration
CN112819065B (en) Unsupervised pedestrian sample mining method and unsupervised pedestrian sample mining system based on multi-clustering information
CN114912612A (en) Bird identification method and device, computer equipment and storage medium
CN112651940B (en) Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN112364791B (en) Pedestrian re-identification method and system based on generation of confrontation network
CN113378706B (en) Drawing system for assisting children in observing plants and learning biological diversity
CN111666873A (en) Training method, recognition method and system based on multitask deep learning network
CN113673482B (en) Cell antinuclear antibody fluorescence recognition method and system based on dynamic label distribution
CN113434688B (en) Data processing method and device for public opinion classification model training
CN111079847A (en) Remote sensing image automatic labeling method based on deep learning
CN113657267A (en) Semi-supervised pedestrian re-identification model, method and device
CN115331284A (en) Self-healing mechanism-based facial expression recognition method and system in real scene
CN117197904A (en) Training method of human face living body detection model, human face living body detection method and human face living body detection device
CN113010683A (en) Entity relationship identification method and system based on improved graph attention network
CN112183464A (en) Video pedestrian identification method based on deep neural network and graph convolution network
CN114782752A (en) Small sample image grouping classification method and device based on self-training
CN110175631A (en) A kind of multiple view clustering method based on common Learning Subspaces structure and cluster oriental matrix
CN116051924B (en) Divide-and-conquer defense method for image countermeasure sample
CN116012569B (en) Multi-label image recognition method based on deep learning and under noisy data
CN116775880A (en) Multi-label text classification method and system based on label semantics and transfer learning
CN112750128A (en) Image semantic segmentation method and device, terminal and readable storage medium
CN114170484B (en) Picture attribute prediction method and device, electronic equipment and storage medium
CN113592045B (en) Model adaptive text recognition method and system from printed form to handwritten form

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant