CN113657561A - Semi-supervised night image classification method based on multi-task decoupling learning - Google Patents
Semi-supervised night image classification method based on multi-task decoupling learning Download PDFInfo
- Publication number
- CN113657561A CN113657561A CN202111220897.5A CN202111220897A CN113657561A CN 113657561 A CN113657561 A CN 113657561A CN 202111220897 A CN202111220897 A CN 202111220897A CN 113657561 A CN113657561 A CN 113657561A
- Authority
- CN
- China
- Prior art keywords
- learning
- supervised
- classification
- samples
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 239000013598 vector Substances 0.000 claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 29
- 238000000605 extraction Methods 0.000 claims abstract description 27
- 230000006870 function Effects 0.000 claims abstract description 24
- 238000004821 distillation Methods 0.000 claims abstract description 14
- 238000009826 distribution Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 claims 2
- 238000010586 diagram Methods 0.000 description 3
- 238000013140 knowledge distillation Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000002860 competitive effect Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
本发明公开了一种基于多任务解耦学习的半监督夜间图像分类方法,将白天带标签的样本与夜间无标签的样本,一同输入特征提取网络,其中白天样本提取的特征向量输入分类网络头,采用交叉熵损失函数进行监督;夜间样本提取的特征向量,首先输入分类网络头获得伪标签,再根据伪标签构造正负样本对后输入自监督网络头,采用角度对比损失函数进行监督训练;完成模型多任务训练后,将夜间数据集中少量带标签的样本输入特征提取网络与分类网络头,进行迭代自蒸馏学习,最终实现夜间数据集可以有效分类的效果。
The invention discloses a semi-supervised nighttime image classification method based on multi-task decoupling learning. The daytime labeled samples and the nighttime unlabeled samples are input into a feature extraction network together, and the feature vector extracted from the daytime samples is input into a classification network head , using the cross-entropy loss function for supervision; the feature vector extracted from the night samples is firstly input into the classification network head to obtain pseudo-labels, and then constructs positive and negative sample pairs according to the pseudo-labels and then inputs them into the self-supervised network head, and uses the angle contrast loss function for supervision training; After the multi-task training of the model is completed, a small number of labeled samples in the night data set are input into the feature extraction network and the classification network head, and the iterative self-distillation learning is performed, and finally the night data set can be effectively classified.
Description
技术领域technical field
本发明涉及计算机视觉识别技术领域中的多任务学习,尤其是涉及一种基于多任务解耦学习的半监督夜间图像分类方法。The invention relates to multi-task learning in the technical field of computer vision recognition, in particular to a semi-supervised night image classification method based on multi-task decoupling learning.
背景技术Background technique
领域迁移是计算机视觉中一个亟待解决的问题,在该问题的定义中,源域和目标域的任务相同,数据不同但相关。这类学习的核心任务是解决两个域数据分布的差异问题。目前通用图像识别算法是在有监督的数据集上训练而成,其在类似分布的图像上已达到较高的性能。然而当迁移到其他目标域的图像时,性能往往会出现极具下降,这是源域和目标域之间的数据分布差异造成的。比如当基于白天数据集训练的网络预测夜间图像时,识别的效果往往会出现大幅降低。Domain transfer is an urgent problem in computer vision, in which the definition of the problem is that the tasks of the source and target domains are the same, and the data are different but related. The core task of this type of learning is to solve the problem of differences in the data distributions of the two domains. Current general image recognition algorithms are trained on supervised datasets and have achieved high performance on images with similar distributions. However, when migrating to images from other target domains, the performance tends to drop dramatically, which is caused by the difference in data distribution between the source and target domains. For example, when a network trained on a daytime dataset predicts nighttime images, the recognition performance tends to drop dramatically.
众所周知,目前存在大量开源的白天图像分类数据集,如PASCAL VOC,但是带标签的夜间图像分类数据集却十分缺乏。因此,我们希望利用白天图像的数据集训练网络,并使该网络可以有效迁移到夜间图像分类上,从而提高夜间图像分类的性能。As we all know, there are a large number of open-source daytime image classification datasets, such as PASCAL VOC, but labeled nighttime image classification datasets are very scarce. Therefore, we hope to train the network with a dataset of daytime images and enable the network to efficiently transfer to nighttime image classification, thereby improving the performance of nighttime image classification.
自监督学习主要是利用辅助任务从大规模的无监督数据中挖掘自身的监督信息,通过这种构造的监督信息对网络进行训练,从而学习到对下游任务有价值的表征。这种学习方法被证明可以捕捉到图像的判别性特征,对于缺乏标签数据的任务来说是一个有效的解决方法。对大量无标签的夜间图像进行自监督学习,可以使网络学习到夜间图像的特征分布,从而提高夜间图像分类的准确率。Self-supervised learning mainly uses auxiliary tasks to mine its own supervision information from large-scale unsupervised data, and trains the network through this constructed supervision information, thereby learning valuable representations for downstream tasks. This learning method is proven to capture discriminative features of images and is an effective solution for tasks lacking labeled data. Self-supervised learning on a large number of unlabeled nighttime images enables the network to learn the feature distribution of nighttime images, thereby improving the accuracy of nighttime image classification.
因此,通过将夜间图像分类的任务解耦为白天图像的有监督分类任务和夜间图像的自监督任务,并将两个任务进行多任务学习,可以使模型既具备提取各类判别性特征的能力,又能适应夜间图像的数据分布。然而多任务学习中,各个任务之间存在竞争关系,如何使两个任务相互促进,而不是相互制约,需要设计有效的损失函数。Therefore, by decoupling the task of nighttime image classification into a supervised classification task of daytime images and a self-supervised task of nighttime images, and performing multi-task learning on the two tasks, the model can not only have the ability to extract various discriminative features , and can also adapt to the data distribution of nighttime images. However, in multi-task learning, there is a competitive relationship between each task. How to make the two tasks promote each other instead of restricting each other requires the design of an effective loss function.
近年来,知识蒸馏成为一个热门的话题。知识蒸馏通过引入与教师网络相关的软目标作为损失的一部分,以诱导学生网络的训练,从而实现知识迁移。自蒸馏的定义,是自己向自己学习,以与自己相关的软目标,诱导下一代网络的训练。这种方法通常可以增强网络的鲁棒性,避免过拟合,因此可适用于进一步提升模型在夜间图像的性能。Knowledge distillation has become a hot topic in recent years. Knowledge distillation achieves knowledge transfer by introducing soft targets related to the teacher network as part of the loss to induce the training of the student network. The definition of self-distillation is to learn from oneself to induce the training of the next generation network with soft goals related to oneself. This approach usually enhances the robustness of the network and avoids overfitting, so it can be applied to further improve the performance of the model on nighttime images.
发明内容SUMMARY OF THE INVENTION
为解决现有技术的不足,实现提高夜间图像识别性能的目的,本发明采用如下的技术方案:In order to solve the deficiencies of the prior art and achieve the purpose of improving the performance of image recognition at night, the present invention adopts the following technical solutions:
一种基于多任务解耦学习的半监督夜间图像分类方法,包括如下步骤:A semi-supervised nighttime image classification method based on multi-task decoupling learning, including the following steps:
S1,构建带标签的白天图像分类数据集D;构建夜间图像分类数据集A,其中夜间图像只有部分样本带有标签,其余样本无类别标签;S1, construct a labeled daytime image classification dataset D; construct a nighttime image classification dataset A, in which only some samples of the nighttime images have labels, and the rest of the samples have no class labels;
S2,将白天图像数据集中有标签的样本与夜间图像数据集中无标签的样本,一同输入特征提取网络,输出白天图像特征向量和夜间图像特征向量;所述特征提取网络为深度残差卷积网络;S2, the labeled samples in the daytime image data set and the unlabeled samples in the nighttime image data set are input into the feature extraction network together, and the daytime image feature vector and the nighttime image feature vector are output; the feature extraction network is a deep residual convolution network. ;
S3,在特征提取网络层后接入一个多任务学习网络,该网络由一个有监督的分类网络头和一个自监督网络头构成;S3, access a multi-task learning network after the feature extraction network layer, the network consists of a supervised classification network head and a self-supervised network head;
S4,对于白天图像特征向量,通过分类网络头进行损失监督训练;对于夜间图像 特征向量,通过同一分类网络头预测其类别作为伪标签,并根据伪标签构造夜间图像正负 样本对;分类网络头由一个全局平均池化层和全连接层构成; S4, for the daytime image feature vector, through the classification network head Loss-supervised training; for nighttime image feature vectors, the same classification network head predicts its category as a pseudo-label, and constructs nighttime image positive and negative sample pairs according to the pseudo-label; the classification network head consists of a global average pooling layer and a fully connected layer;
S5,自监督网络头根据分类网络头的权重参数,对夜间图像正负样本对进行归一 化操作,得到归一化后的特征向量,并采用对比损失指导特征空间的学习,使正样本相 似,负样本有效区分; S5, the self-supervised network head normalizes the positive and negative samples of the night image according to the weight parameters of the classification network head to obtain the normalized feature vector, and uses the contrast loss Guide the learning of the feature space, so that the positive samples are similar and the negative samples are effectively distinguished;
S6,将所述损失监督训练与所述对比损失进行共同监督训练;S6, performing co-supervised training on the loss supervision training and the contrast loss;
S7,将夜间图像数据集中有标签的样本,输入训练完成的特征提取网络与分类网 络头,固定特征提取网络的权重,通过分类网络头进行损失监督训练,使分类网络头适应 夜间图像的特征分布;进入自蒸馏学习阶段,进行多次迭代更新,利用前一次损失监督训 练的分类预测结果作为软目标,与真实标签一同参与监督; S7, input the labeled samples in the night image data set into the trained feature extraction network and the classification network head, fix the weight of the feature extraction network, and carry out the classification network head. Loss-supervised training to adapt the classification network head to the feature distribution of night images; enter the self-distillation learning stage, perform multiple iterative updates, and use the previous The classification prediction result of loss supervision training is used as a soft target to participate in supervision together with the real label;
S8,在推理阶段,将待测夜间图像输入所述训练完成的特征提取网络与分类网络头,输出图像分类结果。S8, in the inference stage, input the nighttime image to be tested into the trained feature extraction network and classification network head, and output the image classification result.
进一步地,所述S4中,将白天图像特征向量输入分类网络头,输出白天样本类别,通过交叉熵损失函数进行监督:Further, in the S4, the daytime image feature vector is input into the classification network head, the daytime sample category is output, and the cross-entropy loss function is used to supervise:
其中,N表示白天图像数据集中有标签的样本总个数,y i 表示第i个样本的真实标 签,表示第i个样本的类别预测概率值。 Among them, N represents the total number of labeled samples in the daytime image dataset, y i represents the true label of the ith sample, Represents the class prediction probability value of the ith sample.
进一步地,所述S4中,将夜间图像特征向量输入分类网络头进行计算,得到预测的伪标签,并根据伪标签构造夜间图像正负样本对{k,k +,k -} m ,k +为k的正样本,与k属于同一标签,k -为k的负样本,与k属于不同标签,m表示样本对个数。Further, in S4, the nighttime image feature vector is input into the classification network head for calculation to obtain the predicted pseudo-label, and the nighttime image positive and negative sample pairs { k , k + , k - } m , k + are constructed according to the pseudo-label is the positive sample of k , which belongs to the same label as k , k - is the negative sample of k , which belongs to a different label from k , and m represents the number of sample pairs.
进一步地,所述S5中将正负样本特征对进行角度归一化:Further, in the S5, the angle normalization is performed on the positive and negative sample feature pairs:
其中,x表示输入的特征向量,||x||表示特征向量x的模长,y表示向量x所属的标签,Wy表示分类网络头中全连接层第y行的参数;将正负样本对{k,k +,k -} m 中的每个样本特征向量进行角度归一化计算,得到归一化后的特征向量{Λk,Λk +,Λk -} m :Among them, x represents the input feature vector, ||x|| represents the modulo length of the feature vector x, y represents the label to which the vector x belongs, W y represents the parameter of the yth row of the fully connected layer in the classification network head; Perform angle normalization calculation on each sample eigenvector in { k , k + , k - } m to obtain normalized eigenvectors {Λ k , Λ k + , Λ k - } m :
Λk=Λ(k,W,y)Λ k =Λ( k ,W,y)
Λk +=Λ(k +,W,y)Λ k + =Λ( k + ,W,y)
Λk -=Λ(k -,W,y)。Λ k - =Λ( k - ,W,y).
进一步地,所述S5中,采用对比损失指导特征空间的学习,使正样本相似,负样本有效区分,采用如下损失函数:Further, in the S5, the contrast loss is used to guide the learning of the feature space, so that the positive samples are similar and the negative samples are effectively distinguished, and the following loss function is used:
其中,yk,yk+,yk-分别表示一个样本对中样本k,k +,k -的真实标签,𝜂是超参数,表示 不同类样本之间的距离最小阈值,表示相似度函数。 Among them, y k , y k +, y k - represent the true labels of samples k , k + , k - in a sample pair, respectively, 𝜂 is a hyperparameter, which represents the minimum distance threshold between samples of different classes, represents the similarity function.
进一步地,采用余弦相似度函数对归一化后的特征向量{Λk,Λk +,Λk -} m 进行相似度比较:Further, the cosine similarity function is used to compare the similarity of the normalized feature vectors {Λ k , Λ k + , Λ k - } m :
其中,A i 、B i 分别代表向量A和B的各分量,其中正样本的相似度为1,负 样本的相似度为-1。 Among them, A i and B i represent the components of the vectors A and B, respectively, and the similarity of the positive samples is 1, the similarity of negative samples is -1.
进一步地,所述S6的总损失函数为:Further, the total loss function of the S6 is:
当训练epoch达到指定次数后,停止训练。When the training epoch reaches the specified number of times, stop training.
进一步地,所述S7中,将夜间图像数据集中有标签的样本,输入训练完成的特征提取网络与分类网络头,固定特征提取网络的权重,利用交叉熵损失函数对分类网络头进行监督:Further, in the described S7, the samples with labels in the nighttime image data set are input into the trained feature extraction network and the classification network head, the weight of the fixed feature extraction network is fixed, and the cross entropy loss function is used to supervise the classification network head:
其中,N’表示夜间图像数据集中有标签的样本总个数,y i表示第i个样本的真实标 签,表示第i个样本的类别预测概率值。 Among them, N' represents the total number of labeled samples in the night image dataset, y i represents the true label of the ith sample, Represents the class prediction probability value of the ith sample.
进一步地,所述S7中,进入自蒸馏学习阶段,进行多次迭代更新,利用前一次损 失监督训练的分类预测结果作为软目标,与真实标签y一同参与监督: Further, in the S7, enter the self-distillation learning stage, perform multiple iterative updates, and use the previous Loss-supervised training classification predictions as soft targets , participate in supervision together with the ground truth label y :
其中,λ表示软目标损失所占的比重,经多次迭代更新后,完成自蒸馏训练。Among them, λ represents the proportion of soft target loss, and after several iterations of updates, the self-distillation training is completed.
一种基于多任务解耦学习的半监督夜间图像分类方法,将待测图像输入所述训练完成的特征提取网络与分类网络头,输出图像分类结果。A semi-supervised nighttime image classification method based on multi-task decoupling learning, the image to be tested is input into the trained feature extraction network and classification network head, and the image classification result is output.
本发明的优势和有益效果在于:The advantages and beneficial effects of the present invention are:
本发明首次提出将多任务学习与知识蒸馏结合赋能于夜间图像分类,利用夜间无标签图像进行自监督学习,使网络在学习白天图像类别特征的同时,自适应地学习到夜间图像的特征分布;通过角度归一化损失函数进行自监督学习,减少自监督损失与有监督损失之间的竞争关系;通过自蒸馏的方法,利用夜间少量带标签的数据进行蒸馏学习,可以避免网络过拟合到目标域而失去泛化能力,同时又能适当地将模型进一步适应到夜间数据中。The present invention first proposes to combine multi-task learning and knowledge distillation to enable night image classification, and use unlabeled images at night for self-supervised learning, so that the network can adaptively learn the feature distribution of night images while learning the features of daytime image categories. ; Carry out self-supervised learning through angle normalization loss function to reduce the competition between self-supervised loss and supervised loss; Through self-distillation method, using a small amount of labeled data at night for distillation learning can avoid network overfitting to the target domain and lose the generalization ability, while appropriately further adapting the model to the nighttime data.
附图说明Description of drawings
图1是本发明方法流程示意图。Fig. 1 is the schematic flow chart of the method of the present invention.
图2是本发明中多任务解耦学习阶段的示意图。FIG. 2 is a schematic diagram of a multi-task decoupling learning stage in the present invention.
图3是本发明中正负样本对的示例图。FIG. 3 is an example diagram of a positive and negative sample pair in the present invention.
图4是本发明中自蒸馏学习阶段的示意图。Figure 4 is a schematic diagram of the self-distillation learning stage in the present invention.
具体实施方式Detailed ways
以下结合附图对本发明的具体实施方式进行详细说明。应当理解的是,此处所描述的具体实施方式仅用于说明和解释本发明,并不用于限制本发明。The specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are only used to illustrate and explain the present invention, but not to limit the present invention.
本发明通过结合夜间数据的自监督学习与白天数据的有监督学习,训练出具备域自适应能力的特征提取网络,并通过夜间数据集中少量带标签样本对图像识别网络进行进一步的自蒸馏学习,使分类网络头向夜间数据分布特征迁移,从而提高夜间图像识别性能。By combining the self-supervised learning of nighttime data and the supervised learning of daytime data, the invention trains a feature extraction network with domain self-adaptation ability, and further self-distills learning of the image recognition network through a small number of labeled samples in the nighttime data set, The classification network head is migrated to the nighttime data distribution features, thereby improving the nighttime image recognition performance.
如图1、图2所示,本发明的一种基于多任务解耦学习的半监督夜间图像分类方法,包括以下步骤:As shown in Figure 1 and Figure 2, a semi-supervised nighttime image classification method based on multi-task decoupling learning of the present invention includes the following steps:
步骤1:构建带标签的白天图像分类数据集,构建夜间图像分类数据集,其中只有少量夜间样本带有标签。本实施例采用开源数据集Exclusively Dark(ExDARK)中的12个类别,分别为自行车、船、瓶子、公交车、轿车、猫、椅子、杯子、狗、摩托车、人和桌子。对于上述12个类别,从COCO公开数据集中分别选取对应图像各800张,作为白天图像分类数据集D。此外,将ExDARK数据集分为3部分:从每个类别中分别抽取400张图像,构建无监督夜间图像数据集A;从每个类别中分别抽取10张图像作为少量带标签的夜间图像数据集T;最后剩下的图像作为夜间图像分类性能验证集V,以评估算法有效性;Step 1: Build a labeled daytime image classification dataset, build a nighttime image classification dataset, in which only a small number of nighttime samples are labeled. This example uses 12 categories in the open source dataset Exclusively Dark (ExDARK), which are bicycles, boats, bottles, buses, cars, cats, chairs, cups, dogs, motorcycles, people and tables. For the above 12 categories, 800 corresponding images were selected from the COCO public data set, respectively, as the daytime image classification data set D. In addition, the ExDARK dataset is divided into 3 parts: 400 images are extracted from each category to construct an unsupervised nighttime image dataset A; 10 images are extracted from each category as a small number of labeled nighttime image datasets T; the last remaining images are used as the night image classification performance validation set V to evaluate the effectiveness of the algorithm;
步骤2:将白天数据集D中带标签的图像样本与夜间数据集A中无标签的图像样本, 一同输入特征提取网络,输出各样本数据的特征向量。特征提取网络为深度残差卷积网络, 本实施例中,采用ResNet50网络,在conv5_x层输出维度为2048的特征向量。网络对所有图 像样本采用的输入尺寸,并使用随机裁剪、水平翻转的图像增强技术来扩增样本 多样性。每次输入的白天图像样本batch size为32,夜间图像样本batch_size为32,采用8 卡GPU并行训练; Step 2: Input the labeled image samples in the daytime data set D and the unlabeled image samples in the nighttime data set A into the feature extraction network together, and output the feature vector of each sample data. The feature extraction network is a deep residual convolutional network. In this embodiment, a ResNet50 network is used, and a feature vector with a dimension of 2048 is output in the conv5_x layer. The network uses the , and use randomly cropped, horizontally flipped image augmentation techniques to amplify sample diversity. The batch size of daytime image samples input each time is 32, and the batch_size of nighttime image samples is 32, and 8-card GPU is used for parallel training;
步骤3,在特征提取网络层后接入一个多任务解耦学习网络,该网络由一个有监督的分类网络头和一个自监督网络头构成;Step 3, after the feature extraction network layer is connected to a multi-task decoupling learning network, the network consists of a supervised classification network head and a self-supervised network head;
步骤4:构建分类网络头,该分类网络头由一个全局平均池化层和一个全连接层构成。本实施例采用average_pool层和一个维度为[2048,12]的全连接层,其中12是输出的类别个数;Step 4: Construct the classification network head, which consists of a global average pooling layer and a fully connected layer. This embodiment uses the average_pool layer and a fully connected layer with a dimension of [2048, 12], where 12 is the number of output categories;
步骤4.1:将白天样本经过步骤2提取的特征向量输入分类网络头,选择最高概率 对应的类别作为该特征点的类别预测结果,采用交叉熵损失函数进行监督,其计算公式 如下: Step 4.1: Input the feature vector extracted by the daytime sample in step 2 into the classification network head, select the category corresponding to the highest probability as the category prediction result of the feature point, and use the cross entropy loss function to supervise, the calculation formula as follows:
N表示样本总个数,y i 表示第i个样本的真实标签,表示第i个样本的类别预测概 率值; N represents the total number of samples, y i represents the true label of the ith sample, Represents the category prediction probability value of the ith sample;
步骤4.2:将夜间样本经过步骤2提取的特征向量输入分类网络头,获得该样本的伪标签,并根据伪标签构造正负样本对{k,k +,k -} m :k +为k的正样本,即与k属于同一标签;k -为k的负样本,即与k属于不同标签,m表示样本对个数。具体构造方法为,在32个夜间样本向量中,首先随机选择一个类别C1,将该类别中的样本随机两两配对,得到一组正样本对集合C1{…},从其他类别中随机挑选1个样本与C1{…}中的正样本对进行组合,得到多个正负样本对;然后从剩余的其他类别中选取一个类别C2,并重复以上操作,直到得到16个正负样本对。对于不足16个的极端情况,即所有样本均来自同一个类别,此次则无自监督网络的输入。因此m在大多数情况下取值16。图3为本实施例中一个正负样本对示例;Step 4.2: Input the feature vector extracted from the night sample in step 2 into the classification network head, obtain the pseudo-label of the sample, and construct a positive and negative sample pair { k , k + , k - } m according to the pseudo-label: k + is k Positive samples, that is, belong to the same label as k ; k - is a negative sample of k , that is, belong to different labels from k , and m represents the number of sample pairs. The specific construction method is as follows: in the 32 nighttime sample vectors, first randomly select a category C1, and randomly pair the samples in this category to obtain a set of positive sample pairs C1{…}, and randomly select 1 from other categories. The samples are combined with the positive sample pairs in C1{…} to obtain multiple positive and negative sample pairs; then a category C2 is selected from the remaining other categories, and the above operations are repeated until 16 positive and negative sample pairs are obtained. For the extreme case of less than 16, that is, all samples are from the same category, this time there is no input from the self-supervised network. So m takes the value 16 in most cases. FIG. 3 is an example of a positive and negative sample pair in this embodiment;
步骤5:构建自监督网络头:将步骤3.2获得的正负样本对{k,k +,k -} m 以及分类网络头的权重参数W输入自监督网络头,首先将样本特征进行角度归一化,其计算公式如下:Step 5: Construct the self-supervised network head: Input the positive and negative sample pairs { k , k + , k - } m obtained in step 3.2 and the weight parameter W of the classification network head into the self-supervised network head, and firstly normalize the angle of the sample features , its calculation formula is as follows:
x表示输入的特征向量,||x||表示特征向量x的模长,y表示向量x所属的标签,Wy表示分类网络头中全连接层第y行的参数。角度归一化处理可以缓解多任务学习任务中,附加任务与主要任务之间的竞争关系,即减少自监督任务对有监督任务的负面影响;x represents the input feature vector, ||x|| represents the modulo length of the feature vector x, y represents the label to which the vector x belongs, and W y represents the parameter of the yth row of the fully connected layer in the classification network head. Angle normalization can alleviate the competitive relationship between additional tasks and main tasks in multi-task learning tasks, that is, reduce the negative impact of self-supervised tasks on supervised tasks;
将正负样本对{k,k +,k -} m 中的每个样本特征向量进行角度归一化计算,得到归一化后的特征向量{Λk,Λk +,Λk -} m :Perform angle normalization calculation on the positive and negative samples for each sample feature vector in { k , k + , k - } m to obtain the normalized feature vector {Λ k , Λ k + , Λ k - } m :
Λk=Λ(k,W,y)Λ k =Λ( k ,W,y)
Λk +=Λ(k +,W,y)Λ k + =Λ( k + ,W,y)
Λk -=Λ(k -,W,y)Λ k - =Λ( k - ,W,y)
步骤5.1:采用余弦相似度函数对{Λk,Λk +,Λk -} m 进行相似度比较,其相似度函 数计算公式如下: Step 5.1: Use the cosine similarity function to compare the similarity of {Λ k , Λ k + , Λ k - } m , and the similarity function Calculated as follows:
A i 、B i 分别代表向量A和B的各分量,其中正样本的相似度应为1,负样本 的相似度应为-1; A i , B i represent the components of vectors A and B, respectively, where the similarity of positive samples Should be 1, the similarity of negative samples should be -1;
步骤5.2:采用对比损失指导特征空间的学习,使正样本相似,负样本有效区分,其 损失函数计算公式如下: Step 5.2: Use the contrast loss to guide the learning of the feature space, so that the positive samples are similar and the negative samples are effectively distinguished, and its loss function Calculated as follows:
yk,yk+,yk-分别表示一个样本对中样本k,k +,k -的真实标签,𝜂是超参数,表示不同类样本之间的距离应该超过该值;y k , y k +, y k - represent the true labels of samples k , k + , k - in a sample pair, respectively, 𝜂 is a hyperparameter, indicating that the distance between samples of different classes should exceed this value;
步骤6:利用步骤4.1与步骤5.2的损失函数,对特征提取网络与多任务解耦学习网络进行共同监督训练,其总损失函数为:Step 6: Use the loss functions of Step 4.1 and Step 5.2 to jointly supervise the training of the feature extraction network and the multi-task decoupling learning network. The total loss function is:
本实施例中,采用SGD优化器,其初始学习率为0.01,当训练epoch达到70时,将学习率降至0.001。当训练epoch达到100次后,停止训练;In this embodiment, the SGD optimizer is used, and its initial learning rate is 0.01, and when the training epoch reaches 70, the learning rate is reduced to 0.001. When the training epoch reaches 100 times, stop training;
步骤7:将夜间数据集中少量带标签的样本输入训练完成的特征提取网络与分类网络头,固定特征提取网络的权重,利用交叉熵损失函数对分类网络头进行进一步监督训练,使分类网络头适应夜间图像特征的数据分布,其计算公式如下:Step 7: Input a small number of labeled samples in the night dataset into the trained feature extraction network and classification network head, fix the weights of the feature extraction network, and use the cross-entropy loss function to further supervise the training of the classification network head, so that the classification network head adapts The data distribution of nighttime image features is calculated as follows:
N’表示样本总个数,y i表示第i个样本的真实标签,表示第i个样本的类别预测概 率值; N' represents the total number of samples, y i represents the true label of the ith sample, Represents the category prediction probability value of the ith sample;
步骤7.1:如图4所示,在自蒸馏学习阶段,利用前一次的分类预测结果作为软目标,与真实标签y一同参与监督,其损失函数的计算公式如下: Step 7.1: As shown in Figure 4, in the self-distillation learning stage, the previous classification prediction result is used as a soft target , participate in supervision together with the real label y, and its loss function The calculation formula is as follows:
λ表示软目标损失所占的比重,本实施例中λ=0.5时,模型性能最佳。基于损失函数对网络进行反向传播,学习率为0.005,,通过批量梯度下降法不断更新网络参数; λ represents the proportion of soft target loss. In this embodiment, when λ=0.5, the model has the best performance. based on loss function Backpropagating the network, the learning rate is 0.005, and the network parameters are continuously updated through the batch gradient descent method;
步骤7.2:重复步骤6.1,经过10次迭代更新后,模型前后两次的损失相差小于0.1,完成自蒸馏网络的训练;Step 7.2: Repeat step 6.1. After 10 iterations of updating, the difference between the two losses before and after the model is less than 0.1, and the training of the self-distillation network is completed;
步骤8:推理阶段,将待测的夜间图像输入特征提取网络与分类网络头,输出图像分类结果。本实例训练与推理阶段皆在GPU服务器GEFORCE RTX 2080 Ti上实现。Step 8: In the inference stage, the night image to be tested is input into the feature extraction network and the classification network head, and the image classification result is output. The training and inference stages of this example are implemented on the GPU server GEFORCE RTX 2080 Ti.
本发明通过将夜间图像分类的任务解耦为白天图像的有监督分类任务和夜间图像的自监督任务,进行多任务学习后训练出具备域自适应能力的特征提取网络,并通过夜间少量带标签样本对图像识别网络进行进一步的自蒸馏学习,使分类网络头学习到的表征向夜间图像特征迁移,从而提高夜间图像识别性能。本实施例采用的验证数据集V在基于ResNet50网络下分类性能达到83.8%,采用本发明的算法可使分类性能达到89.2%,相较于baseline提高了5.4%的准确率,充分体现出本发明的实际效益与应用价值。The invention decouples the task of night image classification into a supervised classification task of daytime images and a self-supervised task of nighttime images, performs multi-task learning and trains a feature extraction network with domain self-adaptive ability, and uses a small amount of labels at night to train a feature extraction network. The sample performs further self-distillation learning on the image recognition network, so that the representation learned by the classification network head is transferred to the nighttime image features, thereby improving the nighttime image recognition performance. The classification performance of the verification data set V used in this embodiment reaches 83.8% based on the ResNet50 network, and the algorithm of the present invention can make the classification performance reach 89.2%, which is 5.4% higher than the baseline accuracy rate, which fully reflects the present invention practical benefits and application value.
以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明实施例技术方案的范围。The above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still be described in the foregoing embodiments. The technical solutions of the present invention are modified, or some or all of the technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111220897.5A CN113657561B (en) | 2021-10-20 | 2021-10-20 | A semi-supervised nighttime image classification method based on multi-task decoupling learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111220897.5A CN113657561B (en) | 2021-10-20 | 2021-10-20 | A semi-supervised nighttime image classification method based on multi-task decoupling learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113657561A true CN113657561A (en) | 2021-11-16 |
CN113657561B CN113657561B (en) | 2022-03-18 |
Family
ID=78494703
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111220897.5A Active CN113657561B (en) | 2021-10-20 | 2021-10-20 | A semi-supervised nighttime image classification method based on multi-task decoupling learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113657561B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113869333A (en) * | 2021-11-29 | 2021-12-31 | 山东力聚机器人科技股份有限公司 | Image identification method and device based on semi-supervised relationship measurement network |
CN113918743A (en) * | 2021-12-15 | 2022-01-11 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | A model training method for image classification in long-tailed distribution scenarios |
CN114037876A (en) * | 2021-12-16 | 2022-02-11 | 马上消费金融股份有限公司 | Model optimization method and device |
CN114255371A (en) * | 2021-12-21 | 2022-03-29 | 中国石油大学(华东) | Small sample image classification method based on component supervision network |
CN114299355A (en) * | 2021-12-02 | 2022-04-08 | 云从科技集团股份有限公司 | Model training method, device and computer storage medium |
CN114565808A (en) * | 2022-04-27 | 2022-05-31 | 南京邮电大学 | Double-action contrast learning method for unsupervised visual representation |
CN114881937A (en) * | 2022-04-15 | 2022-08-09 | 北京医准智能科技有限公司 | Detection method and device for ultrasonic section and computer readable medium |
CN114898141A (en) * | 2022-04-02 | 2022-08-12 | 南京大学 | A multi-view semi-supervised image classification method based on contrastive loss |
CN115496955A (en) * | 2022-11-18 | 2022-12-20 | 之江实验室 | Image classification model training method, image classification method, device and medium |
CN115564960A (en) * | 2022-11-10 | 2023-01-03 | 南京码极客科技有限公司 | Network image label denoising method combining sample selection and label correction |
CN116484272A (en) * | 2023-03-30 | 2023-07-25 | 西安交通大学 | Fraud node detection method based on graph semi-supervised representation learning |
CN117058492A (en) * | 2023-10-13 | 2023-11-14 | 之江实验室 | Two-stage training disease identification method and system based on learning decoupling |
CN119223625A (en) * | 2024-09-19 | 2024-12-31 | 广州民航职业技术学院 | A method for fault diagnosis of aircraft engine bearings based on domain generalization |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110745A (en) * | 2019-03-29 | 2019-08-09 | 上海海事大学 | Based on the semi-supervised x-ray image automatic marking for generating confrontation network |
US20200160177A1 (en) * | 2018-11-16 | 2020-05-21 | Royal Bank Of Canada | System and method for a convolutional neural network for multi-label classification with partial annotations |
CN112990371A (en) * | 2021-04-27 | 2021-06-18 | 之江实验室 | Unsupervised night image classification method based on feature amplification |
CN113378632A (en) * | 2021-04-28 | 2021-09-10 | 南京大学 | Unsupervised domain pedestrian re-identification algorithm based on pseudo label optimization |
-
2021
- 2021-10-20 CN CN202111220897.5A patent/CN113657561B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200160177A1 (en) * | 2018-11-16 | 2020-05-21 | Royal Bank Of Canada | System and method for a convolutional neural network for multi-label classification with partial annotations |
CN110110745A (en) * | 2019-03-29 | 2019-08-09 | 上海海事大学 | Based on the semi-supervised x-ray image automatic marking for generating confrontation network |
CN112990371A (en) * | 2021-04-27 | 2021-06-18 | 之江实验室 | Unsupervised night image classification method based on feature amplification |
CN113378632A (en) * | 2021-04-28 | 2021-09-10 | 南京大学 | Unsupervised domain pedestrian re-identification algorithm based on pseudo label optimization |
Non-Patent Citations (1)
Title |
---|
贾鹏: "基于改进梯形网络的半监督图像分类研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113869333B (en) * | 2021-11-29 | 2022-03-25 | 山东力聚机器人科技股份有限公司 | Image identification method and device based on semi-supervised relationship measurement network |
CN113869333A (en) * | 2021-11-29 | 2021-12-31 | 山东力聚机器人科技股份有限公司 | Image identification method and device based on semi-supervised relationship measurement network |
CN114299355A (en) * | 2021-12-02 | 2022-04-08 | 云从科技集团股份有限公司 | Model training method, device and computer storage medium |
CN113918743A (en) * | 2021-12-15 | 2022-01-11 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | A model training method for image classification in long-tailed distribution scenarios |
CN114037876A (en) * | 2021-12-16 | 2022-02-11 | 马上消费金融股份有限公司 | Model optimization method and device |
CN114037876B (en) * | 2021-12-16 | 2024-08-13 | 马上消费金融股份有限公司 | Model optimization method and device |
CN114255371A (en) * | 2021-12-21 | 2022-03-29 | 中国石油大学(华东) | Small sample image classification method based on component supervision network |
CN114898141A (en) * | 2022-04-02 | 2022-08-12 | 南京大学 | A multi-view semi-supervised image classification method based on contrastive loss |
CN114898141B (en) * | 2022-04-02 | 2025-04-11 | 南京大学 | A multi-view semi-supervised image classification method based on contrastive loss |
CN114881937B (en) * | 2022-04-15 | 2022-12-09 | 北京医准智能科技有限公司 | Detection method and device for ultrasonic section and computer readable medium |
CN114881937A (en) * | 2022-04-15 | 2022-08-09 | 北京医准智能科技有限公司 | Detection method and device for ultrasonic section and computer readable medium |
CN114565808A (en) * | 2022-04-27 | 2022-05-31 | 南京邮电大学 | Double-action contrast learning method for unsupervised visual representation |
CN114565808B (en) * | 2022-04-27 | 2022-07-12 | 南京邮电大学 | Double-action contrast learning method for unsupervised visual representation |
CN115564960A (en) * | 2022-11-10 | 2023-01-03 | 南京码极客科技有限公司 | Network image label denoising method combining sample selection and label correction |
CN115564960B (en) * | 2022-11-10 | 2023-03-03 | 南京码极客科技有限公司 | Network image label denoising method combining sample selection and label correction |
CN115496955A (en) * | 2022-11-18 | 2022-12-20 | 之江实验室 | Image classification model training method, image classification method, device and medium |
CN115496955B (en) * | 2022-11-18 | 2023-03-24 | 之江实验室 | Image classification model training method, image classification method, device and medium |
CN116484272A (en) * | 2023-03-30 | 2023-07-25 | 西安交通大学 | Fraud node detection method based on graph semi-supervised representation learning |
CN116484272B (en) * | 2023-03-30 | 2025-04-11 | 西安交通大学 | A fraudulent node detection method based on graph semi-supervised representation learning |
CN117058492A (en) * | 2023-10-13 | 2023-11-14 | 之江实验室 | Two-stage training disease identification method and system based on learning decoupling |
CN117058492B (en) * | 2023-10-13 | 2024-02-27 | 之江实验室 | Two-stage training disease identification method and system based on learning decoupling |
CN119223625A (en) * | 2024-09-19 | 2024-12-31 | 广州民航职业技术学院 | A method for fault diagnosis of aircraft engine bearings based on domain generalization |
CN119223625B (en) * | 2024-09-19 | 2025-05-20 | 广州民航职业技术学院 | Aeroengine bearing fault diagnosis method based on domain generalization |
Also Published As
Publication number | Publication date |
---|---|
CN113657561B (en) | 2022-03-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113657561B (en) | A semi-supervised nighttime image classification method based on multi-task decoupling learning | |
US12217146B2 (en) | Generating dual sequence inferences using a neural network model | |
Zhao et al. | Intrusion detection using deep belief network and probabilistic neural network | |
Liu et al. | Incdet: In defense of elastic weight consolidation for incremental object detection | |
CN114842267B (en) | Image classification method and system based on label noise domain adaptation | |
CN112966114B (en) | Literature classification method and device based on symmetrical graph convolutional neural network | |
WO2022121289A1 (en) | Methods and systems for mining minority-class data samples for training neural network | |
Zheng et al. | Improving the generalization ability of deep neural networks for cross-domain visual recognition | |
CN113378632A (en) | Unsupervised domain pedestrian re-identification algorithm based on pseudo label optimization | |
CN110427846A (en) | It is a kind of using convolutional neural networks to the face identification method of uneven small sample | |
CN109753571B (en) | Scene map low-dimensional space embedding method based on secondary theme space projection | |
CN114139676A (en) | The training method of domain adaptive neural network | |
CN109255381B (en) | Image classification method based on second-order VLAD sparse adaptive depth network | |
CN114998659B (en) | Image data classification method using spiking neural network model trained online over time | |
CN115688024B (en) | Prediction method for network abnormal users based on user content and behavior characteristics | |
CN108520298A (en) | A Semantic Consistency Verification Method for Land and Air Conversation Based on Improved LSTM-RNN | |
CN115578248B (en) | Generalized enhanced image classification algorithm based on style guidance | |
CN110717402B (en) | Pedestrian re-identification method based on hierarchical optimization metric learning | |
US20230031512A1 (en) | Surrogate hierarchical machine-learning model to provide concept explanations for a machine-learning classifier | |
CN113191445A (en) | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm | |
CN114511737A (en) | Training method of image recognition domain generalization model | |
CN115221947A (en) | A Robust Multimodal Active Learning Approach Based on Pretrained Language Models | |
CN112651242A (en) | Text classification method based on internal and external attention mechanism and variable scale convolution | |
CN112836729A (en) | An image classification model construction method and image classification method | |
CN115984617A (en) | Method for improving long-tail recognition group fairness based on generative countermeasure network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |