CN117237703A

CN117237703A - A zero-sample day and night domain adaptive training method and image classification method based on spectrum analysis

Info

Publication number: CN117237703A
Application number: CN202311062407.2A
Authority: CN
Inventors: 武阿明; 周佳豪; 张自会; 邓成
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2023-08-22
Filing date: 2023-08-22
Publication date: 2023-12-15

Abstract

The invention discloses a zero sample day-night domain self-adaptive training method based on spectrum analysis, which comprises the following steps: receiving a daytime image training set; acquiring an original training feature diagram according to the daytime image training set; determining an enhanced feature map according to the original training feature map; determining a supervision contrast loss function according to the original training feature map and the enhancement feature map; calculating an output descriptor according to the original training feature map; determining a cross entropy loss function according to the enhancement feature map and the output descriptor; and updating network parameters according to the supervision comparison loss function and the cross entropy loss function to obtain the self-adaptive classification model. The invention also provides a zero-sample day-night domain self-adaptive image classification method based on spectrum analysis, which can remove the influence of generalization irrelevant factors and retain generalization relevant object characteristics so as to improve the adaptation performance of the model in a night domain.

Description

A zero-sample day and night domain adaptive training method and image classification based on spectrum analysis class method

技术领域Technical field

本发明属于深度图像识别技术领域，具体涉及一种基于频谱分析的零样本日夜域自适应训练方法及图像分类方法。The invention belongs to the technical field of deep image recognition, and specifically relates to a zero-sample day and night domain adaptive training method and an image classification method based on spectrum analysis.

背景技术Background technique

在实际应用中，深度图像识别模型容易受到光照变化的影响。例如，当在白天场景训练的模型应用于夜间场景时，模型的性能通常会由于域转移的影响而明显下降，因此，零样本日夜域自适应仍然是一个重要的研究领域。域自适应的目标是在源域数据集上训练模型，使模型在不同但相似的目标域数据集上表现良好，域自适应可以节省昂贵的数据收集和注释成本。现有的深度域自适应方法主要可分为两类：基于差异优化的方法和基于对抗性学习的方法。零样本日夜域自适应任务旨在将仅在白天数据上训练的模型很好地泛化到夜间域。与传统的域自适应相比，由于其在训练过程中无法访问目标域(夜间场景)数据，因此，零样本日夜域自适应任务更具挑战性。现有的零样本日夜域自适应方法侧重于提取颜色不变的表示以减轻夜间场景的低光影响，而忽略了其他因素(例如，纹理和风格)对泛化能力的影响，从而导致模型的泛化性能不佳，进而导致模型在夜间域的适应性能不佳。In practical applications, depth image recognition models are easily affected by illumination changes. For example, when a model trained in daytime scenes is applied to nighttime scenes, the performance of the model usually drops significantly due to the impact of domain shift. Therefore, zero-shot day and night domain adaptation remains an important research area. The goal of domain adaptation is to train the model on the source domain data set so that the model performs well on different but similar target domain data sets. Domain adaptation can save expensive data collection and annotation costs. Existing deep domain adaptation methods can be mainly divided into two categories: methods based on differential optimization and methods based on adversarial learning. The zero-shot day-night domain adaptation task aims to generalize models trained only on daytime data well to the night domain. Compared with traditional domain adaptation, the zero-shot day and night domain adaptation task is more challenging because it cannot access the target domain (night scene) data during the training process. Existing zero-shot day and night domain adaptation methods focus on extracting color-invariant representations to mitigate the low-light impact of night scenes, while ignoring the impact of other factors (e.g., texture and style) on generalization ability, resulting in poor performance of the model. Poor generalization performance leads to poor adaptability of the model in the night domain.

发明内容Contents of the invention

为了解决现有技术中存在的上述问题，本发明提供了一种基于频谱分析的零样本日夜域自适应训练方法及图像分类方法。本发明要解决的技术问题通过以下技术方案实现：In order to solve the above problems existing in the prior art, the present invention provides a zero-sample day and night domain adaptive training method and image classification method based on spectrum analysis. The technical problems to be solved by the present invention are achieved through the following technical solutions:

一种基于频谱分析的零样本日夜域自适应训练方法，包括以下步骤：A zero-sample day and night domain adaptive training method based on spectrum analysis, including the following steps:

接收白天图像训练集；Receive daytime image training set;

根据所述白天图像训练集获得原始训练特征图；Obtain original training feature maps according to the daytime image training set;

根据所述原始训练特征图确定增强特征图；Determine enhanced feature maps based on the original training feature maps;

根据所述原始训练特征图和所述增强特征图确定监督对比损失函数；Determine a supervised contrast loss function based on the original training feature map and the enhanced feature map;

根据所述原始训练特征图计算输出描述符；Calculate an output descriptor based on the original training feature map;

根据所述增强特征图和所述输出描述符确定交叉熵损失函数；determining a cross-entropy loss function based on the enhanced feature map and the output descriptor;

根据所述监督对比损失函数和所述交叉熵损失函数进行训练更新网络参数，得到自适应分类模型。The network parameters are trained and updated according to the supervised contrast loss function and the cross-entropy loss function to obtain an adaptive classification model.

在本发明的一个实施例中，所述根据所述原始训练特征图确定增强特征图，包括：In one embodiment of the present invention, determining the enhanced feature map based on the original training feature map includes:

根据所述原始训练特征图确定特征图频谱特征；Determine the feature map spectrum characteristics according to the original training feature map;

根据所述特征图频谱特征确定频谱高频分量；Determine the high-frequency component of the spectrum according to the spectrum characteristics of the characteristic map;

根据所述频谱高频分量确定高通特征图；Determine a high-pass feature map according to the high-frequency component of the spectrum;

根据所述原始训练特征图和所述高通特征图确定增强特征图。An enhanced feature map is determined based on the original training feature map and the high-pass feature map.

在本发明的一个实施例中，所述根据所述原始训练特征图确定特征图频谱特征，包括：In one embodiment of the present invention, determining the feature map spectrum characteristics based on the original training feature map includes:

对所述原始训练特征图进行傅里叶变换，得到特征图频谱特征。Fourier transform is performed on the original training feature map to obtain the spectrum characteristics of the feature map.

在本发明的一个实施例中，所述根据所述特征图频谱特征确定频谱高频分量，包括：In one embodiment of the present invention, determining the high-frequency component of the spectrum based on the spectrum characteristics of the feature map includes:

通过高通滤波器去除所述特征图频谱特征的低频分量，得到频谱高频分量。The low-frequency component of the spectrum feature of the feature map is removed through a high-pass filter to obtain the high-frequency component of the spectrum.

在本发明的一个实施例中，所述根据所述频谱高频分量确定高通特征图，包括：In one embodiment of the present invention, determining a high-pass feature map based on the high-frequency component of the spectrum includes:

对所述频谱高频分量进行傅里叶逆变换，得到高通特征图。Perform inverse Fourier transform on the high-frequency component of the spectrum to obtain a high-pass feature map.

在本发明的一个实施例中，根据所述原始训练特征图和所述高通特征图确定增强特征图，包括：In one embodiment of the present invention, determining the enhanced feature map based on the original training feature map and the high-pass feature map includes:

对所述原始训练特征图和所述高通特征图进行批量拼接处理，得到拼接特征图；Perform batch splicing processing on the original training feature map and the high-pass feature map to obtain a spliced feature map;

将所述拼接特征图输入残差网络进行转换，输出增强特征图。The spliced feature map is input into the residual network for conversion, and an enhanced feature map is output.

在本发明的一个实施例中，所述根据所述原始训练特征图计算输出描述符，包括：In one embodiment of the present invention, calculating the output descriptor based on the original training feature map includes:

根据所述原始训练特征图确定降维特征图；Determine the dimensionality reduction feature map according to the original training feature map;

根据所述降维特征图确定原型集；Determine a prototype set according to the dimensionality reduction feature map;

根据所述原型集计算输出描述符。Output descriptors are computed based on the set of prototypes.

在本发明的一个实施例中，所述监督对比损失函数的表达式为：In one embodiment of the present invention, the expression of the supervised contrast loss function is:

其中，y_i和y_j分别表示锚点样本i和样本j的标签，B表示批量大小，表示批量中标签为y_j的样本数，l_i≠j和/>表示相似的指标函数，s_i，j＝v_i ^Tv_j/||v_i||||v_j||表示锚点样本i和样本j之间的余弦相似度，v_i和v_j分别表示锚点样本i和样本j的高层特征向量，t表示温度超参数；所述原始训练特征图作为锚点样本，所述增强特征图作为正样本，与当前输入不同类别的特征图作为负样本。Among them, y _i and y _j represent the labels of anchor sample i and sample j respectively, and B represents the batch size. Represents the number of samples labeled y _j in the batch, l _i≠j and/> Represents a similar index function, s _{i, j} = v _i ^T v _j /||v _i ||||v _j || Represents the cosine similarity between anchor sample i and sample j, v _i and v _j respectively Represents the high-level feature vectors of anchor sample i and sample j, t represents the temperature hyperparameter; the original training feature map is used as the anchor sample, the enhanced feature map is used as a positive sample, and the feature map of a different category from the current input is used as a negative sample .

在本发明的一个实施例中，所述高通滤波器的表达式为：In one embodiment of the present invention, the expression of the high-pass filter is:

其中，α和β表示去除低频分量的截止阈值，(u，v)表示中心频谱的坐标。Among them, α and β represent the cut-off threshold for removing low-frequency components, and (u, v) represents the coordinates of the center spectrum.

本发明实施例的第二方面提供一种基于频谱分析的零样本日夜域自适应图像分类方法，包括以下步骤：A second aspect of the embodiment of the present invention provides a zero-sample day and night domain adaptive image classification method based on spectrum analysis, which includes the following steps:

预训练的特征提取器对待分类数据集进行特征提取，输出待分类特征图；The pre-trained feature extractor extracts features from the data set to be classified and outputs the feature map to be classified;

所述待分类特征图输入训练后的原型补偿模块计算原型，并根据原型计算得到待分类输出描述符；The prototype compensation module after inputting the feature map to be classified calculates the prototype, and calculates the output descriptor to be classified based on the prototype;

所述待分类特征图和所述待分类输出描述符进行拼接，输出待分类拼接特征图；The feature map to be classified and the output descriptor to be classified are spliced, and the spliced feature map to be classified is output;

所述待分类拼接特征图输入训练后的分类网络中，输出分类结果；The spliced feature map to be classified is input into the trained classification network, and the classification result is output;

所述预训练的特征提取器、训练后的原型补偿模块和训练后的分类网络构成本发明实施例第一方面提供的训练方法训练得到自适应分类模型。The pre-trained feature extractor, the trained prototype compensation module and the trained classification network constitute an adaptive classification model trained by the training method provided in the first aspect of the embodiment of the present invention.

本发明的有益效果：Beneficial effects of the present invention:

本发明通过原始训练特征图和所述增强特征图的对比以减少低频成分(例如，图像的纹理、风格和颜色等因素)的影响，有利于提取广义表示。通过从底层特征图中计算原型，以促进高层包含丰富的语义信息，有利于增强模型的泛化能力，在训练过程中无法访问夜间图像的情况下，提升了模型在夜间域上的分类和检测能力。The present invention reduces the influence of low-frequency components (for example, factors such as texture, style and color of the image) by comparing the original training feature map and the enhanced feature map, which is beneficial to extracting generalized representation. By calculating the prototype from the underlying feature map, it promotes the high-level to contain rich semantic information, which is conducive to enhancing the generalization ability of the model. When night images cannot be accessed during the training process, the classification and detection of the model in the night domain are improved. ability.

以下将结合附图及实施例对本发明做进一步详细说明。The present invention will be further described in detail below with reference to the accompanying drawings and examples.

附图说明Description of drawings

图1为本发明实施例提供的一种基于频谱分析的零样本日夜域自适应训练方法的流程示意图；Figure 1 is a schematic flow chart of a zero-sample day and night domain adaptive training method based on spectrum analysis provided by an embodiment of the present invention;

图2为本发明实施例提供的一种基于频谱分析的零样本日夜域自适应训练方法的整体网络框架图；Figure 2 is an overall network framework diagram of a zero-sample day and night domain adaptive training method based on spectrum analysis provided by an embodiment of the present invention;

图3为本发明实施例提供的一种基于频谱分析的零样本日夜域自适应训练方法的知识过滤网络示意图；Figure 3 is a schematic diagram of the knowledge filtering network of a zero-sample day and night domain adaptive training method based on spectrum analysis provided by an embodiment of the present invention;

图4为本发明的图像分类方法和现有技术的夜间域检测结果的可视化图像；Figure 4 is a visual image of the night domain detection results of the image classification method of the present invention and the prior art;

图5a为本发明的图像分类方法和现有技术在CODaN数据集上使用t-SNE进行特征可视化结果示意图；Figure 5a is a schematic diagram of the feature visualization results of the image classification method of the present invention and the existing technology using t-SNE on the CODaN data set;

图5b为本发明的图像分类方法和现有技术在ShapeNet数据集上使用t-SNE进行特征可视化结果示意图；Figure 5b is a schematic diagram of the feature visualization results of the image classification method of the present invention and the existing technology using t-SNE on the ShapeNet data set;

图6为本发明的图像分类方法和现有技术在CODaN数据集特征图可视化结果示意图。Figure 6 is a schematic diagram of the visualization results of the feature map of the CODaN data set using the image classification method of the present invention and the prior art.

具体实施方式Detailed ways

下面结合具体实施例对本发明做进一步详细的描述，但本发明的实施方式不限于此。The present invention will be described in further detail below with reference to specific examples, but the implementation of the present invention is not limited thereto.

实施例一Embodiment 1

如图1、图2和图3所示，本实施例提供一种基于频谱分析的零样本日夜域自适应训练方法，包括以下步骤：As shown in Figure 1, Figure 2 and Figure 3, this embodiment provides a zero-sample day and night domain adaptive training method based on spectrum analysis, including the following steps:

步骤10，接收白天图像训练集。Step 10: Receive the daytime image training set.

步骤20，根据白天图像训练集获得原始训练特征图。Step 20: Obtain the original training feature map based on the daytime image training set.

本步骤中，输入的白天图像训练集经过预训练的特征提取器进行特征提取，输出原始训练特征图其中H、W和C分别表示原始训练特征图的高度、宽度和通道数。为了减少泛化无关因素的影响并提取泛化表示，后续需要利用频率变换来去除提取的原始训练特征图的低频分量。In this step, the input daytime image training set is extracted by the pre-trained feature extractor, and the original training feature map is output. Among them, H, W and C respectively represent the height, width and number of channels of the original training feature map. In order to reduce the influence of generalization-irrelevant factors and extract generalized representations, frequency transformation needs to be used to remove the low-frequency components of the extracted original training feature maps.

步骤30，根据原始训练特征图确定增强特征图。具体地，根据原始训练特征图确定增强特征图。本步骤和步骤40均通过对比过滤模块执行。Step 30: Determine the enhanced feature map based on the original training feature map. Specifically, the enhanced feature map is determined based on the original training feature map. Both this step and step 40 are performed through the comparison filtering module.

具体地，步骤30包括步骤31-步骤34：Specifically, step 30 includes step 31-step 34:

步骤31，根据原始训练特征图确定特征图频谱特征。具体地，本步骤中，对原始训练特征图F_b进行傅里叶变换，得到特征图频谱特征，得到的特征图频谱特征可以表述如下：Step 31: Determine the spectral characteristics of the feature map based on the original training feature map. Specifically, in this step, Fourier transform is performed on the original training feature map F _b to obtain the spectrum characteristics of the feature map. The spectrum characteristics of the obtained feature map can be expressed as follows:

其中，表示原始训练特征图的第i个特征图，i＝1，...，C，/>表示傅里叶变换，(u，v)表示中心频谱的坐标。in, Represents the i-th feature map of the original training feature map, i=1,...,C,/> represents the Fourier transform, and (u, v) represents the coordinates of the center spectrum.

步骤32，根据特征图频谱特征确定频谱高频分量。具体地，通过高通滤波器去除特征图频谱特征的低频分量，得到频谱高频分量。Step 32: Determine the high-frequency component of the spectrum based on the spectrum characteristics of the feature map. Specifically, the low-frequency component of the spectrum feature of the feature map is removed through a high-pass filter to obtain the high-frequency component of the spectrum.

本步骤中，经过傅里叶变换后，图像的低频分量包含域特定信息，高频分量包含域不变信息。接下来，为了减轻低频分量的影响，定义一个高通滤波器来去除低频分量，高通滤波器为：In this step, after Fourier transformation, the low-frequency components of the image contain domain-specific information, and the high-frequency components contain domain-invariant information. Next, in order to reduce the impact of low-frequency components, define a high-pass filter to remove low-frequency components. The high-pass filter is:

其中，α和β表示去除低频分量的截止阈值。然后，高通滤波器通过与特征图频谱特征进行逐元素乘法运算去除特征图频谱特征的低频分量，获得输出的频谱高频分量：Among them, α and β represent the cut-off threshold for removing low-frequency components. Then, the high-pass filter removes the low-frequency components of the feature map spectrum features by element-wise multiplication with the feature map spectrum features, and obtains the output spectrum high-frequency components:

其中，⊙表示逐元素乘积，表示高通滤波器输出的频谱高频分量。Among them, ⊙ represents the element-wise product, Represents the high-frequency component of the spectrum output by the high-pass filter.

步骤33，根据频谱高频分量确定高通特征图。具体地，对频谱高频分量进行傅里叶逆变换，得到高通特征图：Step 33: Determine the high-pass feature map based on the high-frequency components of the spectrum. Specifically, the inverse Fourier transform is performed on the high-frequency components of the spectrum to obtain the high-pass feature map:

其中，是过滤后的高通特征图。in, is the filtered high-pass feature map.

经过高通滤波器后，输出的特征图比原始输入特征图包含更少的与泛化无关的背景信息，并且包含更丰富的对象相关内容，这有利于增强模型提取鲁棒特征的能力，提高模型的泛化性能。After passing through the high-pass filter, the output feature map contains less generalization-irrelevant background information than the original input feature map, and contains richer object-related content, which is conducive to enhancing the model's ability to extract robust features and improving the model's ability to extract robust features. generalization performance.

步骤34，根据原始训练特征图和高通特征图确定增强特征图。本步骤包括步骤341和步骤342：Step 34: Determine the enhanced feature map based on the original training feature map and the high-pass feature map. This step includes step 341 and step 342:

步骤341，对原始训练特征图和高通特征图进行批量拼接处理，得到拼接特征图。Step 341: Perform batch splicing processing on the original training feature map and the high-pass feature map to obtain a spliced feature map.

步骤342，将拼接特征图输入残差网络进行转换，输出增强特征图F。Step 342: Input the spliced feature map into the residual network for conversion, and output the enhanced feature map F.

F＝Φ([F_a，F_b]_B)；F=Φ([F _a , F _b ] _B );

其中，残差网络Φ由四个残差网络块组成，用于转换连接结果，[，]_B表示批量大小维度的拼接操作。Among them, the residual network Φ consists of four residual network blocks, which are used to transform the connection results, [,] _B represents the splicing operation of the batch size dimension.

步骤40，根据原始训练特征图和增强特征图确定监督对比损失函数。Step 40: Determine the supervised contrast loss function based on the original training feature map and the enhanced feature map.

为了鼓励过滤后的特征包含更多与对象相关的信息，采用有监督的对比学习(Prannay Khosla et al.“Supervised contrastive learning”.In：Advances in neuralinformation processing systems 33(2020)，pp.18661-18673.)来进一步增强过滤后的特征。具体来说，将当前输入的最初提取的特征和对应的过滤后的特征分别作为锚点和正样本，而与当前输入不同类别的特征作为负样本，可以确定监督对比损失函数：In order to encourage the filtered features to contain more object-related information, supervised contrastive learning (Prannay Khosla et al. "Supervised contrastive learning". In: Advances in neuralinformation processing systems 33(2020), pp.18661-18673 .) to further enhance the filtered features. Specifically, the initially extracted features of the current input and the corresponding filtered features are used as anchor points and positive samples respectively, and the features of different categories from the current input are used as negative samples, and the supervised contrast loss function can be determined:

其中，y_i和y_j分别表示锚点样本i和样本j的标签，B表示批量大小，表示批量中标签为y_j的样本数。l_i≠j和/>是相似的指标函数，例如，如果i≠j，l_i≠j∈{0，1}＝1，否则l_i≠j∈{0，1}＝0。s_i，j＝v_i ^Tv_j/||v_i||||v_j||为表示锚点样本i和样本j之间的余弦相似度，其中v_i和v_j分别表示表示锚点样本i和样本j的高层特征向量。t为温度超参数，设为0.07。原始训练特征图作为锚点样本，增强特征图作为正样本，与当前输入不同类别的特征图作为负样本。Among them, y _i and y _j represent the labels of anchor sample i and sample j respectively, and B represents the batch size. Indicates the number of samples labeled y _j in the batch. l _i≠j and/> is a similar indicator function, for example, if i≠j, l _i≠j ∈{0, 1}=1, otherwise l _i≠j ∈{0, 1}=0. s _{i, j} = v _i ^T v _j /||v _i ||||v _j || represents the cosine similarity between anchor sample i and sample j, where v _i and v _j represent anchor points respectively. High-level feature vectors of sample i and sample j. t is the temperature hyperparameter, set to 0.07. The original training feature map is used as the anchor sample, the enhanced feature map is used as the positive sample, and the feature map of a different category from the current input is used as the negative sample.

利用监督对比损失来缩小原始特征与相应的滤波特征之间的语义差距，促进滤波特征包含丰富的对象相关信息，有助于缓解夜景低光照的影响。The supervised contrast loss is used to narrow the semantic gap between the original features and the corresponding filtered features, promote the filtered features to contain rich object-related information, and help alleviate the impact of low light in night scenes.

步骤50，根据原始训练特征图计算输出描述符。在获取原始训练特征图后，本步骤和步骤30、步骤40并行同时进行。Step 50: Calculate the output descriptor based on the original training feature map. After obtaining the original training feature map, this step is performed in parallel with steps 30 and 40.

通过对比过滤模块处理后(通过步骤30和步骤40的处理后)，提取的特征包含了丰富的对象相关信息，提高了模型的泛化能力。然而，对比过滤模块可能会过滤掉一些对识别相应对象至关重要的判别性内容。为此，可以通过原型补偿模块，从底层特征图中计算多个原型，以增强高层特征的语义信息。具体地，步骤50通过原型补偿模块执行，包括步骤51-步骤53：After processing by the comparison filtering module (after processing in steps 30 and 40), the extracted features contain rich object-related information, which improves the generalization ability of the model. However, the comparison filtering module may filter out some discriminative content that is crucial for identifying the corresponding objects. To this end, multiple prototypes can be calculated from the low-level feature map through the prototype compensation module to enhance the semantic information of high-level features. Specifically, step 50 is performed by the prototype compensation module, including steps 51 to 53:

步骤51，根据原始训练特征图确定降维特征图。Step 51: Determine the dimensionality reduction feature map based on the original training feature map.

步骤52，根据降维特征图确定原型集。Step 52: Determine the prototype set based on the dimensionality reduction feature map.

步骤53，根据原型集计算输出描述符。Step 53: Calculate the output descriptor according to the prototype set.

具体地，首先使用最大池化运算对原始训练特征图进行降维，得到降维特征图/>然后将降维后的特征图F_bm输入到原型补偿模块中，计算出一组原型(Aming Wu et al.“Universal-prototype enhancing for few-shot objectdetection”.In：Proceedings of the IEEE/CVF International Conference onComputer Vision.2021，pp.9567-9576.)。这些原型组成原型集/>其中，p_i是原型，D是原型的数量。接下来，基于原型集P计算表示图像级信息的描述符：Specifically, first use the maximum pooling operation to perform the original training feature map Perform dimensionality reduction to obtain the dimensionality reduction feature map/> Then the dimensionally reduced feature map F _bm is input into the prototype compensation module to calculate a set of prototypes (Aming Wu et al. "Universal-prototype enhancing for few-shot object detection". In: Proceedings of the IEEE/CVF International Conference onComputer Vision. 2021, pp.9567-9576.). These prototypes form a prototype set/> Among them, p _i is the prototype and D is the number of prototypes. Next, descriptors representing image-level information are calculated based on the prototype set P:

其中，和/>是卷积参数，/>表示输出描述符，/>表示将视觉特征分配给相应原型的残差操作。in, and/> is the convolution parameter,/> Represents the output descriptor, /> Represents a residual operation that assigns visual features to corresponding prototypes.

这里，对比过滤模块和原型补偿模块构成知识过滤网络。Here, the comparison filtering module and the prototype compensation module form a knowledge filtering network.

步骤60，根据增强特征图和输出描述符确定交叉熵损失函数。Step 60: Determine the cross-entropy loss function based on the enhanced feature map and the output descriptor.

本步骤中，具体地，将F和的拼接结果输入分类网络得到预测概率，从而计算得到交叉熵损失函数。In this step, specifically, F and The splicing results are input into the classification network to obtain the predicted probability, and the cross-entropy loss function is calculated.

其中，表示预测概率：in, Represents the predicted probability:

Ψ由两个具有ReLU激活函数的1×1卷积层组成，用于对齐维度。是/>的整形结果。/>和/>是全连接层的参数。[，]_Ｃ表示通道维度的拼接操作。通过拼接操作，可以将包含底层判别信息的描述符/>融合到特征F中，增强了F的判别能力。Cls代表分类网络，是一个全连接层，其中，N是白天图像训练集中的类别数。通过使用原型补偿模块，可以进一步提升模型提取域不变特征的能力。Ψ consists of two 1×1 convolutional layers with ReLU activation function for aligned dimensions. Yes/> plastic surgery results. /> and/> are the parameters of the fully connected layer. [,] _C represents the splicing operation of the channel dimension. Through the splicing operation, the descriptor containing the underlying discriminant information/> Fusion into feature F enhances the discriminative ability of F. Cls represents the classification network, which is a fully connected layer, where N is the number of categories in the daytime image training set. By using the prototype compensation module, the model's ability to extract domain-invariant features can be further improved.

步骤70，根据监督对比损失函数和交叉熵损失函数进行训练以更新网络参数，训练完成得到自适应分类模型。Step 70: Train according to the supervised contrast loss function and the cross-entropy loss function to update the network parameters. After the training is completed, an adaptive classification model is obtained.

具体地，根据监督对比损失函数和交叉熵损失函数进行训练以更新对比过滤模块、原型补偿模块和分类网络的参数，训练完成得到训练后的对比过滤模块、训练后的原型补偿模块和训练后的分类网络。训练后的原型补偿模块、训练后的分类网络与预训练的特征提取器构成自适应分类模型。Specifically, training is performed according to the supervised contrast loss function and the cross-entropy loss function to update the parameters of the contrast filtering module, the prototype compensation module and the classification network. After the training is completed, the trained contrast filtering module, the trained prototype compensation module and the trained Classification network. The trained prototype compensation module, the trained classification network and the pre-trained feature extractor constitute an adaptive classification model.

具体地，整体训练目标(整体损失函数)由交叉熵损失函数/>和监督对比损失函数/>组成。Specifically, the overall training objective (the overall loss function ) by the cross entropy loss function/> Comparing loss function with supervision/> composition.

模型的损失函数为：The loss function of the model is:

其中，λ表示超参数，将λ设为1可以平衡两个损失项。Among them, λ represents the hyperparameter, and setting λ to 1 can balance the two loss terms.

本实施例的训练方法可以有效地提升模型在夜间域上的图像分类和目标检测能力。对比过滤模块可以有效去除无关因素的影响，减少低频成分(例如，风格)的影响，提取域不变特征。原型补偿模块有助于高层包含丰富的语义信息，提高模型的泛化能力。The training method of this embodiment can effectively improve the model's image classification and target detection capabilities in the night domain. The contrast filtering module can effectively remove the influence of irrelevant factors, reduce the influence of low-frequency components (for example, style), and extract domain-invariant features. The prototype compensation module helps the high-level to contain rich semantic information and improve the generalization ability of the model.

需要说明的是，训练时需要同时使用对比过滤模块和原型补偿模块。训练过程中网络模型的结构包括：预训练的特征提取器、对比过滤模块、原型补偿模块和分类网络。但在实际推理过程中，通过舍弃对比过滤模块，保留原型补偿模块，将模型泛化到未知目标域，以对图像进行分类，也即是，训练完成后，将去除了训练后的对比过滤模块的网络作为自适应分类模型，则训练完成得到的自适应分类模型包括：预训练的特征提取器、训练后的原型补偿模块和训练后的分类网络。It should be noted that the contrast filter module and prototype compensation module need to be used simultaneously during training. The structure of the network model during the training process includes: pre-trained feature extractor, contrast filter module, prototype compensation module and classification network. However, in the actual reasoning process, by discarding the contrast filter module and retaining the prototype compensation module, the model is generalized to the unknown target domain to classify images. That is, after the training is completed, the trained contrast filter module will be removed. If the network is used as an adaptive classification model, the adaptive classification model obtained after training includes: pre-trained feature extractor, trained prototype compensation module and trained classification network.

实施例二Embodiment 2

本实施例提供一种基于频谱分析的零样本日夜域自适应图像分类方法，应用于实施例一训练得到的自适应分类模型，自适应分类模型包括：预训练的特征提取器、训练后的原型补偿模块和训练后的分类网络；This embodiment provides a zero-sample day and night domain adaptive image classification method based on spectrum analysis, which is applied to the adaptive classification model trained in Embodiment 1. The adaptive classification model includes: a pre-trained feature extractor and a trained prototype. Compensation module and trained classification network;

具体地，一种基于频谱分析的零样本日夜域自适应图像分类方法，包括以下步骤：Specifically, a zero-sample day and night domain adaptive image classification method based on spectrum analysis includes the following steps:

步骤一，预训练的特征提取器对待分类数据集进行特征提取，输出待分类特征图。Step 1: The pre-trained feature extractor extracts features from the data set to be classified and outputs the feature map to be classified.

步骤二，待分类特征图输入训练后的原型补偿模块计算原型，并根据原型计算得到待分类输出描述符。Step 2: The feature map to be classified is input into the trained prototype compensation module to calculate the prototype, and the output descriptor to be classified is obtained based on the prototype calculation.

步骤三，待分类特征图和待分类输出描述符进行拼接，输出待分类拼接特征图；Step 3: Splice the feature map to be classified and the output descriptor to be classified, and output the spliced feature map to be classified;

步骤四，待分类拼接特征图输入训练后的分类网络中进行分类，输出分类结果。Step 4: The feature map to be classified and spliced is input into the trained classification network for classification, and the classification result is output.

如图4所示，与Faster R-CNN的结果(如第一行所示)相比，本发明的图像分类方法(如第二行所示)准确地检测了夜间图像中的物体。As shown in Figure 4, compared with the results of Faster R-CNN (shown in the first row), the image classification method of the present invention (shown in the second row) accurately detects objects in nighttime images.

如图5a所示为本发明和现有技术CIConv在CODaN数据集上使用t-SNE进行特征可视化的结果示意图，如图5b所示为本发明和现有技术CIConv在ShapeNet数据集上使用t-SNE进行特征可视化的结果示意图，其中，彩色点代表不同的对象类别，可以看到本发明的方法可以提取更多的判别特征，从而显著提高分类性能。Figure 5a is a schematic diagram of the results of feature visualization using t-SNE on the CODaN data set by the present invention and the prior art CIConv. Figure 5b is a schematic diagram of the feature visualization results of the present invention and the prior art CIConv using t-SNE on the ShapeNet data set. Schematic diagram of the results of feature visualization by SNE, in which the colored points represent different object categories. It can be seen that the method of the present invention can extract more discriminative features, thereby significantly improving the classification performance.

如图6所示，与来自CIConv的特征图(如第二列所示)相比，本发明的方法提取的特征(如第三列所示)包含更多的对象相关信息和更少的背景信息，有利于提高泛化性能。As shown in Figure 6, compared with the feature map from CIConv (shown in the second column), the features extracted by the method of the present invention (shown in the third column) contain more object-related information and less background information, which helps improve generalization performance.

此外，术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中，“多个”的含义是两个或两个以上，除非另有明确具体的限定。In addition, the terms “first” and “second” are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Therefore, features defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of the present invention, "plurality" means two or more than two, unless otherwise explicitly and specifically limited.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。此外，本领域的技术人员可以将本说明书中描述的不同实施例或示例进行接合和组合。In the description of this specification, reference to the terms "one embodiment," "some embodiments," "an example," "specific examples," or "some examples" or the like means that specific features are described in connection with the embodiment or example. , structures, materials or features are included in at least one embodiment or example of the invention. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may join and combine the different embodiments or examples described in this specification.

以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明，不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说，在不脱离本发明构思的前提下，还可以做出若干简单推演或替换，都应当视为属于本发明的保护范围。The above content is a further detailed description of the present invention in combination with specific preferred embodiments, and it cannot be concluded that the specific implementation of the present invention is limited to these descriptions. For those of ordinary skill in the technical field to which the present invention belongs, several simple deductions or substitutions can be made without departing from the concept of the present invention, and all of them should be regarded as belonging to the protection scope of the present invention.

Claims

1. A zero sample day-night domain self-adaptive training method based on spectrum analysis is characterized by comprising the following steps:

receiving a daytime image training set;

acquiring an original training feature map according to the daytime image training set;

determining an enhanced feature map according to the original training feature map;

determining a supervision contrast loss function according to the original training feature map and the enhancement feature map;

calculating an output descriptor according to the original training feature map;

determining a cross entropy loss function from the enhancement feature map and the output descriptor;

and training and updating network parameters according to the supervision comparison loss function and the cross entropy loss function to obtain a self-adaptive classification model.

2. The spectrum analysis-based zero-sample day-night domain adaptive training method according to claim 1, wherein said determining an enhanced feature map from said original training feature map comprises:

determining the spectrum characteristics of the feature map according to the original training feature map;

determining a spectrum high-frequency component according to the spectrum characteristics of the characteristic diagram;

determining a high-pass feature map according to the frequency spectrum high-frequency component;

and determining an enhancement feature map according to the original training feature map and the high-pass feature map.

3. The spectrum analysis-based zero-sample day-night domain adaptive training method according to claim 2, wherein said determining feature map spectrum features from the original training feature map comprises:

and carrying out Fourier transform on the original training feature map to obtain the spectrum features of the feature map.

4. The spectrum analysis-based zero-sample day-night domain adaptive training method according to claim 2, wherein said determining spectral high-frequency components according to the spectral features of the feature map comprises:

and removing the low-frequency component of the spectral characteristics of the characteristic map through a high-pass filter to obtain a spectral high-frequency component.

5. The spectrum analysis-based zero-sample day-night domain adaptive training method according to claim 2, wherein the determining a high-pass feature map according to the spectrum high-frequency component comprises:

and carrying out inverse Fourier transform on the frequency spectrum high-frequency component to obtain a high-pass characteristic diagram.

6. The spectral analysis-based zero-sample day-night domain adaptive training method of claim 5, wherein determining an enhanced feature map from the original training feature map and the high-pass feature map comprises:

performing batch splicing processing on the original training feature map and the high-pass feature map to obtain a spliced feature map;

and inputting the spliced characteristic map into a residual error network for conversion, and outputting an enhanced characteristic map.

7. The spectrum analysis-based zero-sample day-night domain adaptive training method according to claim 1, wherein said calculating an output descriptor from said original training feature map comprises:

determining a dimension reduction feature map according to the original training feature map;

determining a prototype set according to the dimension reduction feature map;

and calculating an output descriptor according to the prototype set.

8. The spectrum analysis-based zero-sample day-night domain adaptive training method according to claim 1, wherein the expression of the supervision contrast loss function is:

wherein y is _i And y _j Labels representing anchor samples i and samples j, respectively, B represents the batch size,indicating the label in the batch as y _j Number of samples of (1) _i≠j And->Representing similar index functions s _i，j ＝v _i ^T v _j /||v _i ||||v _j The i represents cosine similarity between anchor sample i and sample j, v _i And v _j The high-level characteristic vectors of the anchor point sample i and the anchor point sample j are respectively represented, and t represents a temperature super-parameter; the original trainingThe training feature map is used as an anchor point sample, the enhancement feature map is used as a positive sample, and a feature map of a different category from the current input is used as a negative sample.

9. The spectrum analysis-based zero-sample day-night domain adaptive training method according to claim 1, wherein the expression of the high-pass filter is:

where α and β denote cut-off thresholds for removing low frequency components, and (u, v) denote coordinates of the center spectrum.

10. A zero sample day-night domain self-adaptive image classification method based on spectrum analysis is characterized by comprising the following steps:

the pre-trained feature extractor performs feature extraction on the data set to be classified and outputs a feature map to be classified;

the feature map to be classified is input into a prototype compensation module after training to calculate a prototype, and an output descriptor to be classified is obtained according to the prototype calculation;

the feature images to be classified and the output descriptors to be classified are spliced, and the spliced feature images to be classified are output;

inputting the spliced characteristic diagrams to be classified into a trained classification network, and outputting classification results;

the pre-trained feature extractor, the trained prototype compensation module and the trained classification network form the training method according to any one of claims 1-9, and an adaptive classification model is obtained by training.