CN117935172A

CN117935172A - Visible light infrared pedestrian re-identification method and system based on spectral information filtering

Info

Publication number: CN117935172A
Application number: CN202410325387.1A
Authority: CN
Inventors: 张国庆; 王准; 张家伟; 郑钰辉
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2024-03-21
Filing date: 2024-03-21
Publication date: 2024-04-26
Anticipated expiration: 2044-03-21
Also published as: CN117935172B

Abstract

The present invention discloses a visible light infrared pedestrian re-identification method and system based on spectral information filtering, the method comprising the following steps: (1) obtaining raw data, dividing it into a training set, a validation set and a test set and performing preprocessing; (2) randomly forming a cross-modal image pair with the obtained batch training samples; (3) building a three-branch pedestrian re-identification network based on PyTorch and setting training parameters; (4) dividing the training period into two stages, V-T and V-I. When in the V-T stage, calculating the semantic consistency loss to update the network weights, taking the transition mode as a filtering condition, and retaining the spectral information most relevant to the infrared mode from the visible light mode; (5) calculating the cascade aggregation loss when in the V-I stage, updating the network weights, directly realizing modal alignment between the visible light and infrared modes, and extracting modal shared representations; using the validation set to verify the accuracy of the algorithm, and saving the network weights with the optimal accuracy.

Description

A visible light infrared pedestrian re-identification method and system based on spectral information filtering

技术领域Technical Field

本发明涉及交通环境技术领域，具体涉及一种基于光谱信息过滤的可见光红外行人重识别方法及系统。The present invention relates to the technical field of traffic environment, and in particular to a visible light infrared pedestrian re-identification method and system based on spectral information filtering.

背景技术Background technique

行人重识别旨在通过分析行人在不同场景或摄像头中的视觉特征，实现对同一行人在不同位置或时间点的准确识别和匹配。近年来，由于其在智能安防和智慧交通领域的广泛应用，这一技术受到了越来越多的关注。随着深度学习和神经网络架构的不断发展，行人重识别已经取得了令人瞩目的成绩。然而，当前大多数方法的设计主要聚焦于单模态行人重识别(可见光)，忽略了可见光图像在成像过程中容易受到光照条件的影响。在照明条件较差或夜间环境中，捕获到的可见光图像缺乏足够的视觉线索来准确辨别身份。因此，为了弥补这一限制，近期的研究逐渐将焦点转向可见光红外行人重识别，充分利用红外图像在低光照环境下的优势，为身份识别提供更可靠的解决方案。Person re-identification aims to accurately identify and match the same pedestrian at different locations or time points by analyzing the visual features of pedestrians in different scenes or cameras. In recent years, this technology has received increasing attention due to its wide application in the fields of intelligent security and smart transportation. With the continuous development of deep learning and neural network architecture, pedestrian re-identification has achieved remarkable results. However, the design of most current methods mainly focuses on single-modal pedestrian re-identification (visible light), ignoring that visible light images are easily affected by lighting conditions during the imaging process. In poor lighting conditions or night environments, the captured visible light images lack sufficient visual cues to accurately identify the identity. Therefore, in order to compensate for this limitation, recent research has gradually shifted its focus to visible light infrared pedestrian re-identification, making full use of the advantages of infrared images in low-light environments to provide a more reliable solution for identity recognition.

可见光红外行人重识别不仅需应对行人重识别任务固有的挑战（如视角改变、姿势变化、光照差异等），还必须克服两种不同传感器模态之间显著的差异。这种模态差异源于可见光和红外图像在采集过程中所使用的不同物理原理。可见光图像受到自然光照条件的直接影响，捕捉物体的表面颜色和纹理，而红外图像则基于目标物体的热辐射，更注重物体的温度分布。这两种不同的物理性质导致在行人外貌、纹理、亮度和热分布等方面存在明显的差异。现有的方法一般可分为两类：(1) 模态共享特征学习，旨在通过度量学习或模态特定信息解纠缠，挖掘跨模态行人图片的公共表征；(2) 模态补偿学习，其目的是在图像或特征层面生成缺失的模态属性，通过属性补全来减轻模态差异。然而，这两类方法试图直接处理巨大的模态差异，却忽略了较大的模态间隙不利于探索可见光和红外图像之间的光谱对应关系，难以学习到足够的判别语义。Visible-infrared person re-identification not only has to deal with the inherent challenges of the person re-identification task (such as viewpoint change, posture change, illumination difference, etc.), but also has to overcome the significant differences between the two different sensor modalities. This modality difference stems from the different physical principles used in the acquisition process of visible light and infrared images. Visible light images are directly affected by natural lighting conditions and capture the surface color and texture of objects, while infrared images are based on the thermal radiation of the target object and focus more on the temperature distribution of the object. These two different physical properties lead to obvious differences in pedestrian appearance, texture, brightness, and thermal distribution. Existing methods can generally be divided into two categories: (1) modality-sharing feature learning, which aims to mine the common representation of cross-modal pedestrian images through metric learning or modality-specific information disentanglement; (2) modality compensation learning, which aims to generate missing modality attributes at the image or feature level and alleviate the modality difference through attribute completion. However, these two types of methods attempt to directly deal with the huge modality difference, but ignore the fact that the large modality gap is not conducive to exploring the spectral correspondence between visible light and infrared images, and it is difficult to learn sufficient discriminative semantics.

发明内容Summary of the invention

发明目的：本发明的目的是提供一种基于光谱信息过滤的可见光红外行人重识别方法及系统对现有的可见光红外行人重识别算法进行改进优化，捕获模态间潜在的光谱对应关系，解决特征匹配准确率低的问题。Purpose of the invention: The purpose of the present invention is to provide a visible light infrared pedestrian re-identification method and system based on spectral information filtering to improve and optimize the existing visible light infrared pedestrian re-identification algorithm, capture the potential spectral correspondence between modalities, and solve the problem of low feature matching accuracy.

技术方案：本发明所述的一种基于光谱信息过滤的可见光红外行人重识别方法，包括以下步骤：Technical solution: The visible light infrared pedestrian re-identification method based on spectral information filtering described in the present invention comprises the following steps:

（1）获取原始数据，划分训练集、验证集和测试集并进行预处理；(1) Obtain the original data, divide it into training set, validation set and test set, and perform preprocessing;

（2）将步骤（1）处理过的批量训练样本随机组成跨模态图像对；(2) Randomly group the batch training samples processed in step (1) into cross-modal image pairs;

（3）基于PyTorch搭建三分支行人重识别网络并设置训练参数，原始的训练样本以及合成的过渡图片作为网络输入，提取行人表征；(3) Build a three-branch person re-identification network based on PyTorch and set training parameters. Use the original training samples and synthesized transition images as network input to extract pedestrian representations.

（4）将训练时期分为V-T和V-I两个阶段，当处于V-T阶段时，计算语义一致损失更新网络权重，将过渡模态作为过滤条件，从可见光模态中保留与红外模态最为相关的光谱信息；(4) The training period is divided into two stages: V-T and V-I. When in the V-T stage, the semantic consistency loss is calculated to update the network weights, and the transition mode is used as a filtering condition to retain the spectral information most relevant to the infrared mode from the visible light mode;

（5）当处于V-I阶段时计算级联聚合损失，更新网络权重，直接在可见光和红外模态间实现模态对齐，提取模态共享表示；在训练过程中，使用验证集验证算法的精度，保存最优精度的网络权重。(5) When in the V-I stage, the cascade aggregation loss is calculated, the network weights are updated, modal alignment is directly achieved between the visible light and infrared modalities, and the modal shared representation is extracted; during the training process, the verification set is used to verify the accuracy of the algorithm and save the network weights with the optimal accuracy.

进一步的，所述步骤（1）中，原始数据集为公开的SYSU-MM01或RegDB；预处理包括：训练样本裁剪缩放为288x144像素，并通过水平翻转，随机擦除增加样本的多样性，最后使用由ImageNet数据集计算得到通道均值和标准差进行标准化。Furthermore, in step (1), the original dataset is the publicly available SYSU-MM01 or RegDB; the preprocessing includes: the training samples are cropped and scaled to 288x144 pixels, and the diversity of the samples is increased by horizontal flipping and random erasing, and finally the channel mean and standard deviation calculated from the ImageNet dataset are used for standardization.

进一步的，所述步骤（2）具体如下：首先随机选择可见光图像红、绿、蓝颜色通道的其中一个并扩展成三通道；然后将处理过的可见光图像和原始的红外图像水平划分为多个部分，每个部分以相同的概率保留两模态的其中一个；最后沿着图像的高度维度拼接起来；每个图像对生成一张过渡图像；公式如下：Furthermore, the step (2) is specifically as follows: first, randomly select one of the red, green, and blue color channels of the visible light image and expand it into three channels; then, divide the processed visible light image and the original infrared image horizontally into multiple parts, and each part retains one of the two modes with the same probability; finally, splice them along the height dimension of the image; each image pair generates a transition image; the formula is as follows:

设取出图像对中的可见光图像随机选择红、绿、蓝三通道的其中一个并扩展成三通道，然后将变换后的可见光图像/>和原始的红外图像/>水平划分为/>个部分；则：Suppose we take out the visible light image from the image pair Randomly select one of the three channels of red, green and blue and expand it into three channels, then transform the visible light image and the original infrared image/> The horizontal division is part; then:

； ;

其中，、/>、/>分别代表可见光图片中的红色、绿色以及蓝色通道， />和/>为随机选择和扩展操作；in, 、/> 、/> Respectively represent the red, green and blue channels in the visible light image, /> and/> For random selection and expansion operations;

； ;

其中，和 />为图片水平条纹组成的数组，表示对数组元素的随机选择操作，得到新的数组；/>为图片高度维度的拼接操作。in, and/> An array of horizontal stripes of the image. Indicates a random selection operation on array elements to obtain a new array; /> It is the stitching operation of the image height dimension.

进一步的，所述步骤（3）具体如下：选择在ImageNet上预训练的ResNet50模型作为特征提取器，然后在SYSU-MM01或RegDB数据集上对网络进行微调；其中，ResNet50共有五个阶段，将第一个阶段复制三次分别作为可见光、过渡和红外模态的输入口，后四个阶段为模态共享，构成三分支行人重识别网络；将步骤（1）读取的可见光和红外图片以及步骤（2）合成的过渡图片输入到行人重识别网络中提取行人特征。Furthermore, the step (3) is specifically as follows: select the ResNet50 model pre-trained on ImageNet as the feature extractor, and then fine-tune the network on the SYSU-MM01 or RegDB dataset; wherein, ResNet50 has a total of five stages, the first stage is replicated three times as the input ports of the visible light, transition and infrared modes respectively, and the last four stages are modality-shared to form a three-branch pedestrian re-identification network; the visible light and infrared images read in step (1) and the transition images synthesized in step (2) are input into the pedestrian re-identification network to extract pedestrian features.

进一步的，所述步骤（4）具体如下：利用步骤（3）提取得到的行人特征以及步骤（1）读取的身份标签信息，计算由基础损失和语义一致损失组成的联合训练损失，通过梯度反向传播更新行人重识别网络；语义一致损失将过渡模态作为约束条件，从可见光模态中保留与红外模态最为相关的光谱信息；语义一致损失的计算过程如下：Furthermore, the step (4) is specifically as follows: using the pedestrian features extracted in step (3) and the identity tag information read in step (1), a joint training loss consisting of a basic loss and a semantic consistency loss is calculated, and the pedestrian re-identification network is updated by gradient back propagation; the semantic consistency loss takes the transition mode as a constraint condition and retains the spectral information most relevant to the infrared mode from the visible light mode; the calculation process of the semantic consistency loss is as follows:

； ;

其中，P代表当前批量batch中行人类别数量，为单个ID下的可见光/过渡图片数量，/>为全连接网络；/>为超参数权重控制强关联和弱关联样本对的知识传播速率；为L2正则化；/>和/>分别表示第i个身份下第j个可见光特征和第k个过渡特征。Among them, P represents the number of pedestrian categories in the current batch, is the number of visible light/transition images under a single ID, /> It is a fully connected network; /> The hyperparameter weight controls the knowledge propagation rate of strongly and weakly associated sample pairs; is L2 regularization; /> and/> They represent the jth visible light feature and the kth transition feature under the ith identity respectively.

进一步的，基础损失由身份损失即交叉熵损失和三元组损失构成；可见光模态身份损失的计算过程表示为：Furthermore, the basic loss consists of identity loss, i.e., cross entropy loss, and triple loss; the calculation process of the visible light modality identity loss is expressed as:

； ;

其中，N为可见光图片数量，C是ID总数，为第I张可见光图片提取得到的特征；为样本真实身份对应的分类器权重；红外和过渡模态的技术过程与此类似；/>为对应的分类器权重；Among them, N is the number of visible light images, C is the total number of IDs, The features extracted from the I-th visible light image; is the classifier weight corresponding to the true identity of the sample; the technical process of infrared and transition modes is similar; /> is the corresponding classifier weight;

三元组损失计算过程如下：The triplet loss calculation process is as follows:

； ;

其中，表示可见光和过渡特征组成的集合；/>与/> 为正样本对；与/>为负样本对；/>为欧式距离；/>；/>为边缘参数。in, Represents a collection of visible light and transition features; /> With/> is a positive sample pair; With/> is a negative sample pair; /> is the Euclidean distance; /> ; /> is the edge parameter.

进一步的，所述步骤（5）中，通过梯度反向传播更新行人重识别网络；级联聚合损失的计算过程如下：Furthermore, in step (5), the person re-identification network is updated by gradient back propagation; the calculation process of the cascade aggregation loss is as follows:

； ;

其中，和/>分别表示第i个身份的可见光和红外特征中心；/>和/>分别表示第i个身份的可见光和红外语义中心；P为当前batch中ID的总数；/>为当前batch中第i个ID中第j张红外图片提取得到的特征。in, and/> Respectively represent the visible light and infrared feature centers of the i-th identity; /> and/> Respectively represent the visible light and infrared semantic centers of the ith identity; P is the total number of IDs in the current batch; /> The features extracted from the jth infrared image in the i-th ID in the current batch.

本发明所述的一种基于光谱信息过滤的可见光红外行人重识别系统，包括：The visible light infrared pedestrian re-identification system based on spectral information filtering of the present invention comprises:

获取预处理模块：用于获取原始数据，划分训练集、验证集和测试集并进行预处理；Obtaining preprocessing module: used to obtain raw data, divide it into training set, validation set and test set, and perform preprocessing;

跨模态图像对模块：用于将获取预处理模块处理过的批量训练样本随机组成跨模态图像对；Cross-modal image pair module: used to randomly form cross-modal image pairs from batch training samples processed by the pre-processing module;

提取模块：用于基于PyTorch搭建三分支行人重识别网络并设置训练参数，原始的训练样本以及合成的过渡图片作为网络输入，提取行人表征；Extraction module: used to build a three-branch person re-identification network based on PyTorch and set training parameters. The original training samples and synthesized transition images are used as network input to extract pedestrian representations.

V-T阶段模块：用于将训练时期分为V-T和V-I两个阶段，当处于V-T阶段时，计算语义一致损失更新网络权重，将过渡模态作为过滤条件，从可见光模态中保留与红外模态最为相关的光谱信息；V-T stage module: used to divide the training period into two stages: V-T and V-I. When in the V-T stage, the semantic consistency loss is calculated to update the network weights, and the transition mode is used as a filtering condition to retain the spectral information most relevant to the infrared mode from the visible light mode;

V-I阶段模块：用于当处于V-I阶段时计算级联聚合损失，更新网络权重，直接在可见光和红外模态间实现模态对齐，提取模态共享表示；在训练过程中，使用验证集验证算法的精度，保存最优精度的网络权重。V-I stage module: used to calculate the cascade aggregation loss when in the V-I stage, update the network weights, directly achieve modal alignment between visible light and infrared modalities, and extract modal shared representations; during the training process, the verification set is used to verify the accuracy of the algorithm and save the network weights with the optimal accuracy.

本发明所述的一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，其特征在于，所述计算机程序被加载至处理器时实现任一项所述的一种基于光谱信息过滤的可见光红外行人重识别方法。An electronic device described in the present invention includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the computer program, when loaded into the processor, implements any one of the visible light infrared pedestrian re-identification methods based on spectral information filtering.

本发明所述的一种存储介质，所述存储介质存储有计算机程序，其特征在于，所述计算机程序被处理器执行时实现任一项所述的一种基于光谱信息过滤的可见光红外行人重识别方法。A storage medium described in the present invention stores a computer program, characterized in that when the computer program is executed by a processor, any one of the visible light infrared pedestrian re-identification methods based on spectral information filtering is implemented.

有益效果：与现有技术相比，本发明具有如下显著优点：利用深度学习对行人身份进行识别和匹配，且具有较高的识别率，可以节省大量的时间成本和人力成本。此外，本发明没有添加额外的模型复杂度，仅使用全局特征就实现了较好的性能，部署时对硬件和计算速度的需求较小。Beneficial effects: Compared with the prior art, the present invention has the following significant advantages: it uses deep learning to identify and match pedestrian identities, and has a high recognition rate, which can save a lot of time and labor costs. In addition, the present invention does not add additional model complexity, and only uses global features to achieve good performance, and has low requirements on hardware and computing speed during deployment.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明的整体流程图；Fig. 1 is an overall flow chart of the present invention;

图2为本发明提供的基于光谱信息过滤的可见光红外行人重识别方法的网络架构图；FIG2 is a network architecture diagram of a visible light infrared pedestrian re-identification method based on spectral information filtering provided by the present invention;

图3为本发明过渡模态生成的示意图；FIG3 is a schematic diagram of transition mode generation according to the present invention;

图4为本发明两阶段训练损失的示意图；FIG4 is a schematic diagram of the two-stage training loss of the present invention;

图5为本发明模型训练的流程图。FIG5 is a flow chart of model training of the present invention.

具体实施方式Detailed ways

下面结合附图对本发明的技术方案作进一步说明。The technical solution of the present invention is further described below in conjunction with the accompanying drawings.

如图1-5所示，本发明实施例提供一种基于光谱信息过滤的可见光红外行人重识别方法，包括以下步骤：As shown in FIGS. 1-5 , an embodiment of the present invention provides a visible light infrared pedestrian re-identification method based on spectral information filtering, comprising the following steps:

（1）获取原始数据，划分训练集、验证集和测试集并进行预处理；其中，原始数据集为公开的SYSU-MM01或RegDB；预处理包括：训练样本裁剪缩放为288x144像素，并通过水平翻转，随机擦除增加样本的多样性，最后使用由ImageNet数据集计算得到通道均值和标准差进行标准化。(1) Obtain the original data, divide it into training set, validation set and test set and perform preprocessing. The original data set is the public SYSU-MM01 or RegDB. The preprocessing includes: cropping and scaling the training samples to 288x144 pixels, and increasing the diversity of samples by horizontal flipping and random erasing. Finally, the channel mean and standard deviation calculated from the ImageNet dataset are used for standardization.

（2）将步骤（1）处理过的批量训练样本随机组成跨模态图像对；具体如下：首先随机选择可见光图像红、绿、蓝颜色通道的其中一个并扩展成三通道；然后将处理过的可见光图像和原始的红外图像水平划分为多个部分，每个部分以相同的概率保留两模态的其中一个；最后沿着图像的高度维度拼接起来；每个图像对生成一张过渡图像；公式如下：(2) Randomly group the batch training samples processed in step (1) into cross-modal image pairs; specifically, first randomly select one of the red, green, and blue color channels of the visible light image and expand it into three channels; then divide the processed visible light image and the original infrared image horizontally into multiple parts, and each part retains one of the two modes with the same probability; finally, splice them along the height dimension of the image; each image pair generates a transition image; the formula is as follows:

； ;

（3）基于PyTorch搭建三分支行人重识别网络并设置训练参数，原始的训练样本以及合成的过渡图片作为网络输入，提取行人表征；具体如下：选择在ImageNet上预训练的ResNet50模型作为特征提取器，然后在SYSU-MM01或RegDB数据集上对网络进行微调；其中，ResNet50共有五个阶段，将第一个阶段复制三次分别作为可见光、过渡和红外模态的输入口，后四个阶段为模态共享，构成三分支行人重识别网络；将步骤（1）读取的可见光和红外图片以及步骤（2）合成的过渡图片输入到行人重识别网络中提取行人特征。(3) Based on PyTorch, a three-branch person re-identification network is built and training parameters are set. The original training samples and the synthesized transition images are used as network input to extract pedestrian representations. The details are as follows: Select the ResNet50 model pre-trained on ImageNet as the feature extractor, and then fine-tune the network on the SYSU-MM01 or RegDB dataset; Among them, ResNet50 has a total of five stages. The first stage is replicated three times as the input port of the visible light, transition and infrared modes respectively, and the last four stages are modality shared to form a three-branch pedestrian re-identification network; The visible light and infrared images read in step (1) and the transition images synthesized in step (2) are input into the pedestrian re-identification network to extract pedestrian features.

（4）将训练时期分为V-T和V-I两个阶段，当处于V-T阶段时，计算语义一致损失更新网络权重，将过渡模态作为过滤条件，从可见光模态中保留与红外模态最为相关的光谱信息；具体如下：利用步骤（3）提取得到的行人特征以及步骤（1）读取的身份标签信息，计算由基础损失和语义一致损失组成的联合训练损失，通过梯度反向传播更新行人重识别网络；语义一致损失将过渡模态作为约束条件，从可见光模态中保留与红外模态最为相关的光谱信息；语义一致损失的计算过程如下：(4) The training period is divided into two stages, V-T and V-I. When in the V-T stage, the semantic consistency loss is calculated to update the network weights, and the transition mode is used as a filtering condition to retain the spectral information most relevant to the infrared mode from the visible light mode; specifically, the pedestrian features extracted in step (3) and the identity tag information read in step (1) are used to calculate the joint training loss composed of the basic loss and the semantic consistency loss, and the pedestrian re-identification network is updated through gradient back propagation; the semantic consistency loss uses the transition mode as a constraint condition to retain the spectral information most relevant to the infrared mode from the visible light mode; the calculation process of the semantic consistency loss is as follows:

； ;

基础损失由身份损失即交叉熵损失和三元组损失构成；可见光模态身份损失的计算过程表示为：The basic loss consists of identity loss, i.e., cross entropy loss, and triple loss; the calculation process of the visible light modality identity loss is expressed as:

； ;

（5）当处于V-I阶段时计算级联聚合损失，更新网络权重，直接在可见光和红外模态间实现模态对齐，提取模态共享表示；在训练过程中，使用验证集验证算法的精度，保存最优精度的网络权重。其中，通过梯度反向传播更新行人重识别网络；级联聚合损失的计算过程如下：(5) When in the V-I stage, the cascade aggregation loss is calculated, the network weights are updated, the modal alignment is directly achieved between the visible light and infrared modalities, and the modal shared representation is extracted; during the training process, the accuracy of the algorithm is verified using the validation set, and the network weights with the best accuracy are saved. Among them, the pedestrian re-identification network is updated through gradient back propagation; the calculation process of the cascade aggregation loss is as follows:

； ;

本发明初始学习率设置为0.01，然后在第10个epoch后线性增加到0.1。之后，学习率在第20和第60个epoch分别降低0.1倍。一个批量包含64张图像，其中4张可见光图像和4张红外图像是从8个身份中随机选择的。总训练持续时间为110个epoch，其中前100个epoch用于V-T阶段的训练，后10个epoch用于V-I阶段。我们使用SGD 作为优化器，权重衰减设置为0.0005，动量设置为0.9。The initial learning rate of the present invention is set to 0.01, and then linearly increased to 0.1 after the 10th epoch. After that, the learning rate is reduced by 0.1 times at the 20th and 60th epochs respectively. A batch contains 64 images, of which 4 visible light images and 4 infrared images are randomly selected from 8 identities. The total training duration is 110 epochs, of which the first 100 epochs are used for training in the V-T stage and the last 10 epochs are used for training in the V-I stage. We use SGD as the optimizer, with weight decay set to 0.0005 and momentum set to 0.9.

在SYSU-MM01和RegDB这两个主流可见光红外行人重识别数据集上取得了卓越的性能，与其它方法的对比结果如表1所示：Excellent performance was achieved on two mainstream visible light infrared pedestrian re-identification datasets, SYSU-MM01 and RegDB. The comparison results with other methods are shown in Table 1:

表1：本方法与其它可见光红外行人重识别方法的性能对比Table 1: Performance comparison of this method with other visible light infrared pedestrian re-identification methods

本发明实施例提供一种基于光谱信息过滤的可见光红外行人重识别系统，包括：The embodiment of the present invention provides a visible light infrared pedestrian re-identification system based on spectral information filtering, comprising:

本发明实施例提供一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，其特征在于，所述计算机程序被加载至处理器时实现任一项所述的一种基于光谱信息过滤的可见光红外行人重识别方法。An embodiment of the present invention provides an electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the computer program, when loaded into the processor, implements any one of the visible light infrared pedestrian re-identification methods based on spectral information filtering.

本发明实施例提供一种存储介质，所述存储介质存储有计算机程序，其特征在于，所述计算机程序被处理器执行时实现任一项所述的一种基于光谱信息过滤的可见光红外行人重识别方法。An embodiment of the present invention provides a storage medium storing a computer program, wherein the computer program, when executed by a processor, implements any one of the visible light infrared pedestrian re-identification methods based on spectral information filtering.

Claims

1. The visible light infrared pedestrian re-identification method based on spectral information filtering is characterized by comprising the following steps of:

(1) Obtaining original data, dividing a training set, a verification set and a test set, and preprocessing;

(2) Randomly forming cross-modal image pairs from the batch training samples processed in the step (1);

(3) Setting up a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization;

(4) Dividing the training period into two stages of V-T and V-I, calculating the semantic consistency loss updating network weight when the training period is in the V-T stage, taking a transition mode as a filtering condition, and reserving spectrum information most relevant to an infrared mode from a visible light mode;

(5) When the method is in the V-I stage, the cascade aggregation loss is calculated, the network weight is updated, the mode alignment is directly realized between the visible light and the infrared modes, and the mode sharing representation is extracted; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved.

2. The method for re-identifying visible infrared pedestrians based on spectral information filtering according to claim 1, wherein in the step (1), the original dataset is a published SYSU-MM01 or RegDB; the pretreatment comprises the following steps: training samples were cut and scaled to 288x144 pixels and randomly erased by horizontal flipping to increase sample diversity, and finally normalized using channel mean and standard deviation calculated from the ImageNet dataset.

3. The visible infrared pedestrian re-identification method based on spectral information filtering according to claim 1, wherein the step (2) is specifically as follows: firstly, randomly selecting one of red, green and blue color channels of a visible light image and expanding the red, green and blue color channels into three channels; then horizontally dividing the processed visible light image and the original infrared image into a plurality of parts, wherein each part keeps one of two modes with the same probability; finally, splicing along the height dimension of the image; generating a transition image for each image pair; the formula is as follows:

Setting visible light image in extracted image pair Randomly selecting one of the three channels of red, green and blue, expanding the three channels into three channels, and then transforming the visible light image/>And original infrared image/>Level division into/>A plurality of sections; then:

；

wherein, 、/>、/>Respectively represent red, green and blue channels in the visible light picture,/>AndFor random selection and expansion operations;

；

wherein, And/>Is an array formed by horizontal stripes of the picture,Representing the random selection operation of the array elements to obtain a new array; /(I)And (5) splicing the picture in the height dimension.

4. The visible infrared pedestrian re-identification method based on spectral information filtering according to claim 1, wherein the step (3) is specifically as follows: the ResNet model pre-trained on ImageNet was selected as the feature extractor, and then the network was trimmed on SYSU-MM01 or RegDB dataset; the ResNet stages are five in total, the first stage is duplicated for three times to be used as input ports of visible light, transition and infrared modes respectively, and the last four stages are mode sharing to form a three-branch pedestrian re-identification network; and (3) inputting the visible light and infrared pictures read in the step (1) and the transition pictures synthesized in the step (2) into a pedestrian re-identification network to extract pedestrian characteristics.

5. The visible infrared pedestrian re-identification method based on spectral information filtering of claim 1, wherein the step (4) is specifically as follows: calculating joint training loss consisting of basic loss and semantic consistency loss by utilizing the pedestrian characteristics extracted in the step (3) and the identity tag information read in the step (1), and updating a pedestrian re-recognition network through gradient back propagation; the semantic consistency loss takes a transition mode as a constraint condition, and retains spectrum information most relevant to an infrared mode from a visible light mode; the calculation process of the semantic consistency loss is as follows:

；

wherein P represents the number of pedestrian categories in the current batch, For the number of visible/transitional pictures under a single ID,Is a fully connected network; /(I)Controlling knowledge propagation rates of strong correlation and weak correlation sample pairs for the super-parameter weights; /(I)Regularization for L2; /(I)And/>Respectively representing the jth visible light characteristic and the kth transition characteristic under the ith identity.

6. The method for re-identifying visible infrared pedestrians based on spectral information filtering according to claim 5, wherein the basic loss consists of identity loss, i.e. cross entropy loss and triplet loss; the calculation process of the visible light modal identity loss is expressed as follows:

；

wherein N is the number of visible light pictures, C is the total number of IDs, Extracting the obtained characteristics of the I-th visible light picture; /(I)Classifier weights corresponding to the true identities of the samples; /(I)The classifier weights are corresponding;

The triplet loss calculation process is as follows:

；

wherein, Representing a set of visible light and transition features; /(I)And/>Is a positive sample pair; /(I)And (3) withIs a negative sample pair; /(I)Is the Euclidean distance; /(I)；/>Is an edge parameter.

7. The method for re-identifying visible infrared pedestrians based on spectral information filtering according to claim 1, wherein in the step (5), the pedestrian re-identification network is updated by gradient back propagation; the calculation process of the cascade aggregation loss is as follows:

；

wherein, And/>Visible and infrared feature centers respectively representing the ith identity; /(I)And/>Visible and infrared semantic centers respectively representing the ith identity; p is the total number of IDs in the current batch; /(I)And extracting the obtained characteristics for the j-th infrared picture in the i-th ID in the current batch.

8. A visible infrared pedestrian re-identification system based on spectral information filtering, comprising:

and (3) an acquisition pretreatment module: the method comprises the steps of obtaining original data, dividing a training set, a verification set and a test set, and preprocessing;

cross-modality image pair module: the system comprises a preprocessing module, a cross-modal image acquisition module and a cross-modal image acquisition module, wherein the preprocessing module is used for preprocessing the acquired training samples;

And an extraction module: the method is used for building a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization;

V-T stage module: the training period is divided into two stages, namely a V-T stage and a V-I stage, when the training period is in the V-T stage, the semantic consistency loss is calculated to update the network weight, the transition mode is used as a filtering condition, and the spectrum information most relevant to the infrared mode is reserved from the visible light mode;

V-I stage module: the method is used for calculating cascading aggregation loss when the system is in a V-I stage, updating network weight, directly realizing modal alignment between visible light and infrared modes, and extracting modal sharing representation; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when loaded into the processor implements a method for visible infrared pedestrian re-identification based on spectral information filtering according to any one of claims 1-7.

10. A storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the visible infrared pedestrian re-recognition method based on spectral information filtering according to any one of claims 1-7.