CN115205655B

CN115205655B - Infrared dark spot target detection system under dynamic background and detection method thereof

Info

Publication number: CN115205655B
Application number: CN202211118428.7A
Authority: CN
Inventors: 王佳荣; 孙海江; 孙佳琪; 朱明�
Original assignee: Changchun Institute of Optics Fine Mechanics and Physics of CAS
Current assignee: Changchun Institute of Optics Fine Mechanics and Physics of CAS
Priority date: 2022-09-15
Filing date: 2022-09-15
Publication date: 2022-12-09
Anticipated expiration: 2042-09-15
Also published as: CN115205655A

Abstract

The invention relates to the technical field of image processing, in particular to an infrared dark spot target detection system and a detection method thereof under a dynamic background, wherein the detection method comprises the following steps: s1, performing subsequence splitting on an infrared sequence image under dynamic background change to obtain a stage static background subsequence image; s2, performing background clutter suppression on the subsequence images and enhancing targets in the subsequence images through a space-time fusion background suppression network based on full convolution; s3, detecting a target from the background-suppressed image according to an infrared point target detection network based on the improved Yolov 5; and S4, confirming the target through a track association matching algorithm, and removing a false alarm of the target to obtain a real target. The invention can accurately detect the target from noise, background residue and interferent.

Description

Infrared Dark Weak Spot Target Detection System and Its Detection Method in Dynamic Background

技术领域technical field

本发明涉及图像处理技术领域，特别涉及一种动态背景下的红外暗弱点目标检测系统及其检测方法。The invention relates to the technical field of image processing, in particular to an infrared dark spot target detection system and a detection method thereof under a dynamic background.

背景技术Background technique

采用红外探测系统空对空、空对地远距离（通常为几十公里甚至上百公里）观测目标时，受大气扰动、光学散射和衍射等影响，靶面接收目标信号的光谱辐照度很小，导致目标信噪比低，成像面积小（点、斑状，在整个场景中只占几个像素的空间范围），无形状纹理信息，极易淹没于背景杂波与噪声之中。且实际应用中动基座探测器居多，则背景的动态更新更是大大增加了检测难度。因此如何在动态背景中实现红外暗弱点目标的有效检测成为了当今世界检测领域的研究热点。When the infrared detection system is used to observe the target at a long distance (usually tens of kilometers or even hundreds of kilometers) from air to air or air to ground, due to the influence of atmospheric disturbance, optical scattering and diffraction, the spectral irradiance of the target signal received by the target surface is very large. Small, resulting in low signal-to-noise ratio of the target, small imaging area (points, speckles, occupying only a few pixels in the entire scene), no shape and texture information, and easily submerged in background clutter and noise. Moreover, most of the detectors on the moving base are used in practical applications, and the dynamic update of the background greatly increases the difficulty of detection. Therefore, how to realize the effective detection of dark infrared spot targets in the dynamic background has become a research hotspot in the field of detection in the world today.

发明内容Contents of the invention

鉴于上述问题，本发明的目的是提出一种动态背景下的红外暗弱点目标检测系统及其检测方法，通过将传统方法与深度学习相结合，“知识与数据”联合驱动的方式，能够在全时段真实杂干扰或杂波中提取出有效的目标信息。In view of the above problems, the purpose of this invention is to propose a detection system and detection method for dark infrared spot targets in a dynamic background. By combining traditional methods with deep learning and jointly driven by "knowledge and data", it can be used in the whole world. Effective target information can be extracted from real clutter or clutter.

与现有的技术相比，具有以下优点：Compared with the existing technology, it has the following advantages:

（1）本发明提供基于全卷积的时空融合背景抑制网络来并行应对目标长时间静止和目标-背景灰度对比度下降的情况；(1) The present invention provides a spatio-temporal fusion background suppression network based on full convolution to deal with the situation where the target is still for a long time and the gray contrast between the target and the background decreases in parallel;

（2）本发明提供基于改进Yolov5的红外点目标检测网络替代传统的阈值分割方法，不再单纯以（经背景抑制后的）灰度值作为目标判别依据，而是利用神经网络强大的特征提取能力充分学习目标的形态、结构、纹理、灰度多方面属性，准确将目标从噪点、背景残留、干扰物中检测出来。(2) The present invention provides an infrared point target detection network based on improved Yolov5 to replace the traditional threshold segmentation method. It no longer simply uses the gray value (after background suppression) as the basis for target discrimination, but uses the powerful feature extraction of the neural network Ability to fully learn the shape, structure, texture, and grayscale attributes of the target, and accurately detect the target from noise, background residue, and interference.

（3）本发明提供的基于全卷积的时空融合背景抑制算法与基于改进Yolov5的红外点目标检测网络采用“端到端”地训练、预测方式，由于背景抑制与检测内在的相关性，两者的目标函数（LOSS 函数）互为约束、彼此监督，会避免参数过拟合并提高网络鲁棒性。(3) The full convolution-based space-time fusion background suppression algorithm and the improved Yolov5-based infrared point target detection network provided by the present invention adopt an "end-to-end" training and prediction method. Due to the inherent correlation between background suppression and detection, the two The objective function (LOSS function) of the former is constrained and supervised by each other, which will avoid parameter overfitting and improve network robustness.

附图说明Description of drawings

图1是根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法的逻辑框架示意图。Fig. 1 is a schematic diagram of a logical framework of a method for detecting infrared weak spot targets under a dynamic background according to an embodiment of the present invention.

图2是根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法的流程示意图。Fig. 2 is a schematic flow chart of a method for detecting an infrared weak point target under a dynamic background according to an embodiment of the present invention.

图3是根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法中时空融合背景抑制网络的骨干网络结构示意图。Fig. 3 is a schematic diagram of the backbone network structure of the spatio-temporal fusion background suppression network in the method for detecting infrared weak spot targets under dynamic backgrounds according to an embodiment of the present invention.

图4是根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法中红外点目标检测网络的特征提取网络结构示意图。Fig. 4 is a schematic diagram of a feature extraction network structure of an infrared point target detection network in a method for detecting an infrared weak point target under a dynamic background according to an embodiment of the present invention.

图5是根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法的目标检测结果示意图。Fig. 5 is a schematic diagram of a target detection result of a method for detecting an infrared weak point target under a dynamic background according to an embodiment of the present invention.

具体实施方式detailed description

在下文中，将参考附图描述本发明的实施例。在下面的描述中，相同的模块使用相同的附图标记表示。在相同的附图标记的情况下，它们的名称和功能也相同。因此，将不重复其详细描述。Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In the following description, the same blocks are denoted by the same reference numerals. With the same reference numerals, their names and functions are also the same. Therefore, its detailed description will not be repeated.

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及具体实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本发明，而不构成对本发明的限制。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.

图1示出了根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法的逻辑框架示意图。Fig. 1 shows a schematic diagram of a logic framework of a method for detecting infrared weak spot targets in a dynamic background according to an embodiment of the present invention.

图2示出了根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法的流程示意图。Fig. 2 shows a schematic flow chart of a method for detecting an infrared dark spot target under a dynamic background according to an embodiment of the present invention.

如图1和图2所示，本发明提供的动态背景下的红外暗弱点目标检测方法包括以下步骤：As shown in Fig. 1 and Fig. 2, the infrared weak spot target detection method under the dynamic background provided by the present invention comprises the following steps:

S1、根据动态背景变化下的红外序列图像进行子序列拆分，得到背景阶段性静止的子序列图像。S1. Perform sub-sequence splitting according to the infrared sequence images under dynamic background changes, and obtain sub-sequence images with static background in stages.

红外序列图像中当目标离开视场后红外图像传感器会随动，以保证目标时刻处于观测区域。此时背景或场景则会出现阶段性的变化（区别于缓慢更新）。在此过程中，目标会出现短暂消失后重新出现、且大幅度位移的情况。基于此，本发明采用感知哈希与差值哈希联合计算相邻帧图像之间的相似度，以此判断红外图像传感器是否发生移动，按照移动时刻将背景动态变化的序列图像拆分成一个个背景或场景阶段性静止的子序列，以提升后续增强的适配性，降低检测跟踪难度。In the infrared sequence image, when the target leaves the field of view, the infrared image sensor will follow up to ensure that the target is always in the observation area. At this time, the background or scene will change periodically (different from slow update). During this process, the target will disappear for a short time and then reappear with a large displacement. Based on this, the present invention adopts perceptual hash and difference hash to jointly calculate the similarity between adjacent frame images, thereby judging whether the infrared image sensor moves, and splits the sequence images with dynamically changing backgrounds into one A subsequence of a background or scene that is static in stages to improve the adaptability of subsequent enhancements and reduce the difficulty of detection and tracking.

步骤S1包括以下子步骤：Step S1 includes the following sub-steps:

S11、计算红外序列图像中相邻帧图像之间的相似度，判断红外图像传感器是否发生移动，若红外图像传感器产生移动，则记录移动时刻。S11. Calculate the similarity between adjacent frame images in the infrared sequence image, judge whether the infrared image sensor moves, and record the moving time if the infrared image sensor moves.

步骤S11包括以下子步骤：Step S11 includes the following sub-steps:

S110、预处理步骤：设置以下参数：感知哈希阈值pth，差值哈希阈值dth，连续变化阈值sth（例如5），拆分阈值ssth（例如100）。S110 , preprocessing step: setting the following parameters: perceptual hash threshold pth, difference hash threshold dth, continuous change threshold sth (for example, 5), and split threshold ssth (for example, 100).

S111、分别计算红外序列图像F_N(i，j)中当前帧与前一帧的感知哈希值和差值哈希值，得到两帧图像中的感知哈希差异p和差值哈希差异d。S111. Calculate the perceptual hash value and the difference hash value of the current frame and the previous frame in the infrared sequence image F _N (i, j) respectively, and obtain the perceptual hash difference p and the difference hash difference in the two frames of images d.

其中，N为红外序列图像中的原始序列长度。Among them, N is the original sequence length in the infrared sequence image.

S112、当感知哈希差异p >感知哈希阈值pth，且差值哈希差异d >差值哈希阈值dth时，判定此时当前帧的场景发生变化即红外图像传感器产生移动，并将当前帧帧号存入场景变化集B中。S112. When the perceptual hash difference p > the perceptual hash threshold pth, and the difference hash difference d > the difference hash threshold dth, it is determined that the scene of the current frame changes at this time, that is, the infrared image sensor moves, and the current The frame number is stored in the scene change set B.

S113、将场景变化集B按时间（帧号）进行排序。S113. Sorting the scene change set B by time (frame number).

S12、去除在红外图像传感器移动过程中的连续变化帧，并确定每段子序列的开始帧和结束帧。S12. Remove continuously changing frames during the movement of the infrared image sensor, and determine a start frame and an end frame of each subsequence.

若场景变化集B中当前帧号与前一帧号的差值大于连续变化阈值sth时，将当前帧号存入子序列结束集C中，并将子序列结束集C按时间（帧号）进行排序。If the difference between the current frame number and the previous frame number in the scene change set B is greater than the continuous change threshold sth, store the current frame number in the subsequence end set C, and store the subsequence end set C by time (frame number) put in order.

S13、当子序列长度小于ssth时不进行拆分，并将该子序列保留在上一段子序列中，得到子序列拆分集D。S13. When the length of the subsequence is less than ssth, do not split, and keep the subsequence in the previous subsequence to obtain a subsequence split set D.

子序列长度小于ssth，可能是光照变化导致的相似度较差。S14、按照子序列拆分集D对红外序列图像F_N(i，j)进行拆分得到红外子序列 F_N1(i，j)、F_N2(i，j)、F_N3(i，j)…F_Nn(i，j)。The length of the subsequence is less than ssth, which may be due to poor similarity caused by illumination changes. S14. Split the infrared sequence image F _N (i, j) according to the sub-sequence splitting set D to obtain infrared sub-sequences F _N1 (i, j), F _N2 (i, j), F _N3 (i, j) ... F _Nn (i, j).

S2、通过基于全卷积的时空融合背景抑制网络对子序列图像进行背景杂波抑制和对子序列图像中的目标进行增强。S2. Perform background clutter suppression on the subsequence images and enhance targets in the subsequence images through a fully convolution-based spatio-temporal fusion background suppression network.

背景预测及减除的基本思路是通过背景像素在空间分布上相关并且与目标像素的特征相反的假设，从原始图像上减去预测背景。本发明提出一种基于全卷积的时空融合背景抑制网络。主要思想是利用神经网络同时利用序列红外图像的时域和空域信息，充分挖掘目标的形态特性、灰度特性和运动特性。The basic idea of background prediction and subtraction is to subtract the predicted background from the original image by assuming that the background pixels are spatially correlated and opposite to the characteristics of the target pixels. The present invention proposes a spatio-temporal fusion background suppression network based on full convolution. The main idea is to use the neural network to simultaneously utilize the temporal and spatial information of the sequence infrared images to fully mine the morphological characteristics, grayscale characteristics and motion characteristics of the target.

步骤S2包括以下子步骤：Step S2 includes the following sub-steps:

S20、将子序列图像拆分为T段，每段t帧。S20. Split the subsequence image into T segments, and each segment has t frames.

S21、将t帧子序列图像作为t通道图像整体输入至基于全卷积的时空融合背景抑制网络中进行训练，得到背景预测模型。S21. Input the t-frame subsequence image as a t-channel image as a whole into the full convolution-based spatio-temporal fusion background suppression network for training to obtain a background prediction model.

针对远距离探测易发生的目标-背景灰度对比度下降、目标淹没于背景、目标暗弱不可视的情况，本发明利用红外序列图像中时序信息，对拆分而得的阶段性背景静止的第i个子序列，拆分成T段，每段t帧，t帧红外图像作为t通道图像整体输入至基于全卷积的时空融合背景抑制网络中进行训练，得到背景预测模型。背景预测模型可计算输入图像每个像素的背景抑制分量，理想情况下，最后输出的图像应该完全去除了背景和噪声，并且真实目标获得增强，具有更高的信噪比。Aiming at the situation that the target-background gray contrast is easy to occur in long-distance detection, the target is submerged in the background, and the target is dim and invisible, the present invention utilizes the time series information in the infrared sequence image to analyze the i-th stage static background obtained by splitting. Subsequences are divided into T segments, each segment is t frames, and the t frame infrared images are input as a t channel image to the fully convolution-based spatiotemporal fusion background suppression network for training to obtain a background prediction model. The background prediction model can calculate the background suppression component of each pixel of the input image. Ideally, the final output image should completely remove the background and noise, and the real target is enhanced with a higher signal-to-noise ratio.

S22、背景预测模型根据子序列图像计算每个像素的背景抑制分量，得到去除背景、噪声且目标增强后的输出图像。S22. The background prediction model calculates the background suppression component of each pixel according to the subsequence images, and obtains an output image in which the background and noise are removed and the target is enhanced.

图3示出了根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法中时空融合背景抑制网络的骨干网络结构示意图。Fig. 3 shows a schematic diagram of the backbone network structure of the spatio-temporal fusion background suppression network in the method for detecting dark infrared spot targets under dynamic backgrounds according to an embodiment of the present invention.

如图3所示，该网络主要由6个卷积层构成，由于点目标像素少，则不采用池化层或下采样设计，卷积核尺度均设置为3×3，各层卷积核个数分别为8、16、32、64、128、t（保证输出特征图与输入通道数一致）；第n层与n-2层特征图直接融合，兼顾目标细节和强语义信息，该网络采用全卷积设计，对输入输出图像尺寸没有限制。As shown in Figure 3, the network is mainly composed of 6 convolutional layers. Due to the small number of point target pixels, no pooling layer or downsampling design is used. The convolution kernel scale is set to 3×3, and the convolution kernels of each layer The numbers are 8, 16, 32, 64, 128, t (to ensure that the output feature map is consistent with the number of input channels); the nth layer and the n-2 layer feature map are directly fused, taking into account the target details and strong semantic information. The network adopts a fully convolutional design, and there is no limit to the size of the input and output images.

S3、根据基于改进Yolov5的红外点目标检测网络从背景抑制后图像中检测目标。S3, according to the improved Yolov5-based infrared point target detection network to detect the target from the image after background suppression.

传统的基于背景抑制的检测方法会在背景减除、目标加强后，设置灰度阈值门限来筛选、分割“真”目标。但是在目标信噪比较低的情况下，如果设置的灰度检测门限较高，有可能造成暗弱目标丢失；相反，若设置的门限较低，则会导致大量虚警，即使采用自适应阈值分割方法（即根据图像的灰度均值和方差来自适应地确定分割阈值）也只是一定程度缓解。基于此本发明采用神经网络替代传统的阈值分割，不再单纯以灰度值为目标判别依据，而是利用神经网络强大的特征提取能力充分学习目标的形态、结构、纹理、灰度多方面属性，力争将目标从噪点、背景残留、干扰物中分离出来。The traditional detection method based on background suppression will set the gray threshold threshold to filter and segment the "true" target after the background is subtracted and the target is enhanced. However, when the target SNR is low, if the gray detection threshold is set higher, it may cause the loss of faint targets; on the contrary, if the threshold is set lower, it will lead to a large number of false alarms, even if the adaptive threshold The segmentation method (that is, adaptively determine the segmentation threshold according to the gray mean and variance of the image) is only a certain degree of relief. Based on this, the present invention adopts the neural network instead of the traditional threshold segmentation, and no longer simply uses the gray value as the basis for target discrimination, but uses the powerful feature extraction ability of the neural network to fully learn the shape, structure, texture, and gray level of the target. , and strive to separate the target from noise, background residue, and interference.

尽管经典的Yolov5目标检测网络在鲁棒性和准确度上都有很好的表现，但在特征不明显的红外小目标检测上仍有一定的瓶颈。Although the classic Yolov5 target detection network has a good performance in terms of robustness and accuracy, it still has a certain bottleneck in the detection of small infrared targets with inconspicuous features.

图4示出了根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法中红外点目标检测网络的特征提取网络结构示意图。Fig. 4 shows a schematic diagram of a feature extraction network structure of an infrared point target detection network in a method for detecting an infrared weak spot target under a dynamic background according to an embodiment of the present invention.

如图4所示，本发明提出了基于改进Yolov5的红外点目标检测网络。As shown in Figure 4, the present invention proposes an infrared point target detection network based on improved Yolov5.

步骤S3包括以下子步骤：Step S3 includes the following sub-steps:

S301、建立红外点、斑目标数据集；S301. Establishing infrared point and spot target data sets;

S302、直接减少YOLOv5的特征提取网络中网络层数（卷积层、池化层）、残差网络结构数量、SPP中的不同尺度感受野数量；并引入自注意力机制（Attention），将通道注意力模块和空间注意力模块级联嵌入网络中（每次下采样之前）；得到改进后的Yolov5红外点目标特征提取网络。S302. Directly reduce the number of network layers (convolutional layer, pooling layer), the number of residual network structures, and the number of receptive fields of different scales in SPP in the feature extraction network of YOLOv5; and introduce a self-attention mechanism (Attention) to channel The attention module and the spatial attention module are cascaded and embedded in the network (before each downsampling); the improved Yolov5 infrared point target feature extraction network is obtained.

S303、将步骤S301中的数据集图像输入到改进后的Yolov5红外点目标特征提取网络进行特征提取，得到不同尺度的特征图；S303, input the data set image in step S301 to the improved Yolov5 infrared point target feature extraction network for feature extraction, and obtain feature maps of different scales;

S304、将步骤S303中得到的特征图进行分类和边界框回归，计算损失；S304, performing classification and bounding box regression on the feature map obtained in step S303, and calculating the loss;

S305、完成改进后的Yolov5红外点目标检测网络的训练后，借助已划分好的数据集对测试集进行测试以实现对红外点目标检测，并对改进后的Yolov5红外点目标检测网络的检测效果进行评价。步骤S302具体描述如下：S305, after completing the training of the improved Yolov5 infrared point target detection network, test the test set with the help of the divided data set to realize the infrared point target detection, and check the detection effect of the improved Yolov5 infrared point target detection network Make an evaluation. The specific description of step S302 is as follows:

本发明重新设计了Yolov5网络中的特征提取部分（网络结构设计详见如图4）。点/斑目标信息因所占像素少很难传递到网络高层，即使Yolov5采用了特征金字塔（FPN）等补偿融合策略，仍然无法有效解决：（1）下采样过程中已经丢失的空间层级、数据结构信息很难通过简单的上采样有效恢复；（2）粗糙的融合未必能显著地提升特征表征能力。则本发明以Yolov5特征提取网络为蓝本，直接减少网络层数（卷积层、池化层）、残差网络结构数量、SPP中的不同尺度感受野数量，再引入自注意力机制（Attention），将通道注意力模块和空间注意力模块级联嵌入网络中（每次下采样之前），通过建模特征通道、空间的关联来调控网络内的“信息流动”，增强有用信息的权重和表征，生成能够充分表征小目标的强语义特征图，从而实现高质量检测。The present invention redesigns the feature extraction part in the Yolov5 network (see Figure 4 for details of the network structure design). Point/spot target information is difficult to transmit to the upper layer of the network due to the small number of pixels occupied. Even if Yolov5 adopts feature pyramid (FPN) and other compensation and fusion strategies, it still cannot effectively solve: (1) The spatial level and data that have been lost in the downsampling process Structural information is difficult to effectively recover through simple upsampling; (2) Rough fusion may not significantly improve feature representation capabilities. Then the present invention uses the Yolov5 feature extraction network as a blueprint, directly reduces the number of network layers (convolutional layer, pooling layer), the number of residual network structures, and the number of receptive fields of different scales in SPP, and then introduces the self-attention mechanism (Attention) , embed the channel attention module and the spatial attention module cascade into the network (before each downsampling), regulate the "information flow" in the network by modeling the feature channel and spatial association, and enhance the weight and representation of useful information , generating strong semantic feature maps that can adequately characterize small objects, enabling high-quality detection.

减少Yolov5的特征提取网络的网络层数、残差网络结构数量和SPP中的不同尺度感受野数量。Reduce the number of network layers of Yolov5's feature extraction network, the number of residual network structures, and the number of receptive fields of different scales in SPP.

以确保所占像素少的点/斑目标信息传递到网络高层，避免信息损失、粗糙融合或跳跃连接混淆特征表达。To ensure that the point/spot target information that occupies few pixels is transmitted to the upper layer of the network, avoiding information loss, rough fusion or skip connection confusion feature expression.

在Yolov5的特征提取网络中引入自注意力机制（Attention），将通道注意力模块和空间注意力模块级联嵌入Yolov5的特征提取网络中（每次下采样之前），通过建模特征通道、空间的关联来调控网络内的“信息流动”，增强有用信息的权重和表征，生成能够充分表征小目标的强语义特征图。Introduce the self-attention mechanism (Attention) into Yolov5's feature extraction network, and embed the channel attention module and spatial attention module cascade into Yolov5's feature extraction network (before each downsampling), by modeling feature channels, space To regulate the "information flow" in the network, enhance the weight and representation of useful information, and generate strong semantic feature maps that can fully represent small objects.

步骤S302包括以下子步骤：Step S302 includes the following sub-steps:

S30201：输入大小为H×W×1（高、宽和通道数）图像A，，经过Focus层处理（分层、拼接、卷积）输出大小为H×W×32的特征Q1；S30201: Input image A with size H×W×1 (height, width and number of channels), and output feature Q1 with size H×W×32 after Focus layer processing (layering, splicing, convolution);

S30202：特征Q1经过Conv层处理（卷积、正则化、SiLU激活函数）输出大小为H/2×W/2×64的特征Q2；S30202: The feature Q1 is processed by the Conv layer (convolution, regularization, and SiLU activation function) to output a feature Q2 with a size of H/2×W/2×64;

S30203：特征Q2经过Bottleneck（两次卷积）处理后输出大小为H/2×W/2×64的特征Q3；S30203: After the feature Q2 is processed by Bottleneck (two convolutions), it outputs a feature Q3 with a size of H/2×W/2×64;

S30204：特征Q3经过Attention模块（下文介绍）处理后输出大小为H/2×W/2×64的特征Q4；S30204: After the feature Q3 is processed by the Attention module (described below), it outputs a feature Q4 with a size of H/2×W/2×64;

S30205：特征Q4经过Conv层（卷积、正则化、SiLU激活函数）处理后输出大小为H/4×W/4×128的特征Q5；S30205: After the feature Q4 is processed by the Conv layer (convolution, regularization, and SiLU activation function), it outputs a feature Q5 with a size of H/4×W/4×128;

S30206：特征Q5经过3次BottleneckCSP层（卷积、Bottleneck、Conv、拼接、正则化、LeakyReLU激活函数）处理后输出大小为H/4×W/4×128的特征Q6；S30206: After the feature Q5 is processed by the BottleneckCSP layer three times (convolution, Bottleneck, Conv, concatenation, regularization, LeakyReLU activation function), the output feature Q6 is H/4×W/4×128 in size;

S30207：特征Q6经过Attention模块（下文介绍）处理后输出大小为H/4×W/4×128的特征Q7；S30207: After the feature Q6 is processed by the Attention module (described below), the output feature Q7 has a size of H/4×W/4×128;

S30208：特征Q7经过Conv层（卷积、正则化、SiLU激活函数）处理后输出大小为H/8×W/8×256的特征Q8；S30208: After the feature Q7 is processed by the Conv layer (convolution, regularization, and SiLU activation function), it outputs a feature Q8 with a size of H/8×W/8×256;

S30209：特征Q8经过SPP层（Conv、并行不同尺度最大池化、拼接）处理后输出大小为H/8×W/8×256的特征Q9；S30209: After the feature Q8 is processed by the SPP layer (Conv, maximum pooling of different scales in parallel, splicing), the output feature Q9 is H/8×W/8×256 in size;

S30210：特征Q9经过1次BottleneckCSP层（卷积、Bottleneck、Conv、拼接、正则化、LeakyReLU激活函数）处理后输出大小为H/8×W/8×256的特征Q10；S30210: After the feature Q9 is processed once by the BottleneckCSP layer (convolution, Bottleneck, Conv, concatenation, regularization, LeakyReLU activation function), the output feature Q10 is H/8×W/8×256 in size;

S30211：特征Q10经过Attention模块（下文介绍）处理后输出大小为H/8×W/8×256的特征Q11；S30211: After the feature Q10 is processed by the Attention module (described below), the feature Q11 with a size of H/8×W/8×256 is output;

S30212：特征Q11经过1次BottleneckCSP层（卷积、Bottleneck、Conv、拼接、正则化、LeakyReLU激活函数）处理后输出大小为H/8×W/8×256的特征Q12，作为30×30pixel大小左右目标的最终特征图；S30212: After the feature Q11 is processed once by the BottleneckCSP layer (convolution, Bottleneck, Conv, concatenation, regularization, LeakyReLU activation function), the output size is H/8×W/8×256 The feature Q12 is about 30×30pixel in size The final feature map of the target;

S30213：特征Q12经过上采样层处理，再与特征Q7进行拼接，再经Conv层（卷积、正则化、SiLU激活函数）处理后输出大小为H/4×W/4×128的特征Q13；S30213: The feature Q12 is processed by the upsampling layer, then spliced with the feature Q7, and then processed by the Conv layer (convolution, regularization, SiLU activation function) to output the feature Q13 with a size of H/4×W/4×128;

S30214：特征Q13经过1次BottleneckCSP层处理（卷积、Bottleneck、Conv、拼接、正则化、LeakyReLU激活函数）输出大小为H/4×W/4×128的特征Q14，作为10×10pixel大小左右目标的最终特征图；S30214: The feature Q13 is processed once by the BottleneckCSP layer (convolution, Bottleneck, Conv, concatenation, regularization, LeakyReLU activation function) to output the feature Q14 with a size of H/4×W/4×128, as a target of about 10×10pixel size The final feature map of ;

S30215：特征Q14经过上采样层处理，再与特征Q4拼接，再经Conv层（卷积、正则化、SiLU激活函数）处理后输出大小为H/2×W/2×64的特征Q15；S30215: The feature Q14 is processed by the upsampling layer, then spliced with the feature Q4, and then processed by the Conv layer (convolution, regularization, SiLU activation function) to output the feature Q15 with a size of H/2×W/2×64;

S30216：特征Q15经过1次BottleneckCSP层（卷积、Bottleneck、Conv、拼接、正则化、LeakyReLU激活函数）处理后输出大小为H/2×W/2×64的特征Q16，作为5×5pixel大小左右目标的最终特征图；S30216: After the feature Q15 is processed once by the BottleneckCSP layer (convolution, Bottleneck, Conv, concatenation, regularization, LeakyReLU activation function), the output size is H/2×W/2×64 The feature Q16 is about 5×5 pixel in size The final feature map of the target;

Attention模块包括通道注意力操作与空间注意力操作；The Attention module includes channel attention operations and spatial attention operations;

首先进行通道注意力操作：First perform the channel attention operation:

S3001：输入特征E，大小为H×W×C（高、宽和通道数），对空间维度进行平均池化处理输出特征E1，1×1×C大小，对空间维度进行最大池化处理输出特征E2，大小为1×1×C；S3001: Input feature E, with a size of H×W×C (height, width and number of channels), perform average pooling processing on the spatial dimension, and output feature E1, with a size of 1×1×C, perform maximum pooling processing on the spatial dimension and output Feature E2, the size is 1×1×C;

S3002：E1经过全连接、Relu激活函数后输出特征E1-1大小为1×1×C1，C1=H×W×C/16, E2经过全连接、Relu激活函数后层输出特征E2-1大小为1×1×C1，C1=H×W×C/16；S3002: After E1 is fully connected and Relu activation function, the output feature E1-1 size is 1×1×C1, C1=H×W×C/16, E2 is fully connected and Relu activation function, and the output feature E2-1 size is It is 1×1×C1, C1=H×W×C/16;

S3003：E1-1经过全连接后输出特征E1-2大小为1×1×C, E2-1经过全连接层输出特征E2-2大小为1×1×C；S3003: After E1-1 is fully connected, the output feature E1-2 has a size of 1×1×C, and E2-1 has a fully connected layer output feature E2-2 with a size of 1×1×C;

S3004：E1-2与E2-2相加经过Sigmoid函数输出特征E3，大小为为1×1×C。与输入原始特征E沿通道维度进行哈达玛积处理，输出特征图E4大小H×W×C；S3004: Adding E1-2 and E2-2 to output the feature E3 through the Sigmoid function, the size of which is 1×1×C. Perform Hadamard product processing with the input original feature E along the channel dimension, and the output feature map E4 size is H×W×C;

再对特征图E4进行空间注意力操作：Then perform spatial attention operation on the feature map E4:

S3005：输入特征E4，大小为H×W×C（高、宽和通道数），对通道维度进行平均池化处理输出特征E5，H×W×1大小，对通道维度进行最大池化处理输出特征E6，大小为H×W×1；S3005: Input feature E4, the size is H×W×C (height, width, and number of channels), perform average pooling processing on the channel dimension, output feature E5, size H×W×1, perform maximum pooling processing on the channel dimension and output Feature E6, the size is H×W×1;

S3006：E5与E6拼接，经过卷积层、过Sigmoid函数后输出特征E7，大小为H×W×1；S3006: Splicing E5 and E6, output feature E7 after convolution layer and Sigmoid function, the size is H×W×1;

S3007：E7与E4沿空间维度进行哈达玛积处理，输出最终特征图E8，大小H×W×C。S3007: E7 and E4 perform Hadamard product processing along the spatial dimension, and output the final feature map E8 with a size of H×W×C.

S4、通过轨迹关联匹配算法对目标进行确认，去除目标中的虚警，得到真实目标。S4. Confirm the target through the trajectory correlation matching algorithm, remove false alarms in the target, and obtain the real target.

经过上述步骤的处理后，检测结果中除真目标外还残留云层边缘、噪声、碎片等虚警。相对于虚警，目标独有的特征是稳定、有规律的航迹：远距离的目标在连续帧间能够形成稳定的航迹，碎片则主要呈现自由降落趋势，而噪声、云絮干扰相对随机，不具备连续运动轨迹。本发明根据目标运动的连续性和规则性，首先进行轨迹预测，然后利用相邻帧中可能目标点之间的位置关系判别目标，采用最近邻关联的方法，将当前帧目标候选点与前几帧预测结果、前几帧确认检测结果进行关联，以判别候选点是否为真目标。After the processing of the above steps, in addition to the real target, false alarms such as cloud edge, noise, and debris remain in the detection result. Compared with false alarms, the unique feature of the target is a stable and regular track: long-distance targets can form a stable track between consecutive frames, and debris mainly shows a free-falling trend, while noise and cloud interference are relatively random , does not have a continuous motion trajectory. According to the continuity and regularity of the target movement, the present invention first performs trajectory prediction, and then uses the positional relationship between possible target points in adjacent frames to identify the target, and adopts the method of nearest neighbor association to compare the current frame target candidate points with the previous few The prediction results of the frame and the confirmation detection results of the previous frames are correlated to determine whether the candidate point is a real target.

步骤S4包括以下子步骤：Step S4 includes the following sub-steps:

S41、对当前帧中候选目标点的轨迹进行预测，得到下一帧候选目标点的轨迹预测值。S41. Predict the trajectory of the candidate target point in the current frame, and obtain the predicted trajectory value of the candidate target point in the next frame.

步骤S41包括以下子步骤：Step S41 includes the following sub-steps:

S411、获取每帧检测结果中的候选目标点的质心位置(x，y)。S411. Acquire centroid positions (x, y) of candidate target points in each frame of detection results.

S412、通过概率数据关联算法对候选目标点进行估计，获得下一帧候选目标点的轨迹预测值。S412. Estimate the candidate target points through the probabilistic data association algorithm, and obtain the trajectory prediction value of the candidate target points in the next frame.

S413、通过无迹卡尔曼粒子滤波算法根据预测值计算下一帧候选目标点质心位置(x'，y')。S413. Calculate the centroid position (x', y') of the candidate target point in the next frame according to the prediction value through the unscented Kalman particle filter algorithm.

S42、通过最近邻关联方法，将当前帧候选目标点与历史帧的轨迹预测结果、历史帧确认检测结果进行关联；S42. Using the nearest neighbor association method, associating the candidate target point of the current frame with the trajectory prediction result of the historical frame and the confirmation detection result of the historical frame;

若关联成功，则候选目标点为真实目标；得到每帧的检测结果。If the association is successful, the candidate target point is the real target; the detection result of each frame is obtained.

若关联失败，则候选目标点为虚警。If the association fails, the candidate target point is a false alarm.

本发明提供的动态背景下的红外暗弱点目标检测方法还包括预处理步骤S0、通过自动检测算法去除红外序列图像中的坏亮点。The method for detecting infrared weak spot targets under dynamic backgrounds provided by the present invention also includes a preprocessing step S0 of removing bad bright spots in infrared sequence images by an automatic detection algorithm.

红外相机中不能正常感光的单元称之为坏点。具体分为坏暗点或坏亮点，坏亮点大小一般只有一个像素，其亮度不受周围像素亮度的影响，在每帧中基本不变。目前处理坏亮点的方式通常是已知红外图像传感器型号并事先记录坏亮点位置信息并保存，后续根据记录的位置信息剔除坏亮点。此方法在实际工程应用中受限，一方面待处理数据可能来自未知红外图像传感器型号，另一方面因红外图像传感器存在老化或安装、使用损伤等情况，坏亮点的数量或可随时间增加。The units in the infrared camera that cannot normally sense light are called dead pixels. Specifically, it is divided into bad dark spots or bad bright spots. The size of bad bright spots is generally only one pixel, and its brightness is not affected by the brightness of surrounding pixels, and basically remains unchanged in each frame. The current way to deal with bad bright spots is usually to know the model of the infrared image sensor and record the position information of the bad bright spots in advance and save it, and then remove the bad bright spots according to the recorded position information. This method is limited in practical engineering applications. On the one hand, the data to be processed may come from unknown infrared image sensor models. On the other hand, due to the aging or installation and use damage of infrared image sensors, the number of bad bright spots may increase with time.

上述步骤S0中，可分别通过离线模式和在线模式对红外序列图像中的坏亮点进行去除。In the above step S0, the bad bright spots in the infrared sequence images can be removed through the offline mode and the online mode respectively.

本发明提供一种更为鲁棒的坏亮点自动检测算法，分为离线模式和在线模式。The invention provides a more robust algorithm for automatically detecting bad bright spots, which is divided into an offline mode and an online mode.

离线模式下的坏亮点去除步骤在步骤S1之前进行处理；The step of removing bad bright spots in offline mode is processed before step S1;

在线模式下的坏亮点去除步骤在步骤S3和步骤S4之间进行处理。The step of removing bad bright spots in the online mode is processed between step S3 and step S4.

离线模式耗时长，适用于对实时性要求不高的任务。离线模式下的坏亮点自动检测算法包括以下步骤：The offline mode takes a long time and is suitable for tasks that do not require high real-time performance. The algorithm for automatic detection of bad bright spots in offline mode includes the following steps:

S01、对红外序列图像F_N (i，j) 中每帧的像素点按照灰度值由大至小进行排序，并记录前P(P>100)个像素点所在的坐标位置作为疑似坏亮点集。S01. Sort the pixels of each frame in the infrared sequence image F _N (i, j) according to the gray value from large to small, and record the coordinate positions of the first P (P>100) pixels as suspected bad bright spots set.

S02、在红外序列图像F_N (i，j) 中随机选取M(M>1000)帧图像，并求M帧图像中与疑似坏亮点集的交集S(i，j)作为坏亮点集；S02. Randomly select M (M>1000) frame images in the infrared sequence image F _N (i, j), and seek the intersection S (i, j) of the suspected bad bright spot set in the M frame images as the bad bright spot set;

若已知所述红外序列图像中的目标轨迹，则依此将坏亮集S(i，j)中的目标位置剔除；If the target track in the infrared sequence image is known, then the target position in the bad bright set S (i, j) is removed accordingly;

若无目标轨迹先验，需对多个（>3）红外序列图像F_N (i，j)的坏亮集S(i，j)再取交集，以避免目标位置（因亮度高）被误认为坏亮点。If there is no target trajectory prior, it is necessary to take the intersection of the bad bright set S(i, j) of multiple (>3) infrared sequence images F _N (i, j) to avoid the target position (due to high brightness) being misunderstood. Think bad highlights.

S03、计算每帧红外序列图像F_N (i，j)中坏亮点邻域的灰度均值来替代坏亮点，并将红外序列图像F_N (i，j)更新为f(i，j)。S03. Calculate the gray value of the bad bright spot neighborhood in each frame of the infrared sequence image F _N (i, j) to replace the bad bright spot, and update the infrared sequence image F _N (i, j) to f (i, j).

邻域为坏亮点周围上、下、左、右四个方向上的像素点。Neighborhoods are pixels in the four directions of up, down, left and right around the bad bright spot.

在线模式将坏亮点去除模块嵌入至目标确认模块，用于区分红外序列图像中的噪声、坏亮点与捕获目标，以第一段子序列确认结果为依据，对其他段进行坏亮点去除，准确率稍逊于离线版本，但产生的时间消耗可忽略不计，在线模式下的坏亮点自动检测算法包括以下步骤：In the online mode, the bad bright spot removal module is embedded into the target confirmation module, which is used to distinguish the noise, bad bright spots and captured targets in the infrared sequence images. Based on the confirmation results of the first sub-sequence, the bad bright spots are removed for other segments, and the accuracy rate Slightly inferior to the offline version, but with negligible time consumption, the automatic detection algorithm for bad bright spots in online mode consists of the following steps:

S011、取第一段子序列M帧进行自适应阈值分割，取交集得到坏亮点集S₁ (i，j)，M>1000。S011. Take M frames of the first subsequence to perform adaptive threshold segmentation, and take intersection to obtain bad bright spot set S ₁ (i, j), M>1000.

即每个像素的阈值由自身为中心的邻域窗口确定，把高斯卷积（再在此基础上加个常量值）作为阈值，大于该阈值的像素点即判定为坏亮点，坏亮点集为S₁ (i，j)。That is, the threshold of each pixel is determined by the neighborhood window centered on itself, and the Gaussian convolution (adding a constant value on top of it) is used as the threshold. Pixels greater than the threshold are judged as bad bright spots, and the set of bad bright spots is S ₁ (i,j).

S022、根据（第一段子序列）基于轨迹关联匹配的目标确认结果，将在目标预设邻域半径内的坏亮点从坏亮点集S₁ (i，j)中删除。S022. According to the (first subsequence) target confirmation result based on trajectory correlation matching, delete the bad bright spots within the preset neighborhood radius of the target from the bad bright spot set S ₁ (i, j).

对第一段子序列检测结果进行轨迹关联、目标确认后，获得目标位置坐标，邻域半径为2pixel或5pixel；After performing trajectory association and target confirmation on the first subsequence detection results, the target position coordinates are obtained, and the neighborhood radius is 2pixel or 5pixel;

S033、以坏亮点集S₁ (i，j)为依据，在第n段（n≠1）子序列轨迹关联前判断其序列检测结果是否在坏亮点邻域位置附近（邻域半径设为0.5pixel或1pixel），若在则认为该点为坏亮点，将该点从检测结果中剔除。S033. Based on the bad bright spot set S ₁ (i, j), determine whether the sequence detection result is near the bad bright spot neighborhood (the neighborhood radius is set to 0.5) before the subsequence track association of the nth segment (n≠1) pixel or 1pixel), if it is, the point is considered to be a bad bright spot, and the point is excluded from the detection results.

图5示出了根据本发明实施例提供的动态背景下的红外暗弱点目标检测方法的目标检测结果示意图。Fig. 5 shows a schematic diagram of a target detection result of a method for detecting an infrared weak point target under a dynamic background according to an embodiment of the present invention.

如图5所示，本发明对于分辨率不高或是经过下采样后特征信息、语义信息不明显的红外图像有着出色的检测效果。As shown in FIG. 5 , the present invention has an excellent detection effect on infrared images with low resolution or inconspicuous feature information and semantic information after downsampling.

本发明还提供一种动态背景下的红外暗弱点目标检测系统，包括图像子序列拆分模块、背景抑制模块、目标检测模块和目标确认模块；The present invention also provides an infrared weak point target detection system under a dynamic background, including an image subsequence splitting module, a background suppression module, a target detection module and a target confirmation module;

图像子序列拆分模块用于对动态背景变化下的红外序列图像进行子序列拆分；The image subsequence splitting module is used for subsequence splitting of infrared sequence images under dynamic background changes;

背景抑制模块用于通过基于全卷积的时空融合背景抑制网络对子序列图像进行背景杂波抑制和对子序列图像中的目标进行增强；The background suppression module is used to suppress the background clutter of the subsequence images and enhance the targets in the subsequence images through the full convolution-based spatio-temporal fusion background suppression network;

目标检测模块用于根据基于改进Yolov5的红外点目标检测网络从背景抑制后图像中检测目标；The target detection module is used to detect the target from the image after background suppression according to the improved Yolov5-based infrared point target detection network;

目标确认模块通过轨迹关联匹配算法对目标进行确认，去除目标中的虚警，得到真实目标。The target confirmation module confirms the target through the trajectory correlation matching algorithm, removes false alarms in the target, and obtains the real target.

图像子序列拆分模块包括红外图像传感器监测单元、连续变换帧去除单元、子序列拆分单元。The image subsequence splitting module includes an infrared image sensor monitoring unit, a continuous transformation frame removal unit, and a subsequence splitting unit.

红外图像传感器监测单元用于通过计算红外序列图像中相邻帧图像之间的相似度，进而判断红外图像传感器是否发生移动，若红外图像传感器产生移动，则记录移动时刻；The infrared image sensor monitoring unit is used to determine whether the infrared image sensor moves by calculating the similarity between adjacent frame images in the infrared sequence image, and if the infrared image sensor moves, record the moment of movement;

红外图像传感器监测单元包括参数设置子单元、数值计算子单元和红外图像传感器移动时刻记录子单元；The infrared image sensor monitoring unit includes a parameter setting subunit, a numerical calculation subunit and an infrared image sensor moving time recording subunit;

参数设置子单元用于设置以下参数：感知哈希阈值pth，差值哈希阈值dth，连续变化阈值sth，拆分阈值ssth；The parameter setting subunit is used to set the following parameters: perceptual hash threshold pth, difference hash threshold dth, continuous change threshold sth, split threshold ssth;

数值计算子单元用于分别计算红外序列图像F_N(i，j)中当前帧与前一帧的感知哈希值和差值哈希值，进而得到两帧图像中的感知哈希差异p和差值哈希差异d；The numerical calculation subunit is used to calculate the perceptual hash value and difference hash value of the current frame and the previous frame in the infrared sequence image F _N (i, j) respectively, and then obtain the perceptual hash difference p and difference hash difference d;

其中，N为红外序列图像中的序列长度；Wherein, N is the sequence length in the infrared sequence image;

红外图像传感器移动时刻记录子单元用于将红外图像传感器产生移动时的帧号记录在场景变化集B中；The infrared image sensor movement time recording subunit is used to record the frame number when the infrared image sensor moves in the scene change set B;

在红外图像传感器移动时刻记录子单元中：In the recording subunit of the infrared image sensor moving moment:

当感知哈希差异p >感知哈希阈值pth，且差值哈希差异d >差值哈希阈值dth时，判定此时红外图像传感器产生移动，并将当前帧帧号存入场景变化集B中；并将场景变化集B按时间进行排序；When the perceptual hash difference p > the perceptual hash threshold pth, and the difference hash difference d > the difference hash threshold dth, it is determined that the infrared image sensor has moved at this time, and the current frame number is stored in the scene change set B ; and sort the scene change set B by time;

连续变换帧去除单元用于根据场景变化集B去除在红外图像传感器移动过程中的连续变化帧，并确定每段子序列的开始帧和结束帧；The continuously changing frame removal unit is used to remove the continuously changing frames during the movement of the infrared image sensor according to the scene change set B, and determine the start frame and the end frame of each subsequence;

子序列拆分单元用于对红外序列图像进行拆分得到子序列；The subsequence splitting unit is used to split the infrared sequence images to obtain subsequences;

在子序列拆分单元中，当子序列长度小于拆分阈值ssth时不进行拆分，并将该子序列保留在上一段子序列中，得到子序列拆分集D；In the subsequence splitting unit, when the subsequence length is less than the split threshold ssth, the subsequence is not split, and the subsequence is kept in the previous subsequence to obtain the subsequence split set D;

按照子序列拆分集D对红外序列图像F_N(i，j)进行拆分得到子序列 F_N1(i，j)、F_N2(i，j)、F_N3(i，j)…F_Nn(i，j)。Split the infrared sequence image F _N (i, j) according to the sub-sequence splitting set D to obtain sub-sequences F _N1 (i, j), F _N2 (i, j), F _N3 (i, j)...F _Nn (i,j).

背景抑制模块包括子序列图像划分单元、背景预测模型构建单元和检测结果输出单元；The background suppression module includes a subsequence image division unit, a background prediction model construction unit and a detection result output unit;

子序列图像划分单元用于将子序列图像拆分为T段，每段t帧；The subsequence image division unit is used to split the subsequence image into T sections, each section of t frames;

背景预测模型构建单元用于将t帧子序列图像作为t通道图像整体输入至基于全卷积的时空融合背景抑制网络中进行训练，得到背景预测模型；The background prediction model construction unit is used to input the t-frame subsequence image as a t-channel image into a fully convolution-based spatio-temporal fusion background suppression network for training to obtain a background prediction model;

检测结果输出单元用于将背景预测模型根据子序列图像计算每个像素的背景抑制分量，得到去除背景、噪声和目标增强后的输出图像。The detection result output unit is used to use the background prediction model to calculate the background suppression component of each pixel according to the subsequence image, and obtain the output image after removing the background, noise and target enhancement.

目标确认模块包括候选目标点预测单元和真实目标判定单元；The target confirmation module includes a candidate target point prediction unit and a real target determination unit;

候选目标点预测单元用于对当前帧中候选目标点的轨迹进行预测，得到下一帧候选目标点的预测值；The candidate target point prediction unit is used to predict the trajectory of the candidate target point in the current frame to obtain the predicted value of the next frame candidate target point;

候选目标点预测单元包括候选目标点质心位置计算子单元、候选目标点预测值计算子单元和候选目标点下一帧质心位置预测子单元；The candidate target point prediction unit includes a candidate target point centroid position calculation subunit, a candidate target point predicted value calculation subunit and a candidate target point next frame centroid position prediction subunit;

候选目标点质心位置计算子单元用于获取每帧检测结果中候选目标点的质心位置(x，y)；The candidate target point centroid position calculation subunit is used to obtain the centroid position (x, y) of the candidate target point in each frame detection result;

候选目标点预测值计算子单元用于通过概率数据关联算法对候选目标点进行估计，获得下一帧候选目标点的轨迹预测值；The candidate target point prediction value calculation subunit is used to estimate the candidate target point through the probability data association algorithm, and obtain the trajectory prediction value of the next frame candidate target point;

候选目标点下一帧质心位置预测子单元用于通过无迹卡尔曼粒子滤波算法根据轨迹预测值计算下一帧候选目标点质心位置(x'，y')，得到预测结果。The centroid position prediction subunit of the candidate target point in the next frame is used to calculate the centroid position (x', y') of the candidate target point in the next frame according to the trajectory prediction value through the unscented Kalman particle filter algorithm, and obtain the prediction result.

真实目标判定单元通过最近邻关联方法，将当前帧候选目标点分别与历史帧的轨迹预测结果和历史帧确认检测结果进行关联；The real target determination unit associates the candidate target points of the current frame with the trajectory prediction results of the historical frames and the confirmation detection results of the historical frames respectively through the nearest neighbor association method;

关联失败，则候选目标点为虚警；关联成功则候选目标点为真实目标；得到每帧的检测结果。If the association fails, the candidate target point is a false alarm; if the association is successful, the candidate target point is a real target; the detection result of each frame is obtained.

本发明提供动态背景下的红外暗弱点目标检测系统，还包括坏亮点去除模块；The present invention provides an infrared weak point target detection system under a dynamic background, and also includes a bad bright spot removal module;

坏亮点去除模块用于通过自动检测算法去除红外序列图像中的坏亮点。The bad bright spot removal module is used to remove bad bright spots in infrared sequence images through automatic detection algorithm.

坏亮点去除模块包括离线模式下的坏亮点去除模块，具体包括疑似坏亮点集获取单元、坏亮点集获取单元和坏亮点更新单元；The bad bright spot removal module includes a bad bright spot removal module in offline mode, specifically including a suspected bad bright spot set acquisition unit, a bad bright spot set acquisition unit, and a bad bright spot update unit;

疑似坏亮点集获取单元用于对红外序列图像F_N (i，j) 中每帧的像素点按照灰度值由大至小进行排序，并记录前P(P>100)个像素点所在的坐标位置作为疑似坏亮点集；The suspected bad bright spot set acquisition unit is used to sort the pixels of each frame in the infrared sequence image F _N (i, j) according to the gray value from large to small, and record the location of the first P (P>100) pixels. The coordinate position is used as a set of suspected bad bright spots;

坏亮点集获取单元用于在红外序列图像F_N (i，j) 中随机选取M(M>1000)帧图像，并求M帧图像中与疑似坏亮点集的交集S(i，j)作为坏亮点集。The bad bright spot set acquisition unit is used to randomly select M (M>1000) frame images in the infrared sequence image F _N (i, j), and seek the intersection S (i, j) of the suspected bad bright spot set in the M frame images as Bad highlights set.

若无目标轨迹先验，需多个（>3）红外序列图像F_N (i，j)的坏亮集S(i，j)再取交集，以避免目标位置（因亮度高）被误认为坏亮点；If there is no target trajectory prior, multiple (>3) bad bright sets S(i, j) of infrared sequence images F _N (i, j) need to be intersected to avoid the target position (due to high brightness) being mistaken bad bright spot;

坏亮点更新单元用于计算每帧红外序列图像F_N (i，j)中坏亮点邻域的灰度均值来替代坏亮点，并将红外序列图像F_N (i，j)更新为f(i，j)。The bad bright spot update unit is used to calculate the gray mean value of the bad bright spot neighborhood in each frame of infrared sequence image F _N (i, j) to replace the bad bright spot, and update the infrared sequence image F _N (i, j) to f(i , j).

坏亮点去除模块包括在线模式下的坏亮点去除模块，具体包括阈值分割单元和坏亮点去除单元；The bad bright spot removal module includes a bad bright spot removal module in an online mode, specifically including a threshold segmentation unit and a bad bright spot removal unit;

阈值分割单元用于取第一段子序列中的M帧进行自适应阈值分割，取交集得到坏亮点集S₁ (i，j)，M>1000；The threshold segmentation unit is used to get the M frames in the first subsequence for adaptive threshold segmentation, and the intersection is obtained to obtain the bad bright spot set S ₁ (i, j), M>1000;

坏亮点去除单元用于删除红外序列图像中的坏亮点；The bad bright spot removal unit is used to delete the bad bright spot in the infrared sequence image;

在坏亮点去除单元中：In the bad bright spot removal unit:

根据（第一段子序列）基于轨迹关联匹配的目标确认结果，在目标预设邻域半径内的坏亮点从坏亮点集S₁ (i，j)中删除；并以坏亮点集S₁ (i，j)为依据，在第n段子序列轨迹关联前判断其序列检测结果是否在坏亮点邻域位置附近，若在则认为该点为坏亮点，将该点从检测结果中剔除；其中，n≠1。According to the (first subsequence) target confirmation result based on trajectory association matching, the bad bright spots within the preset neighborhood radius of the target are deleted from the bad bright spot set S ₁ (i, j); and the bad bright spot set S ₁ ( i, j) as the basis, judge whether the sequence detection result of the nth segment subsequence trajectory is near the neighborhood of the bad bright spot, if it is, then consider the point to be a bad bright spot, and remove the point from the detection result; wherein, n≠1.

尽管上面已经示出和描述了本发明的实施例，可以理解的是，上述实施例是示例性的，不能理解为对本发明的限制，本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

以上本发明的具体实施方式，并不构成对本发明保护范围的限定。任何根据本发明的技术构思所作出的各种其他相应的改变与变形，均应包含在本发明权利要求的保护范围内。The above specific implementation manners of the present invention do not constitute a limitation to the protection scope of the present invention. Any other corresponding changes and modifications made according to the technical concept of the present invention shall be included in the protection scope of the claims of the present invention.

Claims

1. A method for detecting infrared dark weak point targets under a dynamic background is characterized by comprising the following steps:

s1, performing subsequence splitting on an infrared sequence image under the condition of dynamic background change to obtain a stage static subsequence image of a background;

s2, performing background clutter suppression on the subsequence image and enhancing a target in the subsequence image through a space-time fusion background suppression network based on full convolution;

s3, detecting a target from the image after background suppression according to an infrared point target detection network based on the improved Yolov 5;

s4, confirming the target through a track association matching algorithm, and removing a false alarm of the target to obtain a real target;

the step S4 includes the following substeps:

s41, predicting the track of the candidate target point in the current frame to obtain a track predicted value of the candidate target point in the next frame;

s42, respectively associating the current frame candidate target point with a prediction result of a historical frame and a confirmation detection result of the historical frame by a nearest neighbor association method;

if the association fails, the candidate target point is a false alarm;

if the association is successful, the candidate target point is a real target; and obtaining the detection result of each frame.

2. The method for detecting infrared dark spot targets in dynamic background according to claim 1, wherein the step S1 includes the following sub-steps:

s11, calculating the similarity between adjacent frame images in the infrared sequence image, judging whether the infrared image sensor moves, and if the infrared image sensor moves, recording the moving moment;

s12, removing continuous change frames in the moving process of the infrared image sensor, and determining a starting frame and an ending frame of each segment of subsequence image;

s13, when the length of the subsequence is less than a splitting threshold value ssth, the subsequence is not split, and the subsequence is reserved in the last subsequence, so that a subsequence splitting set D is obtained;

s14, splitting the infrared sequence image F according to the sub-sequence splitting set D _N (i, j) splitting to obtain a subsequence F _N1 (i，j)、F _N2 (i，j)、F _N3 (i，j)…F _Nn (i，j)。

3. The method of claim 2, wherein the object detection method of infrared dark spots in dynamic background,

the step S11 includes the following substeps:

s110, a preprocessing step, setting parameters: sensing a hash threshold pth, a difference hash threshold dth and a continuous change threshold sth;

s111, respectively calculating the infrared sequence images F _N (i, j) obtaining a perceptual hash difference p and a differential hash difference d between the current frame and the previous frame in the two frames of images;

wherein N is the sequence length in the infrared sequence image;

and when the perceptual hash difference p is greater than a perceptual hash threshold pth and the difference hash difference d is greater than a difference hash threshold dth, judging that the infrared image sensor moves at the moment, storing the frame number of the current frame into a scene change set B, and sequencing the scene change set B according to time.

4. The method for detecting infrared dark weak point targets under dynamic background according to claim 3, wherein the step S2 comprises the following sub-steps:

s20, splitting the subsequence image into T sections, wherein each section is T frames;

s21, inputting the t frame subsequence image serving as a t channel image into a space-time fusion background suppression network based on full convolution for training to obtain a background prediction model;

and S22, calculating the background suppression component of each pixel according to the subsequence image by using the background prediction model to obtain an output image with background, noise and target enhancement removed.

5. The method for detecting infrared dark weak point targets under dynamic background according to claim 4, wherein the step S3 comprises the following sub-steps:

s301, establishing an infrared point and spot target data set;

s302, directly reducing the number of network layers, the number of residual network structures and the number of different-scale receptive fields in an SPP in a YOLOv5 feature extraction network; a self-attention mechanism is introduced, and a channel attention module and a space attention module are cascaded and embedded into a network; obtaining an improved Yolov5 infrared point target feature extraction network;

s303, inputting the data set image in the step S301 into the improved Yolov5 infrared point target feature extraction network for feature extraction to obtain feature maps with different scales;

s304, classifying the feature map obtained in the step S303 and performing bounding box regression to calculate loss;

s305, after the training of the improved Yolov5 infrared point target detection network is completed, testing the test set by means of the divided data set to realize the detection of the infrared point target, and evaluating the detection effect of the improved Yolov5 infrared point target detection network.

6. The method for detecting infrared dark spot targets in dynamic background according to claim 5, wherein the step S302 includes the following sub-steps:

s30201: inputting an image A with the size of H multiplied by W multiplied by 1, and outputting a characteristic Q1 with the size of H multiplied by W multiplied by 32 after Focus layer processing;

s30202: the characteristic Q1 is processed by a Conv layer and then outputs a characteristic Q2 with the size of H/2 xW/2 x64;

s30203: the characteristic Q2 is processed by Bottleneck and then outputs a characteristic Q3 with the size of H/2 xW/2 x 64;

s30204: the characteristic Q3 is processed by an Attention module and then outputs a characteristic Q4 with the size of H/2 xW/2 x 64;

s30205: the characteristic Q4 outputs a characteristic Q5 with the size of H/4 xW/4 x 128 after being processed by a Conv layer;

s30206: the characteristic Q5 is processed by the BottleneckCSP layer for 3 times and then outputs a characteristic Q6 with the size of H/4 xW/4 x 128;

s30207: the characteristic Q6 is processed by an Attention module and then outputs a characteristic Q7 with the size of H/4 xW/4 x 128;

s30208: the characteristic Q7 outputs a characteristic Q8 with the size of H/8 xW/8 x 256 after being processed by a Conv layer;

s30209: the characteristic Q8 is processed by an SPP layer and then outputs a characteristic Q9 with the size of H/8 multiplied by W/8 multiplied by 256;

s30210: the characteristic Q9 is processed by the BottleneckCSP layer for 1 time and then outputs a characteristic Q10 with the size of H/8 multiplied by W/8 multiplied by 256;

s30211: the characteristic Q10 is processed by the Attention module and then outputs a characteristic Q11 with the size of H/8 multiplied by W/8 multiplied by 256;

s30212: the characteristic Q11 is processed by a BottleneckCSP layer for 1 time and then outputs a characteristic Q12 with the size of H/8 xW/8 x 256, and the characteristic Q12 is used as a final characteristic diagram of a 30 x 30pixel target;

s30213: the characteristic Q12 is subjected to upsampling layer processing, then spliced with the characteristic Q7, and subjected to Conv layer processing to output a characteristic Q13 with the size of H/4 xW/4 x 128;

s30214: the feature Q13 is processed by a BottleneckCSP layer for 1 time to output a feature Q14 with the size of H/4 xW/4 x128, and the feature Q14 is used as a final feature map of a 10 x 10pixel target;

s30215: the characteristic Q14 is processed by an upper sampling layer, then spliced with the characteristic Q4, and processed by a Conv layer to output a characteristic Q15 with the size of H/2 xW/2 x64;

s30216: the feature Q15 is processed by the BottleneckCSP layer for 1 time to output a feature Q16 with the size of H/2 xW/2 x 64, and the feature Q16 is used as a final feature map of a 5 x 5pixel target.

7. The method for detecting infrared dark spot targets in dynamic background according to claim 6, wherein the Attention module in step S3 includes a channel Attention operation and a space Attention operation;

channel attention operations were performed first:

s3001: inputting the characteristic E with the size of H multiplied by W multiplied by C, carrying out average pooling processing on the space dimension to output the characteristic E1 with the size of 1 multiplied by C, and carrying out maximum pooling processing on the space dimension to output the characteristic E2 with the size of 1 multiplied by C;

s3002: the feature E1 outputs a feature E1-1 with the size of 1 × 1 × C1 after the full connection and Relu activation function, C1= H × W × C/16, the feature E2 outputs a feature E2-1 with the size of 1 × 1 × C1 after the full connection and Relu activation function, and C1= H × W × C/16;

s3003: the feature E1-1 outputs a feature E1-2 with the size of 1 multiplied by C after full connection, and the feature E2-1 outputs a feature E2-2 with the size of 1 multiplied by C after full connection;

s3004: adding the characteristic E1-2 and the characteristic E2-2, and outputting a characteristic E3 with the size of 1 multiplied by C through a Sigmoid function; carrying out Hadamard product processing along the channel dimension with the characteristic E, and outputting a characteristic E4 with the size of H multiplied by W multiplied by C;

and then performing spatial attention operation on the feature E4:

s3005: inputting the feature E4, performing average pooling on channel dimensions to output a feature E5 with the size of H multiplied by W multiplied by 1, and performing maximum pooling on the channel dimensions to output a feature E6 with the size of H multiplied by W multiplied by 1;

s3006: splicing the characteristic E5 and the characteristic E6, and outputting a characteristic E7 with the size of H multiplied by W multiplied by 1 after passing through a convolution layer and a Sigmoid function;

s3007: and carrying out Hadamard product processing on the characteristic E7 and the characteristic E4 along the spatial dimension, and outputting a final characteristic diagram E8 with the size of H multiplied by W multiplied by C.

8. The method for detecting infrared dark spot targets under dynamic background according to claim 7, wherein the step S41 comprises the following sub-steps:

s411, acquiring the centroid position (x, y) of the candidate target point in each frame of detection result;

s412, estimating the candidate target point through a probability data association algorithm to obtain a track prediction value of the next frame of candidate target point;

and S413, calculating the centroid position (x ', y') of the next frame candidate target point according to the predicted value through an unscented Kalman particle filter algorithm to obtain a predicted result.

9. The method for detecting infrared dark and weak point targets under a dynamic background according to claim 8, further comprising a step of removing bright spots: removing bad bright spots in the infrared sequence images through an automatic detection algorithm;

the bright spot removing step includes a bright spot removing step in an off-line mode, and the bright spot removing step in the off-line mode is performed before the step S1, and specifically includes the following sub-steps:

for the infrared sequence image F _N And (i, j) sorting the pixel points of each frame according to the gray values from large to small, and recording the coordinate positions of the first P pixel points as a suspected bad bright point set, wherein P is>100；

In the infrared sequence image F _N Randomly selecting M frames of images from (i, j), and solving an intersection S (i, j) of the M frames of images and the suspected bad light point set as a bad light point set, M>1000；

If the target track in the infrared sequence image is known, the target positions in the bad bright point set S (i, j) are removed accordingly;

if no target track is priori available, L infrared sequence images F are needed _N Taking the intersection set S (i, j) of the bad lighting point set (i, j) and L>3；

Calculating each frame of infrared sequence image F _N Replacing the bad bright points by the gray average value of the neighborhood of the bad bright points in (i, j), and carrying out image F on the infrared sequence _N (i, j) is updated to f (i, j);

the bright spot removing step further includes a bright spot removing step in an online mode, where the bright spot removing step in the online mode is performed between the step S3 and the step S4, and specifically includes the following sub-steps:

taking the first sub-sequencePerforming self-adaptive threshold segmentation on M frames in the column image, and taking intersection to obtain a bad bright point set S ₁ (i，j)，M>1000；

According to the target confirmation result based on the track correlation matching, bad bright points in the target preset neighborhood radius are collected from a bad bright point set S ₁ (i, j) deleting;

by collecting S with bad bright spots ₁ (i, j) according to the judgment result, before the nth sub-sequence track is associated, judging whether the sequence detection result is near the neighborhood position of the defective lighting point, if so, judging the point as the defective lighting point, and removing the point from the detection result;

wherein n ≠ 1.

10. An infrared dark and weak point target detection system under a dynamic background is characterized by comprising an image subsequence splitting module, a background suppression module, a target detection module and a target confirmation module;

the image subsequence splitting module is used for performing subsequence splitting on the infrared sequence image under the dynamic background change;

the background suppression module is used for performing background clutter suppression on the subsequence image and enhancing a target in the subsequence image based on a full-convolution space-time fusion background suppression network;

the target detection module is used for detecting the target from the background suppressed image according to an infrared point target detection network based on improved Yolov 5;

the target confirmation module confirms the target through a track association matching algorithm, removes a false alarm in the target and obtains a real target;

the target validation module comprises: a candidate target point prediction unit and a real target determination unit;

the candidate target point prediction unit is used for predicting the track of the candidate target point in the current frame to obtain the predicted value of the candidate target point of the next frame;

the real target judgment unit respectively associates the current frame candidate target point with the track prediction result of the historical frame and the confirmation detection result of the historical frame by a nearest neighbor association method;

if the association fails, the candidate target point is a false alarm; if the association is successful, the candidate target point is a real target; and obtaining the detection result of each frame.