CN115797684A - Infrared small target detection method and system based on context information - Google Patents
Infrared small target detection method and system based on context information Download PDFInfo
- Publication number
- CN115797684A CN115797684A CN202211461433.8A CN202211461433A CN115797684A CN 115797684 A CN115797684 A CN 115797684A CN 202211461433 A CN202211461433 A CN 202211461433A CN 115797684 A CN115797684 A CN 115797684A
- Authority
- CN
- China
- Prior art keywords
- information
- feature
- small target
- target detection
- infrared
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 claims abstract description 26
- 238000012545 processing Methods 0.000 claims abstract description 18
- 230000008569 process Effects 0.000 claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 5
- 238000000605 extraction Methods 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 25
- 230000004927 fusion Effects 0.000 claims description 24
- 239000011159 matrix material Substances 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 10
- 238000011176 pooling Methods 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 4
- 230000002708 enhancing effect Effects 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000007670 refining Methods 0.000 claims description 2
- 238000013519 translation Methods 0.000 claims description 2
- 230000002776 aggregation Effects 0.000 claims 1
- 238000004220 aggregation Methods 0.000 claims 1
- 238000007499 fusion processing Methods 0.000 claims 1
- 230000009286 beneficial effect Effects 0.000 abstract description 3
- 238000004364 calculation method Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
本发明涉及一种基于上下文信息的红外小目标检测方法及系统,属于计算机视觉技术领域。对红外小目标数据集,使用分类损失函数、置信度损失函数以及位置损失函数,对小目标检测网络进行训练。然后用训练好的小目标检测网络,对红外图像进行特征提取,得到特征结果。最后,将提取到的特征进行进一步融合,在融合后的特征上进行红外小目标检测,得到最终目标检测结果。同时,本发明提出了一种基于上下文信息的红外小目标检测系统。本发明不依赖额外红外图像去噪、增强以及其他处理模块,训练过程端到端进行,实现简单、性能高、鲁棒性强。本发明的额外计算量开销极低,有利于实现低延迟、高速度的红外小目标检测,有效提高了检测率,降低了漏检率。
The invention relates to an infrared small target detection method and system based on context information, belonging to the technical field of computer vision. For the infrared small target dataset, the classification loss function, confidence loss function and position loss function are used to train the small target detection network. Then use the trained small target detection network to extract features from the infrared image and get the feature results. Finally, the extracted features are further fused, and infrared small target detection is performed on the fused features to obtain the final target detection result. At the same time, the invention proposes an infrared small target detection system based on context information. The invention does not rely on additional infrared image denoising, enhancement and other processing modules, and the training process is carried out end-to-end, with simple implementation, high performance and strong robustness. The extra calculation cost of the present invention is extremely low, which is beneficial to realize low-delay and high-speed infrared small target detection, effectively improves the detection rate, and reduces the missed detection rate.
Description
技术领域technical field
本发明涉及一种在红外图像中检测小目标的方法及装置,具体涉及一种基于上下文信息的红外小目标检测方法及装置,属于计算机视觉处理技术领域。The invention relates to a method and device for detecting small targets in infrared images, in particular to a method and device for detecting small infrared targets based on context information, and belongs to the technical field of computer vision processing.
背景技术Background technique
与可见光图像相比,红外图像不受极端气候、环境影响,无需借助外部光照也可成像,探测能力强,作用距离远,从红外监控系统到红外制导系统,红外图像在民用领域和军用领域均有着重要的研究和应用意义。然而,与可见光图像相比,红外图像存在分辨率差、成像模糊,信噪比低等缺点,其中的小物体容易被噪声淹没。因此,有效检测红外图像中的小目标是一项具有挑战的任务,受到信号处理和计算机视觉界的广泛关注。Compared with visible light images, infrared images are not affected by extreme climates and environments, and can be imaged without external light. They have strong detection capabilities and a long range. From infrared monitoring systems to infrared guidance systems, infrared images are used in both civilian and military fields. It has important research and application significance. However, compared with visible light images, infrared images have disadvantages such as poor resolution, blurred imaging, and low signal-to-noise ratio, and small objects in them are easily overwhelmed by noise. Therefore, efficiently detecting small objects in infrared images is a challenging task that has received extensive attention from the signal processing and computer vision communities.
小目标检测,是一种能够从图像中检测小目标的技术。该技术能够从自然光照的条件下检测出小目标的类别、存在的位置。目前,主要的小目标检测方法均基于深度学习和深度卷积神经网络,该技术广泛用于监控安防、自动驾驶、遥感卫星等领域。按照COCO数据集的定义,原则上小于32×32的目标被称为小目标。小目标占比像素低,检测性能与大目标相比相差极大。如果检测场景比较复杂,例如目标之间遮挡,目标被背景遮挡,或者密集的情况下,小目标受到的影响会比大目标更剧烈,小目标检测的难度被进一步加大。Small object detection is a technique capable of detecting small objects from images. This technology can detect the category and location of small objects under natural lighting conditions. At present, the main small target detection methods are based on deep learning and deep convolutional neural network. This technology is widely used in monitoring security, automatic driving, remote sensing satellite and other fields. According to the definition of the COCO dataset, in principle, objects smaller than 32×32 are called small objects. Small targets account for a low proportion of pixels, and the detection performance is significantly different from that of large targets. If the detection scene is more complex, such as occlusion between targets, the target is occluded by the background, or in dense situations, the impact of small targets will be more severe than that of large targets, and the difficulty of small target detection will be further increased.
上下文信息,物体通常伴随着相应的环境出现,除了物体本身具有的特征外,物体与周围环境之间也存在着紧密的联系,这些信息即所谓的特征上下文信息。由于红外图像受噪声和杂波影响严重,其中的小目标极易受到干扰。因此,借助图像中其他和目标相关的信息并结合小目标的特征来检测物体可以有效提高检测的结果,降低漏检小目标的概率。Contextual information. Objects usually appear with the corresponding environment. In addition to the characteristics of the object itself, there is also a close connection between the object and the surrounding environment. This information is the so-called feature context information. Because infrared images are seriously affected by noise and clutter, small targets are easily disturbed. Therefore, using other information related to the target in the image and combining the characteristics of small targets to detect objects can effectively improve the detection results and reduce the probability of missing small targets.
发明内容Contents of the invention
本发明的目的是针对现有红外小目标检测技术存在的缺陷和不足,为了解决现有方法未充分利用目标周围的局部上下文信息和整体图像中的全局上下文信息,且适应不同类别小目标特征变化能力不足,以及浅层特征与深层特征融合不当等技术问题,创造性地提出一种基于上下文信息的红外小目标检测方法及系统。本方发明有效提高了针对红外小目标检测性能,具有良好的实际应用效果。The purpose of the present invention is to address the defects and deficiencies in the existing infrared small target detection technology, in order to solve the problem that the existing methods do not make full use of the local context information around the target and the global context information in the overall image, and adapt to the feature changes of different types of small targets Insufficient ability, and technical problems such as improper fusion of shallow features and deep features, creatively propose a method and system for infrared small target detection based on context information. The invention of the present invention effectively improves the detection performance for small infrared targets, and has good practical application effect.
为达上述目的,本发明采用以下技术方案实现。In order to achieve the above object, the present invention adopts the following technical solutions to achieve.
一种基于上下文信息的红外小目标检测方法,包括以下步骤:A method for detecting infrared small targets based on context information, comprising the following steps:
步骤1:获取红外小目标数据集并处理。Step 1: Obtain and process the infrared small target dataset.
步骤2:使用分类损失函数、置信度损失函数以及位置损失函数,对小目标检测网络进行训练。Step 2: Use the classification loss function, confidence loss function and position loss function to train the small object detection network.
步骤3:用训练好的小目标检测网络,对红外图像进行特征提取,得到特征结果。本方法中,无需对红外图像进行预处理。Step 3: Use the trained small target detection network to extract features from the infrared image and obtain feature results. In this method, there is no need to preprocess the infrared image.
步骤4:将提取到的特征进行进一步融合,在融合后的特征上进行红外小目标检测,得到最终目标检测结果。Step 4: The extracted features are further fused, and infrared small target detection is performed on the fused features to obtain the final target detection result.
为实现本发明所述目的,本发明进一步提出了一种基于上下文信息的红外小目标检测系统,包括图像处理模块、目标信息学习模块、特征提取模块、特征融合和目标检测模块。In order to achieve the purpose of the present invention, the present invention further proposes a small infrared target detection system based on context information, including an image processing module, a target information learning module, a feature extraction module, a feature fusion and a target detection module.
有益效果Beneficial effect
本发明方法及系统,与现有技术相比,具有以下优点:Compared with the prior art, the method and system of the present invention have the following advantages:
1.本发明不依赖额外红外图像去噪、增强以及其他处理模块,训练过程端到端进行,实现简单、性能高、鲁棒性强。1. The present invention does not rely on additional infrared image denoising, enhancement and other processing modules, and the training process is carried out end-to-end, with simple implementation, high performance and strong robustness.
2.本发明的额外计算量开销极低,有利于实现低延迟,高速度的红外小目标检测,有效的提高了小目标的检测率,降低了漏检率,其他尺度目标的精度也有所提升。2. The extra calculation overhead of the present invention is extremely low, which is beneficial to realize low-latency, high-speed infrared small target detection, effectively improves the detection rate of small targets, reduces the missed detection rate, and improves the accuracy of targets of other scales .
附图说明Description of drawings
图1是本发明方法的流程图。Figure 1 is a flow chart of the method of the present invention.
图2是本发明方法所述特征提取方法示意图。Fig. 2 is a schematic diagram of the feature extraction method described in the method of the present invention.
图3是本发明方法所述框架图及特征融合内部细节示意图。Fig. 3 is a schematic diagram of the frame diagram and internal details of feature fusion described in the method of the present invention.
图4是本发明系统的流程图。Figure 4 is a flow chart of the system of the present invention.
具体实施方式Detailed ways
为了更好的阐述本发明的目的和优点,下面结合附图对发明内容进行详细的说明。In order to better illustrate the purpose and advantages of the present invention, the content of the invention will be described in detail below in conjunction with the accompanying drawings.
如图1所示,一种基于上下文信息的红外小目标检测方法,包括以下步骤:As shown in Figure 1, a small infrared target detection method based on context information includes the following steps:
步骤1:获取红外小目标数据集,并进行数据增强处理。Step 1: Obtain the infrared small target data set and perform data enhancement processing.
高质量的数据集是基于深度学习的红外小目标检测方法实现良好性能所必不可少的选择。但是,现有的红外小目标数据集数量少、规模小,小目标占比不佳。因此,本发明首先通过数据增强来扩充红外小目标数据集,从而增强整个检测方法的鲁棒性。A high-quality dataset is an essential choice for deep learning-based infrared small object detection methods to achieve good performance. However, the existing infrared small target data sets are small in number and small in scale, and the proportion of small targets is not good. Therefore, the present invention first expands the infrared small target data set through data enhancement, thereby enhancing the robustness of the entire detection method.
具体地,在含有红外小目标的图像中,找出不与其他目标重叠的小目标,并随机复制粘贴到图像的其他位置。其中,复制的小目标不遮挡其他目标,与其他目标保持距离。Specifically, in an image containing small infrared targets, find out the small targets that do not overlap with other targets, and randomly copy and paste them to other positions in the image. Among them, the copied small target does not block other targets, and keeps a distance from other targets.
进一步地,在复制粘贴小目标的基础上,可以叠加其他的数据增强操作(比如:旋转平移、缩放剪裁、马赛克增强等),其中,优选做马赛克增强。Furthermore, on the basis of copying and pasting small objects, other data enhancement operations (such as: rotation and translation, scaling and cropping, mosaic enhancement, etc.) can be superimposed, among which mosaic enhancement is preferred.
步骤2:使用分类损失函数、置信度损失函数以及位置损失函数,对小目标检测网络进行训练。Step 2: Use the classification loss function, confidence loss function and position loss function to train the small object detection network.
具体地,令总损失函数L(x,x′)表示为:Specifically, let the total loss function L(x,x′) be expressed as:
其中,x、x′分别表示预测值和真实值,αbox、αobj、αcls分别表示三个损失函数的权重,LCIoU、Lobj、Lcls分别表示目标检测任务位置损失函数、置信度损失函数及分类损失函数;k、s2、B分别表示输出特征图、网格和每个网格上anchor(即位置)的数量,Ikij表示第k个输出特征图、第i个网格、第j个anchor box是否为正样本,若是正样本,则为1,若是负样本则为0;αk用来平衡不同尺度的输出特征权重。Among them, x and x′ represent the predicted value and the real value respectively, α box , α obj , and α cls represent the weights of the three loss functions respectively, L CIoU , L obj , and L cls represent the target detection task position loss function, confidence Loss function and classification loss function; k, s 2 , and B represent the output feature map, grid, and the number of anchors (ie positions) on each grid, respectively, and I kij represents the kth output feature map and the i-th grid , Whether the jth anchor box is a positive sample, if it is a positive sample, it is 1, if it is a negative sample, it is 0; α k is used to balance the output feature weights of different scales.
步骤3:用训练好的小目标检测网络对红外图像进行特征提取,得到特征结果。并且,无需对红外图像进行预处理。Step 3: Use the trained small target detection network to perform feature extraction on the infrared image, and obtain the feature result. Also, no preprocessing of infrared images is required.
普通的红外小目标检测方法主要提取物体本身蕴含的特征,欠缺对上下文信息的获取分析能力,而小目标纹理信息不足,在没有上下文信息的补充增强时,低对比度的红外图像复杂的背景很容易将其淹没,小目标的不同形态也会给检测造成一定的难度。The ordinary infrared small target detection method mainly extracts the features contained in the object itself, lacks the ability to acquire and analyze context information, and the texture information of small targets is insufficient. When there is no supplementary enhancement of context information, low-contrast infrared images with complex backgrounds are easy to detect. If it is submerged, the different shapes of small targets will also cause certain difficulties in detection.
为此,本方法提出,首先对图像进行特征提取。在提取过程中,充分考虑到不同形状的特征对上下文信息的需求不同,通过动态上下文信息提取,如图2所示,对特征建立各个信息之间的远距离依赖。其中,输入部分加入的位置编码弥补了特征的位置信息,改善了红外目标在远处特征信息不足的问题。对输入的深层特征分块展平为序列并引入位置信息后,送入多头注意力机制进行加权求和。然后,采用残差连接来优化结果、加快收敛,通过两层全连接层并再次残差连接。后续为一层可变形卷积,在卷积的同时加入偏置项,确保相同位置存在形态各异、大小差别明显的物体时,也能很好的表达物体的特征。To this end, this method proposes to first perform feature extraction on the image. During the extraction process, fully considering that features of different shapes have different requirements for context information, through dynamic context information extraction, as shown in Figure 2, a long-distance dependence between each information is established for features. Among them, the position code added in the input part makes up the position information of the feature, and improves the problem of insufficient feature information of the infrared target in the distance. After flattening the input deep feature block into a sequence and introducing position information, it is sent to the multi-head attention mechanism for weighted summation. Then, the residual connection is used to optimize the results and speed up the convergence, through two fully connected layers and residual connection again. The follow-up is a layer of deformable convolution, and a bias term is added at the same time of convolution to ensure that when there are objects of different shapes and sizes at the same position, the characteristics of the objects can also be well expressed.
具体地,动态上下文信息提取的过程如下:Specifically, the process of dynamic context information extraction is as follows:
对于输入特征F,其特征大小为C×H×W,其中C表示通道数,H表示高度,W表示宽度;给定区块的大小尺寸P,将C×H×W划分为N个P×P×C块,P表示块。For the input feature F, its feature size is C×H×W, where C represents the number of channels, H represents the height, and W represents the width; given the size P of the block, divide C×H×W into N P× P×C blocks, P means block.
得到N个块之后,将其线性变换为N个长度的特征向量,并在向量起始位置添加一个标志位向量xp;After obtaining N blocks, linearly transform them into feature vectors of N lengths, and add a flag vector x p at the starting position of the vector;
F1=E+F0 F 1 =E+F 0
其中,F0表示输出的向量结果,表示第N个区块,WN为权重参数,Concat[]为拼接操作。Among them, F 0 represents the output vector result, Indicates the Nth block, W N is the weight parameter, and Concat[] is the splicing operation.
最终得到的F0为分块嵌入的输出结果。得到F0分块后,嵌入得到的特征尚缺少区块之间的相对位置信息,因此,添加位置编码信息E与F0相加,得到F1,F1表示添加位置信息后的结果。The final F 0 obtained is the output result of block embedding. After obtaining the F 0 blocks, the embedded features still lack the relative position information between the blocks. Therefore, add the position coding information E and F 0 to get F 1 , and F 1 represents the result after adding the position information.
嵌入位置信息后的F1分别乘以三个不同的参数矩阵,映射为查询矩阵、被查询的键值矩阵和值矩阵。经过注意力机制处理后,得到多个注意力结果,用来表示图像中不同的上下文信息。将这些注意力结果拼接起来并标准化,得到最终的上下文信息汇总结果:The F1 after embedding position information is multiplied by three different parameter matrices respectively, and mapped into query matrix, queried key-value matrix and value matrix. After being processed by the attention mechanism, multiple attention results are obtained to represent different contextual information in the image. These attention results are concatenated and normalized to obtain the final summary result of contextual information:
headi=Attention(F1Wq;F1Wk;F1Wv)head i = Attention(F 1 W q ; F 1 W k ; F 1 W v )
FM=Concat[headi;headi;headi;...;headi]WM F M =Concat[head i ; head i ; head i ; . . . ; head i ] W M
其中,Attention()表示注意力机制操作,Q、K、V分别表示查询矩阵、被查询的键值矩阵和值矩阵,T表示转置运算,表示缩放因子;F1表示添加位置信息后的结果,Wq、Wk、Wv、WM是可学习参数矩阵,Softmax表示进行Softmax操作,headi表示多个注意力结果的输出,FM表示多头注意力输出特征。Concat表示相加操作。Among them, Attention() represents the operation of the attention mechanism, Q, K, and V represent the query matrix, the queried key-value matrix and the value matrix, respectively, T represents the transpose operation, Represents the scaling factor; F 1 represents the result after adding position information, W q , W k , W v , W M are learnable parameter matrices, Softmax represents the Softmax operation, head i represents the output of multiple attention results, F M Denotes multi-head attention output features. Concat represents an addition operation.
前馈神经网络包括两层全连接层,残差归一化后的多头注意力输出特征FM被第一个全连接层映射到高维空间,低维空间则被第二个全连接层映射,进一步保留有用的信息,其过程为:The feed-forward neural network includes two layers of fully connected layers. The multi-head attention output feature F M after residual normalization is mapped to the high-dimensional space by the first fully connected layer, and the low-dimensional space is mapped by the second fully connected layer. , to further retain useful information, the process is:
F2=FM[0]+F1 F 2 =F M [0]+F 1
X=F2Wfc1Wfc2+F1 X=F 2 W fc1 W fc2 +F 1
其中,F2表示残差后的结果,FM[0]为标志位向量,X为输出结果,Wfc1、Wfc2为两个全连接层的权重。Among them, F 2 represents the result after the residual, F M [0] is the flag bit vector, X is the output result, W fc1 and W fc2 are the weights of the two fully connected layers.
处理完上下文信息后,输出结果X通过可变形卷积来动态调整有效信息,联系不同小目标与上下文信息之间的关系:After processing the context information, the output result X dynamically adjusts the effective information through deformable convolution, and links the relationship between different small targets and context information:
其中,Y(p0)表示可变形卷积输出结果,X、Y分别为输入特征图和输出特征图,p0表示输出特征中的位置,pn表示相邻位置,R表示实数范围。函数W()表示pn处的权重。pn是偏移值,通过从输入特征进行并行卷积来学习。Among them, Y(p 0 ) represents the output result of deformable convolution, X and Y are the input feature map and output feature map respectively, p 0 represents the position in the output feature, p n represents the adjacent position, and R represents the range of real numbers. The function W() represents the weight at p n . p n is the offset value, learned by parallel convolution from the input features.
步骤4:将提取到的特征进行进一步融合,在融合后的特征上进行红外小目标检测,得到最终目标检测结果。Step 4: The extracted features are further fused, and infrared small target detection is performed on the fused features to obtain the final target detection result.
受到噪声影响,红外图像中不同小目标的特征表现存在很大差异,十分考验模型的特征融合能力。而普通目标检测模型单纯的上采样和卷积、连接特征等操作,未从空间维度分析物体的位置特征、通道维度分析物体的语义特征,或者仅能对图像中明显的特征进行融合,却忽略掉其中的小目标信息,进而导致小目标的最终检测精确度不高。Affected by noise, the feature performance of different small targets in the infrared image is very different, which is a test of the feature fusion ability of the model. However, the ordinary object detection model simply performs operations such as upsampling, convolution, and connection features, and does not analyze the positional features of objects from the spatial dimension and the semantic features of objects from the channel dimension, or can only fuse obvious features in the image, but ignores them. The small target information is lost, which leads to low final detection accuracy of small targets.
因此,本方法中,对图像进行特征提取后,将提取到的特征进行特征融合,并在融合后的特征上进行目标检测。在特征融合过程中,利用多信息融合层聚合多重特征中的通道、空间信息。Therefore, in this method, after feature extraction is performed on the image, the extracted features are subjected to feature fusion, and target detection is performed on the fused features. In the process of feature fusion, the multi-information fusion layer is used to aggregate channel and spatial information in multiple features.
聚合后的特征大大提高了物体的位置信息和语义信息的表达。在特征融合时增加了新的特征尺度并进行融合,来补充深层的小目标特征,有利于丰富小目标的细节特征。The aggregated features greatly improve the expression of the object's location information and semantic information. In the feature fusion, a new feature scale is added and fused to complement the deep small target features, which is conducive to enriching the detailed features of small targets.
为了在特征融合时尽可能的增加物体的时空信息,保留更多的目标特征。如图3所示,多信息融合层在每个特征尺度中进行信息融合操作。In order to increase the spatio-temporal information of the object as much as possible during feature fusion, more target features are retained. As shown in Figure 3, the multi-information fusion layer performs information fusion operations in each feature scale.
多信息融合模块(MFM)通过多个残差结构融合不同层的信息,其结构如图3(c)所示,包含三个部分,第一个是IC层,如图3(b)表示,负责细化特征的信息,然后从通道层面分别进行全局池化和最大池化,并经过共享权重的全连接层整理信息,相乘加后再通过softmax函数归一化,得到提取的通道信息,与输入信息相乘,达到对通道信息增强的效果。The multi-information fusion module (MFM) fuses the information of different layers through multiple residual structures. Its structure is shown in Figure 3(c), which includes three parts. The first one is the IC layer, as shown in Figure 3(b). Responsible for refining the feature information, and then perform global pooling and maximum pooling from the channel level, and organize the information through the fully connected layer with shared weights, multiply and add, and then normalize through the softmax function to obtain the extracted channel information. Multiplied with the input information to achieve the effect of enhancing the channel information.
通道信息提取增强后,继续对图像的每个位置分别进行全局池化和最大池化,相加后可以采取7×7卷积叠加特征并通过softmax函数归一化,达到对位置信息增强的效果。最后,可以经过1×1卷积,进一步整合通道和空间信息。After the channel information extraction is enhanced, continue to perform global pooling and maximum pooling on each position of the image respectively. After adding, 7×7 convolution superimposition features can be taken and normalized by the softmax function to achieve the effect of enhancing position information. . Finally, 1×1 convolution can be used to further integrate channel and spatial information.
深层特征所含语义信息丰富,但多为目标的语义信息,小目标的相关特征经过多次下采样操作后,容易被噪声遮盖,难以定位,而浅层特征具有丰富的小目标纹理信息和位置信息。同时,为了有效利用浅层特征来增强小目标的细节信息、补充小目标的位置信息,额外增加了一个特征尺度来专门关注小物体,增加一个检测头来输出检测结果。相关结构命名如图3(a)所示,动态上下文信息提取模块及后续三个多信息融合模块(MFM)的输出为T5、T4、T3、T2,这些输出的大小分别是原图的1/32、1/16、1/8、1/4。与T5、T4、T3连接的相同大小的特征记为R4、R3、R2。The deep features contain rich semantic information, but most of them are the semantic information of the target. After multiple downsampling operations, the relevant features of the small target are easily covered by noise and difficult to locate, while the shallow features have rich texture information and position of the small target. information. At the same time, in order to effectively use shallow features to enhance the details of small objects and supplement the location information of small objects, an additional feature scale is added to focus on small objects, and a detection head is added to output detection results. The names of related structures are shown in Figure 3(a). The outputs of the dynamic context information extraction module and the subsequent three multi-information fusion modules (MFM) are T5, T4, T3, and T2, and the sizes of these outputs are 1/1 of the original image respectively. 32, 1/16, 1/8, 1/4. Features of the same size connected with T5, T4, T3 are denoted as R4, R3, R2.
本方法在特征图处理到T3层时,继续将特征上采样,并于上采样后加入T2层,同时将T2层与骨干网络第二层相同大小的特征连接。提高小目标细节的表征能力,传递浅层细节信息,在T2层后接小目标检测头,来减小小目标和其他目标在同一层的特征耦合,降低小目标的漏检率,提升检测到小目标的几率,缓解尺度过大带来的精度不佳。为了与后面网络的通道对应,在T2层后加入R2层,与维度相同的T3层特征连接。In this method, when the feature map is processed to the T3 layer, the feature is continuously up-sampled, and the T2 layer is added after the up-sampling, and the T2 layer is connected to the features of the same size as the second layer of the backbone network. Improve the characterization ability of small target details, transfer shallow detail information, and connect small target detection heads after the T2 layer to reduce the feature coupling between small targets and other targets on the same layer, reduce the missed detection rate of small targets, and improve the detection accuracy. The probability of small targets alleviates the poor accuracy caused by excessive scale. In order to correspond to the channel of the subsequent network, the R2 layer is added after the T2 layer, and is connected with the T3 layer features of the same dimension.
为实现本发明所述目的,本发明进一步提出了一种基于上下文信息的端到端红外小目标检测系统,如图4所示,包括红外图像处理模块10、目标信息学习模块20、特征提取模块30、特征融合和目标检测模块40。In order to achieve the stated purpose of the present invention, the present invention further proposes an end-to-end infrared small target detection system based on context information, as shown in Figure 4, including an infrared image processing module 10, a target information learning module 20, and a feature extraction module 30. Feature fusion and object detection module 40.
其中,红外图像处理模块10用于处理用于训练小目标检测模型的红外图像数据集。该模块能够增加小目标的数量,丰富数据集的变化场景,增强模型的鲁棒性。Wherein, the infrared image processing module 10 is used for processing the infrared image data set used for training the small target detection model. This module can increase the number of small targets, enrich the changing scenarios of the dataset, and enhance the robustness of the model.
小目标信息学习模块20,用于引导小目标检测模型学习鲁棒的图像特征。该模块利用红外小目标数据集使用信息学习训练模型,输出得到训练好的小目标检测模型。The small target information learning module 20 is used to guide the small target detection model to learn robust image features. This module uses the infrared small target data set to use the information to learn and train the model, and outputs the trained small target detection model.
图像特征提取模块30,利用动态上下文信息提取模块提取图像特征中的目标周围信息和全局相关信息,并适配不同小目标的轮廓变化。在红外图像上提取出稳定干净的小目标特征,以实现精确的红外小目标检测。The image feature extraction module 30 uses the dynamic context information extraction module to extract the target surrounding information and global related information in the image features, and adapt to the contour changes of different small targets. Extract stable and clean small target features from infrared images to achieve precise infrared small target detection.
特征融合和目标检测模块40,能够将提取到的特征进行融合。从融合后的图像特征中识别提取出感兴趣目标的类别位置大小形状,得到最终的红外小目标检测结果。The feature fusion and target detection module 40 can fuse the extracted features. The category, position, size, and shape of the target of interest are identified and extracted from the fused image features, and the final infrared small target detection result is obtained.
上述模块之间的连接关系如下:The connections between the above modules are as follows:
红外图像处理模块10的输出端与小目标信息学习模块20的输入端相连。The output end of the infrared image processing module 10 is connected with the input end of the small target information learning module 20 .
小目标信息学习模块20的输出端与图像特征提取模块30的输入端相连。The output end of the small target information learning module 20 is connected to the input end of the image feature extraction module 30 .
图像特征提取模块30的输出端与特征融合和目标检测模块40的输入端相连。The output terminal of the image feature extraction module 30 is connected to the input terminal of the feature fusion and object detection module 40 .
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211461433.8A CN115797684A (en) | 2022-11-21 | 2022-11-21 | Infrared small target detection method and system based on context information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211461433.8A CN115797684A (en) | 2022-11-21 | 2022-11-21 | Infrared small target detection method and system based on context information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115797684A true CN115797684A (en) | 2023-03-14 |
Family
ID=85439732
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211461433.8A Pending CN115797684A (en) | 2022-11-21 | 2022-11-21 | Infrared small target detection method and system based on context information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115797684A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116205967A (en) * | 2023-04-27 | 2023-06-02 | 中国科学院长春光学精密机械与物理研究所 | Medical image semantic segmentation method, device, equipment and medium |
-
2022
- 2022-11-21 CN CN202211461433.8A patent/CN115797684A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116205967A (en) * | 2023-04-27 | 2023-06-02 | 中国科学院长春光学精密机械与物理研究所 | Medical image semantic segmentation method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110084210B (en) | SAR image multi-scale ship detection method based on attention pyramid network | |
CN110188705B (en) | Remote traffic sign detection and identification method suitable for vehicle-mounted system | |
CN113095152B (en) | Regression-based lane line detection method and system | |
CN113052210A (en) | Fast low-illumination target detection method based on convolutional neural network | |
CN112800838A (en) | Channel ship detection and identification method based on deep learning | |
CN109145747A (en) | A kind of water surface panoramic picture semantic segmentation method | |
CN115205264A (en) | A high-resolution remote sensing ship detection method based on improved YOLOv4 | |
Berton et al. | Adaptive-attentive geolocalization from few queries: A hybrid approach | |
CN111753677A (en) | Multi-angle remote sensing ship image target detection method based on feature pyramid structure | |
CN117422971A (en) | Bimodal target detection method and system based on cross-modal attention mechanism fusion | |
CN115082855A (en) | Pedestrian occlusion detection method based on improved YOLOX algorithm | |
CN116503709A (en) | Vehicle detection method based on improved YOLOv5 in haze weather | |
CN114519819A (en) | Remote sensing image target detection method based on global context awareness | |
CN116258940A (en) | A small target detection method with multi-scale features and adaptive weight | |
CN117351360A (en) | Remote sensing image road extraction method based on attention mechanism improvement | |
CN117115359A (en) | Multi-view power grid three-dimensional space data reconstruction method based on depth map fusion | |
CN115223009A (en) | Small target detection method and device based on improved YOLOv5 | |
Liangjun et al. | MSFA-YOLO: A multi-scale SAR ship detection algorithm based on fused attention | |
Wang et al. | Hierarchical kernel interaction network for remote sensing object counting | |
CN116935332A (en) | Fishing boat target detection and tracking method based on dynamic video | |
CN115797684A (en) | Infrared small target detection method and system based on context information | |
CN115147644A (en) | Image description model training and description method, system, device and storage medium | |
CN113763261B (en) | Real-time detection method for far small target under sea fog weather condition | |
CN115115973A (en) | Weak and small target detection method based on multiple receptive fields and depth characteristics | |
CN118887511A (en) | A YOLOv8-based infrared remote sensing image target detection method, electronic device, and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |