CN116310795A - A SAR aircraft detection method, system, device and storage medium - Google Patents
A SAR aircraft detection method, system, device and storage medium Download PDFInfo
- Publication number
- CN116310795A CN116310795A CN202310089469.6A CN202310089469A CN116310795A CN 116310795 A CN116310795 A CN 116310795A CN 202310089469 A CN202310089469 A CN 202310089469A CN 116310795 A CN116310795 A CN 116310795A
- Authority
- CN
- China
- Prior art keywords
- aircraft
- feature map
- branch
- feature
- deformable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 101
- 238000003860 storage Methods 0.000 title claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000012549 training Methods 0.000 claims abstract description 31
- 230000010354 integration Effects 0.000 claims abstract description 15
- 230000006870 function Effects 0.000 claims description 30
- 238000005070 sampling Methods 0.000 claims description 20
- 230000007246 mechanism Effects 0.000 claims description 15
- 238000000605 extraction Methods 0.000 claims description 14
- 230000002776 aggregation Effects 0.000 claims description 12
- 238000004220 aggregation Methods 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 8
- 238000012546 transfer Methods 0.000 claims description 5
- 239000012190 activator Substances 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 239000013598 vector Substances 0.000 description 12
- 238000004590 computer program Methods 0.000 description 8
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 6
- 238000003384 imaging method Methods 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 238000001994 activation Methods 0.000 description 3
- 238000002592 echocardiography Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000002679 ablation Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
Abstract
本发明公开了一种SAR飞机检测方法、系统、装置及存储介质,方法包括:获取输入的SAR图像;利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果;飞机检测模型通过已标注飞机类别和目标边框的SAR图像数据集训练生成的;飞机检测模型包括分类分支和回归分支,分类分支具有可变形区域关联模块,可变形区域关联模块通过可变形卷积分支和常规卷积分支进行特征加权整合。本发明通过可变形关联模块构建分类分支,显著提升特征关联能力,并通过已标注飞机类别和目标边框的SAR图像数据集进行模型训练,充分利用SAR飞机散射特征信息的先验知识。本发明提高了在复杂的SAR图像中检测飞机的性能,实现了精确的飞机检测识别。
The invention discloses a SAR aircraft detection method, system, device and storage medium. The method includes: acquiring an input SAR image; using an aircraft detection model to analyze the input SAR image to obtain an aircraft detection result; Generated by SAR image dataset training with marked aircraft category and target frame; aircraft detection model includes classification branch and regression branch, classification branch has deformable area association module, deformable area association module through deformable convolution branch and conventional convolution branch Perform feature weighted integration. The invention constructs classification branches through deformable association modules, significantly improves feature association capabilities, and performs model training through SAR image data sets marked with aircraft categories and target borders, and fully utilizes prior knowledge of SAR aircraft scattering feature information. The invention improves the performance of detecting aircraft in complicated SAR images, and realizes accurate detection and identification of aircraft.
Description
技术领域technical field
本发明涉及飞机检测技术领域,尤其是一种SAR飞机检测方法、系统、装置及存储介质。The invention relates to the technical field of aircraft detection, in particular to a SAR aircraft detection method, system, device and storage medium.
背景技术Background technique
合成孔径雷达SAR是一种主动式微波成像传感器,因其具有全天时、全天候的成像观测能力,在自动目标识别领域起到了重要的作用。在自动目标识别领域中,飞机检测是一个极具价值的应用领域。例如,在民用领域中,飞机行径的动态监测有利于机场的有效管理;在军事领域中,快速而准确的飞机探测对于提供军事侦察信息具有重要的意义。因此,在高分辨率的SAR图像中精确检测飞机是一个具有显著价值的课题。Synthetic Aperture Radar (SAR) is an active microwave imaging sensor, which plays an important role in the field of automatic target recognition because of its all-day and all-weather imaging observation capabilities. In the field of automatic object recognition, aircraft detection is a valuable application field. For example, in the civilian field, the dynamic monitoring of aircraft behavior is beneficial to the effective management of airports; in the military field, fast and accurate aircraft detection is of great significance for providing military reconnaissance information. Therefore, accurate detection of aircraft in high-resolution SAR images is a subject of significant value.
SAR是通过天线给物体发射电磁波,并接收反射的电磁波,最后通过记录下来的回波信息来成像。由于其独特的成像机制,SAR图片呈现出难以解译的复杂表象。在SAR图像中检测物体的原始方法为恒虚警检测CFAR,其利用了物体回波往往比背景回波更强的性质,从SAR图像中提取出具有强回波的区域视为存在目标。然而在复杂场景下,如具有其它多种设备或建筑的机场,CFAR的检测性能将会受到极大的影响,无法精确识别目标。此外,现有的飞机检测方法还要求能够识别出目标的类别,而这是CFAR无法实现的。SAR transmits electromagnetic waves to the object through the antenna, receives the reflected electromagnetic waves, and finally forms an image through the recorded echo information. Due to its unique imaging mechanism, SAR images present complex appearances that are difficult to interpret. The original method of detecting objects in SAR images is constant false alarm detection (CFAR), which utilizes the property that object echoes are often stronger than background echoes, and extracts regions with strong echoes from SAR images as existing targets. However, in complex scenarios, such as airports with various other equipment or buildings, the detection performance of CFAR will be greatly affected, and the target cannot be accurately identified. In addition, the existing aircraft detection methods also require the ability to identify the category of the target, which cannot be achieved by CFAR.
飞机上具有各种二面角、三面角和顶帽等经典反射结构,在雷达的照射下存在着多种散射机制,如直接散射、多次散射和衍射散射等等。因此,在后来的飞机识别方法中,SAR领域的专家提出利用飞机在SAR图像中的散射特征信息,采用模板匹配的方法来检测,其中用于匹配的特征提取是关键步骤。例如,有方法利用Harris-Laplace角点检测器提取飞机上具有稳定散射特征的显著点,并用显著点向量来描述。还有的方法利用高斯混合模型来提取包括了飞机强散射点与其对应分布的散射结构特征。但由于手工提取特征的不充分,以及将实测图像与多种候选模板匹配过程的低效,此类方法在精度与速度上仍然存在不足。There are various classic reflection structures such as dihedral angles, trihedral angles, and top hats on the aircraft. There are various scattering mechanisms under the irradiation of radar, such as direct scattering, multiple scattering, and diffraction scattering. Therefore, in the subsequent aircraft identification method, experts in the SAR field proposed to use the scattering feature information of the aircraft in the SAR image and use the template matching method to detect, and the feature extraction for matching is a key step. For example, there is a method to use the Harris-Laplace corner detector to extract salient points with stable scattering features on the plane, and describe them with salient point vectors. Another method uses the Gaussian mixture model to extract the scattering structure features including the strong scattering points of the aircraft and their corresponding distribution. However, due to the insufficient manual feature extraction and the inefficiency of matching the measured image with a variety of candidate templates, such methods still have shortcomings in accuracy and speed.
随着SAR技术的快速发展,更多高分辨率、高质量的具有专家标注SAR图像可以被获得,这为深度学习方法在SAR目标自动识别中的应用提供了有利条件。近年来,基于卷积神经网络(CNN)的深度学习方法在目标检测方面取得了显著进展,许多CNN方法在SAR目标检测中表现出了优于传统方法的性能,如YOLOX、Cascade R-CNN、CenterNet、RepPoints等等。然而,大多数CNN方法都是为光学目标检测而设计的,在直接引入SAR飞机检测而不考虑飞机散射特性时无法充分发挥其检测性能。飞机的散射特性具体体现在以下两点;1)离散性。由于各种反射结构不规则地分布在飞机上,使飞机各部位的雷达横截面积(RadarCross Section,RCS)不同,因此SAR图像中的飞机呈现为离散散射点的集合。2)多变性。由于飞机结构复杂,多种散射机制的存在使其成像结果随着入射角和传感器参数的变化而变化,因此在不同的成像条件下,即使是同一目标的散射结果也可能变化很大。在SAR成像机制下,SAR飞机图像呈现出与对应光学图像截然不同的特性。因此,现有的CNN方法通过常规卷积无法充分的提取飞机特征。With the rapid development of SAR technology, more high-resolution and high-quality SAR images with expert annotations can be obtained, which provides favorable conditions for the application of deep learning methods in SAR target automatic recognition. In recent years, deep learning methods based on convolutional neural network (CNN) have made remarkable progress in target detection, and many CNN methods have shown better performance than traditional methods in SAR target detection, such as YOLOX, Cascade R-CNN, CenterNet, RepPoints, and more. However, most CNN methods are designed for optical object detection and cannot fully exploit their detection performance when directly introduced to SAR aircraft detection without considering aircraft scattering characteristics. The scattering characteristics of the aircraft are embodied in the following two points; 1) Discreteness. Due to the irregular distribution of various reflection structures on the aircraft, the radar cross-sectional area (RadarCross Section, RCS) of each part of the aircraft is different, so the aircraft in the SAR image appears as a collection of discrete scattering points. 2) Variability. Due to the complex structure of the aircraft, the existence of multiple scattering mechanisms makes the imaging results change with the incident angle and sensor parameters. Therefore, under different imaging conditions, even the scattering results of the same target may vary greatly. Under the SAR imaging mechanism, the SAR aircraft image presents completely different characteristics from the corresponding optical image. Therefore, existing CNN methods cannot sufficiently extract aircraft features through conventional convolutions.
鉴于此,如何利用SAR飞机散射特征信息的先验知识实现高精度飞机检测性能是一个亟需解决的问题。In view of this, how to use the prior knowledge of SAR aircraft scattering feature information to achieve high-precision aircraft detection performance is an urgent problem to be solved.
发明内容Contents of the invention
有鉴于此,本发明实施例提供一种SAR飞机检测方法、系统、装置及存储介质,能够有效提高网络模型对飞机的检测识别性能。In view of this, embodiments of the present invention provide a SAR aircraft detection method, system, device, and storage medium, which can effectively improve the detection and recognition performance of aircraft by network models.
一方面,本发明的实施例提供了一种SAR飞机检测方法,包括:On the one hand, the embodiment of the present invention provides a kind of SAR aircraft detection method, comprising:
获取输入的SAR图像;Get the input SAR image;
利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果;Use the aircraft detection model to analyze the input SAR image and obtain the aircraft detection results;
其中,飞机检测结果包括至少一个目标边框回归结果以及对应的飞机的类别置信度分数;Wherein, the aircraft detection result includes at least one target frame regression result and the category confidence score of the corresponding aircraft;
飞机检测模型通过已标注飞机类别和目标边框的SAR图像数据集训练生成的;飞机检测模型包括分类分支和回归分支,分类分支具有可变形区域关联模块,可变形区域关联模块通过可变形卷积分支和常规卷积分支进行特征加权整合。The aircraft detection model is generated by training the SAR image data set that has marked the aircraft category and the target frame; the aircraft detection model includes a classification branch and a regression branch. The classification branch has a deformable area association module, and the deformable area association module uses the deformable convolution branch Feature weighted integration with conventional convolution branches.
可选地,还包括:Optionally, also include:
基于具有可变形区域关联模块的分类分支和回归分支,创建可变形散射特征关联网络模型;Create a deformable scattering feature association network model based on a classification branch and a regression branch with a deformable region association module;
其中,可变形散射特征关联网络模型包括Swin Transformer骨干、路径聚合特征金字塔网络和解耦头;解耦头包括具有可变形区域关联模块的分类分支和回归分支。Among them, the deformable scattering feature association network model includes a Swin Transformer backbone, a path aggregation feature pyramid network, and a decoupling head; the decoupling head includes a classification branch and a regression branch with a deformable area association module.
可选地,基于具有可变形区域关联模块的分类分支和回归分支,创建可变形散射特征关联网络模型,包括:Optionally, based on the classification branch and the regression branch with the deformable region association module, a deformable scattering feature association network model is created, including:
构建YOLOX模型;其中,YOLOX模型包括Darknet-53骨干、路径聚合特征金字塔网络和原解耦头;原解耦头包括原分类分支和回归分支;Construct the YOLOX model; among them, the YOLOX model includes the Darknet-53 backbone, the path aggregation feature pyramid network and the original decoupling head; the original decoupling head includes the original classification branch and regression branch;
通过引入Swin Transformer骨干替换YOLOX模型的Darknet-53骨干,以及通过引入具有可变形区域关联模块的分类分支替换YOLOX模型的原分类分支,创建得到可变形散射特征关联网络模型;By introducing the Swin Transformer backbone to replace the Darknet-53 backbone of the YOLOX model, and by introducing a classification branch with a deformable region association module to replace the original classification branch of the YOLOX model, a deformable scattering feature association network model is created;
其中,Swin Transformer骨干包括24层Swin Transformer层。Among them, the Swin Transformer backbone includes 24 layers of Swin Transformer layers.
可选地,还包括:Optionally, also include:
根据已标注飞机类别和目标边框的SAR图像数据集,确定训练样本;其中,目标边框中标记有预设数量的强散射区域;Determine the training samples according to the SAR image data set that has marked the aircraft category and the target frame; wherein, the target frame is marked with a preset number of strong scattering areas;
根据训练样本,基于预设训练参数,采用随机梯度下降SGD训练对可变形散射特征关联网络模型进行分类和回归训练,并结合模型损失函数通过损失值反向传播方法优化网络模型参数,获得飞机检测模型。According to the training samples and based on the preset training parameters, the stochastic gradient descent SGD training is used to classify and regress the deformable scattering feature association network model, and combine the model loss function to optimize the network model parameters through the loss value backpropagation method to obtain aircraft detection Model.
可选地,利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果,包括:Optionally, the aircraft detection model is used to analyze the input SAR image to obtain aircraft detection results, including:
对SAR图像进行分割拼接和线性映射,得到第一特征图;Carry out segmentation splicing and linear mapping on the SAR image to obtain the first feature map;
通过Swin Transformer骨干,使用连续的Swin Transformer层交替进行常规窗口划分和转移窗口划分,得到预设规格的第二特征图;Through the Swin Transformer backbone, use continuous Swin Transformer layers to alternately perform regular window division and transfer window division, and obtain the second feature map of the preset specification;
通过路径聚合特征金字塔网络,对第二特征图进行语义特征提取,得到预设规格的第三特征图;其中,语义特征提取包括上采样、拼接和卷积;Through the path aggregation feature pyramid network, the semantic feature extraction is performed on the second feature map to obtain the third feature map with preset specifications; wherein, the semantic feature extraction includes upsampling, splicing and convolution;
通过回归分支,对第三特征图进行位置预测,得到目标边框回归结果;其中,目标边框回归结果包括边框的中心点、宽度和高度;Predict the position of the third feature map through the regression branch to obtain the regression result of the target border; wherein, the regression result of the target border includes the center point, width and height of the border;
通过具有可变形区域关联模块的分类分支,使用可变形卷积分支和常规卷积分支对第三特征图进行特征加权整合,得到与目标边框回归结果对应的预设类别的飞机的类别置信度分数。Through the classification branch with the deformable area association module, use the deformable convolution branch and the conventional convolution branch to carry out feature weighted integration on the third feature map, and obtain the category confidence score of the aircraft of the preset category corresponding to the regression result of the target frame .
可选地,Swin Transformer层包括归一化函数、基于窗口的多头自注意力机制和多层感知机,Swin Transformer层处理特征图的步骤,包括:Optionally, the Swin Transformer layer includes a normalization function, a window-based multi-head self-attention mechanism and a multi-layer perceptron, and the steps of the Swin Transformer layer processing feature maps include:
通过归一化函数对目标特征图进行归一化,得到特征图X2;The target feature map is normalized by a normalization function to obtain the feature map X 2 ;
根据特征图X2,使用基于窗口的多头自注意力机制,通过线性映射和基于通道维度拼接,得到特征图X3;According to the feature map X 2 , using the window-based multi-head self-attention mechanism, the feature map X 3 is obtained through linear mapping and splicing based on the channel dimension;
根据目标特征图和特征图X3相加,得到特征图X4;According to the addition of the target feature map and the feature map X 3 , the feature map X 4 is obtained;
通过归一化函数对特征图X4进行归一化,得到特征图X5;The feature map X4 is normalized by a normalization function to obtain the feature map X5 ;
通过多层感知机,对特征图X5使用GELU非线性激活器激活,得到特征图X6;其中,多层感知机包括两层全连接层;Through the multi-layer perceptron, use the GELU nonlinear activator to activate the feature map X5 , and obtain the feature map X6 ; wherein, the multi-layer perceptron includes two fully connected layers;
根据特征图X4和特征图X6相加,得到特征图X7。According to the addition of the feature map X 4 and the feature map X 6 , the feature map X 7 is obtained.
可选地,通过具有可变形区域关联模块的分类分支,使用可变形卷积分支和常规卷积分支对第三特征图进行特征加权整合,得到与目标边框回归结果对应的预设类别的飞机的类别置信度分数,包括:Optionally, through the classification branch with the deformable area association module, the deformable convolution branch and the conventional convolution branch are used to carry out feature weighted integration on the third feature map to obtain the aircraft of the preset category corresponding to the regression result of the target frame Category confidence scores, including:
通过可变形卷积分支,对第三特征图进行第一卷积,得到特征图X9;Through the deformable convolution branch, perform the first convolution on the third feature map to obtain the feature map X 9 ;
对特征图X9进行第二卷积得到采样点偏移图;对特征图X9进行第三卷积得到分数掩膜;Perform the second convolution on the feature map X9 to obtain the sampling point offset map; perform the third convolution on the feature map X9 to obtain the score mask;
根据特征图X9、采样点偏移图和分数掩膜,进行可变形卷积,得到特征图Y;According to the feature map X 9 , the sampling point offset map and the score mask, deformable convolution is performed to obtain the feature map Y;
通过常规卷积分支,对第三特征图进行连续两次卷积,得到特征图Z;Through the conventional convolution branch, perform two consecutive convolutions on the third feature map to obtain the feature map Z;
基于预设的可学习超参数,将特征图Y和特征图Z进行加权相加和特征通道信息整合,得到与目标边框回归结果对应的预设类别的飞机的类别置信度分数。Based on the preset learnable hyperparameters, the feature map Y and the feature map Z are weighted and added and the feature channel information is integrated to obtain the category confidence score of the preset category of aircraft corresponding to the target frame regression result.
另一方面,本发明的实施例提供了一种SAR飞机检测系统,包括:On the other hand, embodiments of the present invention provide a SAR aircraft detection system, including:
第一模块,用于获取输入的SAR图像;The first module is used to obtain the input SAR image;
第二模块,用于利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果;The second module is used to use the aircraft detection model to analyze the input SAR image to obtain the aircraft detection result;
其中,飞机检测结果包括至少一个目标边框回归结果以及对应的飞机的类别置信度分数;Wherein, the aircraft detection result includes at least one target frame regression result and the category confidence score of the corresponding aircraft;
飞机检测模型通过已标注飞机类别和目标边框的SAR图像数据集训练生成的;飞机检测模型包括分类分支和回归分支,分类分支具有可变形区域关联模块,可变形区域关联模块通过可变形卷积分支和常规卷积分支进行特征加权整合。The aircraft detection model is generated by training the SAR image data set that has marked the aircraft category and the target frame; the aircraft detection model includes a classification branch and a regression branch. The classification branch has a deformable area association module, and the deformable area association module uses the deformable convolution branch Feature weighted integration with conventional convolution branches.
另一方面,本发明的实施例提供了一种SAR飞机检测装置,包括处理器以及存储器;On the other hand, an embodiment of the present invention provides a SAR aircraft detection device, including a processor and a memory;
存储器用于存储程序;The memory is used to store programs;
处理器执行程序实现如前面的方法。The processor executes the program to realize the method as above.
另一方面,本发明的实施例提供了一种计算机可读存储介质,存储介质存储有程序,程序被处理器执行实现如前面的方法。On the other hand, an embodiment of the present invention provides a computer-readable storage medium, where a program is stored in the storage medium, and the program is executed by a processor to implement the foregoing method.
本发明实施例还公开了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器可以从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行前面的方法。The embodiment of the present invention also discloses a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device can read the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the above method.
本发明实施例首先获取输入的SAR图像;利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果;其中,飞机检测结果包括至少一个目标边框回归结果以及对应的飞机的类别置信度分数;飞机检测模型通过已标注飞机类别和目标边框的SAR图像数据集训练生成的;飞机检测模型包括分类分支和回归分支,分类分支具有可变形区域关联模块,可变形区域关联模块通过可变形卷积分支和常规卷积分支进行特征加权整合。本发明的通过引入可变形关联模块构建分类分支,显著提升特征关联能力,进一步通过已标注飞机类别和目标边框的SAR图像数据集进行模型训练,充分利用SAR飞机散射特征信息的先验知识。本发明提高了在复杂的SAR图像中检测飞机的性能,实现了精确的飞机检测识别。In the embodiment of the present invention, the input SAR image is first obtained; the aircraft detection model is used to analyze the input SAR image, and the aircraft detection result is obtained; wherein, the aircraft detection result includes at least one target frame regression result and the corresponding category confidence score of the aircraft ;The aircraft detection model is generated by training the SAR image data set that has marked the aircraft category and the target frame; the aircraft detection model includes a classification branch and a regression branch. The classification branch has a deformable area association module, and the deformable area association module is integrated through the deformable volume Feature-weighted integration of branch and conventional convolution branch. In the present invention, the classification branch is constructed by introducing a deformable association module, which significantly improves the feature association ability, further performs model training through the SAR image data set with marked aircraft category and target frame, and makes full use of the prior knowledge of SAR aircraft scattering feature information. The invention improves the performance of detecting aircraft in complicated SAR images, and realizes accurate detection and identification of aircraft.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.
图1为本发明实施例提供的一种SAR飞机检测方法的流程示意图;Fig. 1 is a schematic flow chart of a SAR aircraft detection method provided by an embodiment of the present invention;
图2为YOLOX模型的结构示意图;Figure 2 is a schematic diagram of the structure of the YOLOX model;
图3为本发明实施例提供的Swin Transformer的结构示意图;Fig. 3 is a schematic structural diagram of the Swin Transformer provided by the embodiment of the present invention;
图4为本发明实施例提供的Swin Transformer层的结构示意图;Fig. 4 is the structural representation of the Swin Transformer layer that the embodiment of the present invention provides;
图5为本发明实施例提供的DRCM的结构示意图;FIG. 5 is a schematic structural diagram of a DRCM provided by an embodiment of the present invention;
图6为本发明实施例提供的DSFCN的结构示意图;FIG. 6 is a schematic structural diagram of a DSFCN provided by an embodiment of the present invention;
图7为本发明实施例提供的构建DSFCN模型的流程示意图。Fig. 7 is a schematic flowchart of constructing a DSFCN model provided by an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
首先需要说明的是,现有关于SAR的飞机检测技术包括:First of all, it should be explained that the existing SAR aircraft detection technologies include:
1、Fu等人在其发表的论文“Aircraft Recognition in SAR Images Based onScattering Structure Feature and Template Matching.”(IEEE Journal of SelectedTopics in Applied Earth Observation and Remote Sensing,(2018)4206-4217.)提出了一种基于模板匹配的飞机识别方法,其分析了飞机的散射特性,并使用高斯混合模型来提取目标的散射结构特征。该方法在匹配阶段通过提出的样本决策优化算法提高候选模板选择的效率,并在检测阶段提出坐标平移Kullback-Leibler散度方法以实现检测的平移不变性。然而,此种改进的模板匹配方法仍然存在匹配精度较低,且需要根据特定数据的属性手工制作特征,难以适用于其他分辨率SAR图像的问题。1. Fu et al. proposed a paper "Aircraft Recognition in SAR Images Based on Scattering Structure Feature and Template Matching." (IEEE Journal of SelectedTopics in Applied Earth Observation and Remote Sensing, (2018) 4206-4217.) An aircraft recognition method based on template matching, which analyzes the scattering characteristics of the aircraft, and uses a Gaussian mixture model to extract the scattering structure features of the target. In the matching stage, the proposed sample decision optimization algorithm is used to improve the efficiency of candidate template selection, and a coordinate translation Kullback-Leibler divergence method is proposed in the detection stage to achieve the translation invariance of detection. However, this improved template matching method still has the problems of low matching accuracy and the need to manually craft features according to the attributes of specific data, making it difficult to apply to other resolution SAR images.
2、Zhao等人在其发表的论文“Pyramid Attention Dilated Network forAircraft Detection in SAR Images.”(IEEE GEOSCI ENCE AND REMOTE SENSINGLETTERS,(2021)662-666.)中提出了一种金字塔注意力拓展网络来检测飞机。该方法考虑了SAR飞机的离散性特征,利用多分支扩张卷积模块来增强飞机离散后向散射特征之间的关系,并使用卷积快注意力模块来细化冗余信息,突出飞机显著特征。虽然该方法考虑了飞机的离散性特征,但其用于提取特征的卷积模块采样灵活性仍然受限,仅能采样于规整的网格,不足以充分建模飞机重要离散区域之间的关联,进而限制了检测的性能。2. In their paper "Pyramid Attention Dilated Network for Aircraft Detection in SAR Images." (IEEE GEOSCI ENCE AND REMOTE SENSINGLETTERS, (2021) 662-666.), Zhao et al. proposed a pyramid attention expansion network to detect airplane. This method considers the discrete characteristics of SAR aircraft, uses the multi-branch dilated convolution module to enhance the relationship between the discrete backscattering features of the aircraft, and uses the convolutional fast attention module to refine redundant information and highlight the salient features of the aircraft . Although this method considers the discrete features of the aircraft, the sampling flexibility of the convolution module used to extract features is still limited, and it can only be sampled on a regular grid, which is not enough to fully model the relationship between important discrete areas of the aircraft , which limits the detection performance.
3、Guo等人在其发表的论文“Scattering Enhanced Attention Pyramid Networkfor Aircraft Detection in SAR Images.”(IEEE Transactions on Geoscience andRemote Sensing,(2021)7570-7587.)中提出了一种散射增强注意力金字塔网络来检测飞机。该方法通过Harris-Laplace角点检测器提取飞机强散射点,再通过基于密度的噪声空间聚类方法和高斯混合模型进行建模,最后通过Kullback-Leibler散度衡量已知目标与模板之间的相关性,若匹配则增强SAR图片散射信息再送入网络。虽然该方法考虑了使用先验知识对SAR飞机进行建模再辅助网络提取特征的思路,但其仅在图片预处理阶段对部分图片进行γ分布CFAR增强,无法充分发挥网络的强大建模能力。3. Guo et al. proposed a scattering enhanced attention pyramid network in their paper "Scattering Enhanced Attention Pyramid Network for Aircraft Detection in SAR Images." (IEEE Transactions on Geoscience and Remote Sensing, (2021) 7570-7587.) to inspect the aircraft. The method uses the Harris-Laplace corner detector to extract the strong scattering points of the aircraft, and then uses the density-based noise space clustering method and the Gaussian mixture model to model, and finally uses the Kullback-Leibler divergence to measure the distance between the known target and the template. Correlation, if it matches, the scattering information of the SAR picture will be enhanced and then sent to the network. Although this method considers the idea of using prior knowledge to model SAR aircraft and then assisting the network to extract features, it only performs γ-distributed CFAR enhancement on some images in the image preprocessing stage, which cannot give full play to the powerful modeling capabilities of the network.
但是相关现有技术存在如下缺点:1.现有的基于模板匹配的检测方法匹配精度较低,且需要根据特定数据的属性手工制作特征,不适用于不同分辨率的SAR数据。2.现有的基于CNN的检测方法大多原是为了光学图像检测任务设计的,在引入SAR飞机检测时仍然存在考虑SAR图像中飞机散射特性不充分的问题,导致检测效果受到限制。3.现有的考虑SAR图像飞机散射特性辅助网络提取特征的方法存在手工提取特征与网络提取特征为两个独立流程,没有实现有机结合的问题,不利于网络性能的显著改善。However, the relevant prior art has the following disadvantages: 1. The existing detection method based on template matching has low matching accuracy, and needs to manually make features according to the attributes of specific data, which is not suitable for SAR data with different resolutions. 2. Most of the existing CNN-based detection methods were originally designed for optical image detection tasks. When introducing SAR aircraft detection, there is still the problem of insufficient consideration of aircraft scattering characteristics in SAR images, resulting in limited detection effects. 3. The existing method of considering the aircraft scattering characteristics of the SAR image to assist the network to extract features has two independent processes, the manual feature extraction and the network feature extraction, and there is no problem of organic integration, which is not conducive to the significant improvement of network performance.
鉴于此,一方面,参照图1,本发明的实施例提供了一种SAR飞机检测方法,包括:In view of this, on the one hand, with reference to FIG. 1, an embodiment of the present invention provides a SAR aircraft detection method, including:
S100、获取输入的SAR图像;S100. Obtain an input SAR image;
获取待进行飞机检测的输入的SAR图像,通常采用1m分辨率SAR图片,并统一为640×640像素。Obtain the input SAR image for aircraft detection, usually using 1m resolution SAR image, and uniformly 640×640 pixels.
S200、利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果;S200. Using the aircraft detection model, analyze the input SAR image to obtain an aircraft detection result;
需要说明的是,飞机检测结果包括至少一个目标边框回归结果以及对应的飞机的类别置信度分数;飞机检测模型通过已标注飞机类别和目标边框的SAR图像数据集训练生成的;飞机检测模型包括分类分支和回归分支,分类分支具有可变形区域关联模块,可变形区域关联模块通过可变形卷积分支和常规卷积分支进行特征加权整合;It should be noted that the aircraft detection results include at least one target frame regression result and the category confidence score of the corresponding aircraft; the aircraft detection model is generated by training the SAR image dataset that has marked the aircraft category and target frame; the aircraft detection model includes classification Branch and regression branch, the classification branch has a deformable area association module, and the deformable area association module performs feature weighted integration through the deformable convolution branch and the conventional convolution branch;
一些实施例中,还包括:基于具有可变形区域关联模块的分类分支和回归分支,创建可变形散射特征关联网络模型;其中,可变形散射特征关联网络模型包括SwinTransformer骨干、路径聚合特征金字塔网络和解耦头;解耦头包括具有可变形区域关联模块的分类分支和回归分支。其中,基于具有可变形区域关联模块的分类分支和回归分支,创建可变形散射特征关联网络模型,包括:构建YOLOX模型;其中,YOLOX模型包括Darknet-53骨干、路径聚合特征金字塔网络和原解耦头;原解耦头包括原分类分支和回归分支;通过引入Swin Transformer骨干替换YOLOX模型的Darknet-53骨干,以及通过引入具有可变形区域关联模块的分类分支替换YOLOX模型的原分类分支,创建得到可变形散射特征关联网络模型;其中,Swin Transformer骨干包括24层Swin Transformer层。In some embodiments, it also includes: based on the classification branch and the regression branch with the deformable area association module, creating a deformable scattering feature association network model; wherein, the deformable scattering feature association network model includes SwinTransformer backbone, path aggregation feature pyramid network and Decoupling head; the decoupling head includes a classification branch and a regression branch with deformable region association modules. Among them, based on the classification branch and the regression branch with the deformable area association module, the deformable scattering feature association network model is created, including: constructing the YOLOX model; wherein, the YOLOX model includes the Darknet-53 backbone, the path aggregation feature pyramid network and the original decoupling head; the original decoupling head includes the original classification branch and regression branch; by introducing the Swin Transformer backbone to replace the Darknet-53 backbone of the YOLOX model, and by introducing the classification branch with a deformable region association module to replace the original classification branch of the YOLOX model, the created Deformable scattering feature association network model; among them, the Swin Transformer backbone includes 24 layers of Swin Transformer layers.
其中,一些实施例中,还包括:根据已标注飞机类别和目标边框的SAR图像数据集,确定训练样本;其中,目标边框中标记有预设数量的强散射区域;根据训练样本,基于预设训练参数,采用随机梯度下降SGD训练对可变形散射特征关联网络模型进行分类和回归训练,并结合模型损失函数通过损失值反向传播方法优化网络模型参数,获得飞机检测模型。Among them, in some embodiments, it also includes: according to the SAR image data set of the marked aircraft category and the target frame, determine the training sample; wherein, the target frame is marked with a preset number of strong scattering areas; according to the training sample, based on the preset For training parameters, the stochastic gradient descent SGD training is used to classify and regression train the deformable scattering feature association network model, and combine the model loss function to optimize the network model parameters through the loss value backpropagation method to obtain the aircraft detection model.
其中,一些具体实施例中,获取已标注飞机类别和目标边框的SAR图像数据集的具体步骤如下:Wherein, in some specific embodiments, the specific steps of obtaining the SAR image data set of marked aircraft category and target frame are as follows:
步骤(1):将从高分三号卫星采集的世界各地多个机场的1m分辨率SAR图片统一裁剪大小为640×640像素,得到具有7类别5429架飞机的2204张SAR图片数据集。其中每架飞机的类别和目标矩形边框(即目标边框)由专家标注,飞机类别依据型号划分有A220、A320-321、A330、ARJ21、Boeing737-800、Boeing787和其它,目标矩形边框为飞机的外切矩形。Step (1): The 1m resolution SAR images collected from GF-3 satellites around the world are uniformly cropped to a size of 640×640 pixels, and a data set of 2204 SAR images with 5429 aircraft in 7 categories is obtained. Among them, the category of each aircraft and the target rectangular frame (i.e. target frame) are marked by experts. The aircraft types are divided into A220, A320-321, A330, ARJ21, Boeing737-800, Boeing787 and others according to the model. The target rectangular frame is the outer frame of the aircraft. Cut rectangles.
步骤(2):将SAR图片按3:1的比例随机分为训练集(即训练样本)和测试集。Step (2): Randomly divide the SAR images into a training set (ie training samples) and a test set at a ratio of 3:1.
步骤(3):标记强散射区域(Strong Scattering Region,SSR),对于SAR图片中的每一架飞机,在其目标矩形边框内提取25个SSR,进一步地,步骤为:Step (3): mark the strong scattering region (Strong Scattering Region, SSR), for each aircraft in the SAR picture, extract 25 SSRs within its target rectangular frame, further, the steps are:
步骤(31):对边框里的像素进行均值滤波,其中滤波器的半径为边框较短一边的十分之一。Step (31): Perform mean filtering on the pixels in the border, where the radius of the filter is one-tenth of the shorter side of the border.
步骤(32):从滤波后边框里的像素点中选择值最大的点,作为第一个半径为圆形SSR的中心点。Step (32): Select the point with the largest value from the pixels in the filtered border as the center point of the first circular SSR with a radius.
步骤(33):排除落在第一个SSR内的像素,从剩下的像素中选择值最大的点作为第二个SSR的中心点。Step (33): exclude the pixels falling in the first SSR, and select the point with the largest value from the remaining pixels as the center point of the second SSR.
步骤(34):重复以上步骤,最后得到25个SSR。Step (34): Repeat the above steps to finally get 25 SSRs.
具体地,利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果,包括:对SAR图像进行分割拼接和线性映射,得到第一特征图;通过Swin Transformer骨干,使用连续的Swin Transformer层交替进行常规窗口划分和转移窗口划分,得到预设规格的第二特征图;通过路径聚合特征金字塔网络,对第二特征图进行语义特征提取,得到预设规格的第三特征图;其中,语义特征提取包括上采样、拼接和卷积;通过回归分支,对第三特征图进行位置预测,得到目标边框回归结果;其中,目标边框回归结果包括边框的中心点、宽度和高度;通过具有可变形区域关联模块的分类分支,使用可变形卷积分支和常规卷积分支对第三特征图进行特征加权整合,得到与目标边框回归结果对应的预设类别的飞机的类别置信度分数。需要说明的是,目标边框回归结果包括在SAR图像中检测识别得到的若干目标边框的回归结果,而飞机的类别置信度分数的数量与目标边框数量一直,且一一对应各目标边框中包含的飞机。Specifically, the aircraft detection model is used to analyze the input SAR image, and the aircraft detection result is obtained, including: segmenting and splicing the SAR image and linear mapping to obtain the first feature map; through the Swin Transformer backbone, using continuous Swin Transformer layers Alternately perform conventional window division and transfer window division to obtain the second feature map of the preset specification; through the path aggregation feature pyramid network, perform semantic feature extraction on the second feature map to obtain the third feature map of the preset specification; where, semantic Feature extraction includes upsampling, splicing and convolution; through the regression branch, the position prediction of the third feature map is performed to obtain the target frame regression result; wherein, the target frame regression result includes the center point, width and height of the frame; through a deformable The classification branch of the area association module uses the deformable convolution branch and the conventional convolution branch to carry out feature weighted integration on the third feature map to obtain the category confidence score of the preset category of aircraft corresponding to the target frame regression result. It should be noted that the target frame regression results include the regression results of several target frames detected and recognized in the SAR image, and the number of aircraft category confidence scores is the same as the number of target frames, and one-to-one corresponds to each target frame. airplane.
其中,一些实施例中,Swin Transformer层包括归一化函数、基于窗口的多头自注意力机制和多层感知机,Swin Transformer层处理特征图的步骤,包括:通过归一化函数对目标特征图进行归一化,得到特征图X2;根据特征图X2,使用基于窗口的多头自注意力机制,通过线性映射和基于通道维度拼接,得到特征图X3;根据目标特征图和特征图X3相加,得到特征图X4;通过归一化函数对特征图X4进行归一化,得到特征图X5;通过多层感知机,对特征图X5使用GELU非线性激活器激活,得到特征图X6;其中,多层感知机包括两层全连接层;根据特征图X4和特征图X6相加,得到特征图X7。Wherein, in some embodiments, the Swin Transformer layer includes a normalization function, a window-based multi-head self-attention mechanism and a multi-layer perceptron, and the step of processing the feature map in the Swin Transformer layer includes: applying a normalization function to the target feature map Perform normalization to obtain the feature map X 2 ; according to the feature map X 2 , use the window-based multi-head self-attention mechanism to obtain the feature map X 3 through linear mapping and splicing based on the channel dimension; according to the target feature map and feature map X 3 are added together to obtain the feature map X 4 ; the feature map X 4 is normalized by a normalization function to obtain the feature map X 5 ; the feature map X 5 is activated by a GELU nonlinear activator through a multi-layer perceptron, The feature map X 6 is obtained; wherein, the multi-layer perceptron includes two fully connected layers; the feature map X 7 is obtained according to the addition of the feature map X 4 and the feature map X 6 .
其中,一些实施例中,通过具有可变形区域关联模块的分类分支,使用可变形卷积分支和常规卷积分支对第三特征图进行特征加权整合,得到与目标边框回归结果对应的预设类别的飞机的类别置信度分数,包括:通过可变形卷积分支,对第三特征图进行第一卷积,得到特征图X9;对特征图X9进行第二卷积得到采样点偏移图;对特征图X9进行第三卷积得到分数掩膜;根据特征图X9、采样点偏移图和分数掩膜,进行可变形卷积,得到特征图Y;通过常规卷积分支,对第三特征图进行连续两次卷积,得到特征图Z;基于预设的可学习超参数,将特征图Y和特征图Z进行加权相加和特征通道信息整合,得到与目标边框回归结果对应的预设类别的飞机的类别置信度分数。Among them, in some embodiments, through the classification branch with the deformable region association module, the deformable convolution branch and the conventional convolution branch are used to carry out feature weighted integration on the third feature map to obtain the preset category corresponding to the regression result of the target frame The category confidence score of the aircraft, including: through the deformable convolution branch, the first convolution is performed on the third feature map to obtain the feature map X 9 ; the second convolution is performed on the feature map X 9 to obtain the sampling point offset map ; The third convolution is performed on the feature map X 9 to obtain the score mask; according to the feature map X 9 , the sampling point offset map and the score mask, deformable convolution is performed to obtain the feature map Y; through the conventional convolution branch, the The third feature map performs two consecutive convolutions to obtain the feature map Z; based on the preset learnable hyperparameters, the feature map Y and the feature map Z are weighted and added and the feature channel information is integrated to obtain the corresponding target border regression result Class confidence scores for aircraft in the preset class of .
一些具体实施例中,创建可变形散射特征关联网络(DSFCN)模型,包括以下步骤:In some specific embodiments, creating a deformable scattering feature association network (DSFCN) model includes the following steps:
(1)如图2所示,首先构建YOLOX模型。其中,YOLOX模型包括Darknet-53骨干、路径聚合特征金字塔网络(Path Aggregation Feature Pyramid Network,PAFPN)和原解耦头(Decoupled Head);原解耦头包括原分类分支(Classification)和回归分支(Regression)。YOLOX模型进行飞机检测的步骤具体包括:(1) As shown in Figure 2, first construct the YOLOX model. Among them, the YOLOX model includes the Darknet-53 backbone, Path Aggregation Feature Pyramid Network (Path Aggregation Feature Pyramid Network, PAFPN) and the original decoupled head (Decoupled Head); the original decoupled head includes the original classification branch (Classification) and regression branch (Regression ). The steps of the YOLOX model for aircraft detection include:
1.1、对于一张输入的640×640的SAR图片,通过Darknet-53骨干提取基础语义特征,输出80×80、40×40和20×20三张不同尺寸的特征图。1.1. For an input 640×640 SAR picture, the basic semantic features are extracted through the Darknet-53 backbone, and three feature maps of 80×80, 40×40 and 20×20 are output.
1.2、将三张特征图送入到路径聚合特征金字塔网络(Path Aggregation FeaturePyramid Network,PAFPN)颈部,通过上采样、拼接、卷积等操作充分产生语义特征,输出80×80、40×40和20×20三张不同尺寸的特征图。1.2. Send the three feature maps to the neck of the Path Aggregation Feature Pyramid Network (PAFPN), fully generate semantic features through operations such as upsampling, splicing, and convolution, and output 80×80, 40×40 and 20×20 feature maps of three different sizes.
1.3、将三张特征图送入解耦头(Decoupled Head,即原解耦头)中输出最终预测结果。解耦头包括一个3×3卷积层和两个由3×3卷积层堆叠而成的分支:分类分支Classification与回归分支Regression。回归分支输出特征图上每个位置预测的目标边框回归结果,包括边框的中心点、宽度和高度,以及此边框是否包含目标的置信度分数;分类分支输出回归分支所预测边框中可能包含的每一个类别的置信度分数。1.3. Send the three feature maps to the decoupled head (the original decoupled head) to output the final prediction result. The decoupling head includes a 3×3 convolutional layer and two branches stacked by 3×3 convolutional layers: classification branch Classification and regression branch Regression. The regression branch outputs the regression result of the target frame predicted by each position on the feature map, including the center point, width and height of the frame, and whether the frame contains the confidence score of the target; the classification branch outputs each possible contained in the frame predicted by the regression branch. Confidence score for a category.
(2)引入Swin Transformer骨干以替换Darknet53骨干(Swin Transformer骨干的结构如图3),其中一共包括了24层Swin Transformer层(STL)(Swin Transformer层的结构如图4)。对于Swin Transformer骨干,进行SAR图像处理的步骤为:(2) Introduce the Swin Transformer backbone to replace the Darknet53 backbone (the structure of the Swin Transformer backbone is shown in Figure 3), which includes a total of 24 layers of Swin Transformer layers (STL) (the structure of the Swin Transformer layer is shown in Figure 4). For the Swin Transformer backbone, the steps for SAR image processing are:
2.1、将输入SAR图片均匀分割成16份并拼接,再经过线性映射得到特征图X1。2.1. The input SAR image is evenly divided into 16 parts and stitched together, and then the feature map X 1 is obtained through linear mapping.
2.2、特征图X1经过一个层归一化函数(Layer Normalization,LN)归一化特征得到特征图X2。2.2. The feature map X 1 is normalized by a layer normalization function (Layer Normalization, LN) to obtain the feature map X 2 .
2.3、使用基于窗口的多头自注意力机制(Window-based Multi-head Self-attention,W-MSA)提取特征。先将特征图划分为不重叠的具有7×7个特征向量的局部窗口,每个窗口中的所有特征向量都视为令牌。接着利用线性映射得到所有令牌的查询向量矩阵(Q),关键词向量矩阵(K),内容向量矩阵(V)。对应表达式如下:2.3. Use Window-based Multi-head Self-attention (W-MSA) to extract features. The feature map is first divided into non-overlapping local windows with 7×7 feature vectors, and all feature vectors in each window are regarded as tokens. Then use linear mapping to get query vector matrix (Q), keyword vector matrix (K) and content vector matrix (V) of all tokens. The corresponding expression is as follows:
Q=X2PQ,K=X2PK,V=X2PV Q=X 2 P Q , K=X 2 P K , V=X 2 P V
其中,PQ、PK和PV代表三个可学习的线性映射的矩阵。Among them, P Q , P K and PV represent three matrices of learnable linear maps.
对得到的矩阵执行分头自注意力机制,即将得到的Q,K,V向量各自沿着通道维分为n组,对n组里的每一子组Qs,Ks,Vs分别执行自注意力机制。对应表达式如下:Perform a separate self-attention mechanism on the obtained matrix, that is, divide the obtained Q, K, and V vectors into n groups along the channel dimension, and perform self-attention on each subgroup Q s , K s , and V s in the n groups. attention mechanism. The corresponding expression is as follows:
其中,B代表可学习的相对位置编码矩阵,d代表Q,K,V向量的通道维度。再将各子组得到的结果沿通道维拼接,得到输出特征图X3上各个特征向量的值。Among them, B represents the learnable relative position encoding matrix, and d represents the channel dimension of Q, K, V vectors. Then the results obtained by each subgroup are spliced along the channel dimension to obtain the value of each feature vector on the output feature map X 3 .
2.4、特征图X3与原输入特征图X1相加得到特征图X4。2.4. The feature map X 3 is added to the original input feature map X 1 to obtain the feature map X 4 .
2.5、特征图X4经过一个LN归一化特征得到特征图X5。2.5. Feature map X 4 obtains feature map X 5 through an LN normalized feature.
2.6、特征图X5通过一个具有两层全连接层,并且中间使用GELU非线性激活器激活的多层感知机(Multilayer Perceptron,MLP),得到特征图X6。2.6. The feature map X 5 obtains the feature map X 6 through a multilayer perceptron (Multilayer Perceptron, MLP) with two fully connected layers and a GELU nonlinear activator in the middle.
2.7、特征图X6与步骤2.4得到的特征图X4相加,得到特征图X7。2.7. The feature map X 6 is added to the feature map X 4 obtained in step 2.4 to obtain the feature map X 7 .
2.8、步骤(2.2)至步骤(2.7)为一次STL提取特征的流程,其中,上述X1至X7。仅用于象征对应阶段的特征图数据,不能看做对对特征图数据的限制,在Swin Transformer骨干中不同的STL中,对应各阶段的X2至X7有所区别,其中,目标特征图包括特征图X1以及上一层STL输出的X7。STL划分窗口分为两种形式,一种为直接从特征图(0,0)点开始划分局部窗口,称为常规窗口划分;一种为从(3,3)点开始划分窗口,称为转移窗口划分。特征图重复步骤(2.2)至步骤(2.7),常规窗口划分和转移窗口划分在连续的STL中交替进行。在第4、8、20、24个STL处将特征图均匀分割成4份并拼接,获得尺寸减半维度加倍的特征图,最后输出第8、20、24个STL处的80×80、40×40和20×20三张不同尺寸的特征图(即预设规格的第二特征图)。2.8. Steps (2.2) to (2.7) are a process of STL feature extraction, wherein, the above-mentioned X 1 to X 7 . It is only used to symbolize the feature map data of the corresponding stage, and cannot be regarded as a restriction on the feature map data. In different STLs in the Swin Transformer backbone, X 2 to X 7 corresponding to each stage are different. Among them, the target feature map Including feature map X 1 and X 7 output by the previous layer of STL. The STL division window is divided into two forms, one is to divide the local window directly from the feature map (0,0), which is called conventional window division; the other is to divide the window from (3,3) point, which is called transfer window division. The feature map repeats steps (2.2) to (2.7), and regular window division and transfer window division are alternately performed in consecutive STLs. At the 4th, 8th, 20th, and 24th STL, the feature map is evenly divided into 4 parts and stitched together to obtain a feature map with a size halved and a dimension doubled, and finally output 80×80, 40 at the 8th, 20, and 24th STL ×40 and 20×20 three feature maps of different sizes (that is, the second feature map of the preset specification).
(3)引入可变形区域关联模块DRCM以替换YOLOX的分类分支(DRCM的结构如图5),DRCM可分为可变形卷积分支(上)和常规卷积分支(下)。DRCM进行数据处理的步骤具体包括:(3) The deformable region association module DRCM is introduced to replace the classification branch of YOLOX (the structure of DRCM is shown in Figure 5). DRCM can be divided into deformable convolution branch (top) and conventional convolution branch (bottom). The steps of data processing by DRCM include:
3.1、经过PAFPN提取特征后的特征图X8(即第三特征图),在上面的分支进行3×3卷积得到特征图X9。3.1. The feature map X 8 (that is, the third feature map) after the feature extraction by PAFPN is performed on the upper branch to perform 3×3 convolution to obtain the feature map X 9 .
3.2、X9分别经过两个卷积层,得到一个维度为H×W×2的采样点偏移图Δp和一个维度为H×W×1的分数掩膜Δm。3.2 and X 9 go through two convolutional layers respectively to obtain a sampling point offset map Δp with a dimension of H×W×2 and a fractional mask Δm with a dimension of H×W×1.
3.3、将Δp和Δm嵌入到5×5的可变形卷积中,X9经过此卷积得到输出特征图Y。表达式如下:3.3. Embed Δp and Δm into a 5×5 deformable convolution, and X 9 obtains an output feature map Y through this convolution. The expression is as follows:
其中,p代表特征图上的位置,k代表第个k采样点,ωk代表第k个偏移采样点的权重,p+pk代表5×5常规卷积的第k个采样位置,Δpk代表5×5常规卷积的第k个采样位置的偏移量,Δmk代表第k个偏移采样点的分数值。Among them, p represents the position on the feature map, k represents the k-th sampling point, ω k represents the weight of the k-th offset sampling point, p+p k represents the k-th sampling position of the 5×5 regular convolution, Δp k represents the offset of the k-th sampling position of the 5×5 regular convolution, and Δm k represents the fractional value of the k-th offset sampling point.
3.4、在下面的分支,对特征图X8连续进行两个3×3卷积得到特征图Z。设置两个可学习的超参数α和β,其中α+β=1,通过α和β将Y与Z加权相加,并经过一个1×1卷积整合特征通道信息,输出特征图上各个位置对7类飞机的置信度分数。3.4. In the following branch, perform two consecutive 3×3 convolutions on the feature map X 8 to obtain the feature map Z. Set two learnable hyperparameters α and β, where α+β=1, add Y and Z weights through α and β, and integrate feature channel information through a 1×1 convolution, and output each position on the feature map Confidence score for class 7 aircraft.
一些具体实施例中,基于损失函数通过损失值反向传播方法优化网络模型参数的具体步骤如下:In some specific embodiments, the specific steps of optimizing the network model parameters through the loss value backpropagation method based on the loss function are as follows:
4.1、在分类分支中,对特征图上的每个位置的分数向量,计算带有Sigmoid激活的交叉熵分类损失函数。4.1. In the classification branch, calculate the cross-entropy classification loss function with Sigmoid activation for the score vector of each position on the feature map.
其中,N为飞机的类别数;yi代表是否为第i类目标,若是则为1,否则为0;pi代表网络预测第i类目标的置信度分数。Sigmoid对于输入x计算激活值。Among them, N is the number of categories of the aircraft; y i represents whether it is the i-th type of target, if so, it is 1, otherwise it is 0; p i represents the confidence score of the network predicting the i-th type of target. Sigmoid computes activations for input x.
4.2、对所有的分类损失取平均,得到总分类损失Lcls。4.2. Take the average of all classification losses to obtain the total classification loss L cls .
Lcls=mean(lcls)L cls = mean(l cls )
4.3、对特征图上负责预测目标的位置得出的偏移采样点,计算倒角距离损失。4.3. Calculate the chamfering distance loss for the offset sampling points that are responsible for predicting the position of the target on the feature map.
其中,代表在第i个目标边框中,第n个预测出的偏移采样点位置;/>代表在第i个目标边框中,第m个标记的强散射区域的中心点位置;ri代表第i个目标边框中标记的强散射区域的半径。in, Represents the position of the nth predicted offset sampling point in the i-th target frame; /> Represents the center point position of the m-th marked strong scattering area in the i-th target frame; r i represents the radius of the marked strong-scattering area in the i-th target frame.
4.4、对所有的SSR预测损失取平均,得到总SSR预测损失。4.4. Average all SSR prediction losses to obtain the total SSR prediction loss.
Lssr=mean(lssr)L ssr = mean(l ssr )
4.5、在回归分支中,对特征图上负责预测目标的位置回归出的边框(boundingbox,bbox),计算交并比(Intersection over Union,IoU)回归损失函数。4.5. In the regression branch, calculate the Intersection over Union (IoU) regression loss function for the bounding box (boundingbox, bbox) that is responsible for predicting the position of the target on the feature map.
其中,bboxpre为网络模型预测的边框,bboxgt为目标的真实边框标签,Intersection为计算交集的函数,Union为计算并集的函数。Among them, bbox pre is the bounding box predicted by the network model, bbox gt is the real bounding box label of the target, Intersection is a function to calculate the intersection, and Union is a function to calculate the union.
4.6、对所有的回归损失取平均,得到总回归损失Lreg。4.6. Take the average of all regression losses to obtain the total regression loss L reg .
Lreg=mean(lreg)L reg = mean(l reg )
4.7、在回归分支中,对特征图上每个位置预测的是否存在目标的分数计算1范数损失。4.7. In the regression branch, the 1-norm loss is calculated for the score of whether there is a target predicted for each position on the feature map.
其中,y为真值,当存在目标时为1,否则为0,为网络模型的预测值。Among them, y is the true value, which is 1 when there is a target, and 0 otherwise, is the predicted value of the network model.
4.8、对所有位置的目标存在损失取平均,得到总目标存在损失。4.8. Average the target existence loss of all positions to obtain the total target existence loss.
Lobj=mean(lobj)L obj =mean(l obj )
4.9、计算训练的总损失值。4.9. Calculate the total loss value of the training.
L=αLcls+βLreg+γLobj L=αL cls +βL reg +γL obj
其中,加权系数α=3,β=3,γ=1,λ=1。Wherein, the weighting coefficients α=3, β=3, γ=1, λ=1.
最终,通过损失值反向传播方法优化网络模型参数。Finally, the network model parameters are optimized through the loss value backpropagation method.
一些具体实施例中,基于预设训练参数,采用随机梯度下降SGD训练对可变形散射特征关联网络模型进行分类和回归训练的具体步骤如下:In some specific embodiments, based on the preset training parameters, the specific steps for classifying and regressing the deformable scattering feature association network model using stochastic gradient descent SGD training are as follows:
5.1、采用随机梯度下降SGD训练,周期数为100,批次大小为8,优化器权值衰减参数和动量分别设置为0.0005和0.9。5.1. Use stochastic gradient descent SGD training, the number of cycles is 100, the batch size is 8, and the weight decay parameter and momentum of the optimizer are set to 0.0005 and 0.9, respectively.
5.2、学习率初始化为0.002,并使用余弦退火策略随着周期数逐步降低到0.0001。5.2. The learning rate is initialized to 0.002, and the cosine annealing strategy is used to gradually decrease to 0.0001 with the number of cycles.
5.3、给定一批输入SAR图片,首先经过Swin Transformer骨干,接着经过PAFPN颈部充分产生语义特征,再经过DRCM输出分类结果和通过回归分支输出回归结果。并基于输出结果进行模型参数的调整。5.3. Given a batch of input SAR images, first pass through the Swin Transformer backbone, then pass through the PAFPN neck to fully generate semantic features, and then pass through DRCM to output classification results and output regression results through the regression branch. And adjust the model parameters based on the output results.
在一些具体实施例中,还包括对可变形散射特征关联网络(DSFCN)模型进行测试的步骤,如图6所示,具体包括以下步骤:In some specific embodiments, it also includes the step of testing the deformable scattering feature association network (DSFCN) model, as shown in Figure 6, specifically including the following steps:
6.1、对训练好的DSFCN模型,通过测试数据集计算精确率(Precision),召回率(Recall),平均精度均值(mean Average Precision,mAP),其中预测边框的分数为回归分支中的是否包含目标的置信度分数和分类分支中的最高类别置信度分数之积,当值大于0.5时视作存在目标,类别即为分类分支中置信度最高的类别;当预测的边框与真值边框交并比大于0.5且类别一致时视为成功匹配。6.1. For the trained DSFCN model, calculate the precision rate (Precision), recall rate (Recall), and mean average precision (mAP) through the test data set, where the score of the predicted frame is whether the target is included in the regression branch The product of the confidence score of the classification branch and the highest category confidence score in the classification branch, when the value is greater than 0.5, it is regarded as the existence of the target, and the category is the category with the highest confidence in the classification branch; When it is greater than 0.5 and the categories are consistent, it is considered a successful match.
6.2、计算模型参数量。6.2. Calculation of model parameters.
6.3、通过特征可视化技术ScoreCAM可视化每片图像区域对检测结果的激活程度。6.3. Visualize the activation degree of each image area to the detection result through the feature visualization technology ScoreCAM.
6.4可视化DRCM中可变形卷积的偏移采样预测结果。6.4 Visualizing offset-sampled prediction results of deformable convolutions in DRCM.
其中,如图7所示,DSFCN模型整体的构建步骤为:Among them, as shown in Figure 7, the overall construction steps of the DSFCN model are:
(1)获取输入的SAR图像,需要说明的是,此处的SAR图像表示已标注飞机类别和目标边框的SAR图像数据集;(1) Obtain the input SAR image. It should be noted that the SAR image here represents the SAR image data set that has marked the aircraft category and the target frame;
(2)构建DSFCN模型;包括:构建YOLOX模型;引入Swin Transformer;引入DRCM;构建模型损失函数;(2) Construct DSFCN model; including: construct YOLOX model; introduce Swin Transformer; introduce DRCM; construct model loss function;
(3)训练DSFCN模型;(3) Training DSFCN model;
(4)测试DSFCN模型。(4) Test the DSFCN model.
具体地,一些实施例中,通过消融实验验证本发明的效果,步骤6.1、6.2的消融实验结果如表1。Specifically, in some embodiments, the effects of the present invention are verified through ablation experiments, and the results of the ablation experiments in steps 6.1 and 6.2 are shown in Table 1.
表1Table 1
与其它现有算法的性能对比结果如表2。The performance comparison results with other existing algorithms are shown in Table 2.
表2Table 2
从表1、2结果可见,本发明提出的两个改进模块使得网络模型在降低了参数量的情况下,在各项指标上都取得了更高的分数,显著提高了检测性能,且优于现有的其它算法。优越的网络模型性能表明了具有自注意力机制的Swin Transformer模块能够精细化地提取SAR图像的散射特征,具有灵活采样能力的DRCM能够关联飞机的显著特征,从而适应飞机在SAR图像中的多变性。As can be seen from the results in Tables 1 and 2, the two improved modules proposed by the present invention enable the network model to achieve higher scores on various indicators while reducing the amount of parameters, significantly improving the detection performance, and better than other existing algorithms. The superior performance of the network model shows that the Swin Transformer module with a self-attention mechanism can finely extract the scattering features of the SAR image, and the DRCM with flexible sampling capabilities can correlate the salient features of the aircraft to adapt to the variability of the aircraft in the SAR image .
综上所述,本发明的目的在于利用SAR图像中飞机离散性与多变性的散射特性,改进神经网络模型中的模块以充分提取飞机特征,提高网络模型对飞机的检测识别性能。卷积是基于模板匹配的思想来建模相邻采样点之间的关系,因此具有局部性的归纳偏差,即相邻采样点具有强相关性。而Swin Transformer使用的自注意力机制可以视为一个自适应滤波器,其权重由点与点之间查询向量和关键词向量的相关性决定,更适合于采样点的长程依赖信息。本发明考虑到飞机在SAR图像中呈现为离散稀疏分布的散射点的集合,飞机上的像素之间关联比较弱,因此摒弃了传统的卷积架构的骨干而采用了Swin Transformer骨干,从而使网络模型能够更充分地提取飞机的散射特征。为了解决原始的卷积只能在规整网格上采样的限制,本发明将可变形卷积嵌入到分类分支中。同时,考虑到在面对SAR飞机的多变性问题时,飞机上的强散射区域是识别飞机的关键,本发明将强散射区域监督信息以损失函数的形式加入网络模型训练过程以指导偏移采样点的预测。相比以往的手工提取特征再预处理增强图片的方法,本发明充分利用网络的强大建模能力,提出了DRCM来自适应地关联携带有飞机显著特征的强散射区域,实现了手工提取特征与网络提取特征的有机结合,具有更强的适应性。本发明实施例通过在先进的YOLOX检测器基础上,采用具备自注意力机制的Swin Transformer作为网络从原始高分辨率SAR图像提取特征的主干,并在检测头中加入具备自动关联飞机显著区域能力的可变形区域关联模块(DRCM),从而构建出新的名为可变形散射特征关联网络(DSFCN)的SAR飞机检测器。相比现有技术方案,本发明所述算法具有卓越的特征提取与关联能力,提高了在复杂的SAR图像中检测飞机的性能,实现了比其它方法更精确的飞机检测识别能力。In summary, the purpose of the present invention is to utilize the scattering characteristics of aircraft discreteness and variability in SAR images, improve the modules in the neural network model to fully extract aircraft features, and improve the detection and recognition performance of the network model for aircraft. Convolution is based on the idea of template matching to model the relationship between adjacent sampling points, so it has a local inductive bias, that is, adjacent sampling points have a strong correlation. The self-attention mechanism used by Swin Transformer can be regarded as an adaptive filter, and its weight is determined by the correlation between query vector and keyword vector between points, which is more suitable for long-range dependent information of sampling points. The present invention considers that the aircraft appears as a collection of discrete and sparsely distributed scattering points in the SAR image, and the correlation between the pixels on the aircraft is relatively weak. Therefore, the backbone of the traditional convolution architecture is abandoned and the backbone of the Swin Transformer is adopted, so that the network The model is able to more fully extract the scattering characteristics of the aircraft. To solve the limitation that original convolutions can only be sampled on regular grids, the present invention embeds deformable convolutions into the classification branch. At the same time, considering that in the face of the variability of SAR aircraft, the strong scattering area on the aircraft is the key to identifying the aircraft, the present invention adds the supervision information of the strong scattering area in the form of a loss function to the network model training process to guide the offset sampling point forecast. Compared with the previous method of manually extracting features and then preprocessing and enhancing pictures, the present invention makes full use of the powerful modeling capabilities of the network, and proposes DRCM to adaptively associate strong scattering areas with prominent aircraft features, realizing the manual extraction of features and network The organic combination of extracted features has stronger adaptability. In the embodiment of the present invention, on the basis of the advanced YOLOX detector, the Swin Transformer with self-attention mechanism is used as the backbone of the network to extract features from the original high-resolution SAR image, and the detection head is added with the ability to automatically correlate significant areas of the aircraft The Deformable Region Correlation Module (DRCM) was used to build a new SAR aircraft detector named Deformable Scatter Feature Correlation Network (DSFCN). Compared with the prior art solutions, the algorithm of the present invention has excellent feature extraction and association capabilities, improves the performance of detecting aircraft in complex SAR images, and realizes more accurate detection and recognition capabilities of aircraft than other methods.
另一方面,本发明的实施例提供了一种SAR飞机检测系统,包括:第一模块,用于获取输入的SAR图像;第二模块,用于利用飞机检测模型,对输入的SAR图像进行分析,获得飞机检测结果;其中,飞机检测结果包括至少一个目标边框回归结果以及对应的飞机的类别置信度分数;飞机检测模型通过已标注飞机类别和目标边框的SAR图像数据集训练生成的;飞机检测模型包括分类分支和回归分支,分类分支具有可变形区域关联模块,可变形区域关联模块通过可变形卷积分支和常规卷积分支进行特征加权整合。On the other hand, an embodiment of the present invention provides a SAR aircraft detection system, including: a first module for acquiring an input SAR image; a second module for analyzing the input SAR image by using an aircraft detection model , to obtain the aircraft detection result; wherein, the aircraft detection result includes at least one target frame regression result and the category confidence score of the corresponding aircraft; the aircraft detection model is generated by training the SAR image data set that has marked the aircraft category and the target frame; the aircraft detection The model includes a classification branch and a regression branch. The classification branch has a deformable area association module, and the deformable area association module performs feature weighted integration through the deformable convolution branch and the regular convolution branch.
本发明方法实施例的内容均适用于本系统实施例,本系统实施例所具体实现的功能与上述方法实施例相同,并且达到的有益效果与上述方法达到的有益效果也相同。The content of the method embodiment of the present invention is applicable to the system embodiment. The functions realized by the system embodiment are the same as those of the method embodiment above, and the beneficial effects achieved are also the same as those achieved by the above method.
本发明实施例的另一方面还提供了一种SAR飞机检测装置,包括处理器以及存储器;Another aspect of the embodiments of the present invention also provides a SAR aircraft detection device, including a processor and a memory;
存储器用于存储程序;The memory is used to store programs;
处理器执行程序实现如前面的方法。The processor executes the program to realize the method as above.
本发明方法实施例的内容均适用于本装置实施例,本装置实施例所具体实现的功能与上述方法实施例相同,并且达到的有益效果与上述方法达到的有益效果也相同。The content of the method embodiment of the present invention is applicable to the device embodiment, and the specific functions realized by the device embodiment are the same as those of the above method embodiment, and the beneficial effects achieved are also the same as those achieved by the above method.
本发明实施例的另一方面还提供了一种计算机可读存储介质,所述存储介质存储有程序,所述程序被处理器执行实现如前面所述的方法。Another aspect of the embodiments of the present invention also provides a computer-readable storage medium, where the storage medium stores a program, and the program is executed by a processor to implement the aforementioned method.
本发明方法实施例的内容均适用于本计算机可读存储介质实施例,本计算机可读存储介质实施例所具体实现的功能与上述方法实施例相同,并且达到的有益效果与上述方法达到的有益效果也相同。The content of the method embodiment of the present invention is applicable to the embodiment of the computer-readable storage medium. The functions realized by the embodiment of the computer-readable storage medium are the same as those of the above-mentioned method embodiment, and the beneficial effect achieved is the same as that achieved by the above-mentioned method. The effect is also the same.
本发明实施例还公开了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器可以从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行前面的方法。The embodiment of the present invention also discloses a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device can read the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the above method.
在一些可选择的实施例中,在方框图中提到的功能/操作可以不按照操作示图提到的顺序发生。例如,取决于所涉及的功能/操作,连续示出的两个方框实际上可以被大体上同时地执行或所述方框有时能以相反顺序被执行。此外,在本发明的流程图中所呈现和描述的实施例以示例的方式被提供,目的在于提供对技术更全面的理解。所公开的方法不限于本文所呈现的操作和逻辑流程。可选择的实施例是可预期的,其中各种操作的顺序被改变以及其中被描述为较大操作的一部分的子操作被独立地执行。In some alternative implementations, the functions/operations noted in the block diagrams may occur out of the order noted in the operational diagrams. For example, two blocks shown in succession may, in fact, be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/operations involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logical flow presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
此外,虽然在功能性模块的背景下描述了本发明,但应当理解的是,除非另有相反说明,所述的功能和/或特征中的一个或多个可以被集成在单个物理装置和/或软件模块中,或者一个或多个功能和/或特征可以在单独的物理装置或软件模块中被实现。还可以理解的是,有关每个模块的实际实现的详细讨论对于理解本发明是不必要的。更确切地说,考虑到在本文中公开的装置中各种功能模块的属性、功能和内部关系的情况下,在工程师的常规技术内将会了解该模块的实际实现。因此,本领域技术人员运用普通技术就能够在无需过度试验的情况下实现在权利要求书中所阐明的本发明。还可以理解的是,所公开的特定概念仅仅是说明性的,并不意在限制本发明的范围,本发明的范围由所附权利要求书及其等同方案的全部范围来决定。Furthermore, although the invention has been described in the context of functional modules, it should be understood that one or more of the described functions and/or features may be integrated into a single physical device and/or unless stated to the contrary. or software modules, or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to understand the present invention. Rather, given the attributes, functions and internal relationships of the various functional blocks in the devices disclosed herein, the actual implementation of the blocks will be within the ordinary skill of the engineer. Accordingly, those skilled in the art can implement the present invention set forth in the claims without undue experimentation using ordinary techniques. It is also to be understood that the particular concepts disclosed are illustrative only and are not intended to limit the scope of the invention which is to be determined by the appended claims and their full scope of equivalents.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes. .
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行装置、装置或设备(如基于计算机的装置、包括处理器的装置或其他可以从指令执行装置、装置或设备取指令并执行指令的装置)使用,或结合这些指令执行装置、装置或设备而使用。就本说明书而言,“计算机可读介质”可以是任何可以包含、存储、通信、传播或传输程序以供指令执行装置、装置或设备或结合这些指令执行装置、装置或设备而使用的装置。The logic and/or steps represented in the flowcharts or otherwise described herein, for example, can be considered as a sequenced listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium, For use with an instruction execution device, device or device (such as a computer-based device, a device including a processor, or other devices that can fetch instructions from an instruction execution device, device or device and execute instructions), or in conjunction with these instruction execution devices, devices or equipment used. For the purposes of this specification, a "computer-readable medium" may be any device that can contain, store, communicate, propagate or transmit a program for use in or in conjunction with an instruction execution device, device or device.
计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。More specific examples (non-exhaustive list) of computer-readable media include the following: electrical connection with one or more wires (electronic device), portable computer disk case (magnetic device), random access memory (RAM), Read Only Memory (ROM), Erasable and Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer-readable medium may even be paper or other suitable medium on which the program can be printed, as it may be possible, for example, by optically scanning the paper or other medium, followed by editing, interpreting, or other suitable processing if necessary. The program is processed electronically and stored in computer memory.
应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行装置执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of the present invention can be realized by hardware, software, firmware or their combination. In the above-described embodiments, various steps or methods may be implemented by software or firmware stored in a memory and executed by a suitable instruction execution device. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques known in the art: Discrete logic circuits, ASICs with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。Although the embodiments of the present invention have been shown and described, those skilled in the art can understand that various changes, modifications, substitutions and modifications can be made to these embodiments without departing from the principle and spirit of the present invention. The scope of the invention is defined by the claims and their equivalents.
以上是对本发明的较佳实施进行了具体说明,但本发明并不限于所述实施例,熟悉本领域的技术人员在不违背本发明精神的前提下还可做出种种的等同变形或替换,这些等同的变形或替换均包含在本发明权利要求所限定的范围内。The above is a specific description of the preferred implementation of the present invention, but the present invention is not limited to the described embodiments, and those skilled in the art can also make various equivalent deformations or replacements without violating the spirit of the present invention. These equivalent modifications or replacements are all within the scope defined by the claims of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310089469.6A CN116310795A (en) | 2023-02-08 | 2023-02-08 | A SAR aircraft detection method, system, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310089469.6A CN116310795A (en) | 2023-02-08 | 2023-02-08 | A SAR aircraft detection method, system, device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116310795A true CN116310795A (en) | 2023-06-23 |
Family
ID=86833287
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310089469.6A Pending CN116310795A (en) | 2023-02-08 | 2023-02-08 | A SAR aircraft detection method, system, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116310795A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116894973A (en) * | 2023-07-06 | 2023-10-17 | 北京长木谷医疗科技股份有限公司 | Integrated learning-based intelligent self-labeling method and device for hip joint lesions |
CN118397257A (en) * | 2024-06-28 | 2024-07-26 | 武汉卓目科技股份有限公司 | SAR image ship target detection method and device, electronic equipment and storage medium |
-
2023
- 2023-02-08 CN CN202310089469.6A patent/CN116310795A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116894973A (en) * | 2023-07-06 | 2023-10-17 | 北京长木谷医疗科技股份有限公司 | Integrated learning-based intelligent self-labeling method and device for hip joint lesions |
CN116894973B (en) * | 2023-07-06 | 2024-05-03 | 北京长木谷医疗科技股份有限公司 | Integrated learning-based intelligent self-labeling method and device for hip joint lesions |
CN118397257A (en) * | 2024-06-28 | 2024-07-26 | 武汉卓目科技股份有限公司 | SAR image ship target detection method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110472627B (en) | An end-to-end SAR image recognition method, device and storage medium | |
Zhang et al. | Real-time detection of cracks on concrete bridge decks using deep learning in the frequency domain | |
CN111401201B (en) | A multi-scale target detection method for aerial images based on spatial pyramid attention-driven | |
CN109871902B (en) | SAR small sample identification method based on super-resolution countermeasure generation cascade network | |
CN112700429A (en) | Airport pavement underground structure disease automatic detection method based on deep learning | |
CN111368896A (en) | A classification method of hyperspectral remote sensing images based on dense residual 3D convolutional neural network | |
CN110018524A (en) | A kind of X-ray safety check contraband recognition methods of view-based access control model-attribute | |
Lu et al. | P_SegNet and NP_SegNet: New neural network architectures for cloud recognition of remote sensing images | |
CN107291855A (en) | A kind of image search method and system based on notable object | |
Meng et al. | Visual inspection of aircraft skin: Automated pixel-level defect detection by instance segmentation | |
Hu et al. | LE–MSFE–DDNet: a defect detection network based on low-light enhancement and multi-scale feature extraction | |
CN116310795A (en) | A SAR aircraft detection method, system, device and storage medium | |
CN106408030A (en) | SAR image classification method based on middle lamella semantic attribute and convolution neural network | |
CN113536963B (en) | SAR image airplane target detection method based on lightweight YOLO network | |
Steno et al. | A novel enhanced region proposal network and modified loss function: threat object detection in secure screening using deep learning | |
Du et al. | Semisupervised SAR ship detection network via scene characteristic learning | |
CN104392459A (en) | Infrared image segmentation method based on improved FCM (fuzzy C-means) and mean drift | |
CN110334656A (en) | Method and device for extracting water bodies from multi-source remote sensing images based on source probability weighting | |
CN110008899B (en) | Method for extracting and classifying candidate targets of visible light remote sensing image | |
WO2024152477A1 (en) | Airport flight zone real-time target detection method based on multiscale feature decoupling | |
CN117437555A (en) | Remote sensing image target extraction processing method and device based on deep learning | |
CN104346814A (en) | SAR (specific absorption rate) image segmentation method based on hierarchy visual semantics | |
CN109034213A (en) | Hyperspectral image classification method and system based on joint entropy principle | |
Zhang et al. | A modified faster region-based convolutional neural network approach for improved vehicle detection performance | |
Liu et al. | Target detection of hyperspectral image based on faster R-CNN with data set adjustment and parameter turning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |