CN116310795A - SAR aircraft detection method, system, device and storage medium - Google Patents

SAR aircraft detection method, system, device and storage medium Download PDF

Info

Publication number
CN116310795A
CN116310795A CN202310089469.6A CN202310089469A CN116310795A CN 116310795 A CN116310795 A CN 116310795A CN 202310089469 A CN202310089469 A CN 202310089469A CN 116310795 A CN116310795 A CN 116310795A
Authority
CN
China
Prior art keywords
branch
aircraft
feature map
feature
deformable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310089469.6A
Other languages
Chinese (zh)
Inventor
丛玉来
陈元嘉
张磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202310089469.6A priority Critical patent/CN116310795A/en
Publication of CN116310795A publication Critical patent/CN116310795A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a SAR aircraft detection method, a system, a device and a storage medium, wherein the method comprises the following steps: acquiring an input SAR image; analyzing the input SAR image by using an airplane detection model to obtain an airplane detection result; the aircraft detection model is generated through SAR image data set training of the marked aircraft category and the target frame; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch. According to the invention, the classification branches are constructed through the deformable association module, the characteristic association capability is remarkably improved, the model training is carried out through the SAR image dataset marked with the airplane category and the target frame, and the priori knowledge of SAR airplane scattering characteristic information is fully utilized. The invention improves the performance of detecting the airplane in the complex SAR image and realizes accurate airplane detection and identification.

Description

SAR aircraft detection method, system, device and storage medium
Technical Field
The invention relates to the technical field of aircraft detection, in particular to a SAR aircraft detection method, a SAR aircraft detection system, a SAR aircraft detection device and a storage medium.
Background
The synthetic aperture radar SAR is an active microwave imaging sensor, and plays an important role in the field of automatic target recognition due to the imaging observation capability of the SAR in all weather. In the field of automatic target recognition, aircraft detection is a very valuable field of application. For example, in the civil field, dynamic monitoring of aircraft paths facilitates efficient management of airports; in the military field, fast and accurate aircraft detection is of great importance for providing military reconnaissance information. Therefore, accurate detection of aircraft in high resolution SAR images is a significant topic of value.
SAR is a method of transmitting electromagnetic waves to an object through an antenna, receiving the reflected electromagnetic waves, and finally imaging through recorded echo information. Due to its unique imaging mechanism, SAR pictures present complex representations that are difficult to interpret. The original method for detecting the object in the SAR image is constant false alarm detection CFAR, which utilizes the property that the echo of the object is often stronger than the background echo, and the region with strong echo is extracted from the SAR image to be regarded as a target. However, in complex scenarios, such as airports with a variety of other devices or buildings, the detection performance of CFAR will be greatly affected, and the target cannot be accurately identified. Furthermore, existing aircraft detection methods also require the ability to identify the class of target that CFAR is not able to achieve.
Aircraft have classical reflective structures such as dihedral angles, top caps, and the like, and under radar illumination, various scattering mechanisms exist, such as direct scattering, multiple scattering, diffraction scattering, and the like. Therefore, in the latter aircraft identification method, an expert in the SAR field proposes to use scattering feature information of an aircraft in the SAR image, and detect the information by using a template matching method, wherein feature extraction for matching is a key step. For example, there are methods to extract salient points on an aircraft with stable scattering features using Harris-Laplace angle point detectors and describe the salient point vectors. Still other methods utilize gaussian mixture models to extract scattering structure features that include strong scattering points of the aircraft and their corresponding distributions. However, such methods still suffer from deficiencies in accuracy and speed due to insufficient manual extraction of features and inefficiency in the process of matching the measured image to the various candidate templates.
With the rapid development of SAR technology, more high-resolution and high-quality SAR images with expert annotation can be obtained, which provides an advantage for the application of the deep learning method in SAR target automatic recognition. In recent years, deep learning methods based on Convolutional Neural Networks (CNNs) have made remarkable progress in target detection, and many CNN methods exhibit performance superior to conventional methods in SAR target detection, such as Yolox, cascade R-CNN, centerNet, repPoints, and the like. However, most CNN methods are designed for optical target detection, and cannot fully exploit the detection performance of SAR aircraft detection when directly introduced without considering the aircraft scattering characteristics. The scattering properties of an aircraft are embodied in the following two points; 1) Discreteness. Since the various reflecting structures are irregularly distributed on the aircraft, the radar cross-sectional areas (Radar Cross Section, RCS) of the various parts of the aircraft are different, and the aircraft in the SAR image appears as a collection of discrete scattering points. 2) Variability. Because of the complex structure of an aircraft, the imaging results of the aircraft vary with the variation of the incident angle and the sensor parameters due to the existence of various scattering mechanisms, so that the scattering results of even the same target can vary greatly under different imaging conditions. Under the SAR imaging mechanism, the SAR aircraft image exhibits distinct characteristics from the corresponding optical image. Thus, existing CNN methods do not adequately extract aircraft features by conventional convolution.
In view of this, how to utilize a priori knowledge of SAR aircraft scattering feature information to achieve high accuracy aircraft detection performance is a challenge.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a method, a system, a device, and a storage medium for detecting an SAR aircraft, which can effectively improve the detection and identification performance of a network model on the aircraft.
In one aspect, an embodiment of the present invention provides a SAR aircraft detection method, including:
acquiring an input SAR image;
analyzing the input SAR image by using an airplane detection model to obtain an airplane detection result;
the aircraft detection result comprises at least one target frame regression result and a corresponding class confidence score of the aircraft;
the aircraft detection model is generated through SAR image data set training of the marked aircraft category and the target frame; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch.
Optionally, the method further comprises:
creating a deformable scattering feature associated network model based on the classification branch and the regression branch with the deformable region associated module;
The deformable scattering feature association network model comprises a Swin transducer backbone, a path aggregation feature pyramid network and a decoupling head; the decoupling head includes a classification branch and a regression branch having a deformable region association module.
Optionally, creating the deformable scattering feature association network model based on the classification branches and the regression branches with the deformable region association module comprises:
constructing a YOLOX model; the YOLOX model comprises a Darknet-53 backbone, a path aggregation feature pyramid network and an original decoupling head; the original decoupling head comprises an original classification branch and a regression branch;
creating a deformable scattering feature associated network model by introducing a Swin transducer backbone to replace the dark-53 backbone of the YOLOX model and by introducing a classification branch with a deformable region associated module to replace the original classification branch of the YOLOX model;
wherein the Swin transducer backbone comprises 24 Swin transducer layers.
Optionally, the method further comprises:
determining a training sample according to the SAR image dataset marked with the airplane category and the target frame; the target frame is marked with a preset number of strong scattering areas;
and classifying and regression training the deformable scattering feature associated network model by adopting random gradient descent SGD training based on preset training parameters according to the training samples, and optimizing network model parameters by a loss value back propagation method in combination with a model loss function to obtain the aircraft detection model.
Optionally, analyzing the input SAR image by using an aircraft detection model to obtain an aircraft detection result, including:
segmentation, stitching and linear mapping are carried out on the SAR image, and a first feature map is obtained;
through the Swin transform backbone, using a continuous Swin transform layer to alternately perform conventional window division and transfer window division, and obtaining a second feature map with preset specification;
semantic feature extraction is carried out on the second feature map through a path aggregation feature pyramid network, and a third feature map with preset specification is obtained; the semantic feature extraction comprises upsampling, splicing and convolution;
position prediction is carried out on the third feature map through regression branches, and a target frame regression result is obtained; the target frame regression result comprises a center point, a width and a height of the frame;
and performing feature weighted integration on the third feature map by using the deformable convolution branches and the conventional convolution branches through the classification branches with the deformable region association modules to obtain the class confidence score of the airplane of the preset class corresponding to the target frame regression result.
Optionally, the Swin transducer layer includes a normalization function, a window-based multi-head self-attention mechanism, and a multi-layer perceptron, and the step of processing the feature map by the Swin transducer layer includes:
Normalizing the target feature map by a normalization function to obtain a feature map X 2
According to the characteristic diagram X 2 The use is based onThe multi-head self-attention mechanism of the window obtains a feature map X through linear mapping and channel dimension-based stitching 3
From object feature map and feature map X 3 Adding to obtain a feature map X 4
Feature map X by normalization function 4 Normalizing to obtain a characteristic diagram X 5
By a multi-layer perceptron, for a characteristic diagram X 5 Activating by using GELU nonlinear activator to obtain feature map X 6 The method comprises the steps of carrying out a first treatment on the surface of the Wherein the multi-layer perceptron comprises two full-connection layers;
according to the characteristic diagram X 4 And feature map X 6 Adding to obtain a feature map X 7
Optionally, feature weighted integration is performed on the third feature map by using the deformable convolution branch and the conventional convolution branch through the classification branch with the deformable region association module to obtain a class confidence score of the aircraft of a preset class corresponding to the target frame regression result, including:
performing first convolution on the third feature map through deformable convolution branches to obtain a feature map X 9
For characteristic diagram X 9 Performing second convolution to obtain a sampling point offset map; for characteristic diagram X 9 Performing third convolution to obtain a fraction mask;
according to the characteristic diagram X 9 Carrying out deformable convolution on the sampling point offset graph and the fractional mask to obtain a feature graph Y;
Carrying out continuous convolution on the third feature map twice through a conventional convolution branch to obtain a feature map Z;
and carrying out weighted addition and feature channel information integration on the feature map Y and the feature map Z based on preset leachable super parameters to obtain the class confidence score of the airplane of the preset class corresponding to the target frame regression result.
In another aspect, an embodiment of the present invention provides a SAR aircraft detection system comprising:
a first module for acquiring an input SAR image;
the second module is used for analyzing the input SAR image by using the aircraft detection model to obtain an aircraft detection result;
the aircraft detection result comprises at least one target frame regression result and a corresponding class confidence score of the aircraft;
the aircraft detection model is generated through SAR image data set training of the marked aircraft category and the target frame; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch.
In another aspect, an embodiment of the present invention provides a SAR aircraft detection apparatus comprising a processor and a memory;
The memory is used for storing programs;
the processor executes a program to implement the method as before.
In another aspect, embodiments of the present invention provide a computer-readable storage medium storing a program for execution by a processor to perform a method as previously described.
Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the foregoing method.
Firstly, acquiring an input SAR image; analyzing the input SAR image by using an airplane detection model to obtain an airplane detection result; the aircraft detection result comprises at least one target frame regression result and a corresponding class confidence score of the aircraft; the aircraft detection model is generated through SAR image data set training of the marked aircraft category and the target frame; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch. According to the invention, the classification branch is constructed by introducing the deformable association module, the characteristic association capability is obviously improved, the model training is further carried out by the SAR image dataset marked with the airplane category and the target frame, and the priori knowledge of SAR airplane scattering characteristic information is fully utilized. The invention improves the performance of detecting the airplane in the complex SAR image and realizes accurate airplane detection and identification.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for detecting an SAR aircraft according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a Yolox model;
FIG. 3 is a schematic diagram of a Swin transducer according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a Swin transducer layer according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a DRCM according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a DSFCN according to an embodiment of the present invention;
fig. 7 is a schematic flow chart of constructing a DSFCN model according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It should be noted that, in the prior art, the aircraft detection technology related to SAR includes:
1. fu et al, in its published paper, "Aircraft Recognition in SAR Images Based on Scattering Structure Feature and Template matching" (IEEE Journal of Selected Topics in Applied Earth Observation and Remote Sensing, (2018) 4206-4217.) propose a template matching-based aircraft identification method that analyzes the scattering characteristics of an aircraft and uses a Gaussian mixture model to extract the scattering structure characteristics of the target. According to the method, the efficiency of candidate template selection is improved through a proposed sample decision optimization algorithm in a matching stage, and a coordinate translation Kullback-Leibler divergence method is proposed in a detection stage so as to realize translation invariance of detection. However, such an improved template matching method still has the problems that the matching accuracy is low, and features need to be manually made according to the attribute of specific data, so that the method is difficult to be applied to SAR images with other resolutions.
2. Zhao et al in its published paper "Pyramid Attention Dilated Network for Aircraft Detection in SAR images" (IEEE GEOSCI ENCE AND REMOTE SENSING LETTERS, (2021) 662-666.) propose a pyramid attention-extending network to detect aircraft. The method considers the discrete characteristics of the SAR aircraft, utilizes a multi-branch expansion convolution module to enhance the relationship between the discrete backscattering characteristics of the aircraft, and uses a convolution fast attention module to refine redundant information and highlight the obvious characteristics of the aircraft. Although the method considers the discrete characteristics of the aircraft, the flexibility of the convolution module for extracting the characteristics is still limited, and the convolution module can only sample the aircraft on a regular grid, so that the correlation between important discrete areas of the aircraft is not sufficiently modeled, and the detection performance is further limited.
3. Guo et al in its published paper "Scattering Enhanced Attention Pyramid Network for Aircraft Detection in SAR images" (IEEE Transactions on Geoscience and Remote Sensing, (2021) 7570-7587.) propose a scattering enhanced attention pyramid network to detect aircraft. The method comprises the steps of extracting strong scattering points of an airplane through a Harris-Laplace angle point detector, modeling through a noise space clustering method based on density and a Gaussian mixture model, measuring correlation between a known target and a template through Kullback-Leibler divergence, and enhancing SAR image scattering information if the known target and the template are matched, and then sending the SAR image scattering information into a network. Although the method considers the thought of modeling the SAR aircraft by using priori knowledge and then assisting the network to extract the characteristics, the method only carries out gamma distribution CFAR enhancement on partial pictures in the picture preprocessing stage, and can not fully exert the strong modeling capability of the network.
However, the related art has the following disadvantages: 1. the existing detection method based on template matching has low matching precision, and features are required to be manually made according to the attribute of specific data, so that the detection method is not suitable for SAR data with different resolutions. 2. Most of the existing CNN-based detection methods are designed for optical image detection tasks, and the problem that the scattering characteristics of the aircraft in SAR images are insufficient still exists when SAR aircraft detection is introduced, so that the detection effect is limited. 3. The existing method for taking the SAR image aircraft scattering characteristic into consideration to assist in network extraction features has the advantages that the manual extraction features and the network extraction features are two independent processes, the problem of organic combination is not realized, and the remarkable improvement of network performance is not facilitated.
In view of this, in one aspect, referring to fig. 1, an embodiment of the present invention provides a SAR aircraft detection method, including:
s100, acquiring an input SAR image;
an input SAR image to be subjected to aircraft detection is acquired, typically using a 1m resolution SAR picture, and is unified at 640 x 640 pixels.
S200, analyzing the input SAR image by using an airplane detection model to obtain an airplane detection result;
it should be noted that the aircraft detection result includes at least one target frame regression result and a corresponding class confidence score of the aircraft; the aircraft detection model is generated through SAR image data set training of the marked aircraft category and the target frame; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch;
in some embodiments, further comprising: creating a deformable scattering feature associated network model based on the classification branch and the regression branch with the deformable region associated module; the deformable scattering feature association network model comprises a Swin transducer backbone, a path aggregation feature pyramid network and a decoupling head; the decoupling head includes a classification branch and a regression branch having a deformable region association module. Wherein creating a deformable scattering feature associated network model based on the classification branches and the regression branches with deformable region association modules comprises: constructing a YOLOX model; the YOLOX model comprises a Darknet-53 backbone, a path aggregation feature pyramid network and an original decoupling head; the original decoupling head comprises an original classification branch and a regression branch; creating a deformable scattering feature associated network model by introducing a Swin transducer backbone to replace the dark-53 backbone of the YOLOX model and by introducing a classification branch with a deformable region associated module to replace the original classification branch of the YOLOX model; wherein the Swin transducer backbone comprises 24 Swin transducer layers.
In some embodiments, the method further comprises: determining a training sample according to the SAR image dataset marked with the airplane category and the target frame; the target frame is marked with a preset number of strong scattering areas; and classifying and regression training the deformable scattering feature associated network model by adopting random gradient descent SGD training based on preset training parameters according to the training samples, and optimizing network model parameters by a loss value back propagation method in combination with a model loss function to obtain the aircraft detection model.
In some embodiments, the specific steps for acquiring the SAR image dataset of the labeled aircraft class and target frame are as follows:
step (1): 1m resolution SAR pictures of a plurality of airports around the world acquired from high-resolution satellite No. three are uniformly cut into 640 x 640 pixels, and 2204 SAR picture data sets with 7 types of 5429 aircraft are obtained. The class and the target rectangular frame (namely the target frame) of each airplane are marked by an expert, the airplane class is divided into A220, A320-321, A330, ARJ21, boeing737-800, boeing787 and others according to models, and the target rectangular frame is an circumscribed rectangle of the airplane.
Step (2): the SAR pictures are randomly divided into training sets (i.e., training samples) and test sets in a 3:1 ratio.
Step (3): marking a strong scattering region (Strong Scattering Region, SSR), extracting 25 SSRs in a target rectangular frame of each aircraft in the SAR image, and further comprising the steps of:
step (31): and carrying out mean value filtering on pixels in the frame, wherein the radius of the filter is one tenth of that of the shorter side of the frame.
Step (32): and selecting a point with the maximum value from pixel points in the filtered frame as a central point of the circular SSR with the first radius.
Step (33): pixels falling within the first SSR are excluded, and the point with the largest value is selected from the rest of pixels as the center point of the second SSR.
Step (34): repeating the steps to obtain 25 SSRs.
Specifically, the method for analyzing the input SAR image by using the aircraft detection model to obtain the aircraft detection result comprises the following steps: segmentation, stitching and linear mapping are carried out on the SAR image, and a first feature map is obtained; through the Swin transform backbone, using a continuous Swin transform layer to alternately perform conventional window division and transfer window division, and obtaining a second feature map with preset specification; semantic feature extraction is carried out on the second feature map through a path aggregation feature pyramid network, and a third feature map with preset specification is obtained; the semantic feature extraction comprises upsampling, splicing and convolution; position prediction is carried out on the third feature map through regression branches, and a target frame regression result is obtained; the target frame regression result comprises a center point, a width and a height of the frame; and performing feature weighted integration on the third feature map by using the deformable convolution branches and the conventional convolution branches through the classification branches with the deformable region association modules to obtain the class confidence score of the airplane of the preset class corresponding to the target frame regression result. It should be noted that, the target frame regression results include regression results of a plurality of target frames detected and identified in the SAR image, and the number of the class confidence scores of the aircraft and the number of the target frames are always the same, and the aircraft contained in each target frame corresponds to each other one by one.
In some embodiments, the Swin transducer layer includes a normalization function, a window-based multi-head self-attention mechanism, and a multi-layer perceptron, and the step of processing the feature map by the Swin transducer layer includes: normalizing the target feature map by a normalization function to obtain a feature map X 2 The method comprises the steps of carrying out a first treatment on the surface of the According to the characteristic diagram X 2 Obtaining a feature map X through linear mapping and channel dimension-based stitching by using a multi-head self-attention mechanism based on windows 3 The method comprises the steps of carrying out a first treatment on the surface of the From object feature map and feature map X 3 Adding to obtain a feature map X 4 The method comprises the steps of carrying out a first treatment on the surface of the Feature map X by normalization function 4 Normalizing to obtain a characteristic diagram X 5 The method comprises the steps of carrying out a first treatment on the surface of the By a multi-layer perceptron, for a characteristic diagram X 5 Activating by using GELU nonlinear activator to obtain feature map X 6 The method comprises the steps of carrying out a first treatment on the surface of the Wherein the multi-layer perceptron comprises two full-connection layers; according to the characteristic diagram X 4 And feature map X 6 Adding to obtain a feature map X 7
In some embodiments, feature weighted integration is performed on the third feature map by using a deformable convolution branch and a conventional convolution branch through a classification branch with a deformable region association module to obtain a class confidence score of an aircraft of a preset class corresponding to a target frame regression result, including: performing first convolution on the third feature map through deformable convolution branches to obtain a feature map X 9 The method comprises the steps of carrying out a first treatment on the surface of the For characteristic diagram X 9 Performing second convolution to obtain a sampling point offset map; for characteristic diagram X 9 Performing third convolution to obtain a fraction mask; according to the characteristic diagram X 9 Carrying out deformable convolution on the sampling point offset graph and the fractional mask to obtain a feature graph Y; carrying out continuous convolution on the third feature map twice through a conventional convolution branch to obtain a feature map Z; and carrying out weighted addition and feature channel information integration on the feature map Y and the feature map Z based on preset leachable super parameters to obtain the class confidence score of the airplane of the preset class corresponding to the target frame regression result.
In some embodiments, creating a Deformable Scattering Feature Correlation Network (DSFCN) model includes the steps of:
(1) As shown in fig. 2, a YOLOX model is first constructed. The YOLOX model comprises a dark net-53 backbone, a path aggregation feature pyramid network (Path Aggregation Feature Pyramid Network, PAFPN) and a primary decoupling Head (coupled Head); the original decoupling head includes an original Classification branch (Classification) and a Regression branch (Regression). The steps of the YOLOX model for aircraft detection specifically include:
1.1, for an input 640×640 SAR image, basic semantic features are extracted through a Darknet-53 backbone, and three feature images with different sizes of 80×80, 40×40 and 20×20 are output.
1.2, the three feature images are sent to the neck of a path aggregation feature pyramid network (Path Aggregation Feature Pyramid Network, PAFPN), semantic features are fully generated through operations such as up-sampling, splicing and convolution, and three feature images with different sizes of 80×80, 40×40 and 20×20 are output.
And 1.3, sending the three feature maps into a decoupling Head (original decoupling Head) to output a final prediction result. The decoupling head comprises a 3×3 convolution layer and two branches formed by stacking the 3×3 convolution layers: classification branch Classification and Regression branch Regression. Regression branches output target frame regression results of each position prediction on the feature map, wherein the target frame regression results comprise the center point, the width and the height of the frame, and whether the frame contains the confidence score of a target; the classification branch outputs a confidence score for each class that may be contained in the predicted bounding box of the regression branch.
(2) A Swin fransformer backbone was introduced to replace the dark 53 backbone (the structure of the Swin fransformer backbone is shown in fig. 3), wherein a total of 24 Swin fransformer layers (STLs) were included (the structure of the Swin fransformer layers is shown in fig. 4). For the Swin transducer backbone, the steps of SAR image processing are as follows:
2.1, uniformly dividing an input SAR picture into 16 parts and splicing, and obtaining a characteristic map X through linear mapping 1
2.2 feature map X 1 Normalized features by a layer normalization function (Layer Normalization, LN) to obtain a feature map X 2
2.3 extracting features using a Window-based Multi-head Self-attention mechanism (W-MSA). The feature map is first divided into non-overlapping local windows of 7 x 7 feature vectors, with all feature vectors in each window being considered tokens. Then, a query vector matrix (Q), a keyword vector matrix (K) and a content vector matrix (V) of all tokens are obtained by linear mapping. The corresponding expression is as follows:
Q=X 2 P Q ,K=X 2 P K ,V=X 2 P V
wherein P is Q 、P K And P V Representing a matrix of three learnable linear mappings.
Performing a split-head self-attention mechanism on the resulting matrix, i.e., the resulting Q, K, V vectors are each split into n groups along the channel dimension, Q for each subgroup in the n groups s ,K s ,V s The self-attention mechanisms are performed separately. The corresponding expression is as follows:
Figure BDA0004069911430000091
wherein B represents a learnable relative position coding matrix, and d represents channel dimensions of Q, K and V vectors. Splicing the results obtained by each subgroup along the channel dimension to obtain an output characteristic diagram X 3 The values of the respective feature vectors.
2.4, feature map X 3 And original input feature map X 1 Adding to obtain a characteristic diagram X 4
2.5, feature map X 4 Obtaining a characteristic diagram X through LN normalization characteristic 5
2.6, feature map X 5 The feature map X is obtained by a multi-layer perceptron (Multilayer Perceptron, MLP) with two fully connected layers and intermediately activated with a GELU nonlinear activator 6
2.7, characteristic diagram X 6 And the characteristic diagram X obtained in the step 2.4 4 Adding to obtain a feature map X 7
2.8, step (2.2) toStep (2.7) is a process of extracting features of the STL once, wherein X is as described above 1 To X 7 . The feature map data used only for symbolizing the corresponding stage cannot be regarded as limiting the feature map data, and X of each stage is corresponding to a different STL in the Swin transducer backbone 2 To X 7 Distinguishing, wherein the target feature map comprises feature map X 1 X of STL output of the upper layer 7 . STL dividing windows are divided into two forms, one is to divide local windows directly from a (0, 0) point of a feature map, which is called conventional window division; one is to divide the window starting from the (3, 3) point, called transition window division. The feature map repeats steps (2.2) to (2.7), with regular window division and transfer window division alternating in successive STLs. And uniformly dividing the feature images into 4 parts at the 4 th, 8 th, 20 th and 24 th STLs, splicing the feature images to obtain feature images with half-dimension doubling, and finally outputting the feature images (namely the second feature image with preset specification) of 80×80, 40×40 and 20×20 at the 8 th, 20 th and 24 th STLs.
(3) The deformable region association module DRCM is introduced to replace the classification branches of YOLOX (the structure of the DRCM is as in fig. 5), and the DRCM can be divided into deformable convolution branches (up) and regular convolution branches (down). The step of performing data processing by the DRCM specifically includes:
3.1, feature map X after PAFPN feature extraction 8 (i.e., the third feature map), the upper branch is subjected to 3×3 convolution to obtain feature map X 9
3.2、X 9 Two convolution layers are respectively passed through to obtain a sampling point offset map deltap with a dimension of H multiplied by W multiplied by 2 and a fractional mask deltam with a dimension of H multiplied by W multiplied by 1.
3.3 embedding Δp and Δm into a 5×5 deformable convolution, X 9 The output feature map Y is obtained through this convolution. The expression is as follows:
Figure BDA0004069911430000101
where p represents the position on the feature map, k represents the k-th sampling point, ω k Representative of the kth offset acquisitionWeights of spots, p+p k Represents the kth sample position, Δp, of the 5×5 regular convolution k Represents the offset, Δm, of the kth sample position of the 5×5 regular convolution k Representing the fractional value of the kth offset sample point.
3.4, lower branch, for feature map X 8 Two 3 x 3 convolutions are consecutively performed to obtain a feature map Z. Two learnable super parameters alpha and beta are set, wherein alpha+beta=1, Y and Z are added in a weighted mode through alpha and beta, feature channel information is integrated through 1X 1 convolution, and confidence scores of all positions on the feature map for 7 classes of aircraft are output.
In some embodiments, the specific steps for optimizing the network model parameters by the loss value back propagation method based on the loss function are as follows:
4.1 in the classification branch, for the score vector of each position on the feature map, a cross entropy classification loss function with Sigmoid activation is calculated.
Figure BDA0004069911430000102
Wherein N is the class number of the aircraft; y is i Representing whether the target is an i-th target, if so, the target is 1, otherwise, the target is 0; p is p i Representing the confidence score of the network prediction class i objective. Sigmoid calculates an activation value for input x.
Figure BDA0004069911430000111
4.2, averaging all the classification losses to obtain a total classification loss L cls
L cls =mean(l cls )
And 4.3, calculating the chamfer distance loss for offset sampling points which are obtained by predicting the position of the target on the characteristic diagram.
Figure BDA0004069911430000112
Figure BDA0004069911430000113
Wherein,,
Figure BDA0004069911430000114
representing the position of the nth predicted offset sampling point in the ith target frame; />
Figure BDA0004069911430000116
Representing the center point position of the strong scattering region of the mth mark in the ith target frame; r is (r) i Representing the radius of the strongly scattering region marked in the ith target frame.
And 4.4, averaging all SSR prediction losses to obtain the total SSR prediction loss.
L ssr =mean(l ssr )
4.5, in the regression branch, calculating an intersection ratio (Intersection over Union, ioU) regression loss function for a frame (bbox) of the feature map, which is responsible for regression of the position of the prediction target.
Figure BDA0004069911430000115
Wherein bbox pre Frame predicted for network model, bbox gt For a true border tag of a target, intersection is a function of computing an Intersection, and Union is a function of computing a Union.
4.6, averaging all the regression losses to obtain a total regression loss L reg
L reg =mean(l reg )
4.7, in the regression branch, calculating a 1-norm loss for the score of whether the target exists for each position prediction on the feature map.
Figure BDA0004069911430000117
Where y is a true value, 1 when a target is present, or 0 otherwise,
Figure BDA0004069911430000118
is a predicted value of the network model.
And 4.8, averaging the target existence losses of all the positions to obtain the total target existence loss.
L obj =mean(l obj )
4.9, calculating the total loss value of training.
L=αL cls +βL reg +γL obj
Wherein the weighting coefficients α=3, β=3, γ=1, λ=1.
Finally, optimizing network model parameters through a loss value back propagation method.
In some embodiments, based on preset training parameters, the specific steps of classifying and regression training the deformable scattering feature associated network model by adopting random gradient descent SGD training are as follows:
5.1, adopting random gradient descent SGD training, wherein the cycle number is 100, the batch size is 8, and the optimizer weight decay parameter and the momentum are respectively set to 0.0005 and 0.9.
5.2, learning rate was initialized to 0.002 and stepped down to 0.0001 with cycle number using a cosine annealing strategy.
And 5.3, giving a batch of input SAR pictures, firstly, generating semantic features through a Swin transform backbone, then through a PAFPN neck, and then outputting a classification result through a DRCM and outputting a regression result through a regression branch. And adjusting the model parameters based on the output result.
In some embodiments, the method further comprises the step of testing a Deformable Scattering Feature Correlation Network (DSFCN) model, as shown in fig. 6, specifically comprising the steps of:
6.1, calculating an accuracy (Precision), a Recall (Recall) and an average accuracy average (mean Average Precision, mAP) of a trained DSFCN model through a test data set, wherein the score of a predicted frame is the product of the confidence score of whether a target is contained in a regression branch and the confidence score of the highest category in a classification branch, and when the value is greater than 0.5, the target is regarded as being present, and the category is the category with the highest confidence in the classification branch; a successful match is considered when the predicted bounding box intersects the true bounding box by a ratio greater than 0.5 and the categories are consistent.
And 6.2, calculating the model parameter.
6.3, visualizing the activation degree of each image area to the detection result through a feature visualization technology ScorCAM.
6.4 visualization of the offset sample predictions of the deformable convolution in the DRCM.
As shown in fig. 7, the overall DSFCN model is constructed by the following steps:
(1) The method comprises the steps of acquiring an input SAR image, wherein the SAR image represents an SAR image dataset marked with airplane categories and target frames;
(2) Constructing a DSFCN model; comprising the following steps: constructing a YOLOX model; introducing a Swin transducer; introducing DRCM; constructing a model loss function;
(3) Training a DSFCN model;
(4) The DSFCN model is tested.
Specifically, in some embodiments, the effect of the present invention is verified by an ablation experiment, and the results of the ablation experiments of steps 6.1 and 6.2 are shown in table 1.
TABLE 1
Figure BDA0004069911430000131
The results of the performance comparison with other existing algorithms are shown in table 2.
TABLE 2
Figure BDA0004069911430000132
From the results in tables 1 and 2, the two improvement modules provided by the invention can make the network model obtain higher scores on various indexes under the condition of reducing the parameter quantity, thereby remarkably improving the detection performance and being superior to the existing other algorithms. The superior network model performance shows that the Swin transducer module with the self-attention mechanism can finely extract the scattering characteristics of the SAR image, and the DRCM with the flexible sampling capability can correlate the significant characteristics of the aircraft, so as to adapt to the variability of the aircraft in the SAR image.
In summary, the invention aims to improve the module in the neural network model by utilizing the scattering characteristics of the aircraft discreteness and variability in the SAR image so as to fully extract the aircraft characteristics and improve the detection and identification performance of the network model on the aircraft. Convolution models the relationship between adjacent sample points based on the idea of template matching, and therefore has a generalized bias of locality, i.e., adjacent sample points have strong correlation. The self-attention mechanism used by Swin Transformer can be regarded as an adaptive filter, and the weight of the self-attention mechanism is determined by the relevance of query vectors and keyword vectors between points, so that the self-attention mechanism is more suitable for sampling long-range dependency information of points. According to the invention, the aircraft is considered to be a set of scattered points which are scattered and sparsely distributed in the SAR image, and the correlation among pixels on the aircraft is weak, so that the backbone of the traditional convolution architecture is abandoned, and the Swin transducer backbone is adopted, so that the network model can more fully extract the scattered characteristics of the aircraft. In order to address the limitation that the original convolution can only be sampled on a regular grid, the present invention embeds the deformable convolution into the classification branches. Meanwhile, considering that the strong scattering area on the aircraft is the key for identifying the aircraft when the variability problem of the SAR aircraft is faced, the invention adds the strong scattering area supervision information into the network model training process in the form of a loss function to guide the prediction of offset sampling points. Compared with the traditional method for manually extracting features and preprocessing the enhanced picture, the method fully utilizes the strong modeling capability of the network, provides a DRCM to adaptively associate the strong scattering region carrying the significant features of the airplane, realizes the organic combination of the manually extracted features and the network extracted features, and has stronger adaptability. The embodiment of the invention constructs a new SAR aircraft detector named as a Deformable Scattering Feature Correlation Network (DSFCN) by adopting a Swin transducer with a self-attention mechanism as a network to extract a backbone of features from an original high-resolution SAR image and adding a Deformable Region Correlation Module (DRCM) with the capability of automatically correlating a significant region of an aircraft into a detection head on the basis of an advanced YOLOX detector. Compared with the prior art, the algorithm provided by the invention has excellent feature extraction and association capability, improves the performance of detecting the aircraft in complex SAR images, and realizes more accurate aircraft detection and identification capability than other methods.
In another aspect, an embodiment of the present invention provides a SAR aircraft detection system comprising: a first module for acquiring an input SAR image; the second module is used for analyzing the input SAR image by using the aircraft detection model to obtain an aircraft detection result; the aircraft detection result comprises at least one target frame regression result and a corresponding class confidence score of the aircraft; the aircraft detection model is generated through SAR image data set training of the marked aircraft category and the target frame; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch.
The content of the method embodiment of the invention is suitable for the system embodiment, the specific function of the system embodiment is the same as that of the method embodiment, and the achieved beneficial effects are the same as those of the method.
Another aspect of the embodiment of the invention also provides a SAR aircraft detection device, which comprises a processor and a memory;
the memory is used for storing programs;
the processor executes a program to implement the method as before.
The content of the method embodiment of the invention is suitable for the device embodiment, the specific function of the device embodiment is the same as that of the method embodiment, and the achieved beneficial effects are the same as those of the method.
Another aspect of the embodiments of the present invention also provides a computer-readable storage medium storing a program that is executed by a processor to implement a method as described above.
The content of the method embodiment of the invention is applicable to the computer readable storage medium embodiment, the functions of the computer readable storage medium embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method.
Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the foregoing method.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution apparatus, device, or apparatus, such as a computer-based apparatus, processor-containing apparatus, or other apparatus that can fetch the instructions from the instruction execution apparatus, device, or apparatus and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution apparatus, device, or apparatus.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution device. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and these equivalent modifications or substitutions are included in the scope of the present invention as defined in the appended claims.

Claims (10)

1. A SAR aircraft detection method, comprising:
acquiring an input SAR image;
analyzing the input SAR image by using an airplane detection model to obtain an airplane detection result;
the aircraft detection results comprise at least one target frame regression result and corresponding aircraft category confidence scores;
the aircraft detection model is generated through SAR image data set training of marked aircraft categories and target frames; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch.
2. The SAR aircraft detection method of claim 1, further comprising:
creating a deformable scattering feature correlation network model based on the classification branches and the regression branches with deformable region correlation modules;
the deformable scattering feature association network model comprises a Swin transducer backbone, a path aggregation feature pyramid network and a decoupling head; the decoupling head includes the classification branch and the regression branch having deformable region association modules.
3. The SAR aircraft detection method of claim 2, wherein the creating a deformable scattering feature correlation network model based on the classification branch and the regression branch with deformable region correlation module comprises:
constructing a YOLOX model; the YOLOX model comprises a dark-53 backbone, a path aggregation feature pyramid network and an original decoupling head; the original decoupling head comprises an original classification branch and the regression branch;
creating a deformable scattering feature associated network model by introducing the Swin Transformer backbone to replace the dark-53 backbone of the YOLOX model and by introducing the classification branch with deformable region associated module to replace the original classification branch of the YOLOX model;
wherein the Swin transducer backbone comprises 24 Swin transducer layers.
4. The SAR aircraft detection method of claim 2, further comprising:
determining a training sample according to the SAR image data set of the marked airplane category and the target frame; the target frame is marked with a preset number of strong scattering areas;
and classifying and regression training the deformable scattering feature associated network model by adopting random gradient descent SGD training based on preset training parameters according to the training samples, and optimizing network model parameters by a loss value back propagation method by combining a model loss function to obtain the aircraft detection model.
5. The method according to claim 2, wherein analyzing the input SAR image using an aircraft detection model to obtain an aircraft detection result comprises:
performing segmentation, stitching and linear mapping on the SAR image to obtain a first feature map;
through the Swin Transformer backbone, using a continuous Swin Transformer layer to alternately perform conventional window division and transfer window division, and obtaining a second characteristic diagram with preset specification;
extracting semantic features of the second feature map through the path aggregation feature pyramid network to obtain a third feature map with the preset specification; wherein the semantic feature extraction includes upsampling, stitching, and convolution;
performing position prediction on the third feature map through the regression branch to obtain a target frame regression result; the target frame regression result comprises a center point, a width and a height of the frame;
and carrying out feature weighted integration on the third feature map by using the deformable convolution branch and the conventional convolution branch through the classification branch with the deformable region association module to obtain the class confidence score of the airplane of the preset class corresponding to the target frame regression result.
6. The SAR aircraft detection method of claim 5, wherein the Swin Transformer layer comprises a normalization function, a window-based multi-headed self-attention mechanism, and a multi-layer perceptron, and wherein the Swin Transformer layer processes the feature map, comprising:
normalizing the target feature map by the normalization function to obtain a feature map X 2
According to the characteristic diagram X 2 The multi-head self-attention mechanism based on the window is used for obtaining a feature map X through linear mapping and channel dimension splicing 3
From the target feature map and the feature map X 3 Adding to obtain a feature map X 4
By the normalization function, the feature map X 4 Normalizing to obtain a characteristic diagram X 5
The multi-layer perceptron is used for the characteristic diagram X 5 Activating by using GELU nonlinear activator to obtain feature map X 6 The method comprises the steps of carrying out a first treatment on the surface of the Wherein the multi-layer perceptron comprises two full-connection layers;
according to the characteristic diagram X 4 And feature map X 6 Adding to obtain a feature map X 7
7. The method according to claim 5, wherein the performing feature weighted integration on the third feature map by using the deformable convolution branch and the regular convolution branch through the classification branch with the deformable region association module to obtain a class confidence score of a preset class of aircraft corresponding to the target frame regression result includes:
Performing a first convolution on the third feature map through the deformable convolution branch to obtain a feature map X 9
For the characteristic diagram X 9 Performing second convolution to obtain a sampling point offset map; for the characteristic diagram X 9 Performing third convolution to obtain a fraction mask;
according to the characteristic diagram X 9 Carrying out deformable convolution on the sampling point offset graph and the fraction mask to obtain a feature graph Y;
performing continuous convolution on the third feature map twice through the conventional convolution branch to obtain a feature map z;
and based on preset learnable super parameters, carrying out weighted addition and feature channel information integration on the feature map Y and the feature map Z to obtain the class confidence score of the airplane of the preset class corresponding to the target frame regression result.
8. A SAR aircraft detection system, comprising:
a first module for acquiring an input SAR image;
the second module is used for analyzing the input SAR image by using an airplane detection model to obtain an airplane detection result;
the aircraft detection results comprise at least one target frame regression result and corresponding aircraft category confidence scores;
the aircraft detection model is generated through SAR image data set training of marked aircraft categories and target frames; the aircraft detection model comprises a classification branch and a regression branch, wherein the classification branch is provided with a deformable region association module, and the deformable region association module performs feature weighted integration through a deformable convolution branch and a conventional convolution branch.
9. A SAR aircraft detection device comprises a processor and a memory;
the memory is used for storing programs;
the processor executing the program implements the method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the storage medium stores a program that is executed by a processor to implement the method of any one of claims 1 to 7.
CN202310089469.6A 2023-02-08 2023-02-08 SAR aircraft detection method, system, device and storage medium Pending CN116310795A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310089469.6A CN116310795A (en) 2023-02-08 2023-02-08 SAR aircraft detection method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310089469.6A CN116310795A (en) 2023-02-08 2023-02-08 SAR aircraft detection method, system, device and storage medium

Publications (1)

Publication Number Publication Date
CN116310795A true CN116310795A (en) 2023-06-23

Family

ID=86833287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310089469.6A Pending CN116310795A (en) 2023-02-08 2023-02-08 SAR aircraft detection method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN116310795A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116894973A (en) * 2023-07-06 2023-10-17 北京长木谷医疗科技股份有限公司 Integrated learning-based intelligent self-labeling method and device for hip joint lesions
CN118397257A (en) * 2024-06-28 2024-07-26 武汉卓目科技股份有限公司 SAR image ship target detection method and device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116894973A (en) * 2023-07-06 2023-10-17 北京长木谷医疗科技股份有限公司 Integrated learning-based intelligent self-labeling method and device for hip joint lesions
CN116894973B (en) * 2023-07-06 2024-05-03 北京长木谷医疗科技股份有限公司 Integrated learning-based intelligent self-labeling method and device for hip joint lesions
CN118397257A (en) * 2024-06-28 2024-07-26 武汉卓目科技股份有限公司 SAR image ship target detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110472627B (en) End-to-end SAR image recognition method, device and storage medium
CN112700429B (en) Airport pavement underground structure disease automatic detection method based on deep learning
CN107679250B (en) Multi-task layered image retrieval method based on deep self-coding convolutional neural network
Guo et al. Scattering enhanced attention pyramid network for aircraft detection in SAR images
Adla et al. Deep learning-based computer aided diagnosis model for skin cancer detection and classification
Singh et al. Automated ground-based cloud recognition
Liu et al. Hierarchical semantic model and scattering mechanism based PolSAR image classification
CN116310795A (en) SAR aircraft detection method, system, device and storage medium
Yin et al. An optimised multi-scale fusion method for airport detection in large-scale optical remote sensing images
Aghaei et al. Osdes_net: Oil spill detection based on efficient_shuffle network using synthetic aperture radar imagery
Kilic et al. An accurate car counting in aerial images based on convolutional neural networks
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
Kaur et al. A novel technique for content based image retrieval using color, texture and edge features
CN115661649A (en) Ship-borne microwave radar image oil spill detection method and system based on BP neural network
CN117830874B (en) Remote sensing target detection method under multi-scale fuzzy boundary condition
Sun et al. Marine ship instance segmentation by deep neural networks using a global and local attention (GALA) mechanism
Naiemi et al. Scene text detection using enhanced extremal region and convolutional neural network
CN117593514B (en) Image target detection method and system based on deep principal component analysis assistance
CN113239895A (en) SAR image change detection method of capsule network based on attention mechanism
Chen et al. Regional classification of urban land use based on fuzzy rough set in remote sensing images
CN112949634A (en) Bird nest detection method for railway contact network
CN115019107B (en) Sonar simulation image generation method, system and medium based on style migration
Sebastian et al. Significant full reference image segmentation evaluation: a survey in remote sensing field
Yawale et al. Design of a hybrid GWO CNN model for identification of synthetic images via transfer learning process
Liu et al. Target detection of hyperspectral image based on faster R-CNN with data set adjustment and parameter turning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination