CN115995041A

CN115995041A - Attention mechanism-based SAR image multi-scale ship target detection method and device

Info

Publication number: CN115995041A
Application number: CN202211728727.2A
Authority: CN
Inventors: 刘瑜; 陈碧朗; 姜智卓; 王学谦; 李刚
Original assignee: Tsinghua University; Shenzhen International Graduate School of Tsinghua University
Current assignee: Tsinghua University; Shenzhen International Graduate School of Tsinghua University
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2023-04-21

Abstract

The invention provides a method and a device for detecting SAR image multi-scale ship targets based on an attention mechanism, which belong to the field of image detection, and comprise the following steps: acquiring an SAR image to be detected; inputting the SAR image into a multi-scale ship detection model to obtain a target detection result output by the multi-scale ship detection model; the multi-scale ship detection model comprises a feature extraction layer, a feature fusion layer and a prediction layer, wherein the core part of the multi-scale ship detection model comprises a space and channel attention module, the space and channel attention module comprises a space attention module and a channel attention module, and the channel attention module processes a feature map output by the space attention module. The space and channel attention module is obtained based on the space attention module and the channel attention module, the space information and the semantic information of the multi-scale ship and the context information for inhibiting the complex background interference can be simultaneously extracted in a complex environment, the omission ratio and the false alarm ratio are reduced, and the detection precision of the multi-scale ship target is improved.

Description

Attention mechanism-based SAR image multi-scale ship target detection method and device

Technical Field

The invention relates to the technical field of image detection, in particular to a method and a device for detecting SAR image multi-scale ship targets based on an attention mechanism.

Background

Synthetic aperture radar (SAR: synthetic aperture radar) remote sensing has the advantages of all-weather, all-weather and cloud penetration and fog penetration, and has become an important means for monitoring marine targets. The existing ship target detection method mainly comprises a traditional ship target detection method and a ship target detection method based on deep learning. The constant false alarm rate detection algorithm is a mainstream traditional SAR ship target detection method at present, and the threshold value is adaptively adjusted on the premise of keeping the false alarm rate constant, so that the detection of the multi-scale ship target is realized mainly through the contrast relation between the ship target and the sea clutter background. With the rapid development of computer vision technology, the SAR ship target detection algorithm based on deep learning can extract more abundant detail information and depth information from massive high-resolution SAR images, wherein the detail information and the depth information comprise space information, semantic information and context information, and learn the history information and rules of the images, so that the detection accuracy of the multi-scale ship targets is improved.

The method for detecting the ship target based on the traditional technology mainly aims at solving the problems that the scene is single, the high-dimensional characteristics cannot be extracted, and the method is not suitable for high-resolution remote sensing images in complex scenes. For complex sea surface scenes, serious missed detection and false alarm easily occur in a deep learning-based SAR multi-scale ship target detection algorithm, and the detection accuracy is low.

Disclosure of Invention

The invention provides a method and a device for detecting SAR image multi-scale ship targets based on an attention mechanism, which are used for solving the defects that the prior art based on the traditional ship target detection method is not suitable for high-resolution remote sensing images in complex scenes, serious missed detection and false alarm easily occur in a SAR multi-scale ship target detection algorithm based on deep learning, and the detection precision is low.

The invention provides a method for detecting SAR image multi-scale ship targets based on an attention mechanism, which comprises the following steps:

acquiring an SAR image to be detected;

inputting the SAR image into a multi-scale ship detection model to obtain a target detection result output by the multi-scale ship detection model;

the multi-scale ship detection model comprises a space and channel attention module, wherein the space and channel attention module comprises a space attention module and a channel attention module, and the channel attention module is used for processing a characteristic diagram output by the space attention module.

According to the SAR image multi-scale ship target detection method based on the attention mechanism, the multi-scale ship detection model comprises a feature extraction layer, a feature fusion layer and a prediction layer;

the feature extraction layer comprises the space and channel attention module and is used for extracting multi-scale ship image features in the SAR image, weighting the ship image features based on the space and channel attention module, extracting context information for inhibiting complex background interference and obtaining the main features of the ship image;

the feature fusion layer comprises the space and channel attention module and is used for extracting features of the main features of the ship images according to the space and channel attention module, extracting space information and semantic information of the multi-scale ship and fusing the extracted image features;

and the prediction layer is used for determining the position and the category of the ship target according to the fused image characteristics.

According to the SAR image multi-scale ship target detection method based on the attention mechanism, the feature extraction layer further comprises a sampling layering module, a convolution activation module, a first feature extraction enhancement module, a second feature extraction enhancement module and a space pyramid pooling module;

The sampling layering module is arranged at the input end of the characteristic extraction layer and is used for performing line-to-line sampling on the SAR image and splicing a plurality of image characteristic layers obtained by sampling;

the convolution activation module is used for carrying out convolution, batch normalization and activation function processing on the image feature layer;

the space and channel attention module is used for carrying out weighting processing on the image feature layer after convolution activation processing and extracting context information for inhibiting complex background interference;

the first feature extraction enhancement module is used for carrying out feature fusion processing on the weighted image feature layer and outputting ship image features with set scales;

the spatial pyramid pooling module is used for carrying out maximum pooling treatment on the weighted image characteristic layer;

the second feature extraction enhancement module is used for carrying out feature fusion processing on the pooled image feature layer and outputting ship image features with set scales.

According to the SAR image multi-scale ship target detection method based on the attention mechanism, the feature extraction layer comprises a sampling layering module, a convolution activating module, a space and channel attention module, a first feature extraction enhancing module, a convolution activating module, a space and channel attention module, a space pyramid pooling module and a second feature extraction enhancing module which are sequentially arranged.

According to the SAR image multi-scale ship target detection method based on the attention mechanism, the feature fusion layer further comprises an up-sampling module and a down-sampling module, wherein the input ends of the up-sampling module and the down-sampling module are respectively provided with the space and channel attention module, and the space and the semantic information of the multi-scale ship are extracted.

According to the SAR image multi-scale ship target detection method based on the attention mechanism, the prediction layer comprises a plurality of decoupling head modules, each decoupling head module comprises a classification output branch, a detection output branch and a regression output branch, the classification output branch is used for predicting image categories, the detection output branch is used for determining whether an image contains a target or not, and the regression output branch is used for predicting target coordinate information.

According to the SAR image multi-scale ship target detection method based on the attention mechanism, the loss function of the multi-scale ship detection model is determined based on position loss, confidence loss and classification loss.

The invention also provides a SAR image multi-scale ship target detection device based on the attention mechanism, which comprises:

The SAR image acquisition module is used for acquiring an SAR image to be detected;

the ship target detection module is used for inputting the SAR image into a multi-scale ship detection model to obtain a target detection result output by the multi-scale ship detection model;

The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the SAR image multiscale ship target detection method based on the attention mechanism when executing the program.

The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for multi-scale ship target detection of SAR images based on an attention mechanism as described in any of the above.

According to the SAR image multi-scale ship target detection method and device based on the attention mechanism, the space attention module and the channel attention module acquire the space and channel attention module according to the sequence, so that the space information, the semantic information and the context information for inhibiting complex background interference of the multi-scale ship can be extracted simultaneously in a complex environment, the omission ratio and the false alarm ratio are reduced, and the detection precision of the multi-scale ship target is improved.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow diagram of a SAR image multi-scale ship target detection method based on an attention mechanism;

fig. 2a is a schematic structural view of a spatial and channel attention module according to the present invention, fig. 2b is a schematic structural view of a spatial attention module, and fig. 2c is a schematic structural view of a channel attention module;

FIG. 3 is a schematic structural diagram of a multi-scale ship detection model provided by the invention;

FIG. 4 is a schematic diagram of the structure of a prediction layer provided by the present invention;

fig. 5a to 5d are schematic diagrams of target detection contrast effects provided by the present invention, in which fig. 5a and 5c are schematic diagrams of detection results of a YOLOX-L model on an SSDD, and fig. 5b and 5d are schematic diagrams of detection results of a YOLOX-MSD model provided by the present invention on an SSDD;

Fig. 6 is a schematic structural diagram of the attention mechanism-based SAR image multi-scale ship target detection device provided by the invention;

fig. 7 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the embodiments of the present application, it should be noted that, directions or positional relationships indicated by terms such as "center", "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc., are based on those shown in the drawings, are merely for convenience in describing the embodiments of the present application and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the embodiments of the present application. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the embodiments of the present application, it should be noted that, unless explicitly specified and limited otherwise, the terms "connected," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium. The specific meaning of the terms in the embodiments of the present application will be understood by those of ordinary skill in the art in a specific context.

In the examples herein, a first feature "on" or "under" a second feature may be either the first and second features in direct contact, or the first and second features in indirect contact via an intermediary, unless expressly stated and defined otherwise. Moreover, a first feature being "above," "over" and "on" a second feature may be a first feature being directly above or obliquely above the second feature, or simply indicating that the first feature is level higher than the second feature. The first feature being "under", "below" and "beneath" the second feature may be the first feature being directly under or obliquely below the second feature, or simply indicating that the first feature is less level than the second feature.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Fig. 1 is a schematic flow chart of a method for detecting a target of a multi-scale ship in an SAR image based on an attention mechanism, which is provided by the invention, as shown in fig. 1, and comprises the following steps:

s110, acquiring SAR images to be detected;

s120, inputting the SAR image into a multi-scale ship detection model to obtain a target detection result output by the multi-scale ship detection model;

The multi-scale ship detection model comprises a space attention module and a channel attention module, wherein the space attention module and the channel attention module comprise a space attention module and a channel attention module, and the channel attention module is used for processing a characteristic diagram output by the space attention module.

Optionally, the SAR image in step S110 is a SAR image under a complex environment, including ships with various sizes of large, medium and small. The complex environment refers to complex and changeable detection scenes under offshore, open sea, harbor dense distribution, strong sea clutter conditions and the like, and due to the fact that SAR images are various in background and strong in speckle noise, various background interferences such as land, islands and sea clutter exist.

Optionally, the spatial and channel attention module in step S120 is defined as SCAM: spatial and channel attention module, combined with the spatial and channel dimension attention, allows the network to selectively focus on a significant portion of the target to better extract the effective features of a multi-scale ship target and suppress interference from complex backgrounds, which can help the network focus on important features in complex environments.

Fig. 2a is a schematic structural view of a spatial and channel attention module according to the present invention, fig. 2b is a schematic structural view of a spatial attention module, and fig. 2c is a schematic structural view of a channel attention module.

As shown in fig. 2 a-2 c, the spatial and channel attention module includes spatial attention and channel attention mechanisms, learning the importance of the space and the importance of the channel, respectively. The specific calculation formula is as follows:

Ms(F1)＝σ(f ^7×7 ([MaxPool(F1)；AvgPool(F1)])) (3)

Mc(Fs)＝σ(MLP(MaxPool(Fs)+AvgPool(Fs))) (4)

wherein F1 and F2 respectively represent an input and an output feature map; fs and Fc represent feature maps obtained by spatial and channel attention, respectively;

and->

Representing element multiplication and addition, respectively; ms and Mc represent spatial and channel attention mechanisms, respectively; ms (F1) represents the passage of the signature F1 through the spatial attention module; maxpool and AvgPool represent maximum pool and average pool, respectively; sigma is the activation function; mc (Fs) represents the mechanism of channel attention by which the profile Fs passes; f (f) ^7×7 Representing a 7 x 7 convolutional layer.

It can be understood that the space attention module and the channel attention module acquire the space and channel attention modules according to the sequence, so that the space information, the semantic information and the context information for inhibiting complex background interference of the multi-scale ship can be extracted simultaneously in a complex environment, the omission ratio and the false alarm ratio are reduced, and the detection precision of the multi-scale ship target is improved.

FIG. 3 is a schematic structural diagram of a multi-scale ship detection model provided by the invention, and as shown in FIG. 3, the multi-scale ship detection model comprises a feature extraction layer, a feature fusion layer and a prediction layer as an alternative embodiment on the basis of the above embodiment;

The core part of the feature extraction layer comprises the space and channel attention module and is used for extracting multi-scale ship image features in the SAR image, weighting the ship image features based on the space and channel attention module, extracting context information for inhibiting complex background interference and obtaining the main features of the ship image;

the core part of the feature fusion layer comprises the space and channel attention module, and is used for extracting the main features of the ship image according to the space and channel attention module, extracting the space information and semantic information of the multi-scale ship, and fusing the extracted image features;

Optionally, the multi-scale ship detection model provided by the invention is a multi-scale ship target detection model obtained by improving the YOLOX-L model, is defined as YOLO-MSD (Multiscale Ship Detection), and can improve the detection precision of the multi-scale ship target.

It can be understood that in the feature extraction stage, the method and the device focus main features of ship targets with different scales by weighting the features by using the space and channel attention modules, extract more abundant context information and inhibit interference of complex backgrounds. The space and channel attention module is embedded in the feature fusion stage, salient features on specific scales are highlighted through a large amount of space and channel aggregation information of the multi-level feature map, the multi-scale feature map is adaptively refined, the space information and semantic information of the multi-scale ship are extracted, the omission ratio and the false alarm ratio are reduced, and the detection precision of the multi-scale ship targets is improved.

On the basis of the above embodiment, as an optional embodiment, the feature extraction layer further includes a sampling layering module, a convolution activation module, a first feature extraction enhancement module, a second feature extraction enhancement module, and a spatial pyramid pooling module.

Optionally, the YOLO-MSD in the invention takes the CSPDarknet as the basis of the backbone network, and adds the SCAM module to extract the characteristics of the ship image. Among them, CSPDarknet is a backbone network used in YOLO neural networks.

The sampling layering module is a Focus module in fig. 3, the convolution activating module is a CBS (Convolution and Batch Normalization and SILU) module in fig. 3, the first feature extraction enhancing module is a CSP1 (bottleeckcsp 1) module in fig. 3, the second feature extraction enhancing module is a CSP2 (bottleeckcsp 2) module in fig. 3, and the spatial pyramid pooling module is a SPP (Spatial Pyramid Pooling) module in fig. 3.

The sampling layering module is arranged at the input end of the characteristic extraction layer, namely after the ship image input layer, and is used for performing inter-line sampling on the SAR image and splicing a plurality of sampled image characteristic layers.

Alternatively, the Focus module uses inter-line sampling and then stitching. And sampling a value of every other pixel in the ship image, and converting the ship image into four independent characteristic layers. The four feature layers are then stacked to convert information in the width and height planes of the ship image to channel dimensions, thereby reducing information loss.

The convolution activation module is used for carrying out convolution, batch normalization and activation function processing on the image feature layer.

Alternatively, the CBS module is the most basic convolution block in the feature extraction module, consisting of 1 convolution layer (Convolution layer), 1 BN layer (Batch Normalization layer), and an activation function SILU. Convolution is 3*3 or 1*1 for convolution calculation and channel integration; the BN layer normalizes the output and prevents overfitting; siLU is a combination of Sigmoid and ReLU. The SiLU has no upper bound, no lower bound, smooth and non-monotonic properties. The SiLU is superior to the ReLU in depth model and can be seen as a smooth ReLU activation function.

The space and channel attention module is used for carrying out weighting processing on the image feature layer after convolution processing and extracting context information for inhibiting complex background interference.

The first feature extraction enhancement module is used for carrying out feature fusion processing on the weighted image feature layer and outputting ship image features with set scales.

Alternatively, CSP1 module is a CBS module, and the other branch is composed of CBS and Resunit. The result consists of two CBS modules and a residual edge.

The spatial pyramid pooling module is used for carrying out maximum pooling treatment on the image characteristic layer after the weighting treatment.

Optionally, the SPP module is located at the bottom of the feature extraction layer, and the maximum pooling is performed using four pooling kernels of different sizes, where the maximum pooling kernel sizes are 13x13, 9x9, 5x5, and 1x1 (1 x1 represents no processing), and then four ship image feature layers are stacked and combined.

Alternatively, one branch of CSP2 is a single CBS module and the other branch is two CBS modules. The CSP1 module and the CSP2 module can enhance the feature extraction capability of the network and maintain the accuracy of the model on the basis of light weight.

Optionally, the feature extraction layer includes the sampling layering module, the convolution activation module, the space and channel attention module, the first feature extraction enhancement module, the convolution activation module, the space and channel attention module, the space pyramid pooling module, and the second feature extraction enhancement module that are sequentially set.

It can be understood that the invention provides a construction scheme of a feature extraction layer, in the feature extraction stage, as the network goes deep, redundant information is increased, and the features are weighted by using SCAM, so that main features of ship targets with different scales can be rapidly focused, more abundant context information is extracted, and interference of complex backgrounds is suppressed.

On the basis of the above embodiment, as an optional embodiment, the feature fusion layer further includes an up-sampling module and a down-sampling module, where input ends of the up-sampling module and the down-sampling module are respectively provided with the space and channel attention module, and the space and the channel attention module are used for extracting space information and semantic information of the multi-scale ship.

Alternatively, the length and width ranges of the ship targets are larger, so that the space information of the ship targets on the feature map is greatly different. In order to obtain ship target information of different scales, the feature fusion layer comprises a splicing and CSP2 module, an up-sampling module, a SCAM module, a splicing and CSP2 module, a CBS module, a down-sampling module, a splicing and CSP2 module, and feature information of ship images of different scales extracted by a network is fused.

The feature layer is first scaled down by a factor of two using CBS modules. And adding a SCAM module before upsampling, overlapping after upsampling twice, and using a CSP2 module for feature extraction after overlapping. And adding a SCAM module before downsampling, overlapping after downsampling by using two downsampling, and extracting features by using a CSP2 module after stacking.

It can be appreciated that the invention provides a structure of a feature fusion layer, the feature fusion layer not only carries out up-sampling on three ship feature layers in a main network to realize feature fusion, but also continues to carry out down-sampling on ship features to realize feature fusion, so that ship features with different scales can be fused more comprehensively, the image features of a multi-scale ship target can be extracted, a multi-scale feature map can be adaptively refined, and the spatial information and semantic information of the multi-scale ship can be extracted. And SCAM is embedded in the characteristic extraction and fusion stage, so that missing detection and false alarm are reduced, and the detection precision of the multi-scale ship target is improved.

Fig. 4 is a schematic structural diagram of a prediction layer provided in the present invention, as shown in fig. 4, as an alternative embodiment of the foregoing embodiment, the prediction layer includes a plurality of decoupling head modules, each of the decoupling head modules includes a classification output branch, a detection output branch, and a regression output branch, where the classification output branch is used for predicting an image class, the detection output branch is used for determining whether an image includes a target, and the regression output branch is used for predicting target coordinate information.

The prediction layer is used for determining the position and the category of the ship target in the feature map, and is equivalent to a classifier and a regressor of the multi-scale ship target detection model.

Alternatively, in YOLO-MSD, the decoupling head module first halves the number of feature channels using a 1 x 1 convolution layer, and then adds two parallel branches to extract the ship features using a 3 x 3 convolution. In the classification output branch, a 1×1 convolution is used to predict the target frame class; detecting an output branch for determining whether the feature point contains an object; the regression output branch is used to predict coordinate information of the target frame. Finally, the three branches are stacked, and finally, the characteristic information of the ship image is determined. The decoupling heads 2 and 3 feature information determination process is similar.

It can be appreciated that YOLO-MSD reduces the number of predictions per position from 3 to 1 compared to other anchor framed head networks, greatly improving the convergence rate of the model. YOLO-MSD based on the YOLOX-L model, two offsets in the top left corner of the grid, as well as the height and width of the prediction box, can be predicted directly. At the same time, YOLO-MSD designates the location of each object in the center as a positive sample with a predetermined scale range, reducing the parameters and complexity of the head network, resulting in a faster, better performing model.

On the basis of the above embodiment, as an alternative embodiment, the loss function of the multi-scale ship detection model is determined based on a position loss, a confidence loss and a classification loss.

The loss function is used to evaluate how different the model's predicted and actual values are.

YOLO-MSD first divides the image into S x S grids in prediction, a representing the region of true values, and B representing the predicted region. If the center of the ship target is in the grid, the bounding box in the grid will predict the ship target. The confidence threshold is used to reduce redundancy of the bounding box during prediction. The confidence score of the bounding box is calculated by equation (5).

Representing the intersection ratio of a predicted frame and a real frame。/>

Is the jth bounding box of the ith grid, when the ship target is at the jth bounding box of the ith grid, P _i,j 1, otherwise 0.

In YOLO-MSD, the loss function can be represented by equation (7).

LOSS＝loss _reg +loss _obj +loss _cls (7)

loss _reg 、loss _obj And loss of _cls Indicating a loss of position, a loss of confidence, and a loss of classification, respectively.

The advantages of the present invention are described in detail below based on the hardware and software environment of the experiment, the dataset of the experiment, the experimental setup, the experimental evaluation index, and the experimental results on the official SAR Ship Detection Dataset (SSDD).

(1) Experimental environment

All experiments were run using the same environment, and detailed information of the environment configuration is shown in table 1.

Table 1 experimental environment configuration

(2) Experimental data set

In experiments, using the proposed method for evaluation of SAR data sets, the official SAR ship detection data set SSDD comprises 1160 SAR images of 500 x 500 pixels, the resolution varies from 1 to 15 meters, the data sets have different scenes, comprising 2456 ship targets of different sizes, the SAR expert marked SAR ship data set can be used for developing a multi-scale ship target detector, and table 2 shows detailed information of the SSDD. To eliminate the impact of different dataset partitioning on the experimental results, the experiment used the same dataset partitioning approach as the original dataset. The training set of SSDD contains 928 images and the test set contains 232 images.

Table 2SSDD dataset information

(3) Experimental setup

All experimental parameter settings were the same, training times epoch=300, batch size batch size=64, learning rate lr=0.01, momentum m=0.937, and weight decay=0.0005 random gradient descent (SGD) as optimization algorithm. The input size in SSDD is 640 x 640. In the experiment, the model performance is improved by using a Mosaic data enhancement method and a MixUp data enhancement method, wherein the Mosaic data enhancement method is used for enriching the background of the multi-scale ship targets, and the MixUp data enhancement method is used for increasing the number of the multi-scale ship targets in the data set. All experiments were constructed based on MMdetection, and the parameters not mentioned were the default parameters of MMdetection.

(4) Experimental evaluation index

The COCO evaluation index is used for evaluating the YOLO-MSD multi-scale ship target detection network in all experiments, the specific meanings of the indexes of AP-0.5, AP-0.75, AP-s, AP-m and AP-l are shown in the table 3, AP represents average precision, ioU (Intersection over Union) represents cross-over ratio, and the overlapping degree between a predicted frame and a real frame is predicted.

TABLE 3 evaluation index meanings

Accuracy is the ratio of the correct target to all the identified targets calculated by equation (8). TP represents the number of positive samples predicted to be positive, and FP represents the number of negative samples measured to be positive.

Recall represents the ratio of the number of positive samples correctly identified to the number of all positive samples in the test set, calculated by equation (9). FN represents the number of positive samples predicted negative.

Average accuracy is the most important performance indicator in the target detection task. It represents the average value of the APs of the ship, i.e. the area under the P-R (precision recall) curve, calculated by equation (10).

TP, FP, FN and TN are shown in Table 4.

TABLE 4 confusion matrix

(5) Experimental results

The SSDD dataset was tested according to the experimental setup and evaluation criteria. Based on the YOLOX-L as a base line, SCAMs are simultaneously embedded in a feature extraction module and a feature fusion module, SAR multi-scale ship target detection performance of the YOLO-MSD is compared with that of a latest method, experimental results are shown in a table 5, and the method provided by the invention is proved to be excellent in performance in six evaluation indexes, has higher improvement on detection accuracy of the multi-scale ship target, and achieves 98.6% in AP 50.

TABLE 5 experimental results

Fig. 5a to 5d are schematic diagrams of target detection contrast effects provided by the present invention, in which fig. 5a and 5c are schematic diagrams of detection results of a YOLOX-L model on an SSDD, and fig. 5b and 5d are schematic diagrams of detection results of a YOLOX-MSD model provided by the present invention on an SSDD; by comparison, the condition that the Yolox-L model has omission can be seen, and the Yolox-MSD model provided by the invention can detect omitted ships. In conclusion, the invention can be used for greatly reducing the omission ratio and the false alarm rate in a complex environment and improving the detection precision of the multi-scale ship targets.

The attention mechanism-based SAR image multi-scale ship target detection device provided by the invention is described below, and the attention mechanism-based SAR image multi-scale ship target detection device described below and the attention mechanism-based SAR image multi-scale ship target detection method described above can be correspondingly referred to each other.

Fig. 6 is a schematic structural diagram of the attention mechanism-based SAR image multi-scale ship target detection device provided by the present invention, and as shown in fig. 6, the present invention also provides an attention mechanism-based SAR image multi-scale ship target detection device, including:

The SAR image acquisition module 610 is configured to acquire a SAR image to be detected;

the ship target detection module 620 is configured to input the SAR image into a multi-scale ship detection model, so as to obtain a target detection result output by the multi-scale ship detection model;

As an embodiment, the multi-scale ship detection model comprises a feature extraction layer, a feature fusion layer and a prediction layer;

As one embodiment, the feature extraction layer further comprises a sampling layering module, a convolution activating module, a first feature extraction enhancing module, a second feature extraction enhancing module and a spatial pyramid pooling module;

the space and channel attention module is used for carrying out weighting processing on the image feature layer after convolution processing and extracting context information for inhibiting complex background interference;

As an embodiment, the feature extraction layer includes the sampling layering module, the convolution activation module, the spatial and channel attention module, the first feature extraction enhancement module, the convolution activation module, the spatial and channel attention module, the spatial pyramid pooling module, and the second feature extraction enhancement module that are sequentially arranged.

As an embodiment, the feature fusion layer further includes an up-sampling module and a down-sampling module, where the input ends of the up-sampling module and the down-sampling module are both provided with the space and channel attention module, and the space and semantic information are used for extracting space information and semantic information of the multi-scale ship.

As one embodiment, the prediction layer includes a plurality of decoupling head modules, each of which includes a classification output branch for predicting an image class, a detection output branch for determining whether an object is included in the image, and a regression output branch for predicting object coordinate information.

As one embodiment, the loss function of the multi-scale ship detection model is determined based on location loss, confidence loss, and classification loss.

Fig. 7 illustrates a physical schematic diagram of an electronic device, as shown in fig. 7, which may include: processor 710, communication interface (Communications Interface) 720, memory 730, and communication bus 740, wherein processor 710, communication interface 720, memory 730 communicate with each other via communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform an attention mechanism based SAR image multi-scale ship target detection method comprising:

acquiring an SAR image to be detected;

Further, the logic instructions in the memory 730 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of performing a method for detecting a multi-scale ship target of a SAR image based on an attention mechanism, the method comprising:

Acquiring an SAR image to be detected;

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a method for attention-based SAR image multi-scale ship target detection, the method comprising:

acquiring an SAR image to be detected;

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The SAR image multi-scale ship target detection method based on the attention mechanism is characterized by comprising the following steps of:

acquiring an SAR image to be detected;

2. The attention mechanism-based SAR image multi-scale ship target detection method of claim 1, wherein the multi-scale ship detection model comprises a feature extraction layer, a feature fusion layer and a prediction layer;

The feature extraction layer is used for extracting multi-scale ship image features in the SAR image, weighting the ship image features based on the space and the channel attention module, extracting context information for inhibiting complex background interference, and obtaining main features of the ship image;

the feature fusion layer is used for extracting features of the main features of the ship image according to the space and channel attention module, extracting space information and semantic information of the multi-scale ship, and fusing the extracted image features;

3. The attention mechanism-based SAR image multi-scale ship target detection method according to claim 2, wherein the feature extraction layer further comprises a sampling layering module, a convolution activation module, a first feature extraction enhancement module, a second feature extraction enhancement module, and a spatial pyramid pooling module;

4. The attention mechanism-based SAR image multi-scale ship target detection method according to claim 3, wherein the feature extraction layer comprises the sampling layering module, the convolution activation module, the spatial and channel attention module, the first feature extraction enhancement module, the convolution activation module, the spatial and channel attention module, the spatial pyramid pooling module, and the second feature extraction enhancement module, which are sequentially arranged.

5. The attention mechanism-based SAR image multi-scale ship target detection method as set forth in claim 2 wherein the feature fusion layer further comprises an up-sampling module and a down-sampling module, wherein the input ends of the up-sampling module and the down-sampling module are respectively provided with the space and channel attention module for extracting space information and semantic information of the multi-scale ship.

6. The attention mechanism based SAR image multi-scale ship target detection method according to claim 2, wherein the prediction layer comprises a plurality of decoupling head modules, each of the decoupling head modules comprising a classification output branch for predicting an image class, a detection output branch for determining whether a target is contained in the image, and a regression output branch for predicting target coordinate information.

7. The attention mechanism based SAR image multi-scale ship target detection method of claim 1, wherein the loss function of the multi-scale ship detection model is determined based on position loss, confidence loss, and classification loss.

8. SAR image multiscale naval vessel target detection device based on attention mechanism, characterized by comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the attention-based SAR image multi-scale ship target detection method according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the attention-based SAR image multi-scale ship target detection method according to any one of claims 1 to 7.