CN115272842A - SAR image ship instance segmentation method based on global semantic boundary attention network - Google Patents

SAR image ship instance segmentation method based on global semantic boundary attention network Download PDF

Info

Publication number
CN115272842A
CN115272842A CN202210472909.1A CN202210472909A CN115272842A CN 115272842 A CN115272842 A CN 115272842A CN 202210472909 A CN202210472909 A CN 202210472909A CN 115272842 A CN115272842 A CN 115272842A
Authority
CN
China
Prior art keywords
network
recording
sub
boundary
traditional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210472909.1A
Other languages
Chinese (zh)
Inventor
张晓玲
柯潇
张天文
师君
韦顺军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202210472909.1A priority Critical patent/CN115272842A/en
Publication of CN115272842A publication Critical patent/CN115272842A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a SAR ship instance segmentation method based on a global semantic boundary attention network, which is used for solving the problem of limited positioning capability of a target frame in the prior art. The invention is based on deep learning theory, and mainly comprises a global context information modeling module and a boundary attention prediction module. The global context information modeling module builds a long-distance dependency relationship by enhancing the semantic information of the features for multiple times, thereby effectively reducing background interference. The boundary attention prediction module predicts the boundary information of the target twice, so that the positioning capacity of the target frame is improved. The average precision AP of the method provided by the invention is superior to that of the existing SAR ship example segmentation method based on deep learning. The method can overcome the problem of limited target frame positioning capability in the prior art, and improve the example segmentation precision of the ship in the SAR image.

Description

SAR image ship instance segmentation method based on global semantic boundary attention network
Technical Field
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and relates to an SAR image ship instance segmentation method based on a global semantic boundary attention network.
Background
Synthetic Aperture Radar (SAR) is an excellent sensor. The high-resolution observation image can be provided by measuring the radar scattering characteristics of the target, is not influenced by light and weather, and is widely applied to communities such as measurement, traffic, oceans and remote sensing. The ship monitoring is beneficial to disaster relief, traffic control and fishery monitoring, and is a hotspot of current research. Compared with optical, infrared and hyperspectral sensors, the SAR has stronger adaptability to the marine climate change environment and is more suitable for ship monitoring. Therefore, ship surveillance using SAR is gaining increasing importance.
The traditional method usually depends on expert experience to manually make the characteristics, which wastes time and labor and limits wider popularization. In recent years, detection, classification and recognition methods based on deep learning have been rapidly developed in various fields, and have been extensively and extensively applied to pedestrian detection, face recognition, image classification, speech translation, and the like. The deep learning is introduced into SAR ship example segmentation, and the deep learning has great application potential. More and more scholars have conducted more research on SAR ship instance segmentation methods based on deep learning. For example, suhao et al adopts a model based on a convolutional neural network to perform example segmentation on a remote sensing image, but does not consider the characteristics of an SAR ship, so that the further improvement of the precision is influenced. High-fly et al propose an anchor-box-free instance-segmented network, but the model cannot handle complex scenes and cases. One example segmentation method for SAR vessels based on synergistic attention was proposed by jones danbach et al, but their methods still leave out many small vessels and offshore vessels. In general, most existing SAR vessel example segmentation methods have limited target frame positioning capability, so that the segmentation accuracy needs to be further improved.
Therefore, to solve this problem. The SAR image ship example segmentation method based on the global semantic boundary attention network is provided, and the example segmentation precision is improved by improving the positioning capability of a target frame. The method mainly comprises two modules for improving the positioning capability of a target frame, wherein the first module is a global context information modeling module and is formed by serially connecting a content perception feature recombination sub-network, a multi-view field feature extraction sub-network and a global feature self-attention sub-network. The global context information modeling module is used for modeling the long-distance dependence of the ship surrounding environment through a larger visual field, so that background interference is effectively reduced, and regional features with higher resolution are extracted. The second module is a boundary attention prediction module and is formed by connecting four parts of a boundary attention feature extraction sub-network, a boundary rough positioning sub-network, a boundary fine positioning sub-network and a boundary guide classification re-grading sub-network in series. The boundary attention prediction module is different from a traditional boundary regression module, and the boundary attention prediction module does not adopt output center point and size information to predict a boundary frame, but adopts output information of four boundaries to realize boundary frame prediction. Experimental results on the HRSID data set show that the proposed method is superior to other example segmentation methods based on deep learning.
Disclosure of Invention
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and discloses a SAR ship instance segmentation method based on a global semantic boundary attention network, which is used for solving the problem of limited target frame positioning capability in the prior art. The method is based on a deep learning theory and mainly comprises a global context information modeling module and a boundary attention prediction module. The global context information modeling module builds a long-distance dependency relationship by enhancing the semantic information of the features for multiple times, thereby effectively reducing background interference. The boundary attention prediction module predicts the boundary information of the target twice, so that the positioning capacity of the target frame is improved. Experiments prove that on the HRSID data set, the average precision AP of the SAR ship example segmentation method based on the global semantic boundary attention network is 57.3%, and the highest average precision in the existing SAR ship example segmentation methods based on the deep learning is 55.4%. The SAR ship instance segmentation method based on the global semantic boundary attention network improves the ship instance segmentation precision.
For the convenience of describing the present invention, the following terms are first defined:
definition 1: traditional HRSID data set acquisition method
The HRSID Dataset is a commonly used SAR image ship instance segmentation Dataset, which is called High-Resolution SAR Images Dataset in English. The data set is derived from 136 panoramic SAR images with the resolution range of 1 meter to 5 meters, each panoramic SAR image is provided with an overlap ratio of 25 percent, a plurality of SAR image slices are obtained through a sliding window mechanism, the size of each slice is 800 x 800 pixels, the number of the finally obtained total slices is 5604, and the total number of ships contained in each slice is 169511. The HRSID dataset sets 65 percent of the slices as the training set and the remaining 35 percent of the slices as the test set. HRSID data set acquisition methods are detailed in "Wei S, zeng X, qu Q, et al.HRSID A High-Resolution SAR Images data set for Ship Detection and Instance Segmentation [ J ]. IEEE Access,2020, 8.
Definition 2: traditional residual backbone network construction method
The residual backbone network is a commonly used backbone network, and the convolutional neural network proposed by 4 scholars from Microsoft Research has gained a superior image classification and object Recognition in the 2015 ImageNet Large Scale Visual Recognition Competition (ILSVRC). Compared with the conventional backbone network, the residual backbone network reduces the probability of gradient extinction and gradient explosion by adding a plurality of residual connections, and can realize faster optimization, so that the number of convolution layers in the residual backbone network can be more than that in the conventional backbone network. A residual network with network depth of 101 is one of the more common structures in the residual network. Specifically, the residual network having a network depth of 101 layers is first subjected to feature extraction by convolution layers having a size of 7 × 7, a convolution kernel number of 64, and a step size of 2, and then subjected to double down-sampling by 3 × 3 maximum pooling, and is output as a feature map at the first stage. And then, the outputs of the second stage, the third stage, the fourth stage and the fifth stage are respectively extracted through stacking of a plurality of residual error modules. The differences are that the number of residual error modules used in each stage is different, and the number of convolution kernels of convolution layers in the residual error modules in each stage is different. Thanks to the cross-connection structure in the residual error module, the convolution network can avoid the problems of gradient disappearance, gradient explosion and degradation while realizing deep-level stacking, can quickly optimize the network, and extracts abstract features with higher distinguishing capability. The classical Residual network construction method is described in detail in K.He et al, "Deep Residual Learning for Image registration," IEEE Conf.Compout.Vis.Pattern registration, 2016, pp.770-778.
Definition 3: traditional regional recommendation network construction method
The regional recommendation network is proposed in Faster R-CNN. The Fast R-CNN provides a regional recommendation network to replace a selective search algorithm aiming at the defect of time consumption of the regional recommendation algorithm in the Fast R-CNN algorithm, and the regional recommendation network and the Fast R-CNN are fused into a network by introducing the concept of a shared convolution characteristic diagram, so that the rapid target detection is realized. In addition, the regional recommendation network also enhances the multi-scale detection capability of the detection network to a certain extent by presetting anchor frames with different sizes and different length-width ratios, thereby improving the detection precision of the target. Specifically, the area recommendation network takes a whole picture as input, and outputs position information and confidence degrees of a series of area recommendation frames. Wherein the confidence level represents the probability that the region recommendation box is foreground. The regional recommendation network comprises two sub-networks, wherein the first sub-network is a backbone network which is shared by the regional recommendation network and Fast R-CNN, and the mechanism is also called a shared convolution characteristic diagram, and the ultimate aim of the mechanism is to save the consumption of computing resources for target detection. The second sub-network of the regional recommendation network consists of an intermediate layer, a classification layer and a regression layer. Wherein the middle layer is essentially a full connection layer; both the classification layer and the regression layer are essentially convolutional layers of 3 x 3 size. It is noted that the classification layer and the regression layer are two modules in parallel, and their inputs are both the outputs of the middle layer. In addition, the second sub-network operates in a similar manner to the convolution operation, using a sliding window mechanism. In particular a local region of the input profile of the second subnetwork. In the forward propagation stage, the second sub-network slides on the feature map, and for each position, classification and regression information corresponding to a plurality of anchor frames is calculated for the position. The classic method for constructing a regional recommended network is described in "Ren S, he K, girshick R, et al. Faster R-CNN: towards read-Time Object Detection with Region pro-social Networks [ J ]. IEEE Transactions on Pattern Analysis & Machine Analysis, 2017,39 (6): 1137-1149.
Definition 4: construction method of traditional interested region feature extraction module
The interesting region feature extraction module is firstly proposed in Fast R-CNN paper, and is used for acquiring corresponding local features with fixed size on the feature map according to coordinate values of the interesting region. The region-of-interest feature extraction module proposed in the Fast R-CNN paper is RoI posing, and the basic idea is to obtain local features with fixed size through two-time quantization and maximum pooling operation. However, the two quantifications often bring about some precision loss, so that the extracted local features are inconsistent with the coordinate values of the region of interest. Therefore, in the Mask R-CNN article, the method proposes RoI Align to extract the characteristics of the region of interest, and the main idea is to abandon the quantization operation and calculate the characteristic value of the corresponding coordinate point by adopting bilinear interpolation, so that the extracted local characteristics are consistent with the coordinate values of the region of interest. The RoI Align has now become the mainstream implementation of the region of interest feature extraction module. The detailed construction method of the interesting region feature extraction module is detailed in He K, gkioxari G, P Doll' R, et al Mask R-CNN [ J ]. IEEE Transactions on Pattern Analysis & Machine Analysis, 2017.
Definition 5: traditional convolutional layer construction method
The convolutional layer is a basic module in a deep learning neural network, and has the basic function of extracting abstract features of input data, so that subsequent classification, regression and other networks can conveniently execute related tasks. Convolutional layers typically contain a number of convolutional kernels, which is a node that enables values within a small rectangular region in an input feature map or picture to be weighted separately and then summed as an output. Each convolution kernel requires the manual specification of multiple parameters. One type of parameter is the length and width of the node matrix processed by the convolution kernel, and the size of this node matrix is also the size of the convolution kernel. The other type of convolution kernel has parameters of the depth of the unit node matrix obtained by processing, and the depth of the unit node matrix is also the depth of the convolution kernel. In the convolution operation process, each convolution kernel slides on input data, then an inner product of the whole convolution kernel and the corresponding position of the input data is calculated, then the inner product is processed through a nonlinear function to obtain a final result, and finally the results of all the corresponding positions form a two-dimensional characteristic diagram. Each convolution kernel generates a two-dimensional feature map, and the feature maps generated by the plurality of convolution kernels are overlapped to form a three-dimensional feature map. In general, the convolution layer has a convolution kernel size of 3 × 3 or 5 × 5, the depth of the convolution kernel is determined by the number of characteristic channels of the previous layer, and the number of convolution kernels is determined by the designer. The classic convolutional layer construction method is detailed in 'Vanli, zhao hong Wei, zhaoyu, huhuang water, wangxing' target detection research based on deep convolutional neural network reviews [ J ]. Optical precision engineering 2020,28 (05): 1152-1164.
Definition 6: traditional pixel recombination construction method
The pixel recombination is firstly proposed for the super-resolution task of the image, and is gradually applied to the classification and detection task of the image subsequently. The pixel recombination is an up-sampling method, can effectively enlarge the reduced feature map, and can be used as a substitute for deconvolution or nearest neighbor interpolation. The classic pixel recombination construction method is detailed in https:// blog.csdn.net/djfjkj 52/article/details/123829282.
Definition 7: traditional hollow convolution layer construction method
The void convolutional layer is similar to the standard convolutional layer, and only the expansion rate parameter is added on the basis of the standard convolutional layer, so that the size of a sampling domain of the convolutional layer is increased, and the increase of the size of a receptive field is finally realized. The cavity convolution layer can extract global information to a certain extent by increasing the receptive field, enhance the semantic information in the output characteristic diagram and is beneficial to the neural network to distinguish the target and the background interference. The classic method for constructing the cavity convolution layer is detailed in https:// blog.csdn.net/qq _ 30241709/article/details/88080367.
Definition 8: conventional tandem operation
The cascade is an important operation in the design of a network structure, and is used for combining features, fusing the features extracted by a plurality of convolution feature extraction frameworks or fusing the information of an output layer, thereby enhancing the feature extraction capability of the network. The cascade method is detailed in "https:// blog. Csdn. Net/alxe _ master/article/details/80506051utm _. Medium =" distribute. Pc _. Release. Free. Non-task-block-blogcommenda frommmachine LearnPai2-3.Channel _. Park parameter _. Dept. 1-utm _ source = distribute. Pc _ release. Non-task-block-blogcommendan LearnPai2-3.Channel _. Park _ release. P _ Rev.
Definition 9: traditional global feature self-attention construction method
The global feature self-attention module is used for extracting input non-local features, and the basic idea of the module is that similarity weights of all other pixel points are calculated for each input pixel point, and then the corresponding pixel points are subjected to weighted summation by using the similarity weights and serve as the output of the pixel point. Compared with the convolutional layer, the visual field of the global feature self-attention module is larger and is not limited to the local visual field, so that global information can be extracted, and semantic information in the feature map is enhanced. The classical global feature self-attention construction method is detailed in Wang X, girshick R, gupta A, et al.
Definition 10: traditional full-connection layer construction method
The fully connected layer is one of neural network structures for further extracting features. Different from the convolutional layer, the number of input and output nodes of the fully-connected layer needs to be preset, and the parameter number and the calculation amount of the fully-connected layer far exceed those of the convolutional layer, so that the fully-connected layer often appears in only a certain part of a neural network structure. The classical full link layer method is described in detail in Haoren Wang, haotian Shi, ke Lin, chengjin Qin, liqun Zhao, yixiang Huang, chengliang Liu.A. high-precision arhythmia classification method based on dual functional connected neural network [ J ]. Biological Signal Processing and Control,2020,58".
Definition 11: traditional convolution attention module construction method
The convolution attention module mainly comprises three parts of pooling, convolution and activation functions. Specifically, for the input feature F, a two-dimensional feature map is obtained by maximum pooling at a channel level and average pooling at a channel level, the two feature maps are spliced, and finally a spatial attention map M is obtained by a convolution kernel activation functionS. The specific expression of the spatial attention module is as formula MS(F)=σ(f7×7([AvgPool(F);MaxPool(F)]) Shown in (c). Wherein f is7×7Represents convolution operation, the convolution kernel size is 7 × 7, and σ represents sigmoid activation function. It should be noted that, in this chapter, in order to extract features of multiple visual fields, this chapter adopts two spatial attention modules to extract features in parallel, and the convolution kernel sizes of convolution layers are 7 × 7 and 3 × 3, respectively. Classical convolutional attention module construction methods such as "Woo, s.; park, j.; lee, j.y.; kweon, I.S.J.S., cham, CBAM: volumetric Block attachment Module.2018.".
Definition 12: traditional boundary coarse positioning sub-network construction method
The boundary rough positioning sub-network takes the boundary characteristics as input and outputs rough positioning of the corresponding boundary. In particular, the coarse boundary locator sub-network divides the target space into a plurality of discrete intervals, and for a given boundary characteristic, the coarse boundary locator sub-network only gives the parameter (i.e., s) of which interval the corresponding boundary belongs tox-right,sx-left,sy-rightAnd s andy-left) Without giving more accurate boundary regression values. Wherein s isx-rightRepresenting the confidence of the boundary vertically to the right, sx-leftRepresenting the confidence of the boundary vertically to the left, sy-rightRepresenting the confidence of the boundary horizontally to the right, sy-leftRepresentative levelBoundary confidence towards the left. The classic Boundary coarse positioning sub-network construction method is described in detail in Wang J, zhang W, cao Y, et al, side-Aware Boundary Localization for More precision Object Detection [ J].2019.”。
Definition 13: construction method of traditional boundary fine positioning sub-network
The boundary fine positioning sub-network corrects the position of the boundary again on the basis of the boundary coarse positioning. The process is similar to the traditional boundary classification regression, the prediction frame in the boundary coarse positioning is used as the prior knowledge, and the coordinate offset and the size offset between the target frame and the prediction frame are output, so that a more accurate boundary prediction value is obtained. The classic Boundary fine positioning subnetwork construction method is described in detail in Wang J, zhang W, cao Y, et al.side-Aware Boundary Localization for More precision Object Detection [ J ].2019 ].
Definition 14: traditional mask subnetwork construction method
The Mask sub-network is extracted from the Mask R-CNN, the sub-network takes the result of the boundary prediction network as input, outputs the pixel level two-classification result of the target area, and can realize the pixel level differentiation of the target and the background so as to extract the edge information of the target. The classical mask subnetwork construction method is described in detail in "He K, gkioxari G, P Doll a R, et al.Mask R-CNN [ J ]. IEEE Transactions on Pattern Analysis & Machine understanding, 2017.".
Definition 15: classical Adam algorithm
The classical Adam algorithm is an extension of the stochastic gradient descent method and has recently been widely used in deep learning applications in computer vision and natural language processing. Classical Adam is different from classical random gradient descent methods. The random gradient descent maintains a single learning rate for all weight updates, and the learning rate does not change during the training process. Each network weight maintains a learning rate and is adjusted individually as learning progresses. The method calculates adaptive learning rates for different parameters from budgets of the first and second moments of the gradient. The classic Adam algorithm is detailed in "Kingma, d.; a Method for Stochastic optimization 2014, arXiv 1412.698. ".
Definition 16: conventional forward propagation method
The forward propagation method is the most basic method in deep learning, and mainly carries out forward reasoning on input according to parameters and connection methods in a network so as to obtain the output of the network. The forward propagation method is detailed in "https:// www. Jianshu. Com/p/f30c8daebebb".
Definition 17: conventional non-maxima suppression method
The non-maximum suppression (NMS) method is an algorithm used in the field of object detection to remove redundant detection boxes. In the forward propagation result of the classical detection network, the situation that the same target corresponds to a plurality of detection boxes often occurs. Therefore, an algorithm is needed to select a detection box with the best quality and the highest score from a plurality of detection boxes of the same target. Non-maxima suppression performs a local maximum search by calculating an overlap rate threshold. Non-maxima suppression methods are detailed in "https:// www. Cnblogs. Com/makefile/p/nms. Html".
Definition 18: traditional recall rate and accuracy rate calculation method
Recall R refers to the number of correct predictions in all positive samples, expressed as
Figure BDA0003623713080000071
The precision ratio P refers to the proportional expression of the correct number in the result predicted as positive example as
Figure BDA0003623713080000072
Wherein TP (true positive) represents positive samples predicted to be positive by the model; FN (false negative) represents the negative sample predicted by the model to be negative; FP (false positive) is expressed as a positive sample predicted to be negative by the model. The recall rate and accuracy curve P (R) refers to a function with R as an independent variable and P as a dependent variable, and the method for solving the numerical values of the parameters is shown in the literature' Lihang, statistical learning method [ M]Beijing, qinghua university Press, 2012 ".
The invention provides a SAR ship instance segmentation method based on a global semantic boundary attention network, which comprises the following steps:
step 1, initializing a data set
Obtaining HRSID data set according to the traditional HRSID data set obtaining method in definition 1, and marking the training set in the HRSID data set as DtestTraining set Dtrain
Step 2, building a forward propagation network
Step 2.1, building ResNet-101 backbone network
And (3) constructing a residual error network with 101 network layers by adopting the traditional classical residual error backbone network construction method in definition 2, and marking as Res-101.
Step 2.2, building a regional recommendation network
Constructing a regional recommendation network by adopting a classical regional recommendation network construction method in definition 3, taking the ResNet-101 backbone network Res-101 obtained in step 2.1 as a sub-network in the regional recommendation network, and recording the constructed regional recommendation network as RPN0
Step 2.3, building a feature extraction module
And (4) constructing a feature extraction module by adopting a traditional region-of-interest feature extraction module construction method in definition 4, and recording the constructed feature extraction module as FExtract.
Step 2.4, building a global context information modeling module
Firstly, a traditional convolutional layer construction method in definition 5 is adopted to construct two convolutional layers which are respectively marked as conv1 and conv2, and then a traditional pixel recombination construction method in definition 6 is adopted to construct a pixel recombination module which is marked as pixelschuffle. According to the expression
Figure BDA0003623713080000073
Defining a Softmax layer, noted as Softmax0, wherein ziC represents the channel number of the input feature map for the feature value of the ith node on the input feature map. Conv1, conv2, pixelshuffle, softmax were concatenated and designated kplayer. According to the expression
Figure BDA0003623713080000074
Constructing feature reconstruction layersIs cz. Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003623713080000075
(i, j) represents l, F(i+n,j+m)Represents the feature vector at (i + n, j + m) in F, Wl'(n,m)Represents Wl'Of (c) is located at (n, m). And combining the kplayer and the czlayer together to complete the construction of the content-aware feature recombination sub-network, and marking the constructed content-aware feature recombination sub-network as card.
Then, the conventional method for constructing the hole convolution layers in definition 7 is adopted to construct hole convolution layers with expansion rates of 2,3,4,5, which are respectively marked as d1, d2, d3 and d4. And constructing a cascade module, which is recorded as concate, by adopting a traditional cascade operation construction method in definition 8. Build the convolutional layer using the conventional convolutional layer building method in definition 6, denoted conv3. And d1, d2, d3 and d4 are connected in parallel and then are sequentially connected with concate and con3 in series, so that the construction of the multi-view-field feature extraction subnetwork is completed, and the constructed multi-view-field feature extraction subnetwork is recorded as mrblock.
And finally, constructing a global feature self-attention sub-network by adopting a traditional global feature self-attention construction method in definition 9, and marking the constructed global feature self-attention sub-network as sablock.
And (4) extracting subnetworks mrblock from the feature recombination sub-network card and the multi-view field features, and connecting the global features in series according to the sequence from the attention sub-network sablock to obtain a global context information modeling module, which is recorded as GCB.
Step 2.5, setting up boundary attention prediction module
And building three full connection layers which are respectively marked as fc1, fc2 and fc3 according to the definition 10 of the traditional full connection layer building method. Establishing the classification branch by connecting fc1, fc2 and fc3 in series, and recording the result as CLBranch, and recording the classification result output by the CLBranch as s.
And (4) constructing a convolution attention module, which is recorded as CBAM, by adopting a traditional convolution attention module construction method of definition 11. And (5) building four convolutional layers by adopting a traditional convolutional layer building method of definition 5, wherein the convolutional layers are respectively marked as conv4, conv5, conv6 and conv7. According to the expression
Figure BDA0003623713080000081
Two Softmax layers are defined, denoted as Softmax1 and Softmax2, respectively, where ziC represents the channel number of the input feature map for the feature value of the ith node on the input feature map. Conv4, softmann 1, conv5 were serially connected in sequence and designated branchx, and conv6, softmax2, conv7 were serially connected in sequence and designated branchy. And (4) connecting branchx and branchy in parallel and then adding the branchx and branchy to the CBAM module, thus completing the construction of the boundary attention prediction module and recording the boundary attention prediction module as baff.
Establishing a boundary coarse positioning sub-network by adopting a traditional boundary coarse positioning sub-network construction method defined by 12, recording the established boundary coarse positioning sub-network as bbcl, and recording four outputs of the bbcl as s respectivelyx-right,sx-left,sy-right,and sy-left
And (3) constructing a boundary fine positioning sub-network by adopting a traditional boundary fine positioning sub-network construction method defined by 13, and recording the constructed boundary fine positioning sub-network as brfl.
With the classification result s output by CLBranch in step 2.5 and s output by bbcl in step 2.6x-right,sx-left,sy-right,and sy-leftAs input, according to the formula
Figure BDA0003623713080000082
And (5) performing calculation, namely finishing the construction of the boundary guide classification re-scoring sub-network, and recording as cbcr.
And serially connecting baff, bbcl, brfl and cbcr in sequence to complete the construction of the boundary attention prediction module, and marking as BABP.
Step 2.6, building a mask subnetwork
And constructing a MASK subnetwork according to the definition 14 of the traditional MASK subnetwork construction method, and recording the constructed MASK subnetwork as MASK.
Step 2.7, building an example segmentation cascade network
And (3) sequentially connecting the feature extraction module FExtract obtained in the step (2.3), the global context information modeling module GCB obtained in the step (2.4), the boundary attention prediction module BABP obtained in the step (2.5) and the MASK sub-networks MASK obtained in the step (2.6) in series to obtain a first example segmentation network which is marked as SEG1.
And (3) repeating the step 2.3, the step 2.4, the step 2.5 and the step 2.6, and sequentially connecting the modules or the sub-networks obtained in the steps in series to obtain a second example segmentation network, which is marked as SEG2.
And (3) repeating the step 2.3, the step 2.4, the step 2.5 and the step 2.6, and sequentially connecting the modules or the sub-networks obtained in the steps in series to obtain a third example segmentation network, which is marked as SEG3.
The first example segmentation network SEG1, the second example segmentation network SEG2 and the third example segmentation network SEG3 are connected in series in sequence, namely the example segmentation cascade network is completed and is marked as CASEG0
Step 3, training area recommendation network
An iteration parameter epoch is set, and an initial epoch value is 1.
Step 3.1, forward propagation is carried out on the regional recommendation network
Training set D obtained in step 1trainNetwork RPN as regional recommendation0According to the forward propagation method in definition 16, training set DtrainSending into regional recommended network RPN0Computing and recording network RPN0The output of (1) is Result0.
Step 3.2, sampling the forward propagation result
Inputting Result0 and training set D obtained in step 3.1trainAs input, according to the formula
Figure BDA0003623713080000091
Calculating the IOU value of each recommendation box in Result0 by using a calculation method, and taking the output of the IOU in Result0 which is greater than 0.5 as a positive sample and recording as Result0p; the output of Result0 with an IOU less than 0.5 is taken as a negative sample and is recorded as Result0n. The total number of samples in the negative sample Result0n is counted as M. Manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally by human input is nbThe number of samples in the ith IOU interval is Mi. Setting the random sampling probability of the ith interval as
Figure BDA0003623713080000092
And randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0ns.
The number of samples in the positive sample Result0P is counted and recorded as P. Setting a random sampling probability of
Figure BDA0003623713080000093
Randomly sampling Result0p, and recording the sampling Result of positive samples as Result0ps.
Step 3.3, training and optimizing the regional recommendation network
And (3) taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step (3.2) as inputs, and training and optimizing the regional recommendation network according to the classical Adam algorithm in the definition 4. Obtaining the RPN of the region recommendation network after training and optimization1
Step 4, training example segmentation cascade network
Step 4.1, forward propagation is carried out on the example segmentation cascade network
The training set D obtained in the step 1 is processedtrainSplitting cascaded networks CASEG as an example0According to the conventional forward propagation method defined in definition 16, training set DtrainCut-in instance split cascade network CASEG0Performing operation, recording example division cascade network CASEG0The output of (d) is Result1.
Step 4.2, training and optimizing the example segmentation cascade network
Segmenting the example obtained in the step 4.1 into a cascade network CASEG0With output Result1 as input, the example split cascade network is trained and optimized according to the classical Adam algorithm in definition 15. Obtaining an instance segmentation cascade network CASEG after training and optimization1
Step 5, alternative training is carried out
It is determined whether epoch set in step 3 is equal to 12. If the epoch is not equal to 12, let epoch = epoch +1, RPN0=RPN1、CASEG0=CASEG1Sequentially repeating the step 3.1, the step 3.2, the step 3.3, the step 4.1 and the step 4.2, and then returning to the step 5 to judge the epoch again; if the epoch is equal to 12, let the trained area recommendation network RPN1 and the trained flat case segmentation cascade network CASEG1Is recorded as a network GCBAN, and then step 6 is carried out.
Step 6, evaluation method
Step 6.1, forward propagation
Using the network GCBAN obtained in step 6 and the test set D obtained in step 1testAs an input, the detection result, denoted as R, is obtained by using the conventional forward propagation method of definition 16.
Taking the detection result R as input, removing a redundant frame in the detection result R1 by adopting the conventional non-maximum suppression method in definition 17, and specifically performing the following steps:
firstly, marking a box with the highest score in a detection result R1 as BS;
the step (2) then adopts a calculation formula as follows:
Figure BDA0003623713080000101
calculating an overlapping rate threshold (IoU) of all frames of the detection result R1; discarding IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the process of calculating IoU and discarding in the step (2) until no frame can be discarded, wherein the last remaining frame is the final detection result and is recorded as RF
Step 6.2, index calculation
Using the detection result R obtained in step 6.1FAs input, the traditional recall rate and precision calculation method in definition 18 is adopted to calculate the precision rate P, the recall rate R and the precision rate and recall rate curve P (R) of the network; using the formula
Figure BDA0003623713080000111
Calculating SAR ship example segmentation precision indexes AP and AP based on balance learning50、AP75、APS、APM、APL
The SAR ship instance segmentation method based on deep learning has the innovative point that a global context information modeling module and a boundary attention prediction module are introduced, so that the problem that the target frame positioning capability is limited in the existing SAR ship instance segmentation method based on deep learning is solved. The SAR image ship example segmentation AP adopting the method is 57.3 percent, which exceeds 1.9 percent of the suboptimal SAR image ship example segmentation method; SAR image ship example segmentation AP adopting method5088.6%, 2.8% more than the suboptimal SAR image ship example segmentation method; SAR image ship example segmentation AP adopting method7568.9%, which is 2.0% higher than the suboptimal SAR image ship example segmentation method; SAR image ship example segmentation AP adopting methodS57%, which is 2.1% higher than the suboptimal SAR image ship example segmentation method; SAR image ship example segmentation AP adopting methodM64.3%, which exceeds the suboptimal SAR image ship example segmentation method by 0.8 percentage point; SAR image ship example segmentation AP adopting methodL25.9%, 6.2% more than the suboptimal SAR image ship example segmentation method. In conclusion, the method can realize better target frame positioning and has excellent SAR ship example segmentation precision.
The method has the advantages of overcoming the problem of limited target frame positioning capability in the prior art and improving the example segmentation precision of the ship in the SAR image.
Drawings
FIG. 1 is a schematic flow chart of an SAR image ship example segmentation method based on a global semantic boundary attention network in the invention
Wherein, 1 represents a boundary attention prediction module, and 2 represents a global context information modeling module;
FIG. 2 is an example segmentation accuracy index of the SAR image ship example segmentation method based on the global semantic boundary attention network in the present invention.
Detailed Description
The present invention is further described in detail with reference to fig. 1 and 2.
Step 1, initializing a data set
Obtaining HRSID data set according to HRSID data set obtaining method in definition 1, marking training set in HRSID data set as DtestTraining set Dtrain
Step 2, building a forward propagation network
Step 2.1, building ResNet-101 backbone network
As shown in fig. 1, a classical residual backbone network construction method in definition 2 is adopted to construct a residual network with 101 network layers, which is denoted as Res-101.
Step 2.2, building a regional recommendation network
As shown in fig. 1, a classical regional recommendation network construction method in definition 3 is adopted to construct a regional recommendation network, the ResNet-101 backbone network Res-101 obtained in step 2.1 is used as a sub-network in the regional recommendation network, and the constructed regional recommendation network is marked as RPN0
Step 2.3, building a feature extraction module
As shown in fig. 1, a feature extraction module is constructed by using the region-of-interest feature extraction module construction method in definition 4, and the constructed feature extraction module is recorded as FExtract.
Step 2.4, building a global context information modeling module
Firstly, two convolutional layers are constructed by adopting the convolutional layer construction method in the definition 5, which are respectively marked as conv1 and conv2, and then a pixel recombination module is constructed by adopting the pixel recombination construction method in the definition 6, which is marked as pixelbuffle. According to the expression
Figure BDA0003623713080000121
Define Softmax layer, denoted as Softmax0, where ziC represents the channel number of the input feature map for the feature value of the ith node on the input feature map. Conc 1, conv2, pixelbuffle, softmax were concatenated and recorded as kplayer. According to the expression
Figure BDA0003623713080000122
And constructing a characteristic recombination layer, and recording as cz. Wherein the content of the first and second substances,
Figure BDA0003623713080000123
(i, j) represents l, F(i+n,j+m)Represents the feature vector at (i + n, j + m) in F, Wl'(n,m)Represents Wl'Is located at (n, m). And combining the kplayer and the czlayer together to complete the construction of the content-aware feature recombination sub-network, and marking the constructed content-aware feature recombination sub-network as card.
Then, the hole convolution layers with expansion rates of 2,3,4,5, denoted as d1, d2, d3, d4, respectively, are constructed by the hole convolution layer construction method in definition 7. And constructing a cascade module by adopting a cascade operation construction method in the definition 8, and marking as concatee. Build the convolutional layer using the convolutional layer building method in definition 6, denoted conv3. And d1, d2, d3 and d4 are connected in parallel and then are sequentially connected with concate and con3 in series, so that the construction of the multi-view-field feature extraction subnetwork is completed, and the constructed multi-view-field feature extraction subnetwork is recorded as mrblock.
And finally, constructing a global feature self-attention sub-network by adopting a global feature self-attention construction method in definition 9, and marking the constructed global feature self-attention sub-network as sablock.
As shown in fig. 1, a feature recombination sub-network card, a multi-view domain feature extraction sub-network mrblock, and a global feature self-attention sub-network sablock are connected in series in sequence, so as to obtain a global context information modeling module, which is denoted as GCB.
Step 2.5, building a boundary attention prediction module
And building three full-connection layers according to the full-connection layer construction method defined by the definition 10, wherein the three full-connection layers are respectively marked as fc1, fc2 and fc3. Establishing a classification branch by connecting fc1, fc2 and fc3 in series, namely, establishing a classification branch, and recording the classification branch as CLBranch, and recording a classification result output by the CLBranch as s.
And (4) building a convolution attention module by adopting a convolution attention module construction method of the definition 11, and marking the module as CBAM. Four convolutional layers were constructed using the convolutional layer construction method defined in definition 5, denoted as conv4, conv5, conv6, conv7. According to the expression
Figure BDA0003623713080000131
Two Softmax layers are defined, denoted as Softmax1 and Softmax2, respectively, where ziC represents the channel number of the input feature map for the feature value of the ith node on the input feature map. Conv4, softmann 1, conv5 were serially connected in sequence and designated branchx, and conv6, softmax2, conv7 were serially connected in sequence and designated branchy. And (4) connecting branchx and branchy in parallel and then adding the branchx and branchy to the CBAM module, thus completing the construction of the boundary attention prediction module, and recording the boundary attention prediction module as baff.
Establishing a boundary coarse positioning sub-network by adopting the boundary coarse positioning sub-network establishing method defined by the definition 12, recording the established boundary coarse positioning sub-network as bbcl, and recording four outputs of the bbcl as sx-right,sx-left,sy-right,and sy-left
And (3) constructing a boundary fine positioning sub-network by adopting the boundary fine positioning sub-network construction method of the definition 13, and recording the constructed boundary fine positioning sub-network as brfl.
With the classification result s output by CLBranch in step 2.5 and s output by bbcl in step 2.6x-right,sx-left,sy-right,and sy-leftAs input, according to the formula
Figure BDA0003623713080000132
And (5) performing calculation, namely finishing the construction of the boundary guide classification re-scoring sub-network, and recording as cbcr.
As shown in fig. 1, baff, bbcl, brfl, cbcr are serially connected in sequence, that is, the building of the boundary attention prediction module is completed and is recorded as BABP.
Step 2.6, building a mask subnetwork
As shown in fig. 1, a MASK subnetwork is constructed according to the MASK subnetwork construction method defined by definition 14, and the constructed MASK subnetwork is denoted as MASK.
Step 2.7, building an example segmentation cascade network
And (3) sequentially connecting the feature extraction module FExtract obtained in the step (2.3), the global context information modeling module GCB obtained in the step (2.4), the boundary attention prediction module BABP obtained in the step (2.5) and the MASK sub-networks MASK obtained in the step (2.6) in series to obtain a first example segmentation network which is marked as SEG1.
And (4) repeating the step 2.3, the step 2.4, the step 2.5 and the step 2.6, and sequentially connecting the modules or the sub-networks obtained in the steps in series to obtain a second example segmentation network, which is marked as SEG2.
And (3) repeating the step 2.3, the step 2.4, the step 2.5 and the step 2.6, and sequentially connecting the modules or the sub-networks obtained in the steps in series to obtain a third example segmentation network, which is marked as SEG3.
The first example segmentation network SEG1, the second example segmentation network SEG2 and the third example segmentation network SEG3 are connected in series in sequence, namely the example segmentation cascade network is completed and is marked as CASEG0
Step 3, training area recommendation network
An iteration parameter epoch is set, and an initial epoch value is 1.
Step 3.1, forward propagation is carried out on the regional recommendation network
Training set D obtained in step 1trainNetwork RPN as regional recommendation0According to the forward propagation method in definition 16, training set DtrainSending to regional recommended network (RPN)0Performing calculation and recording network RPN0As Result0.
Step 3.2, sampling the forward propagation result
Inputting Result0 and training set D obtained in step 3.1trainAs input, according to the formula
Figure BDA0003623713080000141
Calculating the IOU value of each recommendation box in Result0 by using a calculation method, and taking the output of the IOU in Result0 which is greater than 0.5 as a positive sample and recording as Result0p; the output of Result0 with an IOU less than 0.5 is taken as a negative sample and is denoted as Result0n. The total number of samples in the negative sample Result0n is counted as M. Manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally required by human input is nbThe number of samples in the ith IOU interval is Mi. Setting the random sampling probability of the ith interval as
Figure BDA0003623713080000142
And randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0ns.
The number of samples in the positive sample Result0P is counted and is marked as P. Setting a random sampling probability of
Figure BDA0003623713080000143
Randomly sampling Result0p, and recording the sampling Result of positive samples as Result0ps.
Step 3.3, training and optimizing the regional recommendation network
And (3) taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step (3.2) as input, and training and optimizing the regional recommendation network according to the classical Adam algorithm in the definition 4. Obtaining the RPN of the region recommended network after training and optimization1
Step 4, training example segmentation cascade network
Step 4.1, forward propagation is carried out on the example segmentation cascade network
Training set D obtained in step 1trainSplitting a cascaded network CASEG as an example0According to the forward propagation method in definition 16, training set DtrainSending instance split cascade network CASEG0Performing operation, recording example, dividing cascade network CASEG0As Result1.
Step 4.2, training and optimizing the example segmentation cascade network
Segmenting the example obtained in the step 4.1 into a cascade network CASEG0With output Result1 as input, the example split cascade network is trained and optimized according to the classical Adam algorithm in definition 15. Obtaining an instance segmentation cascade network CASEG after training and optimization1
Step 5, alternate training is carried out
It is determined whether epoch set in step 3 is equal to 12. If the epoch is not equal to 12, let epoch = epoch +1, RPN0=RPN1、CASEG0=CASEG1Repeating the steps in sequenceStep 3.1, step 3.2, step 3.3, step 4.1, step 4.2, then return to step 5 to judge the epoch again; if the epoch is equal to 12, the trained region recommendation network RPN1 and the trained plain case segmentation cascade network CASEG are enabled1Is recorded as a network GCBAN, and then step 6 is carried out.
Step 6, evaluation method
Step 6.1, forward propagation
Using the network GCBAN obtained in step 6 and the test set D obtained in step 1testAs an input, the detection result, denoted as R, is obtained by using the conventional forward propagation method of definition 16.
Taking the detection result R as an input, removing a redundant box in the detection result R1 by adopting a conventional non-maximum suppression method in definition 17, specifically including the following steps:
firstly, marking a box with the highest score in a detection result R1 as a BS;
the step (2) then adopts a calculation formula as follows:
Figure BDA0003623713080000151
calculating an overlapping rate threshold (IoU) of all frames of the detection result R1; discarding IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the process of calculating IoU and discarding in the step (2) until no frame can be discarded, wherein the last remaining frame is the final detection result and is recorded as RF
Step 6.2, index calculation
Using the detection result R obtained in step 6.1FAs input, the traditional recall ratio and precision calculation method in definition 18 is adopted to calculate the precision ratio P, the recall ratio R and the precision ratio and recall ratio curve P (R) of the network; using the formula
Figure BDA0003623713080000152
And calculating the SAR ship example segmentation average precision mAP based on balance learning.

Claims (1)

1. A SAR ship instance segmentation method based on a global semantic boundary attention network is characterized by comprising the following steps:
step 1, initializing a data set
Obtaining HRSID data set according to traditional HRSID data set obtaining method, marking training set in HRSID data set as DtestTraining set Dtrain
Step 2, building a forward propagation network
Step 2.1, building ResNet-101 backbone network
Constructing a residual error network with 101 network layers by adopting a classical residual error backbone network construction method, and marking as Res-101;
step 2.2, building a regional recommendation network
Constructing a regional recommendation network by adopting a classical regional recommendation network construction method, taking the ResNet-101 backbone network Res-101 obtained in the step 2.1 as a sub-network in the regional recommendation network, and marking the constructed regional recommendation network as RPN0
Step 2.3, building a feature extraction module
Constructing a feature extraction module by adopting a traditional interesting region feature extraction module construction method, and recording the constructed feature extraction module as FExtract;
step 2.4, building a global context information modeling module
Firstly, constructing two convolution layers which are respectively marked as conv1 and conv2 by adopting a traditional convolution layer construction method, and then constructing a pixel recombination module which is marked as pixelshuffle by adopting a traditional pixel recombination construction method;
according to the expression
Figure RE-RE-FDA0003863488410000011
Defining a Softmax layer, noted as Softmax0, wherein ziC represents the channel number of the input characteristic diagram; connecting conv1, conv2, pixelshuffle and soft max in series and recording as kplayer;
according to the expression
Figure RE-RE-FDA0003863488410000012
Constructing a characteristic reconstruction layer, and recording as cz; wherein the content of the first and second substances,
Figure RE-RE-FDA0003863488410000013
(i, j) represents l, F(i+n,j+m)Represents the feature vector at (i + n, j + m) in F, Wl'(n,m)Represents Wl'The weight at (n, m); combining the kplayer and the czlayer together to complete the construction of a content sensing characteristic recombination sub-network, and recording the constructed content sensing characteristic recombination sub-network as card;
then, respectively constructing the cavity convolution layers with expansion rates of 2,3,4 and 5 by adopting a traditional cavity convolution layer construction method, and respectively recording the expansion rates as d1, d2, d3 and d4; constructing a cascade module by adopting a traditional cascade operation construction method, and recording as concatee; constructing a convolutional layer by adopting a traditional convolutional layer construction method, and marking as conv3; d1, d2, d3 and d4 are connected in parallel and then sequentially connected with concatee and co nv3 in series, namely the construction of the multi-view-field feature extraction sub-network is completed, and the constructed multi-view-field feature extraction sub-network is marked as mrblock;
finally, a traditional global feature self-attention construction method is adopted to construct a global feature self-attention sub-network, and the constructed global feature self-attention sub-network is marked as sablock;
the method comprises the steps that a feature recombination sub-network card and a multi-view field feature extraction sub-network mrblock are connected in series in sequence, and a global context information modeling module is obtained and recorded as GCB;
step 2.5, building a boundary attention prediction module
Constructing three full-connection layers which are respectively marked as fc1, fc2 and fc3 by adopting a traditional full-connection layer construction method; connecting fc1, fc2 and fc3 in series to finish the construction of the classification branch, recording as CLBranch, and recording the classification result output by the CLBranch as s;
building a convolution attention module by adopting a traditional convolution attention module construction method, and recording the convolution attention module as CBAM; building four convolutional layers by adopting a traditional convolutional layer building method, wherein the convolutional layers are respectively marked as conv4, conv5, conv6 and conv7;
according to the expression
Figure RE-RE-FDA0003863488410000021
Two Softmax layers are defined, denoted as Softmax1 and Softmax2, respectively, where ziC represents the channel number of the input characteristic diagram; serially connecting conv4, softmanx1 and conv5 in sequence and marking as branchx, and serially connecting conv6, softmax2 and conv7 in sequence and marking as branchy; adding branchx and branchy which are connected in parallel after the CBAM module, namely completing the construction of a boundary attention prediction module, and recording as baff;
establishing a boundary coarse positioning sub-network by adopting a traditional boundary coarse positioning sub-network construction method, recording the established boundary coarse positioning sub-network as bbcl, and recording four outputs of the bbcl as S respectivelyx-right,Sx-left,Sy-rightAnd Sy-left
Constructing a boundary fine positioning sub-network by adopting a traditional boundary fine positioning sub-network construction method, and recording the constructed boundary fine positioning sub-network as brfl;
with classification result S output by CLBranch in step 2.5 and S output by bbcl in step 2.6x-right,Sx-left,Sy-rightAnd Sy-leftAs input, according to the formula
Figure RE-RE-FDA0003863488410000022
Calculating to complete the construction of a boundary guidance classification re-scoring sub-network, and recording as cbcr;
serially connecting baff, bbcl, brfl and cbcr in sequence to complete the construction of a boundary attention prediction module, and marking as BABP;
step 2.6, building a mask subnetwork
Constructing a MASK sub-network according to a traditional MASK sub-network construction method, and marking the constructed MASK sub-network as MASK;
step 2.7, building an example segmentation cascade network
The feature extraction module FExtract obtained in the step 2.3, the global context information modeling module GCB obtained in the step 2.4, the boundary attention prediction module BABP obtained in the step 2.5 and the MASK sub-networks MASK obtained in the step 2.6 are sequentially connected in series to obtain a first example segmentation network which is marked as SEG1;
repeating the step 2.3, the step 2.4, the step 2.5 and the step 2.6, and sequentially connecting the modules or sub-networks obtained in the steps in series to obtain a second example segmentation network which is marked as SEG2;
repeating the step 2.3, the step 2.4, the step 2.5 and the step 2.6, and sequentially connecting the modules or the sub-networks obtained in each step in series to obtain a third example segmentation network, which is marked as SEG3;
the first example segmentation network SEG1, the second example segmentation network SEG2 and the third example segmentation network SEG3 are connected in series in sequence, namely the example segmentation cascade network is completed and is marked as CASEG0
Step 3, training area recommendation network
Setting an iteration parameter epoch, and initializing an epoch value to be 1;
step 3.1, forward propagation is carried out on the regional recommendation network
The training set D obtained in the step 1 is processedtrainNetwork RPN as regional recommendation0According to a forward propagation method, training set DtrainSending into regional recommended network RPN0Computing and recording network RPN0The output of (1) is Result0;
step 3.2, sampling the forward propagation result
Inputting Result0 and training set D obtained in step 3.1trainAs input, according to the formula
Figure RE-RE-FDA0003863488410000031
Calculating the IOU value of each recommended box in Result0 by using the calculation method, and taking the output of which the IOU in Result0 is greater than 0.5 as a positive sample and recording as Result0p; taking the output of the IOU less than 0.5 in Result0 as a negative sample, and recording as Result0n; counting the total number of samples in the negative sample Result0n to be M; manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally by human input is nbThe number of samples in the ith IOU interval is Mi(ii) a Setting the random sampling probability of the ith interval as
Figure RE-RE-FDA0003863488410000032
Randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0ns;
counting the number of samples in the positive sample Result0P, and recording as P; setting a random sampling probability of
Figure RE-RE-FDA0003863488410000033
Randomly sampling Result0p, and recording a positive sample sampling Result as Result0ps;
step 3.3, training and optimizing the regional recommendation network
Taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step 3.2 as input, and training and optimizing the regional recommendation network according to a classical Adam algorithm; obtaining the RPN of the region recommendation network after training and optimization1
Step 4, training example segmentation cascade network
Step 4.1, forward propagation is carried out on the example segmentation cascade network
Training set D obtained in step 1trainSplitting cascaded networks CASEG as an example0According to the conventional forward propagation method, training set DtrainSending instance split cascade network CASEG0Performing operation, recording example, dividing cascade network CASEG0The output of (1) is Result1;
step 4.2, training and optimizing the example segmentation cascade network
Segmenting the example obtained in the step 4.1 into a cascade network CASEG0Taking the output Result1 as input, and training and optimizing the example segmentation cascade network according to a classical Adam algorithm; obtaining an instance segmentation cascade network CASEG after training and optimization1
Step 5, alternate training is carried out
Judging whether the epoch set in the step 3 is equal to 12 or not; if epoch is not equal to 12, let epoch = epoch+1、RPN0=RPN1、CASEG0=CASEG1Sequentially repeating the step 3.1, the step 3.2, the step 3.3, the step 4.1 and the step 4.2, and then returning to the step 5 to judge the epoch again; if the epoch is equal to 12, let the trained area recommendation network RPN1 and the trained flat case segmentation cascade network CASEG1Recording as a network GCBAN, and then performing step 6;
step 6, evaluation method
Step 6.1, forward propagation
Using the network GCBAN obtained in step 6 and the test set D obtained in step 1testAs input, a traditional forward propagation method is adopted to obtain a detection result which is marked as R;
taking the detection result R as input, and removing a redundant frame in the detection result R1 by adopting a traditional non-maximum value inhibition method, wherein the method specifically comprises the following steps:
firstly, marking a box with the highest score in a detection result R1 as a BS;
the step (2) then adopts a calculation formula as follows:
Figure RE-RE-FDA0003863488410000041
calculating an overlapping rate threshold (IoU) of all frames of the detection result R1; discarding IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the processes of calculating IoU and discarding in the step (2) until no frame can be discarded, and taking the last remaining frame as the final detection result and marking as RF
Step 6.2, calculating the index
Using the detection result R obtained in step 6.1FAs input, the precision P, the recall ratio R and a precision and recall ratio curve P (R) of the network are solved by adopting a traditional recall ratio and precision calculation method; using a formula
Figure RE-RE-FDA0003863488410000051
Calculating SAR ship example segmentation precision indexes AP and AP based on balance learning50、AP75、APS、APM、APL
CN202210472909.1A 2022-04-29 2022-04-29 SAR image ship instance segmentation method based on global semantic boundary attention network Pending CN115272842A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210472909.1A CN115272842A (en) 2022-04-29 2022-04-29 SAR image ship instance segmentation method based on global semantic boundary attention network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210472909.1A CN115272842A (en) 2022-04-29 2022-04-29 SAR image ship instance segmentation method based on global semantic boundary attention network

Publications (1)

Publication Number Publication Date
CN115272842A true CN115272842A (en) 2022-11-01

Family

ID=83760373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210472909.1A Pending CN115272842A (en) 2022-04-29 2022-04-29 SAR image ship instance segmentation method based on global semantic boundary attention network

Country Status (1)

Country Link
CN (1) CN115272842A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402999A (en) * 2023-06-05 2023-07-07 电子科技大学 SAR (synthetic aperture radar) instance segmentation method combining quantum random number and deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402999A (en) * 2023-06-05 2023-07-07 电子科技大学 SAR (synthetic aperture radar) instance segmentation method combining quantum random number and deep learning
CN116402999B (en) * 2023-06-05 2023-09-15 电子科技大学 SAR (synthetic aperture radar) instance segmentation method combining quantum random number and deep learning

Similar Documents

Publication Publication Date Title
CN111259930B (en) General target detection method of self-adaptive attention guidance mechanism
CN109711316B (en) Pedestrian re-identification method, device, equipment and storage medium
CN110781776B (en) Road extraction method based on prediction and residual refinement network
CN114119582B (en) Synthetic aperture radar image target detection method
CN112285712B (en) Method for improving detection precision of coasting ship in SAR image
CN113628249B (en) RGBT target tracking method based on cross-modal attention mechanism and twin structure
CN114565860B (en) Multi-dimensional reinforcement learning synthetic aperture radar image target detection method
CN109492596B (en) Pedestrian detection method and system based on K-means clustering and regional recommendation network
RU2476825C2 (en) Method of controlling moving object and apparatus for realising said method
CN110197505B (en) Remote sensing image binocular stereo matching method based on depth network and semantic information
CN113408423A (en) Aquatic product target real-time detection method suitable for TX2 embedded platform
CN111914924A (en) Rapid ship target detection method, storage medium and computing device
CN113763442A (en) Deformable medical image registration method and system
CN113408340B (en) Dual-polarization SAR small ship detection method based on enhanced feature pyramid
CN115631127A (en) Image segmentation method for industrial defect detection
CN115187786A (en) Rotation-based CenterNet2 target detection method
CN112560624A (en) High-resolution remote sensing image semantic segmentation method based on model depth integration
CN115761393A (en) Anchor-free target tracking method based on template online learning
CN115272842A (en) SAR image ship instance segmentation method based on global semantic boundary attention network
CN114241250A (en) Cascade regression target detection method and device and computer readable storage medium
CN115272670A (en) SAR image ship instance segmentation method based on mask attention interaction
CN114898464B (en) Lightweight accurate finger language intelligent algorithm identification method based on machine vision
CN116168235A (en) Hyperspectral image classification method based on double-branch attention network
CN113989672B (en) SAR image ship detection method based on balance learning
Feng et al. Improved deep fully convolutional network with superpixel-based conditional random fields for building extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination