CN113989672A - SAR image ship detection method based on balance learning - Google Patents

SAR image ship detection method based on balance learning Download PDF

Info

Publication number
CN113989672A
CN113989672A CN202111268008.2A CN202111268008A CN113989672A CN 113989672 A CN113989672 A CN 113989672A CN 202111268008 A CN202111268008 A CN 202111268008A CN 113989672 A CN113989672 A CN 113989672A
Authority
CN
China
Prior art keywords
network
data
result
scene
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111268008.2A
Other languages
Chinese (zh)
Other versions
CN113989672B (en
Inventor
张晓玲
柯潇
张天文
师君
韦顺军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202111268008.2A priority Critical patent/CN113989672B/en
Publication of CN113989672A publication Critical patent/CN113989672A/en
Application granted granted Critical
Publication of CN113989672B publication Critical patent/CN113989672B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a SAR image ship detection method based on balance learning, which is based on a deep learning theory and mainly comprises a balance scene learning mechanism, a balance interval sampling mechanism, a balance characteristic pyramid network and a balance classification regression network. The balanced scene learning mechanism solves the problem of unbalanced sample scene by amplifying the shore-approaching sample; the balance interval sampling mechanism solves the problem of scene unbalance of image samples by dividing an IOU into a plurality of intervals and sampling samples such as each interval and the like; the balanced feature pyramid network extracts features with more multi-scale detection capability through a feature enhancement method, so that the problem of unbalanced ship scale features is solved; the balanced classification regression network solves the problem of unbalanced classification regression tasks by designing two different sub-networks for classification and regression tasks. The method has the advantages of overcoming the unbalance problem in the prior art and improving the detection precision of the ship in the SAR image.

Description

SAR image ship detection method based on balance learning
Technical Field
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and relates to a SAR image ship detection method based on balance learning.
Background
Synthetic Aperture Radar (SAR) is an advanced active microwave sensor for high-resolution earth observation, and is still the leading technology in the field of ocean monitoring at present. The method is widely applied to military and civil fields of marine traffic control, disaster relief, fishery management and the like. Currently, while optical or hyperspectral satellites provide some monitoring services, SAR with all-day, all-weather working capability is more suitable for climatically changing oceans. Therefore, SAR is an indispensable remote sensing tool in marine regional awareness.
Ships are the most important participants in the ocean. Due to the huge value of the method in the aspects of sunken ship rescue, marine traffic control, fishery management and the like, the method is more and more valued by scholars. The research on marine vessel surveillance has been vigorously developed since the launch of the first SAR satellite Seasat-1 in the united states. In addition, the data volume generated by various SAR sensors is large at present, and intelligent detection on marine targets is urgently needed. Therefore, ship SAR detection has become a research hotspot of a high-resolution earth observation boundary. The details are shown in the literature 'Wangzaiyong, Chonghao, Tianjin, SAR image ship target rapid detection method research [ J ]. electronic ship engineering, 2016,36(09):27-30+ 88'
In recent years, with the rapid rise of Deep Learning (DL), many scholars in the SAR community have started to research DL-based detection methods. Compared with the traditional characteristic-based method, the DL-based method has the outstanding advantages of simplicity, full automation (namely, no complex basic stages such as land and sea segmentation, coastline detection, speckle correction and the like), high speed, high precision and the like. Although their underlying principles are not recognized, it can liberate productivity and greatly improve work efficiency. This enables a qualitative leap in the intelligent interpretation of SAR images. See "Dulan, Wangmcheng, Wangsan, Weidi, Liluol. Single channel SAR target detection and discrimination research progress in complex scene overview [ J ] Radar report, 2020,9(01):34-54 ].
However, existing deep learning based SAR ship detectors suffer from some imbalance problems, potentially preventing further accuracy improvements. Specifically, the method comprises the following steps: 1) the image sample scene is unbalanced, i.e. the number of image samples of the offshore vessel is unbalanced. Simply stated, an onshore vessel has far fewer samples than an offshore vessel. 2) The positive and negative samples are not balanced, i.e. the number of positive samples (ship) is not balanced with the number of negative samples (background). There are far more negative samples than positive samples. 3) The ship dimension characteristic is unbalanced, namely the multi-dimension ship characteristic is unbalanced. For dynamic vessel detection, the vessel size is also varied due to different spatial resolution and vessel classification. 4) The classification regression task is unbalanced, i.e. the difficulty level of the classification of the vessel and the regression of the vessel position is unbalanced, the latter being much more difficult than the former.
Therefore, in order to solve the problems of the unbalance, a SAR image ship detection method based on balance learning is provided. The method comprises a balanced scene learning mechanism, a balanced interval sampling mechanism, a balanced feature pyramid network and a balanced classification regression network, wherein the balanced scene learning mechanism, the balanced interval sampling mechanism, the balanced feature pyramid network and the balanced classification regression network are four mechanisms for solving the unbalanced problem. Experimental results on SSDD datasets show that the proposed method is superior to other deep learning based detection methods.
Disclosure of Invention
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and discloses a ship detection method based on balance learning, which is used for solving the problems of unbalanced image sample scene, unbalanced positive and negative samples, unbalanced ship scale features and unbalanced classification regression tasks in the prior art. The method is based on a deep learning theory and mainly comprises a balanced scene learning mechanism, a balanced interval sampling mechanism, a balanced feature pyramid network and a balanced classification regression network. The balanced scene learning mechanism solves the problem of unbalanced sample scene by amplifying the ship sample in shore; the balance interval sampling mechanism solves the problem of scene unbalance of image samples by dividing an IOU into a plurality of intervals and sampling samples such as each interval and the like; the balanced feature pyramid network extracts features with more multi-scale detection capability through a feature enhancement method, so that the problem of unbalanced ship scale features is solved; the balanced classification regression network solves the problem of unbalanced classification regression tasks by designing two different sub-networks for classification and regression tasks. Experiments prove that on an SSDD data set, the detection precision of the SAR image ship detection method based on the balance learning is 95.25%, the detection precision of the existing SAR ship detection method based on the deep learning is 92.27%, and the SAR detection method based on the balance learning improves the ship detection precision.
For the convenience of describing the present invention, the following terms are first defined:
definition 1: SSDD data set acquisition method
The SSDD data set refers to a SAR Ship Detection data set, which is called SAR Ship Detection Dataset in all english, and SSDD is the first open SAR Ship Detection data set. The SAR images including Sentinil-1, RadarSat-2 and TerrasAR-X are 1160 frames in total, and the resolution is 500X 500 pixels. The SSDD has 2551 ships. The minimum is 28 pixels2The maximum is 62878 pixels2(pixel2Is the product of the width pixel and the height 1). In SSDD, images with suffixes 1 and 9 (232 samples) are chosen as the test set and the rest as the training set (928 samples). The method for acquiring the SSDD data set can be used for detecting the ship target from the SAR image [ J ] based on the convolutional neural network in the reference documents of Lijianwei, Quchang, Pengshan, Dengdong and the like]Systems engineering and electronics, 2018,40(09): 1953-.
Definition 2: classic GAN network construction method
The classical idiomatic countermeasure network (GAN) is a deep learning model, and is one of the most promising methods for unsupervised learning in complex distribution in recent years. The model passes through two modules in the framework: the mutual game learning of the Generative Model (Generative Model) and the Discriminative Model (Discriminative Model) yields a reasonably good output. In the original GAN theory, it is not required that G and D are both neural networks, but only that functions that can be generated and discriminated correspondingly are fitted. Deep neural networks are generally used as G and D in practice. An excellent GAN network can realize rapid scene feature extraction. The classic GAN network construction method is described in "I.J. Goodfellow et al", "genetic adaptive networks", "International Conference on Neural Information Processing Systems, pp.2672-2680,2014"
Definition 3: classic K-means clustering algorithm
The classic K-means clustering algorithm is a clustering analysis algorithm for iterative solution, is often used as an unsupervised classification task, and comprises the steps of dividing data into K groups in advance, randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. The classic K-means clustering algorithm details are "litting", the research of the improved K-means clustering algorithm [ D ]. Anhui university, 2015. ".
Definition 4: classical Adam algorithm
The classical Adam algorithm is an extension of the stochastic gradient descent method and has recently been widely used in deep learning applications in computer vision and natural language processing. Classical Adam is different from classical random gradient descent methods. The random gradient descent maintains a single learning rate for all weight updates, and the learning rate does not change during the training process. Each network weight maintains a learning rate and is adjusted individually as learning progresses. The method calculates adaptive learning rates for different parameters from budgets of the first and second moments of the gradient. The classic Adam algorithm is detailed in "Kingma, d.; ba, J.Adam: A Method for Stocharistic optimization. arXiv 2014, arXiv:1412.6980.
Definition 5: classical forward propagation method
The forward propagation method is the most basic method in deep learning, and mainly carries out forward reasoning on input according to parameters and connection methods in a network so as to obtain the output of the network. The forward propagation method is detailed in "https:// www.jianshu.com/p/f30c8 daebebeb".
Definition 6: classic residual error network construction method
The residual network is a convolutional neural network proposed by 4 scholars from Microsoft Research, and wins image classification and object Recognition in the 2015 ImageNet Large Scale Visual Recognition Competition (ILSVRC). The residual network is characterized by easy optimization and can improve accuracy by adding considerable depth. The internal residual block uses jump connection, and the classical residual network construction method of the gradient disappearance problem caused by increasing the depth in the deep neural network is relieved. The classical Residual network construction method is described in detail in K.He et al, "Deep Residual Learning for Image registration," IEEE Conf.Compout.Vis.Pattern registration, 2016, pp.770-778.
Definition 7: conventional convolution kernel operation
The convolution kernel is a node that implements weighting and then summing values within a small portion of a rectangular region in an input feature map or picture, respectively, as an output. Each convolution kernel requires the manual specification of multiple parameters. One type of parameter is the length and width of the node matrix processed by the convolution kernel, and the size of this node matrix is also the size of the convolution kernel. The other type of convolution kernel has parameters of the depth of the unit node matrix obtained by processing, and the depth of the unit node matrix is also the depth of the convolution kernel. In the convolution operation process, each convolution kernel slides on input data, then an inner product of the whole convolution kernel and the corresponding position of the input data is calculated, then the inner product is processed through a nonlinear function to obtain a final result, and finally the results of all the corresponding positions form a two-dimensional characteristic diagram. Each convolution kernel generates a two-dimensional feature map, and the feature maps generated by the plurality of convolution kernels are overlapped to form a three-dimensional feature map. The traditional convolution kernel operation is detailed in 'Vanli, Zhao hong Wei, Zhaoyu, Huhuangshui, Wangxing' object detection research based on deep convolution neural network reviews [ J ] optical precision engineering, 2020,28(05):1152 + 1164 ].
Definition 8: conventional cascading operation
The cascade is an important operation in the network structure design, and is used for combining features, fusing the features extracted by a plurality of convolution feature extraction frameworks or fusing the information of an output layer, thereby enhancing the feature extraction capability of the network. The cascade method is detailed in https:// blog.csdn.net/alxe _ map/arrow/detail/80506051 utm _ medium ═ distribution.pc _ release.non-task-patch-block-command-message-machine-message-pai 2-3.channel _ param & depth _1-utm _ source ═ distribution.pc _ release.non-task-block-blogcommandedfromachine-pai 2-3.channel _ param.
Definition 9: conventional upsampling operations
The upsampling is an operation of performing a method on a picture or a feature map, and the main upsampling operation usually adopts an interpolation method, that is, a suitable interpolation algorithm is adopted to insert new elements between pixel points on the basis of original image pixels. In the mainstream interpolation algorithm, the adjacent interpolation is simple and easy to realize, and the application is common in the early stage. However, this method can produce significant jagged edges and mosaics in the new image. The bilinear interpolation method has a smoothing function, can effectively overcome the defects of the adjacent method, but can degrade the high-frequency part of the image to make the details of the image blurred. When the magnification factor is higher, high-order interpolation, such as bicubic and cubic spline interpolation, has good effect compared with low-order interpolation. These interpolation algorithms can continue the continuity of the gray scale change of the original image with the pixel gray scale value generated by interpolation, thereby naturally smoothing the gray scale change of the enlarged image. However, in the image, there are abrupt changes in the gray value between some pixels and the adjacent pixels, i.e., there are gray discontinuities. These pixels with abrupt changes in gray value are the edge pixels of the image that describe the contour or texture of the object. The classical upsampling operation is detailed in https:// blog.csdn.net/weixin _ 43960370/article/detail/106049708 utm _ term ═ E5% 8D% B7% E7% A7% AF% E7% 89% B9% E5% BE 81% E5% 9B% BE 4% B8% 8A 9% 87% E6% A0% B7& utm _ medium ═ pc _ aggpage _ search _ result. non-task-block-2-all-sobaiidum web-default-1-106049708 & spm ═ 3001.4430 ".
Definition 10: conventional pooling operations
The Pooling operation (Pooling) is a very common operation in CNN, the Pooling layer is used for reducing the dimension of data by simulating a human visual system, the Pooling operation is also commonly called sub-sampling (Subsampling) or down-sampling (Downsampling), and when a convolutional neural network is constructed, the Pooling operation is often used after a convolutional layer to reduce the characteristic dimension of the convolutional layer output, so that network parameters can be effectively reduced, and an over-fitting phenomenon can be prevented. Classical pooling is described in detail in "https:// www.zhihu.com/query/303215483/answer/615115629"
Definition 11: traditional regional recommendation network construction method
The regional recommendation network is a sub-network in the Faster R-CNN for extracting regions where targets may exist in the picture. The regional recommendation network is a full convolution network that takes as input the convolution signature of the underlying network output, the output being the target confidence score for each candidate box. The traditional regional recommended network construction method is described in detail in "Ren S, He K, Girshick R, et al. faster R-CNN: Towards read-Time Object Detection with Region pro-posal Networks [ J ]. IEEE Transactions on Pattern Analysis & Machine Analysis, 2017,39(6):1137 and 1149"
Definition 12: conventional full link layer approach
The fully-connected layer is a part of a convolutional neural network, the input and output sizes of the fully-connected layer are fixed, and each node is connected with all nodes of the previous layer and is used for integrating the extracted features. The full link layer method is described in detail in "Haoren Wang, Haotian Shi, Ke Lin, Chengjin Qin, Liqun Zhao, Yixiang Huang, Chengliang Liu.A. high-precision arrhythmia classification method on dual functional connected network [ J ]. biological Signal Processing and Control 2020,58 ].
Definition 13: conventional non-maxima suppression method
The non-maximum suppression method is an algorithm used for removing redundant detection boxes in the field of target detection. In the forward propagation result of the classical detection network, the situation that the same target corresponds to a plurality of detection boxes often occurs. Therefore, an algorithm is needed to select a detection box with the best quality and the highest score from a plurality of detection boxes of the same target. Non-maxima suppression performs a local maximum search by calculating an overlap rate threshold. Non-maxima suppression methods are detailed in "https:// www.cnblogs.com/makefile/p/nms.
Definition 14: traditional recall ratio and accuracy calculation method
Recall R refers to the number of correct predictions in all positive samples, expressed as
Figure BDA0003327601480000051
The precision ratio P refers to the proportional expression of the correct number in the result predicted as positive example as
Figure BDA0003327601480000052
Wherein tp (true positive) represents a positive sample predicted to be a positive value by the model; fn (false negative) represents the negative sample predicted by the model as negative; fp (false positive) is expressed as a positive sample predicted to be negative by the model. The conventional recall rate and accuracy curve P (R) refers to a function with R as an independent variable and P as a dependent variable, and the method for solving the numerical values of the parameters is shown in the literature' Li navigation, statistical learning method [ M]Beijing, Qinghua university Press, 2012 ".
The invention discloses a ship detection method based on balance learning, which comprises the following steps:
step 1, initializing SSDD data set
And adjusting the SAR image sequence in the SSDD data set by adopting a random method to obtain a new SSDD data set.
Step 2, carrying out scene augmentation by utilizing a balanced scene learning mechanism
Step 2.1, extracting SSDD data set characteristics by using GAN network
Adopting typical GAN network construction method in definition 2 to build and generate confrontation network GAN0. Taking the new SSDD data obtained in the step 1 as input, adopting the classical Adam algorithm in the definition 4 to train and optimize to generate the countermeasure network GAN0Generation of countermeasure networks after training and optimizationAnd is denoted as GAN.
Then, taking the new SSDD data obtained in step 1 as input again, according to the conventional forward propagation method in definition 5, inputting the new SSDD data obtained in step 1 into the trained and optimized generative confrontation network GAN, and obtaining an output vector M of the network, which is { M1, M2, … Mi, … M1160}, where Mi is an output vector of the ith picture in the new SSDD data.
And defining an output vector M as the scene characteristics of all pictures in the new SSDD data set, and defining Mi as the scene characteristics of the ith picture in the new SSDD data set.
Step 2.2, clustering scenes
Taking the set M of scene features of all pictures in the new SSDD data obtained in the step 2.1 as input, adopting a traditional K-means clustering algorithm in definition 3, and clustering the pictures in the new SSDD data set by means of the scene features M:
step 2.3, initializing parameters
For the centroid parameter in the traditional K-means clustering algorithm in definition 3, randomly initializing the centroid parameter of the K-means clustering algorithm in the first iteration step, and recording the centroid parameter as the centroid parameter
Figure BDA0003327601480000061
Defining the current iteration number as t, t as 1,2, …, and I as the maximum iteration number of the K-means clustering algorithm, and initializing I as 1000. Defining the centroid parameter of the t-th iteration as
Figure BDA0003327601480000062
And initializing an iteration convergence error epsilon as one of iteration convergence conditions of the algorithm.
Step 2.4, carrying out iterative operation
Firstly, using a formula
Figure BDA0003327601480000063
Calculating to obtain scene characteristics M of the ith pictureiTo the first centroid in the 1 st iteration
Figure BDA0003327601480000064
Is marked as
Figure BDA0003327601480000065
Using a formula
Figure BDA0003327601480000066
Calculating to obtain scene characteristics M of the ith pictureiTo the second centroid in the 1 st iteration
Figure BDA0003327601480000071
Is marked as
Figure BDA0003327601480000072
Comparison
Figure BDA0003327601480000073
And
Figure BDA0003327601480000074
if it is
Figure BDA0003327601480000075
Then define: scene features M of ith picture in 1 st iterationiBelong to the second category, otherwise define the scene characteristics M of the ith picture in the 1 st iterationiBelonging to the first category.
Defining: after the 1 st iteration, the set of all scene features of the first class is
Figure BDA0003327601480000076
The set of all scene features of the second class is
Figure BDA0003327601480000077
Then let t be 2, perform the following until convergence:
1) let the centroid parameter of the t step
Figure BDA0003327601480000078
Is a set
Figure BDA0003327601480000079
The arithmetic mean of (1), let the centroid parameter of the t step
Figure BDA00033276014800000710
Is a set
Figure BDA00033276014800000711
Is calculated as the arithmetic mean of (1).
2) Using a formula
Figure BDA00033276014800000712
Calculating to obtain scene characteristics M of the ith pictureiTo the first centroid in the t-th iteration
Figure BDA00033276014800000713
Is marked as
Figure BDA00033276014800000714
Using a formula
Figure BDA00033276014800000715
Calculating to obtain scene characteristics M of the ith pictureiTo the second centroid in the t-th iteration
Figure BDA00033276014800000716
Is marked as
Figure BDA00033276014800000717
3) Comparison
Figure BDA00033276014800000718
And
Figure BDA00033276014800000719
if it is
Figure BDA00033276014800000720
Then define: scene features M of ith picture in t-th iterationiBelong to the second category, otherwise define: scene features M of ith picture in t-th iterationiBelonging to the first category. Defining: after the t-th iteration, all scene feature sets in the first class are
Figure BDA00033276014800000721
All scene features of the second class are
Figure BDA00033276014800000722
And outputting a clustering result, and marking as CLASS.
4) Calculating the variation of the centroid parameter between the iteration and the last iteration, and recording as sigma, wherein the expression is
Figure BDA00033276014800000723
If σ is<Epsilon or t<And I, outputting a clustering result CLASS, otherwise, t is t +1, and then returning to the step 1) to continue iteration.
Step 2.5, carrying out scene amplification
Dividing all pictures in the new SSDD Data into two types according to the CLASS obtained from the step 2.4 and all pictures in the new SSDD Data, wherein the first type is a landing scene picture and is marked as Data1The second type is offshore scene picture marked as Data2. Defining: data1Number of pictures of N1,Data2Number of pictures of N2
If N is present2>N1Then from the first class as the landing scene picture Data1In the method, N is randomly selected based on Gaussian distribution2-N1Performing traditional mirror image operation on the picture to obtain N after the mirror image operation2-N1Opening a picture, recording as Data1extra. Then N after the mirroring operation2-N1Picture Data1extraAnd the first type is the land-backing scene picture Data1Merging and outputting a new picture set which is recorded as Data1new. Defining Data2new=Data2
If N is present2<=N1From the second class, Data is an offshore scene picture2In the method, N is randomly selected based on Gaussian distribution1-N2Carrying out traditional mirror image operation on a picture to obtain N after the mirror image operation1-N2Opening a picture, recording as Data2extra. Then N after the mirroring operation1-N2Picture Data2extraAnd the first type is the land-backing scene picture Data2Merging and outputting a new picture set which is recorded as Data2new. Defining Data1new=Data1
Defining a new set of pictures Datanew={Data1new,Data2new}。
Will DatanewAnd dividing the training set into two parts according to a 7:3 ratio to obtain a training set and a Test set, wherein the training set is marked as Train, and the Test set is marked as Test.
Step 3, building a forward propagation network
Step 3.1, building a balanced feature pyramid network
Adopting a classical residual error network construction method in definition 6 to construct a residual error network with 50 network layers, marking as Res-50, and respectively marking as F characteristic graphs generated by the last layer of network with different sizes in the residual error network Res-50 from large to small according to the size of the characteristic graphs1,F2,F3,F4,F5
F is to be5Is otherwise denoted as P5
Using the conventional convolution kernel operation in definition 7, F4Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E4
With the conventional upsampling operation in definition 9, P is upsampled by5Feature size of (D) and (F)4If the result of the sampling operation is consistent, the result is recorded as U5
With the conventional cascading operation in definition 8, E4And U5Overlapping, and recording the overlapping result as P4
Using the conventional convolution kernel operation in definition 7, F3Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E3
With the conventional upsampling operation in definition 9, P is upsampled by4Feature size of (D) and (F)3If the result of the sampling operation is consistent, the result is recorded as U4
With the conventional cascading operation in definition 8, E3And U4Overlapping, and recording the overlapping result as P3
Using the conventional convolution kernel operation in definition 7, F2Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E2
With the conventional upsampling operation in definition 9, P is upsampled by3Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U3
With the conventional cascading operation in definition 8, E2And U3Overlapping, and recording the overlapping result as P2
Using the conventional convolution kernel operation in definition 7, F1Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E1
With the conventional upsampling operation in definition 9, P is upsampled by2Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U2
With cascading operation in definition 8, E1And U2Overlapping, and recording the overlapping result as P1
With the conventional upsampling operation in definition 9, P is upsampled by5Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H5
With the conventional upsampling operation in definition 9, P is upsampled by4Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H4
Will P3Is otherwise denoted as H5
P is pooled by max pooling using the conventional pooling operation in definition 102Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H2
P is pooled by max pooling using the conventional pooling operation in definition 101Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H1
For H1,H2,H3,H4,H5By the formula
Figure BDA0003327601480000091
A feature map I is computed, where k represents the index of H and (I, j) represents the spatial sample position of the feature map.
Taking the characteristic diagram I as input and adopting a formula
Figure BDA0003327601480000092
And calculating to obtain a characteristic diagram O. Wherein, IiA feature representing the ith position on the feature map I; o isiA feature representing the ith position on the feature map O;
Figure BDA0003327601480000093
represents a normalization factor; f (I)i,Ij) Is used to calculate IiAnd IjThe function of similarity between the two is expressed as
Figure BDA0003327601480000094
Wherein, theta (I)i)=WθIi,φ(Ij)=WφIJ,WθAnd WφIs a matrix learned by the 1 × 1 convolution operation in definition 7; g (I)j)=WgIj,WgIs a matrix learned by the 1 × 1 convolution operation in definition 7.
And 3.1, obtaining a balanced characteristic pyramid network after all network operations in the step 3.1 are completed, and marking as a backhaul.
Step 3.2, building a regional recommendation network
Adopting a traditional regional recommended network construction method in the definition 11, taking the backhaul obtained in the step 3.1 as a feature extraction layer, constructing a regional recommended network, and recording the regional recommended network as RPN0
Step 3.3, building a balance classification regression network
Constructing full link layers FC1 and FC2 by adopting the traditional full link layer method in definition 12, taking the output of FC1 as the input of FC2, taking FC1 and FC2 as classification heads and marking as Clhead;
constructing four convolutional layers by adopting the traditional convolutional kernel method in definition 7, wherein the convolutional layers are Conv1, Conv2, Conv3 and Conv 4; meanwhile, the Pooling layer is constructed using the conventional Pooling operation in definition 10, denoted Pooling. The output of Conv1 was taken as the input of Conv2, the output of Conv2 as the input of Conv3, the output of Conv3 as the input of Conv4, and the output of Conv4 as the input of Pooling. Conv1, Conv2, Conv3, Conv4 and Pooling were used as regression heads and labeled Rehead. The Classification head Clhead and the regression head Rehead have the same characteristic diagram input, and together with the backhaul, the Classification head Clhead and the regression head Rehead form a balanced classification regression network which is marked as BCRN0
Step 4, training area recommendation network
An iteration parameter epoch is set, and an initial epoch value is 1.
Step 4.1, forward propagation is carried out on the regional recommendation network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a regional recommended network (RPN)0Using the conventional forward propagation method in definition 5 to send the training set Train into the regional recommendation network RPN0Computing and recording network RPN0As Result 0.
Step 4.2, carrying out balance interval sampling on the forward propagation result
Taking the input Result0 and the training set Train obtained in the step 4.1 as input, and adopting a formula
Figure BDA0003327601480000101
Calculating the IOU value of each recommendation box in Result0, and taking the output of the IOU in Result0 larger than 0.5 as a positive sample, and recording as Result0 p; the output of Result0 with an IOU less than 0.5 is taken as a negative sample and is denoted as Result0 n. The total number of samples in the negative sample Result0n is counted as M. Manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally by human input is nbThe number of samples in the ith IOU interval is Mi. Setting the random sampling probability of the ith interval as
Figure BDA0003327601480000102
And randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0 ns.
The number of samples in the positive sample Result0P is counted and is denoted as P. Setting a random sampling probability of
Figure BDA0003327601480000103
Result0p was sampled randomly and the positive sample sampling Result was recorded as Result0 ps.
Step 4.3, training and optimizing the regional recommendation network
And (3) taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step 4.2 as input, and training and optimizing the regional recommendation network by adopting a classic Adam algorithm in definition 4. And obtaining the RPN1 of the area recommendation network after training and optimization.
Step 5, training the balance classification regression network
Step 5.1, forward propagation is carried out on the balance classification regression network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a balance classification regression network BCRN0The training set Train is sent to the BCRN by the traditional forward propagation method in definition 50Calculating, and recording balance classification regression network BCRN0As Result 1.
Step 5.2, training and optimizing the balance classification regression network
The balance obtained in step 5.1 is classified backHome network BCRN0Using Result1 as an input, the area recommendation network is trained and optimized using the classical Adam algorithm in definition 4. And obtaining the trained and optimized regional recommended network BCRN 1.
Step 6, alternate training is carried out
It is determined whether epoch set in step 4 is equal to 12. If the epoch is not equal to 12, let the epoch be epoch +1, RPN0=RPN1、BCRN0=BCRN1Sequentially repeating the step 4.1, the step 4.2, the step 4.3, the step 5.1 and the step 5.2, and then returning to the step 6 to judge the epoch again; if the epoch is equal to 12, let the trained region recommendation network RPN1 and the trained balanced classification regression network BCRN1 note as network BL-Net, and then go to step 7.
Step 7, evaluation method
Step 7.1, Forward propagation
And (5) taking the network BL-Net obtained in the step 6 and the test set Tests obtained in the step 2.5 as input, and obtaining a detection result by adopting a traditional forward propagation method defined by the definition 5, wherein the detection result is marked as R.
Taking the detection result R as an input, removing a redundant box in the detection result R1 by adopting the conventional non-maximum suppression method in definition 13, and specifically performing the following steps:
firstly, marking a box with the highest score in a detection result R1 as a BS;
the step (2) then adopts a calculation formula as follows:
Figure BDA0003327601480000111
calculating an overlapping rate threshold (IoU) of all the frames of the detection result R1; discard IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the calculation IoU and discarding processes in the step (2) until no frame can be discarded, and the last remaining frame is the final detection result and is marked as RF
Step 7.2, calculating indexes
Using the detection result R obtained in step 7.1FAs an input to the process, the process may,calculating the precision ratio P, the recall ratio R and a precision ratio and recall ratio curve P (R) of the network by adopting a traditional recall ratio and precision ratio calculation method in definition 14;
using a formula
Figure BDA0003327601480000121
And calculating to obtain the average detection accuracy mAP of the SAR ship based on balance learning.
The invention has the innovation point that four balance learning methods, namely a balance scene learning mechanism, a balance interval sampling mechanism, a balance characteristic pyramid network and a balance classification regression network, are introduced, so that four unbalance problems of image sample scene unbalance, positive and negative sample unbalance, ship scale characteristic unbalance and classification regression task unbalance in the conventional SAR ship detection method based on deep learning are solved. The SAR image ship detection mAP adopting the method is 95.25 percent and exceeds a suboptimal SAR image ship detector by 3 percent; the detection mAP of the SAR image ship-ashore detector is 84.79%, which exceeds 10% of suboptimal SAR image ship detector; the SAR image offshore ship detection mAP of the method is 99.62%, which exceeds the suboptimal SAR image ship detector by 0.5 percentage point.
The method has the advantages of overcoming the unbalance problem in the prior art and improving the detection precision of the ship in the SAR image.
Drawings
Fig. 1 is a schematic flow chart of a SAR image ship detection method based on balance learning in the present invention.
Fig. 2 is a schematic diagram of a balance classification regression network in the SAR image ship detection method for balance learning in the present invention.
Fig. 3 shows the detection accuracy of the SAR image ship detection method based on balance learning in the present invention.
Detailed Description
The invention is described in further detail below with reference to fig. 1,2 and 3.
Step 1, initializing a data set
And adjusting the SAR image sequence in the SSDD data set by adopting a random method to obtain a new SSDD data set.
Step 2, carrying out scene augmentation by utilizing a balanced scene learning mechanism
Step 2.1, extracting SSDD data set characteristics by using GAN network
As shown in fig. 1, according to the classic GAN network construction method in definition 2, a countermeasure network GAN is constructed and generated0. Training and optimizing to generate an antagonistic network GAN according to a classical Adam algorithm in definition 4 by taking the new SSDD data obtained in the step 1 as input0And generating the countermeasure network after training and optimization, and recording as GAN.
Then, taking the new SSDD data obtained in step 1 as input again, according to the conventional forward propagation method in definition 5, inputting the new SSDD data obtained in step 1 into the trained and optimized generative countermeasure network GAN, and obtaining an output vector M of the network, which is { M1, M2, … Mi, … M1160}, where Mi is an output vector of the ith picture in the new SSDD data.
And defining an output vector M as the scene characteristics of all pictures in the new SSDD data set, and defining Mi as the scene characteristics of the ith picture in the new SSDD data set.
Step 2.2, clustering scenes
Taking the set M of scene features of all pictures in the new SSDD data obtained in the step 2.1 as input, adopting a traditional K-means clustering algorithm in definition 3, and clustering the pictures in the new SSDD data set by means of the scene features M:
step 2.3, initializing parameters
For the centroid parameter in the traditional K-means clustering algorithm in definition 3, randomly initializing the centroid parameter of the K-means clustering algorithm in the first iteration step, and recording the centroid parameter as the centroid parameter
Figure BDA0003327601480000131
Defining the current iteration number as t, t as 1,2, …, and I as the maximum iteration number of the K-means clustering algorithm, and initializing I as 1000. Defining the centroid parameter of the t-th iteration as
Figure BDA0003327601480000132
And initializing an iteration convergence error epsilon as one of iteration convergence conditions of the algorithm.
Step 2.4, carrying out iterative operation
Firstly, using a formula
Figure BDA0003327601480000133
Calculating scene characteristics M of ith pictureiTo the first centroid in the 1 st iteration
Figure BDA0003327601480000134
Is marked as
Figure BDA0003327601480000135
Using a formula
Figure BDA0003327601480000136
Calculating scene characteristics M of ith pictureiTo the second centroid in the 1 st iteration
Figure BDA0003327601480000137
Is marked as
Figure BDA0003327601480000138
Comparison
Figure BDA0003327601480000139
And
Figure BDA00033276014800001310
if, if
Figure BDA00033276014800001311
Then the scene characteristics M of the ith picture in the 1 st iteration are definediBelong to the second category, otherwise define the scene characteristics M of the ith picture in the 1 st iterationiBelonging to the first category.
Defining all scenes of the first class after iteration step 1The set of features is
Figure BDA00033276014800001312
The set of all scene features of the second class is
Figure BDA0003327601480000141
Then let t be 2, perform the following until convergence:
1) let the centroid parameter of the t step
Figure BDA0003327601480000142
Is a set
Figure BDA0003327601480000143
The arithmetic mean of (1), let the centroid parameter of the t step
Figure BDA0003327601480000144
Is a set
Figure BDA0003327601480000145
Is calculated as the arithmetic mean of (1).
2) Using a formula
Figure BDA0003327601480000146
Calculating scene characteristics M of ith pictureiTo the first centroid in the t-th iteration
Figure BDA0003327601480000147
Is marked as
Figure BDA0003327601480000148
By using
Figure BDA0003327601480000149
Scene feature M of ith pictureiTo the second centroid in the t-th iteration
Figure BDA00033276014800001410
Is marked as
Figure BDA00033276014800001411
3) Comparison
Figure BDA00033276014800001412
And
Figure BDA00033276014800001413
if it is
Figure BDA00033276014800001414
Then define the scene characteristics M of the ith picture in the t iterationiBelongs to the second category, otherwise defines the scene characteristics M of the ith picture in the t iterationiBelonging to the first category. Defining all scene characteristics of the first class as
Figure BDA00033276014800001415
All scene features of the second class are
Figure BDA00033276014800001416
And outputting a clustering result, and marking as CLASS.
4) Calculating the variation of the centroid parameter between the iteration and the last iteration, and recording as sigma, wherein the expression is
Figure BDA00033276014800001417
If σ is<Epsilon or t<And I, outputting a clustering result CLASS, otherwise, t is t +1, and then returning to the step 1) to continue iteration.
Step 2.5, carrying out scene amplification
Dividing all pictures in the new SSDD Data into two types according to the CLASS obtained from the step 2.4 and all pictures in the new SSDD Data, wherein the first type is a landing scene picture and is marked as Data1The second type is offshore scene picture marked as Data2. Defining Data1Number of pictures of N1,Data2Number of pictures of N2
If N is present2>N1Then from the first class as the landing scene picture Data1In the method, N is randomly selected based on Gaussian distribution2-N1Carrying out mirror image operation on a picture to obtain N after the mirror image operation2-N1Opening a picture, recording as Data1extra. Then N after the mirroring operation2-N1Picture Data1extraAnd the first type is the land-backing scene picture Data1Merging and outputting a new picture set which is recorded as Data1new. Defining Data2new=Data2
If N is present2<=N1From the second class, Data is an offshore scene picture2In the method, N is randomly selected based on Gaussian distribution1-N2Carrying out mirror image operation on a picture to obtain N after the mirror image operation1-N2Opening a picture, recording as Data2extra. Then N after the mirroring operation1-N2Picture Data2extraAnd the first type is the land-backing scene picture Data2Merging and outputting a new picture set which is recorded as Data2new. Defining Data1new=Data1
Defining a new set of pictures Datanew={Data1new,Data2new}。
Will DatanewAnd dividing the training set into two parts according to a 7:3 ratio to obtain a training set and a Test set, wherein the training set is marked as Train, and the Test set is marked as Test.
Step 3, building a forward propagation network
Step 3.1, building a balanced feature pyramid network
As shown in fig. 1, a classical residual network construction method in definition 6 is adopted to construct a residual network with 50 network layers, which is recorded as Res-50, and feature maps generated by the last layer of network with different sizes in the residual network Res-50 are respectively recorded as F from large to small according to the feature map size1,F2,F3,F4,F5
F is to be5Is otherwise denoted as P5
Following the convolution sum operation in definition 7, F4Feature extraction by 1 × 1 convolution sum, feature extractionThe extracted result is marked as E4(ii) a P is upsampled by the upsampling operation as in definition 95Feature size of (D) and (F)4If the result of the sampling operation is consistent, the result is recorded as U5(ii) a According to the cascade operation in definition 8, E4And U5Overlapping, and recording the overlapping result as P4
Following the convolution sum operation in definition 7, F3Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E3(ii) a P is upsampled by the upsampling operation as in definition 94Feature size of (D) and (F)3If the result of the sampling operation is consistent, the result is recorded as U4(ii) a According to the cascade operation in definition 8, E3And U4Overlapping, and recording the overlapping result as P3
Following the convolution sum operation in definition 7, F2Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E2(ii) a P is upsampled by the upsampling operation as in definition 93Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U3(ii) a According to the cascade operation in definition 8, E2And U3Overlapping, and recording the overlapping result as P2
Following the convolution sum operation in definition 7, F1Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E1(ii) a P is upsampled by the upsampling operation as in definition 92Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U2(ii) a According to the cascade operation in definition 8, E1And U2Overlapping, and recording the overlapping result as P1
P is upsampled by the upsampling operation as in definition 95Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H5
P is upsampled by the upsampling operation as in definition 94Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H4
Will P3Is otherwise denoted as H5
P is pooled by maximum pooling as per pooling operation in definition 102Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H2
P is pooled by maximum pooling as per pooling operation in definition 101Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H1
H is to be1,H2,H3,H4,H5According to the formula
Figure BDA0003327601480000161
A feature map I is computed, where k represents the index of H and (I, j) represents the spatial sample position of the feature map.
Taking the characteristic diagram I as an input according to a formula
Figure BDA0003327601480000162
And calculating to obtain a characteristic diagram O. Wherein, IiA feature representing the ith position on the feature map I; o isiA feature representing the ith position on the feature map O;
Figure BDA0003327601480000163
represents a normalization factor; f (I)i,Ij) Is used to calculate IiAnd IjThe function of similarity between the two is expressed as
Figure BDA0003327601480000164
Wherein, theta (I)i)=WθIi,φ(Ij)=WφIJ,WθAnd WφIs a matrix learned by the 1 × 1 convolution operation in definition 7; g (I)j)=WgIj,WgIs a matrix learned by the 1 × 1 convolution operation in definition 7.
And (4) taking all the network operations in the step 3.1 as a balanced feature pyramid network, and marking as a backhaul.
Step 3.2, building a regional recommendation network
According to the regional recommended network construction method in the definition 11, the backhaul obtained in the step 3.1 is used as a feature extraction layer to construct a regional recommended network, and the regional recommended network is marked as RPN0
Step 3.3, building a balance classification regression network
As shown in fig. 2, the balanced classification regression network is divided into two parts, namely a classification head lead and a regression head Rhead, and full connection layers FC1 and FC2 are constructed according to the conventional full connection layer method in definition 12, the output of FC1 is used as the input of FC2, and FC1 and FC2 are used as classification heads and are marked as cluads; constructing four convolutional layers, Conv1, Conv2, Conv3, and Conv4, respectively, according to the convolutional kernel method in definition 7; at the same time, the Pooling layer is constructed according to the Pooling operation in definition 10, denoted Pooling. The output of Conv1 was taken as the input of Conv2, the output of Conv2 as the input of Conv3, the output of Conv3 as the input of Conv4, and the output of Conv4 as the input of Pooling. Conv1, Conv2, Conv3, Conv4 and Pooling were used as regression heads and labeled Rehead. The Classification head Clhead and the regression head Rehead have the same characteristic diagram input, and together with the backhaul, the Classification head Clhead and the regression head Rehead form a balanced classification regression network which is marked as BCRN0
Step 4, training area recommendation network
An iteration parameter epoch is set, and an initial epoch value is 1.
Step 4.1, forward propagation is carried out on the regional recommendation network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a regional recommended network (RPN)0According to the forward propagation method in definition 5, the training set Train is sent to the regional recommended network RPN0Computing and recording network RPN0As Result 0.
Step 4.2, carrying out balance interval sampling on the forward propagation result
Taking the input Result0 obtained in the step 4.1 and the training set Train as input, and according to a formula
Figure BDA0003327601480000171
Calculating the IOU value of each recommendation box in Result0 by using a calculation method, and taking the output of the IOU more than 0.5 in Result0 as a positive sample, and recording as Result0 p; the output of Result0 with an IOU less than 0.5 is taken as a negative sample and is denoted as Result0 n. The total number of samples in the negative sample Result0n is counted as M. Manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally by human input is nbThe number of samples in the ith IOU interval is Mi. Setting the random sampling probability of the ith interval as
Figure BDA0003327601480000172
And randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0 ns.
The number of samples in the positive sample Result0P is counted and is denoted as P. Setting a random sampling probability of
Figure BDA0003327601480000173
Result0p was sampled randomly and the positive sample sampling Result was recorded as Result0 ps.
Step 4.3, training and optimizing the regional recommendation network
And (3) taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step 4.2 as input, and training and optimizing the regional recommendation network according to the classic Adam algorithm in the definition 4. And obtaining the RPN1 of the area recommendation network after training and optimization.
Step 5, training the balance classification regression network
Step 5.1, forward propagation is carried out on the balance classification regression network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a balance classification regression network BCRN0According to the forward propagation method in definition 5, the training set Train is sent to the BCRN0Calculating, and recording balance classification regression network BCRN0As Result 1.
Step 5.2, training and optimizing the balance classification regression network
The equilibrium classification obtained in the step 5.1 is regressed to a BCRN0Is used as input to train and optimize the regional recommendation network according to the classical Adam algorithm in definition 4. And obtaining the trained and optimized regional recommended network BCRN 1.
Step 6, alternate training is carried out
It is determined whether epoch set in step 4 is equal to 12. If the epoch is not equal to 12, let the epoch be epoch +1, RPN0=RPN1、BCRN0=BCRN1Sequentially repeating the step 4.1, the step 4.2, the step 4.3, the step 5.1 and the step 5.2, and then returning to the step 6 to judge the epoch again; if the epoch is equal to 12, let the trained region recommendation network RPN1 and the trained balanced classification regression network BCRN1 note as network BL-Net, and then go to step 7.
Step 7, evaluation method
Step 7.1, Forward propagation
And (5) taking the network BL-Net obtained in the step 6 and the test set Tests obtained in the step 2.5 as input, and obtaining a detection result by adopting a traditional forward propagation method defined by the definition 5, wherein the detection result is marked as R.
Taking the detection result R as an input, removing a redundant box in the detection result R1 by adopting the conventional non-maximum suppression method in definition 13, and specifically performing the following steps:
firstly, marking a box with the highest score in a detection result R1 as a BS;
the step (2) then adopts a calculation formula as follows:
Figure BDA0003327601480000181
calculating an overlapping rate threshold (IoU) of all the frames of the detection result R1; discard IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the calculation IoU and discarding processes in the step (2) until no frame can be discarded, and the last remaining frame is the final detection result and is marked as RF
Step 7.2, calculating indexes
As shown in FIG. 3, the detection result R obtained in step 7.1 is usedFAs input, calculating the precision ratio P, the recall ratio R and a precision ratio and recall ratio curve P (R) of the network by adopting a traditional recall ratio and precision ratio calculation method in definition 14; using a formula
Figure BDA0003327601480000182
And calculating the average detection accuracy mAP of the SAR ship based on balance learning.

Claims (1)

1. A ship detection method based on balance learning is characterized by comprising the following steps:
step 1, initializing SSDD data set
Adjusting the SAR image sequence in the SSDD data set by adopting a random method to obtain a new SSDD data set;
step 2, carrying out scene augmentation by utilizing a balanced scene learning mechanism
Step 2.1, extracting SSDD data set characteristics by using GAN network
Adopting a classic GAN network construction method to build and generate a confrontation network GAN0(ii) a Taking the new SSDD data obtained in the step 1 as input, adopting a classical Adam algorithm, training and optimizing to generate an antagonistic network GAN0Obtaining a trained and optimized generation countermeasure network, and recording as GAN;
then, taking the new SSDD data obtained in step 1 as input again, and inputting the new SSDD data obtained in step 1 into the trained and optimized generation countermeasure network GAN by using a conventional forward propagation method to obtain an output vector M of the network, { M1, M2, … Mi, … M1160}, where Mi is an output vector of the ith picture in the new SSDD data;
defining an output vector M as the scene characteristics of all pictures in the new SSDD data set, and defining Mi as the scene characteristics of the ith picture in the new SSDD data set;
step 2.2, clustering scenes
Taking the set M of scene characteristics of all pictures in the new SSDD data obtained in the step 2.1 as input, adopting a traditional K-means clustering algorithm, and clustering the pictures in the new SSDD data set by means of the scene characteristics M:
step 2.3, initializing parameters
For the centroid parameters in the traditional K-means clustering algorithm, randomly initializing the centroid parameters of the K-means clustering algorithm in the first iteration step, and recording the centroid parameters as the centroid parameters
Figure FDA0003327601470000011
Defining the current iteration times as t, t is 1,2, …, I, and I is the maximum iteration times of the K-means clustering algorithm, and initializing I is 1000; defining the centroid parameter of the t-th iteration as
Figure FDA0003327601470000012
Initializing an iterative convergence error epsilon as one of iterative convergence conditions of the algorithm;
step 2.4, carrying out iterative operation
Firstly, using a formula
Figure FDA0003327601470000013
Calculating to obtain scene characteristics M of the ith pictureiTo the first centroid in the 1 st iteration
Figure FDA0003327601470000014
Is marked as
Figure FDA0003327601470000015
Using a formula
Figure FDA0003327601470000021
Calculating to obtain scene characteristics M of the ith pictureiTo the second centroid in the 1 st iteration
Figure FDA0003327601470000022
Is marked as
Figure FDA0003327601470000023
Comparison
Figure FDA0003327601470000024
And
Figure FDA0003327601470000025
if it is
Figure FDA0003327601470000026
Then define: scene features M of ith picture in 1 st iterationiBelong to the second category, otherwise define the scene characteristics M of the ith picture in the 1 st iterationiBelong to a first class;
defining: after the 1 st iteration, the set of all scene features of the first class is
Figure FDA0003327601470000027
The set of all scene features of the second class is
Figure FDA0003327601470000028
Then let t be 2, perform the following until convergence:
1) let the centroid parameter of the t step
Figure FDA0003327601470000029
Is a set
Figure FDA00033276014700000210
The arithmetic mean of (1), let the centroid parameter of the t step
Figure FDA00033276014700000211
Is a set
Figure FDA00033276014700000212
The arithmetic mean of (1);
2) using a formula
Figure FDA00033276014700000213
Calculating to obtain scene characteristics M of the ith pictureiTo the first centroid in the t-th iteration
Figure FDA00033276014700000214
Is marked as
Figure FDA00033276014700000215
Using a formula
Figure FDA00033276014700000216
Calculating to obtain scene characteristics M of the ith pictureiTo the second centroid in the t-th iteration
Figure FDA00033276014700000217
Is marked as
Figure FDA00033276014700000218
3) Comparison
Figure FDA00033276014700000219
And
Figure FDA00033276014700000220
if it is
Figure FDA00033276014700000221
Then define: scene features M of ith picture in t-th iterationiBelong to the second category, otherwise define: scene features M of ith picture in t-th iterationiBelong to a first class; defining: after the t-th iteration, all scene feature sets in the first class are
Figure FDA00033276014700000222
All scene features of the second class are
Figure FDA00033276014700000223
Outputting a clustering result, and marking as CLASS;
4) calculating the variation of the centroid parameter between the iteration and the last iteration, and recording as sigma, wherein the expression is
Figure FDA00033276014700000224
If σ is<Epsilon or t<I, outputting a clustering result CLASS, otherwise, returning to the step 1) to continue iteration, wherein t is t + 1;
step 2.5, carrying out scene amplification
Dividing all pictures in the new SSDD Data into two types according to the CLASS obtained from the step 2.4 and all pictures in the new SSDD Data, wherein the first type is a landing scene picture and is marked as Data1The second type is offshore scene picture marked as Data2(ii) a Defining: data1Number of pictures of N1,Data2Number of pictures of N2
If N is present2>N1Then from the first class as the landing scene picture Data1In the method, N is randomly selected based on Gaussian distribution2-N1Performing traditional mirror image operation on the picture to obtain N after the mirror image operation2-N1Opening a picture, recording as Data1extra(ii) a Then N after the mirroring operation2-N1Picture Data1extraAnd the first type is the land-backing scene picture Data1Merging and outputting a new picture set which is recorded as Data1new(ii) a Defining Data2new=Data2
If N is present2<=N1From the second class, Data is an offshore scene picture2In the method, N is randomly selected based on Gaussian distribution1-N2Carrying out traditional mirror image operation on a picture to obtain N after the mirror image operation1-N2Opening a picture, recording as Data2extra(ii) a Then N after the mirroring operation1-N2Picture Data2extraAnd the first type is the land-backing scene picture Data2Merging and outputting a new picture set which is recorded as Data2new(ii) a Defining Data1new=Data1
Defining a new set of pictures Datanew={Data1new,Data2new};
Will DatanewDividing the training set into two parts according to a 7:3 ratio to obtain a training set and a Test set, wherein the training set is marked as Train, and the Test set is marked as Test;
step 3, building a forward propagation network
Step 3.1, building a balanced feature pyramid network
Constructing a residual error network with the network layer number of 50 by adopting a classical residual error network construction method, marking as Res-50, and respectively marking as F characteristic graphs generated by the last layer of network with different sizes in the residual error network Res-50 from large to small according to the sizes of the characteristic graphs1,F2,F3,F4,F5
F is to be5Is otherwise denoted as P5
Using conventional convolution kernel operations, F4Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E4
By conventional upsampling operations, P is formed by the upsampling operation5Feature size of (D) and (F)4If the result of the sampling operation is consistent, the result is recorded as U5
By conventional cascade operation, E4And U5Overlapping, and recording the overlapping result as P4
Using conventional convolution kernel operations, F3Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E3
By means of conventional upsampling operation, P is upsampled4Feature size of (D) and (F)3If the result of the sampling operation is consistent, the result is recorded as U4
By conventional cascade operation, E3And U4Overlapping, and recording the overlapping result as P3
Using conventional convolution kernel operationDo, F2Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E2
By means of conventional upsampling operation, P is upsampled3Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U3
By conventional cascade operation, E2And U3Overlapping, and recording the overlapping result as P2
Using conventional convolution kernel operations, F1Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E1
By means of conventional upsampling operation, P is upsampled2Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U2
By cascade operation, E1And U2Overlapping, and recording the overlapping result as P1
By means of conventional upsampling operation, P is upsampled5Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H5
By means of conventional upsampling operation, P is upsampled4Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H4
Will P3Is otherwise denoted as H5
By conventional pooling operation, P is pooled by maximizing pooling2Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H2
By conventional pooling operation, P is pooled by maximizing pooling1Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H1
For H1,H2,H3,H4,H5By the formula
Figure FDA0003327601470000041
Calculating to obtain a feature map I, wherein k represents the subscript of H, and (I, j) represents the spatial sampling position of the feature map;
taking the characteristic diagram I as input and adopting a formula
Figure FDA0003327601470000042
Calculating to obtain a characteristic diagram O; wherein, IiA feature representing the ith position on the feature map I; o isiA feature representing the ith position on the feature map O;
Figure FDA0003327601470000043
represents a normalization factor; f (I)i,Ij) Is used to calculate IiAnd IjThe function of similarity between the two is expressed as
Figure FDA0003327601470000044
Wherein, theta (I)i)=WθIi,φ(Ij)=WφIJ,WθAnd WφIs a matrix learned by a 1 × 1 convolution operation; g (I)j)=WgIj,WgIs a matrix learned by a 1 × 1 convolution operation;
after all network operations in the step 3.1 are completed, obtaining a balanced characteristic pyramid network, and marking as a backhaul;
step 3.2, building a regional recommendation network
Constructing a regional recommended network by adopting a traditional regional recommended network construction method and taking the backhaul obtained in the step 3.1 as a feature extraction layer, and recording the regional recommended network as RPN0
Step 3.3, building a balance classification regression network
Constructing full link layers FC1 and FC2 by adopting a traditional full link layer method, taking the output of FC1 as the input of FC2, taking FC1 and FC2 as classification heads and marking as Clhead;
constructing four convolutional layers by adopting a traditional convolutional kernel method, wherein the four convolutional layers are Conv1, Conv2, Conv3 and Conv 4; at the same time, traditional pooling operations are used to build a pooling layer, denoted asPooling; the output of Conv1 is taken as the input of Conv2, the output of Conv2 is taken as the input of Conv3, the output of Conv3 is taken as the input of Conv4, and the output of Conv4 is taken as the input of Pooling; conv1, Conv2, Conv3, Conv4 and Pooling are used as regression heads and are marked as Rehead; the Classification head Clhead and the regression head Rehead have the same characteristic diagram input, and together with the backhaul, the Classification head Clhead and the regression head Rehead form a balanced classification regression network which is marked as BCRN0
Step 4, training area recommendation network
Setting an iteration parameter epoch, and initializing an epoch value to be 1;
step 4.1, forward propagation is carried out on the regional recommendation network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a regional recommended network (RPN)0The training set Train is sent into the regional recommended network RPN by adopting the traditional forward propagation method0Computing and recording network RPN0As Result 0;
step 4.2, carrying out balance interval sampling on the forward propagation result
Taking the input Result0 and the training set Train obtained in the step 4.1 as input, and adopting a formula
Figure FDA0003327601470000051
Calculating the IOU value of each recommendation box in Result0, and taking the output of the IOU in Result0 larger than 0.5 as a positive sample, and recording as Result0 p; taking the output of Result0 with IOU less than 0.5 as a negative sample, and marking as Result0 n; counting the total number of samples in the negative sample Result0n as M; manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally by human input is nbThe number of samples in the ith IOU interval is Mi(ii) a Setting the random sampling probability of the ith interval as
Figure FDA0003327601470000052
Randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0 ns;
counting the number of samples in the positive sample Result0P, and recording as P; setting random sampling probabilityIs composed of
Figure FDA0003327601470000053
Randomly sampling Result0p, and recording the sampling Result of the positive sample as Result0 ps;
step 4.3, training and optimizing the regional recommendation network
Taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step 4.2 as input, and training and optimizing the regional recommendation network by adopting a classical Adam algorithm; obtaining a trained and optimized region recommended network RPN 1;
step 5, training the balance classification regression network
Step 5.1, forward propagation is carried out on the balance classification regression network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a balance classification regression network BCRN0The training set Train is sent to the BCRN by adopting the traditional forward propagation method0Calculating, and recording balance classification regression network BCRN0As Result 1;
step 5.2, training and optimizing the balance classification regression network
The equilibrium classification obtained in the step 5.1 is regressed to a BCRN0The output Result1 is used as input, and the regional recommendation network is trained and optimized according to the classical Adam algorithm; obtaining a trained and optimized regional recommended network BCRN 1;
step 6, alternate training is carried out
Judging whether the epoch set in the step 4 is equal to 12 or not; if the epoch is not equal to 12, let the epoch be epoch +1, RPN0=RPN1、BCRN0=BCRN1Sequentially repeating the step 4.1, the step 4.2, the step 4.3, the step 5.1 and the step 5.2, and then returning to the step 6 to judge the epoch again; if the epoch is equal to 12, let the trained region recommendation network RPN1 and the trained balanced classification regression network BCRN1 note as network BL-Net, and then go to step 7.
Step 7, evaluation method
Step 7.1, Forward propagation
Taking the network BL-Net obtained in the step 6 and the test set Tests obtained in the step 2.5 as input, and obtaining a detection result by adopting a traditional forward propagation method, and marking the detection result as R;
taking the detection result R as input, and removing a redundant frame in the detection result R1 by adopting a traditional non-maximum suppression method, wherein the method specifically comprises the following steps:
firstly, marking a box with the highest score in a detection result R1 as a BS;
the step (2) then adopts a calculation formula as follows:
Figure FDA0003327601470000061
calculating an overlapping rate threshold (IoU) of all the frames of the detection result R1; discard IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the calculation IoU and discarding processes in the step (2) until no frame can be discarded, and the last remaining frame is the final detection result and is marked as RF
Step 7.2, calculating indexes
Using the detection result R obtained in step 7.1FAs input, calculating the precision ratio P, the recall ratio R and a precision ratio and recall ratio curve P (R) of the network by adopting a traditional recall ratio and precision ratio calculation method;
using a formula
Figure FDA0003327601470000071
And calculating to obtain the average detection accuracy mAP of the SAR ship based on balance learning.
CN202111268008.2A 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning Active CN113989672B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111268008.2A CN113989672B (en) 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111268008.2A CN113989672B (en) 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning

Publications (2)

Publication Number Publication Date
CN113989672A true CN113989672A (en) 2022-01-28
CN113989672B CN113989672B (en) 2023-10-17

Family

ID=79744053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111268008.2A Active CN113989672B (en) 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning

Country Status (1)

Country Link
CN (1) CN113989672B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972739A (en) * 2022-06-21 2022-08-30 天津大学 Image target detection method based on target centroid relationship
CN114998759A (en) * 2022-05-27 2022-09-02 电子科技大学 High-precision SAR ship detection method based on visual transform

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490158A (en) * 2019-08-23 2019-11-22 安徽大学 A kind of robust human face alignment schemes based on multistage model
CN110826428A (en) * 2019-10-22 2020-02-21 电子科技大学 Ship detection method in high-speed SAR image
US20200278465A1 (en) * 2017-09-12 2020-09-03 Schlumberger Technology Corporation Seismic image data interpretation system
CN112285712A (en) * 2020-10-15 2021-01-29 电子科技大学 Method for improving detection precision of ship on shore in SAR image
CN113378813A (en) * 2021-05-28 2021-09-10 陕西大智慧医疗科技股份有限公司 Modeling and target detection method and device based on attention balance feature pyramid

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200278465A1 (en) * 2017-09-12 2020-09-03 Schlumberger Technology Corporation Seismic image data interpretation system
CN110490158A (en) * 2019-08-23 2019-11-22 安徽大学 A kind of robust human face alignment schemes based on multistage model
CN110826428A (en) * 2019-10-22 2020-02-21 电子科技大学 Ship detection method in high-speed SAR image
CN112285712A (en) * 2020-10-15 2021-01-29 电子科技大学 Method for improving detection precision of ship on shore in SAR image
CN113378813A (en) * 2021-05-28 2021-09-10 陕西大智慧医疗科技股份有限公司 Modeling and target detection method and device based on attention balance feature pyramid

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
TIANWEN ZHANG 等: "Balance Scene Learning Mechanism for Offshore and Inshore Ship Detection in SAR Images" *
TIANWEN ZHANG 等: "Balanced Feature Pyramid Network for Ship Detection in Synthetic Aperture Radar Images" *
张天文 等: "\"一种大场景SAR图像中舰船检测虚警抑制方法 \"" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998759A (en) * 2022-05-27 2022-09-02 电子科技大学 High-precision SAR ship detection method based on visual transform
CN114972739A (en) * 2022-06-21 2022-08-30 天津大学 Image target detection method based on target centroid relationship

Also Published As

Publication number Publication date
CN113989672B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
CN110135267B (en) Large-scene SAR image fine target detection method
CN111797717B (en) High-speed high-precision SAR image ship detection method
CN110232350B (en) Real-time water surface multi-moving-object detection and tracking method based on online learning
CN112285712B (en) Method for improving detection precision of coasting ship in SAR image
CN110826428A (en) Ship detection method in high-speed SAR image
WO2022028031A1 (en) Contour shape recognition method
CN114119582B (en) Synthetic aperture radar image target detection method
CN113989672A (en) SAR image ship detection method based on balance learning
CN107274416A (en) High spectrum image conspicuousness object detection method based on spectrum gradient and hierarchical structure
CN105894490A (en) Fuzzy integration multiple classifier integration-based uterine neck cell image identification method and device
CN113850189B (en) Embedded twin network real-time tracking method applied to maneuvering platform
CN113705331B (en) SAR ship detection method based on quaternary feature pyramid network
CN112766340B (en) Depth capsule network image classification method and system based on self-adaptive spatial mode
CN115272670A (en) SAR image ship instance segmentation method based on mask attention interaction
CN115272842A (en) SAR image ship instance segmentation method based on global semantic boundary attention network
CN113298129A (en) Polarized SAR image classification method based on superpixel and graph convolution network
Xi et al. Semi-supervised graph prototypical networks for hyperspectral image classification
CN113344103A (en) Hyperspectral remote sensing image ground object classification method based on hypergraph convolution neural network
CN112508066A (en) Hyperspectral image classification method based on residual error full convolution segmentation network
CN116933141B (en) Multispectral laser radar point cloud classification method based on multicore graph learning
CN109558880A (en) A kind of whole profile testing method with Local Feature Fusion of view-based access control model
CN114898464B (en) Lightweight accurate finger language intelligent algorithm identification method based on machine vision
CN113902975B (en) Scene perception data enhancement method for SAR ship detection
Xu et al. Infrared image semantic segmentation based on improved deeplab and residual network
CN113534146A (en) Radar video image target automatic detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant