CN113989672A - A Balanced Learning-Based Vessel Detection Method in SAR Images - Google Patents

A Balanced Learning-Based Vessel Detection Method in SAR Images Download PDF

Info

Publication number
CN113989672A
CN113989672A CN202111268008.2A CN202111268008A CN113989672A CN 113989672 A CN113989672 A CN 113989672A CN 202111268008 A CN202111268008 A CN 202111268008A CN 113989672 A CN113989672 A CN 113989672A
Authority
CN
China
Prior art keywords
network
data
scene
traditional
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111268008.2A
Other languages
Chinese (zh)
Other versions
CN113989672B (en
Inventor
张晓玲
柯潇
张天文
师君
韦顺军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202111268008.2A priority Critical patent/CN113989672B/en
Publication of CN113989672A publication Critical patent/CN113989672A/en
Application granted granted Critical
Publication of CN113989672B publication Critical patent/CN113989672B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a SAR image ship detection method based on balance learning, which is based on a deep learning theory and mainly comprises a balance scene learning mechanism, a balance interval sampling mechanism, a balance characteristic pyramid network and a balance classification regression network. The balanced scene learning mechanism solves the problem of unbalanced sample scene by amplifying the shore-approaching sample; the balance interval sampling mechanism solves the problem of scene unbalance of image samples by dividing an IOU into a plurality of intervals and sampling samples such as each interval and the like; the balanced feature pyramid network extracts features with more multi-scale detection capability through a feature enhancement method, so that the problem of unbalanced ship scale features is solved; the balanced classification regression network solves the problem of unbalanced classification regression tasks by designing two different sub-networks for classification and regression tasks. The method has the advantages of overcoming the unbalance problem in the prior art and improving the detection precision of the ship in the SAR image.

Description

SAR image ship detection method based on balance learning
Technical Field
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and relates to a SAR image ship detection method based on balance learning.
Background
Synthetic Aperture Radar (SAR) is an advanced active microwave sensor for high-resolution earth observation, and is still the leading technology in the field of ocean monitoring at present. The method is widely applied to military and civil fields of marine traffic control, disaster relief, fishery management and the like. Currently, while optical or hyperspectral satellites provide some monitoring services, SAR with all-day, all-weather working capability is more suitable for climatically changing oceans. Therefore, SAR is an indispensable remote sensing tool in marine regional awareness.
Ships are the most important participants in the ocean. Due to the huge value of the method in the aspects of sunken ship rescue, marine traffic control, fishery management and the like, the method is more and more valued by scholars. The research on marine vessel surveillance has been vigorously developed since the launch of the first SAR satellite Seasat-1 in the united states. In addition, the data volume generated by various SAR sensors is large at present, and intelligent detection on marine targets is urgently needed. Therefore, ship SAR detection has become a research hotspot of a high-resolution earth observation boundary. The details are shown in the literature 'Wangzaiyong, Chonghao, Tianjin, SAR image ship target rapid detection method research [ J ]. electronic ship engineering, 2016,36(09):27-30+ 88'
In recent years, with the rapid rise of Deep Learning (DL), many scholars in the SAR community have started to research DL-based detection methods. Compared with the traditional characteristic-based method, the DL-based method has the outstanding advantages of simplicity, full automation (namely, no complex basic stages such as land and sea segmentation, coastline detection, speckle correction and the like), high speed, high precision and the like. Although their underlying principles are not recognized, it can liberate productivity and greatly improve work efficiency. This enables a qualitative leap in the intelligent interpretation of SAR images. See "Dulan, Wangmcheng, Wangsan, Weidi, Liluol. Single channel SAR target detection and discrimination research progress in complex scene overview [ J ] Radar report, 2020,9(01):34-54 ].
However, existing deep learning based SAR ship detectors suffer from some imbalance problems, potentially preventing further accuracy improvements. Specifically, the method comprises the following steps: 1) the image sample scene is unbalanced, i.e. the number of image samples of the offshore vessel is unbalanced. Simply stated, an onshore vessel has far fewer samples than an offshore vessel. 2) The positive and negative samples are not balanced, i.e. the number of positive samples (ship) is not balanced with the number of negative samples (background). There are far more negative samples than positive samples. 3) The ship dimension characteristic is unbalanced, namely the multi-dimension ship characteristic is unbalanced. For dynamic vessel detection, the vessel size is also varied due to different spatial resolution and vessel classification. 4) The classification regression task is unbalanced, i.e. the difficulty level of the classification of the vessel and the regression of the vessel position is unbalanced, the latter being much more difficult than the former.
Therefore, in order to solve the problems of the unbalance, a SAR image ship detection method based on balance learning is provided. The method comprises a balanced scene learning mechanism, a balanced interval sampling mechanism, a balanced feature pyramid network and a balanced classification regression network, wherein the balanced scene learning mechanism, the balanced interval sampling mechanism, the balanced feature pyramid network and the balanced classification regression network are four mechanisms for solving the unbalanced problem. Experimental results on SSDD datasets show that the proposed method is superior to other deep learning based detection methods.
Disclosure of Invention
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and discloses a ship detection method based on balance learning, which is used for solving the problems of unbalanced image sample scene, unbalanced positive and negative samples, unbalanced ship scale features and unbalanced classification regression tasks in the prior art. The method is based on a deep learning theory and mainly comprises a balanced scene learning mechanism, a balanced interval sampling mechanism, a balanced feature pyramid network and a balanced classification regression network. The balanced scene learning mechanism solves the problem of unbalanced sample scene by amplifying the ship sample in shore; the balance interval sampling mechanism solves the problem of scene unbalance of image samples by dividing an IOU into a plurality of intervals and sampling samples such as each interval and the like; the balanced feature pyramid network extracts features with more multi-scale detection capability through a feature enhancement method, so that the problem of unbalanced ship scale features is solved; the balanced classification regression network solves the problem of unbalanced classification regression tasks by designing two different sub-networks for classification and regression tasks. Experiments prove that on an SSDD data set, the detection precision of the SAR image ship detection method based on the balance learning is 95.25%, the detection precision of the existing SAR ship detection method based on the deep learning is 92.27%, and the SAR detection method based on the balance learning improves the ship detection precision.
For the convenience of describing the present invention, the following terms are first defined:
definition 1: SSDD data set acquisition method
The SSDD data set refers to a SAR Ship Detection data set, which is called SAR Ship Detection Dataset in all english, and SSDD is the first open SAR Ship Detection data set. The SAR images including Sentinil-1, RadarSat-2 and TerrasAR-X are 1160 frames in total, and the resolution is 500X 500 pixels. The SSDD has 2551 ships. The minimum is 28 pixels2The maximum is 62878 pixels2(pixel2Is the product of the width pixel and the height 1). In SSDD, images with suffixes 1 and 9 (232 samples) are chosen as the test set and the rest as the training set (928 samples). The method for acquiring the SSDD data set can be used for detecting the ship target from the SAR image [ J ] based on the convolutional neural network in the reference documents of Lijianwei, Quchang, Pengshan, Dengdong and the like]Systems engineering and electronics, 2018,40(09): 1953-.
Definition 2: classic GAN network construction method
The classical idiomatic countermeasure network (GAN) is a deep learning model, and is one of the most promising methods for unsupervised learning in complex distribution in recent years. The model passes through two modules in the framework: the mutual game learning of the Generative Model (Generative Model) and the Discriminative Model (Discriminative Model) yields a reasonably good output. In the original GAN theory, it is not required that G and D are both neural networks, but only that functions that can be generated and discriminated correspondingly are fitted. Deep neural networks are generally used as G and D in practice. An excellent GAN network can realize rapid scene feature extraction. The classic GAN network construction method is described in "I.J. Goodfellow et al", "genetic adaptive networks", "International Conference on Neural Information Processing Systems, pp.2672-2680,2014"
Definition 3: classic K-means clustering algorithm
The classic K-means clustering algorithm is a clustering analysis algorithm for iterative solution, is often used as an unsupervised classification task, and comprises the steps of dividing data into K groups in advance, randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. The classic K-means clustering algorithm details are "litting", the research of the improved K-means clustering algorithm [ D ]. Anhui university, 2015. ".
Definition 4: classical Adam algorithm
The classical Adam algorithm is an extension of the stochastic gradient descent method and has recently been widely used in deep learning applications in computer vision and natural language processing. Classical Adam is different from classical random gradient descent methods. The random gradient descent maintains a single learning rate for all weight updates, and the learning rate does not change during the training process. Each network weight maintains a learning rate and is adjusted individually as learning progresses. The method calculates adaptive learning rates for different parameters from budgets of the first and second moments of the gradient. The classic Adam algorithm is detailed in "Kingma, d.; ba, J.Adam: A Method for Stocharistic optimization. arXiv 2014, arXiv:1412.6980.
Definition 5: classical forward propagation method
The forward propagation method is the most basic method in deep learning, and mainly carries out forward reasoning on input according to parameters and connection methods in a network so as to obtain the output of the network. The forward propagation method is detailed in "https:// www.jianshu.com/p/f30c8 daebebeb".
Definition 6: classic residual error network construction method
The residual network is a convolutional neural network proposed by 4 scholars from Microsoft Research, and wins image classification and object Recognition in the 2015 ImageNet Large Scale Visual Recognition Competition (ILSVRC). The residual network is characterized by easy optimization and can improve accuracy by adding considerable depth. The internal residual block uses jump connection, and the classical residual network construction method of the gradient disappearance problem caused by increasing the depth in the deep neural network is relieved. The classical Residual network construction method is described in detail in K.He et al, "Deep Residual Learning for Image registration," IEEE Conf.Compout.Vis.Pattern registration, 2016, pp.770-778.
Definition 7: conventional convolution kernel operation
The convolution kernel is a node that implements weighting and then summing values within a small portion of a rectangular region in an input feature map or picture, respectively, as an output. Each convolution kernel requires the manual specification of multiple parameters. One type of parameter is the length and width of the node matrix processed by the convolution kernel, and the size of this node matrix is also the size of the convolution kernel. The other type of convolution kernel has parameters of the depth of the unit node matrix obtained by processing, and the depth of the unit node matrix is also the depth of the convolution kernel. In the convolution operation process, each convolution kernel slides on input data, then an inner product of the whole convolution kernel and the corresponding position of the input data is calculated, then the inner product is processed through a nonlinear function to obtain a final result, and finally the results of all the corresponding positions form a two-dimensional characteristic diagram. Each convolution kernel generates a two-dimensional feature map, and the feature maps generated by the plurality of convolution kernels are overlapped to form a three-dimensional feature map. The traditional convolution kernel operation is detailed in 'Vanli, Zhao hong Wei, Zhaoyu, Huhuangshui, Wangxing' object detection research based on deep convolution neural network reviews [ J ] optical precision engineering, 2020,28(05):1152 + 1164 ].
Definition 8: conventional cascading operation
The cascade is an important operation in the network structure design, and is used for combining features, fusing the features extracted by a plurality of convolution feature extraction frameworks or fusing the information of an output layer, thereby enhancing the feature extraction capability of the network. The cascade method is detailed in https:// blog.csdn.net/alxe _ map/arrow/detail/80506051 utm _ medium ═ distribution.pc _ release.non-task-patch-block-command-message-machine-message-pai 2-3.channel _ param & depth _1-utm _ source ═ distribution.pc _ release.non-task-block-blogcommandedfromachine-pai 2-3.channel _ param.
Definition 9: conventional upsampling operations
The upsampling is an operation of performing a method on a picture or a feature map, and the main upsampling operation usually adopts an interpolation method, that is, a suitable interpolation algorithm is adopted to insert new elements between pixel points on the basis of original image pixels. In the mainstream interpolation algorithm, the adjacent interpolation is simple and easy to realize, and the application is common in the early stage. However, this method can produce significant jagged edges and mosaics in the new image. The bilinear interpolation method has a smoothing function, can effectively overcome the defects of the adjacent method, but can degrade the high-frequency part of the image to make the details of the image blurred. When the magnification factor is higher, high-order interpolation, such as bicubic and cubic spline interpolation, has good effect compared with low-order interpolation. These interpolation algorithms can continue the continuity of the gray scale change of the original image with the pixel gray scale value generated by interpolation, thereby naturally smoothing the gray scale change of the enlarged image. However, in the image, there are abrupt changes in the gray value between some pixels and the adjacent pixels, i.e., there are gray discontinuities. These pixels with abrupt changes in gray value are the edge pixels of the image that describe the contour or texture of the object. The classical upsampling operation is detailed in https:// blog.csdn.net/weixin _ 43960370/article/detail/106049708 utm _ term ═ E5% 8D% B7% E7% A7% AF% E7% 89% B9% E5% BE 81% E5% 9B% BE 4% B8% 8A 9% 87% E6% A0% B7& utm _ medium ═ pc _ aggpage _ search _ result. non-task-block-2-all-sobaiidum web-default-1-106049708 & spm ═ 3001.4430 ".
Definition 10: conventional pooling operations
The Pooling operation (Pooling) is a very common operation in CNN, the Pooling layer is used for reducing the dimension of data by simulating a human visual system, the Pooling operation is also commonly called sub-sampling (Subsampling) or down-sampling (Downsampling), and when a convolutional neural network is constructed, the Pooling operation is often used after a convolutional layer to reduce the characteristic dimension of the convolutional layer output, so that network parameters can be effectively reduced, and an over-fitting phenomenon can be prevented. Classical pooling is described in detail in "https:// www.zhihu.com/query/303215483/answer/615115629"
Definition 11: traditional regional recommendation network construction method
The regional recommendation network is a sub-network in the Faster R-CNN for extracting regions where targets may exist in the picture. The regional recommendation network is a full convolution network that takes as input the convolution signature of the underlying network output, the output being the target confidence score for each candidate box. The traditional regional recommended network construction method is described in detail in "Ren S, He K, Girshick R, et al. faster R-CNN: Towards read-Time Object Detection with Region pro-posal Networks [ J ]. IEEE Transactions on Pattern Analysis & Machine Analysis, 2017,39(6):1137 and 1149"
Definition 12: conventional full link layer approach
The fully-connected layer is a part of a convolutional neural network, the input and output sizes of the fully-connected layer are fixed, and each node is connected with all nodes of the previous layer and is used for integrating the extracted features. The full link layer method is described in detail in "Haoren Wang, Haotian Shi, Ke Lin, Chengjin Qin, Liqun Zhao, Yixiang Huang, Chengliang Liu.A. high-precision arrhythmia classification method on dual functional connected network [ J ]. biological Signal Processing and Control 2020,58 ].
Definition 13: conventional non-maxima suppression method
The non-maximum suppression method is an algorithm used for removing redundant detection boxes in the field of target detection. In the forward propagation result of the classical detection network, the situation that the same target corresponds to a plurality of detection boxes often occurs. Therefore, an algorithm is needed to select a detection box with the best quality and the highest score from a plurality of detection boxes of the same target. Non-maxima suppression performs a local maximum search by calculating an overlap rate threshold. Non-maxima suppression methods are detailed in "https:// www.cnblogs.com/makefile/p/nms.
Definition 14: traditional recall ratio and accuracy calculation method
Recall R refers to the number of correct predictions in all positive samples, expressed as
Figure BDA0003327601480000051
The precision ratio P refers to the proportional expression of the correct number in the result predicted as positive example as
Figure BDA0003327601480000052
Wherein tp (true positive) represents a positive sample predicted to be a positive value by the model; fn (false negative) represents the negative sample predicted by the model as negative; fp (false positive) is expressed as a positive sample predicted to be negative by the model. The conventional recall rate and accuracy curve P (R) refers to a function with R as an independent variable and P as a dependent variable, and the method for solving the numerical values of the parameters is shown in the literature' Li navigation, statistical learning method [ M]Beijing, Qinghua university Press, 2012 ".
The invention discloses a ship detection method based on balance learning, which comprises the following steps:
step 1, initializing SSDD data set
And adjusting the SAR image sequence in the SSDD data set by adopting a random method to obtain a new SSDD data set.
Step 2, carrying out scene augmentation by utilizing a balanced scene learning mechanism
Step 2.1, extracting SSDD data set characteristics by using GAN network
Adopting typical GAN network construction method in definition 2 to build and generate confrontation network GAN0. Taking the new SSDD data obtained in the step 1 as input, adopting the classical Adam algorithm in the definition 4 to train and optimize to generate the countermeasure network GAN0Generation of countermeasure networks after training and optimizationAnd is denoted as GAN.
Then, taking the new SSDD data obtained in step 1 as input again, according to the conventional forward propagation method in definition 5, inputting the new SSDD data obtained in step 1 into the trained and optimized generative confrontation network GAN, and obtaining an output vector M of the network, which is { M1, M2, … Mi, … M1160}, where Mi is an output vector of the ith picture in the new SSDD data.
And defining an output vector M as the scene characteristics of all pictures in the new SSDD data set, and defining Mi as the scene characteristics of the ith picture in the new SSDD data set.
Step 2.2, clustering scenes
Taking the set M of scene features of all pictures in the new SSDD data obtained in the step 2.1 as input, adopting a traditional K-means clustering algorithm in definition 3, and clustering the pictures in the new SSDD data set by means of the scene features M:
step 2.3, initializing parameters
For the centroid parameter in the traditional K-means clustering algorithm in definition 3, randomly initializing the centroid parameter of the K-means clustering algorithm in the first iteration step, and recording the centroid parameter as the centroid parameter
Figure BDA0003327601480000061
Defining the current iteration number as t, t as 1,2, …, and I as the maximum iteration number of the K-means clustering algorithm, and initializing I as 1000. Defining the centroid parameter of the t-th iteration as
Figure BDA0003327601480000062
And initializing an iteration convergence error epsilon as one of iteration convergence conditions of the algorithm.
Step 2.4, carrying out iterative operation
Firstly, using a formula
Figure BDA0003327601480000063
Calculating to obtain scene characteristics M of the ith pictureiTo the first centroid in the 1 st iteration
Figure BDA0003327601480000064
Is marked as
Figure BDA0003327601480000065
Using a formula
Figure BDA0003327601480000066
Calculating to obtain scene characteristics M of the ith pictureiTo the second centroid in the 1 st iteration
Figure BDA0003327601480000071
Is marked as
Figure BDA0003327601480000072
Comparison
Figure BDA0003327601480000073
And
Figure BDA0003327601480000074
if it is
Figure BDA0003327601480000075
Then define: scene features M of ith picture in 1 st iterationiBelong to the second category, otherwise define the scene characteristics M of the ith picture in the 1 st iterationiBelonging to the first category.
Defining: after the 1 st iteration, the set of all scene features of the first class is
Figure BDA0003327601480000076
The set of all scene features of the second class is
Figure BDA0003327601480000077
Then let t be 2, perform the following until convergence:
1) let the centroid parameter of the t step
Figure BDA0003327601480000078
Is a set
Figure BDA0003327601480000079
The arithmetic mean of (1), let the centroid parameter of the t step
Figure BDA00033276014800000710
Is a set
Figure BDA00033276014800000711
Is calculated as the arithmetic mean of (1).
2) Using a formula
Figure BDA00033276014800000712
Calculating to obtain scene characteristics M of the ith pictureiTo the first centroid in the t-th iteration
Figure BDA00033276014800000713
Is marked as
Figure BDA00033276014800000714
Using a formula
Figure BDA00033276014800000715
Calculating to obtain scene characteristics M of the ith pictureiTo the second centroid in the t-th iteration
Figure BDA00033276014800000716
Is marked as
Figure BDA00033276014800000717
3) Comparison
Figure BDA00033276014800000718
And
Figure BDA00033276014800000719
if it is
Figure BDA00033276014800000720
Then define: scene features M of ith picture in t-th iterationiBelong to the second category, otherwise define: scene features M of ith picture in t-th iterationiBelonging to the first category. Defining: after the t-th iteration, all scene feature sets in the first class are
Figure BDA00033276014800000721
All scene features of the second class are
Figure BDA00033276014800000722
And outputting a clustering result, and marking as CLASS.
4) Calculating the variation of the centroid parameter between the iteration and the last iteration, and recording as sigma, wherein the expression is
Figure BDA00033276014800000723
If σ is<Epsilon or t<And I, outputting a clustering result CLASS, otherwise, t is t +1, and then returning to the step 1) to continue iteration.
Step 2.5, carrying out scene amplification
Dividing all pictures in the new SSDD Data into two types according to the CLASS obtained from the step 2.4 and all pictures in the new SSDD Data, wherein the first type is a landing scene picture and is marked as Data1The second type is offshore scene picture marked as Data2. Defining: data1Number of pictures of N1,Data2Number of pictures of N2
If N is present2>N1Then from the first class as the landing scene picture Data1In the method, N is randomly selected based on Gaussian distribution2-N1Performing traditional mirror image operation on the picture to obtain N after the mirror image operation2-N1Opening a picture, recording as Data1extra. Then N after the mirroring operation2-N1Picture Data1extraAnd the first type is the land-backing scene picture Data1Merging and outputting a new picture set which is recorded as Data1new. Defining Data2new=Data2
If N is present2<=N1From the second class, Data is an offshore scene picture2In the method, N is randomly selected based on Gaussian distribution1-N2Carrying out traditional mirror image operation on a picture to obtain N after the mirror image operation1-N2Opening a picture, recording as Data2extra. Then N after the mirroring operation1-N2Picture Data2extraAnd the first type is the land-backing scene picture Data2Merging and outputting a new picture set which is recorded as Data2new. Defining Data1new=Data1
Defining a new set of pictures Datanew={Data1new,Data2new}。
Will DatanewAnd dividing the training set into two parts according to a 7:3 ratio to obtain a training set and a Test set, wherein the training set is marked as Train, and the Test set is marked as Test.
Step 3, building a forward propagation network
Step 3.1, building a balanced feature pyramid network
Adopting a classical residual error network construction method in definition 6 to construct a residual error network with 50 network layers, marking as Res-50, and respectively marking as F characteristic graphs generated by the last layer of network with different sizes in the residual error network Res-50 from large to small according to the size of the characteristic graphs1,F2,F3,F4,F5
F is to be5Is otherwise denoted as P5
Using the conventional convolution kernel operation in definition 7, F4Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E4
With the conventional upsampling operation in definition 9, P is upsampled by5Feature size of (D) and (F)4If the result of the sampling operation is consistent, the result is recorded as U5
With the conventional cascading operation in definition 8, E4And U5Overlapping, and recording the overlapping result as P4
Using the conventional convolution kernel operation in definition 7, F3Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E3
With the conventional upsampling operation in definition 9, P is upsampled by4Feature size of (D) and (F)3If the result of the sampling operation is consistent, the result is recorded as U4
With the conventional cascading operation in definition 8, E3And U4Overlapping, and recording the overlapping result as P3
Using the conventional convolution kernel operation in definition 7, F2Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E2
With the conventional upsampling operation in definition 9, P is upsampled by3Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U3
With the conventional cascading operation in definition 8, E2And U3Overlapping, and recording the overlapping result as P2
Using the conventional convolution kernel operation in definition 7, F1Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E1
With the conventional upsampling operation in definition 9, P is upsampled by2Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U2
With cascading operation in definition 8, E1And U2Overlapping, and recording the overlapping result as P1
With the conventional upsampling operation in definition 9, P is upsampled by5Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H5
With the conventional upsampling operation in definition 9, P is upsampled by4Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H4
Will P3Is otherwise denoted as H5
P is pooled by max pooling using the conventional pooling operation in definition 102Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H2
P is pooled by max pooling using the conventional pooling operation in definition 101Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H1
For H1,H2,H3,H4,H5By the formula
Figure BDA0003327601480000091
A feature map I is computed, where k represents the index of H and (I, j) represents the spatial sample position of the feature map.
Taking the characteristic diagram I as input and adopting a formula
Figure BDA0003327601480000092
And calculating to obtain a characteristic diagram O. Wherein, IiA feature representing the ith position on the feature map I; o isiA feature representing the ith position on the feature map O;
Figure BDA0003327601480000093
represents a normalization factor; f (I)i,Ij) Is used to calculate IiAnd IjThe function of similarity between the two is expressed as
Figure BDA0003327601480000094
Wherein, theta (I)i)=WθIi,φ(Ij)=WφIJ,WθAnd WφIs a matrix learned by the 1 × 1 convolution operation in definition 7; g (I)j)=WgIj,WgIs a matrix learned by the 1 × 1 convolution operation in definition 7.
And 3.1, obtaining a balanced characteristic pyramid network after all network operations in the step 3.1 are completed, and marking as a backhaul.
Step 3.2, building a regional recommendation network
Adopting a traditional regional recommended network construction method in the definition 11, taking the backhaul obtained in the step 3.1 as a feature extraction layer, constructing a regional recommended network, and recording the regional recommended network as RPN0
Step 3.3, building a balance classification regression network
Constructing full link layers FC1 and FC2 by adopting the traditional full link layer method in definition 12, taking the output of FC1 as the input of FC2, taking FC1 and FC2 as classification heads and marking as Clhead;
constructing four convolutional layers by adopting the traditional convolutional kernel method in definition 7, wherein the convolutional layers are Conv1, Conv2, Conv3 and Conv 4; meanwhile, the Pooling layer is constructed using the conventional Pooling operation in definition 10, denoted Pooling. The output of Conv1 was taken as the input of Conv2, the output of Conv2 as the input of Conv3, the output of Conv3 as the input of Conv4, and the output of Conv4 as the input of Pooling. Conv1, Conv2, Conv3, Conv4 and Pooling were used as regression heads and labeled Rehead. The Classification head Clhead and the regression head Rehead have the same characteristic diagram input, and together with the backhaul, the Classification head Clhead and the regression head Rehead form a balanced classification regression network which is marked as BCRN0
Step 4, training area recommendation network
An iteration parameter epoch is set, and an initial epoch value is 1.
Step 4.1, forward propagation is carried out on the regional recommendation network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a regional recommended network (RPN)0Using the conventional forward propagation method in definition 5 to send the training set Train into the regional recommendation network RPN0Computing and recording network RPN0As Result 0.
Step 4.2, carrying out balance interval sampling on the forward propagation result
Taking the input Result0 and the training set Train obtained in the step 4.1 as input, and adopting a formula
Figure BDA0003327601480000101
Calculating the IOU value of each recommendation box in Result0, and taking the output of the IOU in Result0 larger than 0.5 as a positive sample, and recording as Result0 p; the output of Result0 with an IOU less than 0.5 is taken as a negative sample and is denoted as Result0 n. The total number of samples in the negative sample Result0n is counted as M. Manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally by human input is nbThe number of samples in the ith IOU interval is Mi. Setting the random sampling probability of the ith interval as
Figure BDA0003327601480000102
And randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0 ns.
The number of samples in the positive sample Result0P is counted and is denoted as P. Setting a random sampling probability of
Figure BDA0003327601480000103
Result0p was sampled randomly and the positive sample sampling Result was recorded as Result0 ps.
Step 4.3, training and optimizing the regional recommendation network
And (3) taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step 4.2 as input, and training and optimizing the regional recommendation network by adopting a classic Adam algorithm in definition 4. And obtaining the RPN1 of the area recommendation network after training and optimization.
Step 5, training the balance classification regression network
Step 5.1, forward propagation is carried out on the balance classification regression network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a balance classification regression network BCRN0The training set Train is sent to the BCRN by the traditional forward propagation method in definition 50Calculating, and recording balance classification regression network BCRN0As Result 1.
Step 5.2, training and optimizing the balance classification regression network
The balance obtained in step 5.1 is classified backHome network BCRN0Using Result1 as an input, the area recommendation network is trained and optimized using the classical Adam algorithm in definition 4. And obtaining the trained and optimized regional recommended network BCRN 1.
Step 6, alternate training is carried out
It is determined whether epoch set in step 4 is equal to 12. If the epoch is not equal to 12, let the epoch be epoch +1, RPN0=RPN1、BCRN0=BCRN1Sequentially repeating the step 4.1, the step 4.2, the step 4.3, the step 5.1 and the step 5.2, and then returning to the step 6 to judge the epoch again; if the epoch is equal to 12, let the trained region recommendation network RPN1 and the trained balanced classification regression network BCRN1 note as network BL-Net, and then go to step 7.
Step 7, evaluation method
Step 7.1, Forward propagation
And (5) taking the network BL-Net obtained in the step 6 and the test set Tests obtained in the step 2.5 as input, and obtaining a detection result by adopting a traditional forward propagation method defined by the definition 5, wherein the detection result is marked as R.
Taking the detection result R as an input, removing a redundant box in the detection result R1 by adopting the conventional non-maximum suppression method in definition 13, and specifically performing the following steps:
firstly, marking a box with the highest score in a detection result R1 as a BS;
the step (2) then adopts a calculation formula as follows:
Figure BDA0003327601480000111
calculating an overlapping rate threshold (IoU) of all the frames of the detection result R1; discard IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the calculation IoU and discarding processes in the step (2) until no frame can be discarded, and the last remaining frame is the final detection result and is marked as RF
Step 7.2, calculating indexes
Using the detection result R obtained in step 7.1FAs an input to the process, the process may,calculating the precision ratio P, the recall ratio R and a precision ratio and recall ratio curve P (R) of the network by adopting a traditional recall ratio and precision ratio calculation method in definition 14;
using a formula
Figure BDA0003327601480000121
And calculating to obtain the average detection accuracy mAP of the SAR ship based on balance learning.
The invention has the innovation point that four balance learning methods, namely a balance scene learning mechanism, a balance interval sampling mechanism, a balance characteristic pyramid network and a balance classification regression network, are introduced, so that four unbalance problems of image sample scene unbalance, positive and negative sample unbalance, ship scale characteristic unbalance and classification regression task unbalance in the conventional SAR ship detection method based on deep learning are solved. The SAR image ship detection mAP adopting the method is 95.25 percent and exceeds a suboptimal SAR image ship detector by 3 percent; the detection mAP of the SAR image ship-ashore detector is 84.79%, which exceeds 10% of suboptimal SAR image ship detector; the SAR image offshore ship detection mAP of the method is 99.62%, which exceeds the suboptimal SAR image ship detector by 0.5 percentage point.
The method has the advantages of overcoming the unbalance problem in the prior art and improving the detection precision of the ship in the SAR image.
Drawings
Fig. 1 is a schematic flow chart of a SAR image ship detection method based on balance learning in the present invention.
Fig. 2 is a schematic diagram of a balance classification regression network in the SAR image ship detection method for balance learning in the present invention.
Fig. 3 shows the detection accuracy of the SAR image ship detection method based on balance learning in the present invention.
Detailed Description
The invention is described in further detail below with reference to fig. 1,2 and 3.
Step 1, initializing a data set
And adjusting the SAR image sequence in the SSDD data set by adopting a random method to obtain a new SSDD data set.
Step 2, carrying out scene augmentation by utilizing a balanced scene learning mechanism
Step 2.1, extracting SSDD data set characteristics by using GAN network
As shown in fig. 1, according to the classic GAN network construction method in definition 2, a countermeasure network GAN is constructed and generated0. Training and optimizing to generate an antagonistic network GAN according to a classical Adam algorithm in definition 4 by taking the new SSDD data obtained in the step 1 as input0And generating the countermeasure network after training and optimization, and recording as GAN.
Then, taking the new SSDD data obtained in step 1 as input again, according to the conventional forward propagation method in definition 5, inputting the new SSDD data obtained in step 1 into the trained and optimized generative countermeasure network GAN, and obtaining an output vector M of the network, which is { M1, M2, … Mi, … M1160}, where Mi is an output vector of the ith picture in the new SSDD data.
And defining an output vector M as the scene characteristics of all pictures in the new SSDD data set, and defining Mi as the scene characteristics of the ith picture in the new SSDD data set.
Step 2.2, clustering scenes
Taking the set M of scene features of all pictures in the new SSDD data obtained in the step 2.1 as input, adopting a traditional K-means clustering algorithm in definition 3, and clustering the pictures in the new SSDD data set by means of the scene features M:
step 2.3, initializing parameters
For the centroid parameter in the traditional K-means clustering algorithm in definition 3, randomly initializing the centroid parameter of the K-means clustering algorithm in the first iteration step, and recording the centroid parameter as the centroid parameter
Figure BDA0003327601480000131
Defining the current iteration number as t, t as 1,2, …, and I as the maximum iteration number of the K-means clustering algorithm, and initializing I as 1000. Defining the centroid parameter of the t-th iteration as
Figure BDA0003327601480000132
And initializing an iteration convergence error epsilon as one of iteration convergence conditions of the algorithm.
Step 2.4, carrying out iterative operation
Firstly, using a formula
Figure BDA0003327601480000133
Calculating scene characteristics M of ith pictureiTo the first centroid in the 1 st iteration
Figure BDA0003327601480000134
Is marked as
Figure BDA0003327601480000135
Using a formula
Figure BDA0003327601480000136
Calculating scene characteristics M of ith pictureiTo the second centroid in the 1 st iteration
Figure BDA0003327601480000137
Is marked as
Figure BDA0003327601480000138
Comparison
Figure BDA0003327601480000139
And
Figure BDA00033276014800001310
if, if
Figure BDA00033276014800001311
Then the scene characteristics M of the ith picture in the 1 st iteration are definediBelong to the second category, otherwise define the scene characteristics M of the ith picture in the 1 st iterationiBelonging to the first category.
Defining all scenes of the first class after iteration step 1The set of features is
Figure BDA00033276014800001312
The set of all scene features of the second class is
Figure BDA0003327601480000141
Then let t be 2, perform the following until convergence:
1) let the centroid parameter of the t step
Figure BDA0003327601480000142
Is a set
Figure BDA0003327601480000143
The arithmetic mean of (1), let the centroid parameter of the t step
Figure BDA0003327601480000144
Is a set
Figure BDA0003327601480000145
Is calculated as the arithmetic mean of (1).
2) Using a formula
Figure BDA0003327601480000146
Calculating scene characteristics M of ith pictureiTo the first centroid in the t-th iteration
Figure BDA0003327601480000147
Is marked as
Figure BDA0003327601480000148
By using
Figure BDA0003327601480000149
Scene feature M of ith pictureiTo the second centroid in the t-th iteration
Figure BDA00033276014800001410
Is marked as
Figure BDA00033276014800001411
3) Comparison
Figure BDA00033276014800001412
And
Figure BDA00033276014800001413
if it is
Figure BDA00033276014800001414
Then define the scene characteristics M of the ith picture in the t iterationiBelongs to the second category, otherwise defines the scene characteristics M of the ith picture in the t iterationiBelonging to the first category. Defining all scene characteristics of the first class as
Figure BDA00033276014800001415
All scene features of the second class are
Figure BDA00033276014800001416
And outputting a clustering result, and marking as CLASS.
4) Calculating the variation of the centroid parameter between the iteration and the last iteration, and recording as sigma, wherein the expression is
Figure BDA00033276014800001417
If σ is<Epsilon or t<And I, outputting a clustering result CLASS, otherwise, t is t +1, and then returning to the step 1) to continue iteration.
Step 2.5, carrying out scene amplification
Dividing all pictures in the new SSDD Data into two types according to the CLASS obtained from the step 2.4 and all pictures in the new SSDD Data, wherein the first type is a landing scene picture and is marked as Data1The second type is offshore scene picture marked as Data2. Defining Data1Number of pictures of N1,Data2Number of pictures of N2
If N is present2>N1Then from the first class as the landing scene picture Data1In the method, N is randomly selected based on Gaussian distribution2-N1Carrying out mirror image operation on a picture to obtain N after the mirror image operation2-N1Opening a picture, recording as Data1extra. Then N after the mirroring operation2-N1Picture Data1extraAnd the first type is the land-backing scene picture Data1Merging and outputting a new picture set which is recorded as Data1new. Defining Data2new=Data2
If N is present2<=N1From the second class, Data is an offshore scene picture2In the method, N is randomly selected based on Gaussian distribution1-N2Carrying out mirror image operation on a picture to obtain N after the mirror image operation1-N2Opening a picture, recording as Data2extra. Then N after the mirroring operation1-N2Picture Data2extraAnd the first type is the land-backing scene picture Data2Merging and outputting a new picture set which is recorded as Data2new. Defining Data1new=Data1
Defining a new set of pictures Datanew={Data1new,Data2new}。
Will DatanewAnd dividing the training set into two parts according to a 7:3 ratio to obtain a training set and a Test set, wherein the training set is marked as Train, and the Test set is marked as Test.
Step 3, building a forward propagation network
Step 3.1, building a balanced feature pyramid network
As shown in fig. 1, a classical residual network construction method in definition 6 is adopted to construct a residual network with 50 network layers, which is recorded as Res-50, and feature maps generated by the last layer of network with different sizes in the residual network Res-50 are respectively recorded as F from large to small according to the feature map size1,F2,F3,F4,F5
F is to be5Is otherwise denoted as P5
Following the convolution sum operation in definition 7, F4Feature extraction by 1 × 1 convolution sum, feature extractionThe extracted result is marked as E4(ii) a P is upsampled by the upsampling operation as in definition 95Feature size of (D) and (F)4If the result of the sampling operation is consistent, the result is recorded as U5(ii) a According to the cascade operation in definition 8, E4And U5Overlapping, and recording the overlapping result as P4
Following the convolution sum operation in definition 7, F3Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E3(ii) a P is upsampled by the upsampling operation as in definition 94Feature size of (D) and (F)3If the result of the sampling operation is consistent, the result is recorded as U4(ii) a According to the cascade operation in definition 8, E3And U4Overlapping, and recording the overlapping result as P3
Following the convolution sum operation in definition 7, F2Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E2(ii) a P is upsampled by the upsampling operation as in definition 93Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U3(ii) a According to the cascade operation in definition 8, E2And U3Overlapping, and recording the overlapping result as P2
Following the convolution sum operation in definition 7, F1Performing feature extraction by using 1 × 1 convolution sum, and recording the feature extraction result as E1(ii) a P is upsampled by the upsampling operation as in definition 92Feature size of (D) and (F)2If the result of the sampling operation is consistent, the result is recorded as U2(ii) a According to the cascade operation in definition 8, E1And U2Overlapping, and recording the overlapping result as P1
P is upsampled by the upsampling operation as in definition 95Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H5
P is upsampled by the upsampling operation as in definition 94Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H4
Will P3Is otherwise denoted as H5
P is pooled by maximum pooling as per pooling operation in definition 102Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H2
P is pooled by maximum pooling as per pooling operation in definition 101Feature size and P of3When the result of the sampling operation is consistent, the result is recorded as H1
H is to be1,H2,H3,H4,H5According to the formula
Figure BDA0003327601480000161
A feature map I is computed, where k represents the index of H and (I, j) represents the spatial sample position of the feature map.
Taking the characteristic diagram I as an input according to a formula
Figure BDA0003327601480000162
And calculating to obtain a characteristic diagram O. Wherein, IiA feature representing the ith position on the feature map I; o isiA feature representing the ith position on the feature map O;
Figure BDA0003327601480000163
represents a normalization factor; f (I)i,Ij) Is used to calculate IiAnd IjThe function of similarity between the two is expressed as
Figure BDA0003327601480000164
Wherein, theta (I)i)=WθIi,φ(Ij)=WφIJ,WθAnd WφIs a matrix learned by the 1 × 1 convolution operation in definition 7; g (I)j)=WgIj,WgIs a matrix learned by the 1 × 1 convolution operation in definition 7.
And (4) taking all the network operations in the step 3.1 as a balanced feature pyramid network, and marking as a backhaul.
Step 3.2, building a regional recommendation network
According to the regional recommended network construction method in the definition 11, the backhaul obtained in the step 3.1 is used as a feature extraction layer to construct a regional recommended network, and the regional recommended network is marked as RPN0
Step 3.3, building a balance classification regression network
As shown in fig. 2, the balanced classification regression network is divided into two parts, namely a classification head lead and a regression head Rhead, and full connection layers FC1 and FC2 are constructed according to the conventional full connection layer method in definition 12, the output of FC1 is used as the input of FC2, and FC1 and FC2 are used as classification heads and are marked as cluads; constructing four convolutional layers, Conv1, Conv2, Conv3, and Conv4, respectively, according to the convolutional kernel method in definition 7; at the same time, the Pooling layer is constructed according to the Pooling operation in definition 10, denoted Pooling. The output of Conv1 was taken as the input of Conv2, the output of Conv2 as the input of Conv3, the output of Conv3 as the input of Conv4, and the output of Conv4 as the input of Pooling. Conv1, Conv2, Conv3, Conv4 and Pooling were used as regression heads and labeled Rehead. The Classification head Clhead and the regression head Rehead have the same characteristic diagram input, and together with the backhaul, the Classification head Clhead and the regression head Rehead form a balanced classification regression network which is marked as BCRN0
Step 4, training area recommendation network
An iteration parameter epoch is set, and an initial epoch value is 1.
Step 4.1, forward propagation is carried out on the regional recommendation network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a regional recommended network (RPN)0According to the forward propagation method in definition 5, the training set Train is sent to the regional recommended network RPN0Computing and recording network RPN0As Result 0.
Step 4.2, carrying out balance interval sampling on the forward propagation result
Taking the input Result0 obtained in the step 4.1 and the training set Train as input, and according to a formula
Figure BDA0003327601480000171
Calculating the IOU value of each recommendation box in Result0 by using a calculation method, and taking the output of the IOU more than 0.5 in Result0 as a positive sample, and recording as Result0 p; the output of Result0 with an IOU less than 0.5 is taken as a negative sample and is denoted as Result0 n. The total number of samples in the negative sample Result0n is counted as M. Manually inputting the number of required negative samples, and recording the number as N; the number of intervals for dividing IOU equally by human input is nbThe number of samples in the ith IOU interval is Mi. Setting the random sampling probability of the ith interval as
Figure BDA0003327601480000172
And randomly sampling each IOU interval, and recording the sampling results of all the IOU intervals of the negative samples as Result0 ns.
The number of samples in the positive sample Result0P is counted and is denoted as P. Setting a random sampling probability of
Figure BDA0003327601480000173
Result0p was sampled randomly and the positive sample sampling Result was recorded as Result0 ps.
Step 4.3, training and optimizing the regional recommendation network
And (3) taking the positive sample sampling Result0ps and the negative sample sampling Result0ns obtained in the step 4.2 as input, and training and optimizing the regional recommendation network according to the classic Adam algorithm in the definition 4. And obtaining the RPN1 of the area recommendation network after training and optimization.
Step 5, training the balance classification regression network
Step 5.1, forward propagation is carried out on the balance classification regression network
Taking the training set Train of the amplified data set Datanew obtained in the step 2 as a balance classification regression network BCRN0According to the forward propagation method in definition 5, the training set Train is sent to the BCRN0Calculating, and recording balance classification regression network BCRN0As Result 1.
Step 5.2, training and optimizing the balance classification regression network
The equilibrium classification obtained in the step 5.1 is regressed to a BCRN0Is used as input to train and optimize the regional recommendation network according to the classical Adam algorithm in definition 4. And obtaining the trained and optimized regional recommended network BCRN 1.
Step 6, alternate training is carried out
It is determined whether epoch set in step 4 is equal to 12. If the epoch is not equal to 12, let the epoch be epoch +1, RPN0=RPN1、BCRN0=BCRN1Sequentially repeating the step 4.1, the step 4.2, the step 4.3, the step 5.1 and the step 5.2, and then returning to the step 6 to judge the epoch again; if the epoch is equal to 12, let the trained region recommendation network RPN1 and the trained balanced classification regression network BCRN1 note as network BL-Net, and then go to step 7.
Step 7, evaluation method
Step 7.1, Forward propagation
And (5) taking the network BL-Net obtained in the step 6 and the test set Tests obtained in the step 2.5 as input, and obtaining a detection result by adopting a traditional forward propagation method defined by the definition 5, wherein the detection result is marked as R.
Taking the detection result R as an input, removing a redundant box in the detection result R1 by adopting the conventional non-maximum suppression method in definition 13, and specifically performing the following steps:
firstly, marking a box with the highest score in a detection result R1 as a BS;
the step (2) then adopts a calculation formula as follows:
Figure BDA0003327601480000181
calculating an overlapping rate threshold (IoU) of all the frames of the detection result R1; discard IoU>A frame of 0.5;
step (3) selecting a frame BS with the highest score from the rest frames;
repeating the calculation IoU and discarding processes in the step (2) until no frame can be discarded, and the last remaining frame is the final detection result and is marked as RF
Step 7.2, calculating indexes
As shown in FIG. 3, the detection result R obtained in step 7.1 is usedFAs input, calculating the precision ratio P, the recall ratio R and a precision ratio and recall ratio curve P (R) of the network by adopting a traditional recall ratio and precision ratio calculation method in definition 14; using a formula
Figure BDA0003327601480000182
And calculating the average detection accuracy mAP of the SAR ship based on balance learning.

Claims (1)

1.一种基于平衡学习的船只检测方法,其特征是它包括如下步骤:1. a ship detection method based on balanced learning is characterized in that it comprises the following steps: 步骤1、初始化SSDD数据集Step 1. Initialize the SSDD dataset 采用随机的方法调整SSDD数据集中的SAR图像次序,得到新的SSDD数据集;A random method was used to adjust the order of SAR images in the SSDD dataset to obtain a new SSDD dataset; 步骤2、利用平衡场景学习机制进行场景扩增Step 2. Use the balanced scene learning mechanism for scene augmentation 步骤2.1、利用GAN网络提取SSDD数据集特征Step 2.1. Use GAN network to extract SSDD dataset features 采用经典的GAN网络构建方法,搭建生成对抗网络GAN0;以步骤1中获取得到的新的SSDD数据作为输入,采用经典的Adam算法,训练和优化生成对抗网络GAN0,得到训练和优化之后的生成对抗网络,记为GAN;Using the classic GAN network construction method, build the generative adversarial network GAN 0 ; take the new SSDD data obtained in step 1 as input, use the classic Adam algorithm to train and optimize the generative adversarial network GAN 0 , get the training and optimization Generative adversarial network, denoted as GAN; 然后再次以步骤1中获取得到的新的SSDD数据作为输入,采用传统的前向传播方法,将步骤1中获取得到的新的SSDD数据输入到训练和优化之后的生成对抗网络GAN中,得到网络的输出向量M={M1,M2,…Mi,…M1160},其中,Mi是新的SSDD数据中第i张图片的输出向量;Then again take the new SSDD data obtained in step 1 as input, adopt the traditional forward propagation method, input the new SSDD data obtained in step 1 into the generative adversarial network GAN after training and optimization, and get the network The output vector M={M1,M2,...Mi,...M1160}, where Mi is the output vector of the ith picture in the new SSDD data; 定义输出向量M是新的SSDD数据集中所有图片的场景特征,定义Mi为新的SSDD数据集中第i张图片的场景特征;Define the output vector M as the scene feature of all pictures in the new SSDD dataset, and define Mi as the scene feature of the ith picture in the new SSDD dataset; 步骤2.2、进行场景聚类Step 2.2, perform scene clustering 以步骤2.1中得到的新的SSDD数据中所有图片的场景特征的集合M作为输入,采用传统的的K-means聚类算法,借助场景特征M对新的SSDD数据集中的图片进行聚类操作:Taking the set M of scene features of all pictures in the new SSDD data obtained in step 2.1 as input, the traditional K-means clustering algorithm is used, and the pictures in the new SSDD data set are clustered with the help of scene feature M: 步骤2.3、初始化参数Step 2.3, initialization parameters 对于传统的K-means聚类算法中的质心参数,随机初始化第一步迭代中K-means聚类算法的质心参数,记为
Figure FDA0003327601470000011
For the centroid parameter in the traditional K-means clustering algorithm, randomly initialize the centroid parameter of the K-means clustering algorithm in the first iteration, denoted as
Figure FDA0003327601470000011
定义当前迭代次数为t,t=1,2,…,I,I为K-means聚类算法最大迭代次数,初始化I=1000;定义第t步迭代的质心参数为
Figure FDA0003327601470000012
初始化迭代收敛误差ε,作为算法迭代收敛条件之一;
Define the current number of iterations as t, t=1,2,...,I, where I is the maximum number of iterations of the K-means clustering algorithm, initialize I=1000; define the centroid parameter of the t-th iteration as
Figure FDA0003327601470000012
Initialize the iterative convergence error ε, as one of the algorithm iterative convergence conditions;
步骤2.4、进行迭代操作Step 2.4, perform iterative operation 首先采用公式
Figure FDA0003327601470000013
计算得到第i张图片的场景特征Mi到在第1次迭代中第一个质心
Figure FDA0003327601470000014
的距离,记为
Figure FDA0003327601470000015
First use the formula
Figure FDA0003327601470000013
Calculate the scene feature M i of the ith picture to the first centroid in the first iteration
Figure FDA0003327601470000014
distance, denoted as
Figure FDA0003327601470000015
采用公式
Figure FDA0003327601470000021
计算得到第i张图片的场景特征Mi到在第1次迭代中第二个质心
Figure FDA0003327601470000022
的距离,记为
Figure FDA0003327601470000023
using the formula
Figure FDA0003327601470000021
Calculate the scene feature M i of the ith image to the second centroid in the first iteration
Figure FDA0003327601470000022
distance, denoted as
Figure FDA0003327601470000023
比较
Figure FDA0003327601470000024
Figure FDA0003327601470000025
Figure FDA0003327601470000026
则定义:在第1次迭代中第i张图片的场景特征Mi属于第二类,反之则定义在第1次迭代中第i张图片的场景特征Mi属于第一类;
Compare
Figure FDA0003327601470000024
and
Figure FDA0003327601470000025
like
Figure FDA0003327601470000026
Then define: in the first iteration, the scene feature M i of the ith picture belongs to the second category, otherwise, it is defined that the scene feature M i of the ith picture in the first iteration belongs to the first category;
定义:在第1步迭代后,第一类的所有场景特征的集合为
Figure FDA0003327601470000027
第二类的所有场景特征的集合为
Figure FDA0003327601470000028
Definition: After the first iteration, the set of all scene features of the first class is
Figure FDA0003327601470000027
The set of all scene features of the second category is
Figure FDA0003327601470000028
然后令t=2,执行以下操作直至收敛:Then let t=2, do the following until convergence: 1)令第t步的质心参数
Figure FDA0003327601470000029
为集合
Figure FDA00033276014700000210
的算术均值,令第t步的质心参数
Figure FDA00033276014700000211
为集合
Figure FDA00033276014700000212
的算术均值;
1) Let the centroid parameter of the t-th step
Figure FDA0003327601470000029
for the collection
Figure FDA00033276014700000210
The arithmetic mean of , let the centroid parameter at step t
Figure FDA00033276014700000211
for the collection
Figure FDA00033276014700000212
The arithmetic mean of ;
2)采用公式
Figure FDA00033276014700000213
计算得到第i张图片的场景特征Mi到在第t次迭代中第一个质心
Figure FDA00033276014700000214
的距离,记为
Figure FDA00033276014700000215
2) Using the formula
Figure FDA00033276014700000213
Calculate the scene feature M i of the i-th image to the first centroid in the t-th iteration
Figure FDA00033276014700000214
distance, denoted as
Figure FDA00033276014700000215
采用公式
Figure FDA00033276014700000216
计算得到第i张图片的场景特征Mi到在第t次迭代中第二个质心
Figure FDA00033276014700000217
的距离,记为
Figure FDA00033276014700000218
using the formula
Figure FDA00033276014700000216
Calculate the scene feature M i of the i-th image to the second centroid in the t-th iteration
Figure FDA00033276014700000217
distance, denoted as
Figure FDA00033276014700000218
3)比较
Figure FDA00033276014700000219
Figure FDA00033276014700000220
Figure FDA00033276014700000221
则定义:在第t次迭代中第i张图片的场景特征Mi属于第二类,反之则定义:在第t次迭代中第i张图片的场景特征Mi属于第一类;定义:在第t步迭代后,第一类的所有场景特征集合为
Figure FDA00033276014700000222
第二类的所有场景特征集合为
Figure FDA00033276014700000223
3) Compare
Figure FDA00033276014700000219
and
Figure FDA00033276014700000220
like
Figure FDA00033276014700000221
Then define: in the t-th iteration, the scene feature M i of the ith picture belongs to the second category, otherwise, define: in the t-th iteration, the scene feature M i of the ith picture belongs to the first category; After the t-th iteration, the set of all scene features of the first category is
Figure FDA00033276014700000222
The set of all scene features of the second category is
Figure FDA00033276014700000223
输出聚类结果,记为CLASS;Output the clustering result, denoted as CLASS; 4)计算该次迭代与上一次迭代的质心参数变化量,记为σ,表达式为
Figure FDA00033276014700000224
如果σ<ε或t<I,则输出聚类结果CLASS,否则另t=t+1,然后返回到步骤1)继续迭代;
4) Calculate the change of the centroid parameter between this iteration and the previous iteration, denoted as σ, and the expression is
Figure FDA00033276014700000224
If σ<ε or t<I, output the clustering result CLASS, otherwise t=t+1, and then return to step 1) to continue the iteration;
步骤2.5、进行场景扩增Step 2.5, perform scene augmentation 根据由步骤2.4中得到的聚类结果CLASS和新的SSDD数据中所有图片,将新的SSDD数据中所有图片分为两类,第一类为靠岸场景图片,记为Data1,第二类为离岸场景图片记为Data2;定义:Data1的图片数量为N1,Data2的图片数量为N2According to the clustering result CLASS obtained in step 2.4 and all the pictures in the new SSDD data, all the pictures in the new SSDD data are divided into two categories, the first category is the landing scene pictures, denoted as Data 1 , the second category The offshore scene picture is recorded as Data 2 ; Definition: The number of pictures of Data 1 is N 1 , and the number of pictures of Data 2 is N 2 ; 若N2>N1,则从第一类为靠岸场景图片Data1中基于高斯分布随机选取N2-N1张图片,进行传统的镜像操作,得到镜像操作之后的N2-N1张图片,记为Data1extra;然后将镜像操作之后的N2-N1张图片Data1extra和第一类为靠岸场景图片Data1合并,输出一个新的图片集合,记为Data1new;定义Data2new=Data2If N 2 >N 1 , randomly select N 2 -N 1 pictures from the first type of images of the landing scene, Data 1 , based on Gaussian distribution, and perform traditional mirroring operations to obtain N 2 -N 1 pictures after mirroring operation. Picture, denoted as Data 1extra ; then merge the N 2 -N 1 pictures Data 1extra after the mirror operation with the first type of docking scene picture Data 1 , and output a new set of pictures, denoted as Data 1new ; define Data 2new = Data2 ; 若N2<=N1,则从第二类为离岸场景图片Data2中基于高斯分布随机选取N1-N2张图片进行传统的镜像操作,得到镜像操作之后的N1-N2张图片,记为Data2extra;然后将镜像操作之后的N1-N2张图片Data2extra和第一类为靠岸场景图片Data2合并,输出一个新的图片集合,记为Data2new;定义Data1new=Data1If N 2 <= N 1 , then randomly select N 1 -N 2 pictures from the second category of offshore scene pictures Data 2 based on Gaussian distribution to perform traditional mirroring operations, and obtain N 1 -N 2 pictures after mirroring operations Picture, denoted as Data 2extra ; then merge the N 1 -N 2 pictures Data 2extra after the mirror operation with the first type of docking scene picture Data 2 , and output a new set of pictures, denoted as Data 2new ; define Data 1new = Data1 ; 定义新的图片集合Datanew={Data1new,Data2new};Define a new image set Data new = {Data 1new , Data 2new }; 将Datanew按照7:3的比列划分为两部分,得到训练集、和测试集,训练集记为Train,测试集记为Test;Divide the data new into two parts according to the ratio of 7:3, and obtain the training set and the test set. The training set is recorded as Train, and the test set is recorded as Test; 步骤3、搭建前向传播网络Step 3. Build a forward propagation network 步骤3.1、搭建平衡特征金字塔网络Step 3.1, build a balanced feature pyramid network 采用经典的残差网络构建方法构建网络层数为50的残差网络,记为Res-50,同时将残差网络Res-50中不同尺寸的最后一层网络所生成的特征图按特征图尺寸由大到小分别记为F1,F2,F3,F4,F5The classical residual network construction method is used to build a residual network with 50 network layers, denoted as Res-50. At the same time, the feature maps generated by the last layer of networks of different sizes in the residual network Res-50 are classified according to the feature map size. They are recorded as F 1 , F 2 , F 3 , F 4 , and F 5 in descending order; 将F5另记为P5Denote F 5 as P 5 ; 采用传统卷积核操作,将F4,用1×1卷积和进行特征提取,记特征提取结果记为E4Using traditional convolution kernel operation, F 4 is convolved with 1×1 sum to perform feature extraction, and the feature extraction result is recorded as E 4 ; 采用的传统上采样操作,通过上采样操作将P5的特征图尺寸与F4,一致,记上采样操作后的结果为U5The adopted traditional up-sampling operation, through the up-sampling operation, the size of the feature map of P 5 is consistent with F 4 , and the result after the up-sampling operation is recorded as U 5 ; 采用传统级联操作,将E4和U5进行叠加,将叠加结果记为P4Using traditional cascade operation, E 4 and U 5 are superimposed, and the superposition result is recorded as P 4 ; 采用传统卷积核操作,将F3,用1×1卷积和进行特征提取,记特征提取结果记为E3Using traditional convolution kernel operation, F 3 is convolved with 1×1 sum to perform feature extraction, and the feature extraction result is recorded as E 3 ; 采用传统上采样操作,通过上采样操作将P4的特征图尺寸与F3,一致,记上采样操作后的结果为U4Adopt the traditional up-sampling operation, through the up-sampling operation, the size of the feature map of P 4 is consistent with F 3 , and record the result after the up-sampling operation as U 4 ; 采用传统级联操作,将E3和U4进行叠加,将叠加结果记为P3Using traditional cascade operation, superimpose E 3 and U 4 , and denote the superposition result as P 3 ; 采用传统卷积核操作,将F2,用1×1卷积和进行特征提取,记特征提取结果记为E2The traditional convolution kernel operation is adopted, and F 2 is convolved with 1×1 sum for feature extraction, and the feature extraction result is recorded as E 2 ; 采用传统上采样操作,通过上采样操作将P3的特征图尺寸与F2,一致,记上采样操作后的结果为U3Using the traditional upsampling operation, through the upsampling operation, the size of the feature map of P3 is consistent with F2, and the result after the upsampling operation is recorded as U3 ; 采用传统级联操作,将E2和U3进行叠加,将叠加结果记为P2Using traditional cascade operation, superimpose E 2 and U 3 , and denote the superposition result as P 2 ; 采用传统卷积核操作,将F1,用1×1卷积和进行特征提取,记特征提取结果记为E1Using traditional convolution kernel operation, F 1 is convolved with 1×1 sum to perform feature extraction, and the feature extraction result is recorded as E 1 ; 采用传统上采样操作,通过上采样操作将P2的特征图尺寸与F2,一致,记上采样操作后的结果为U2Using the traditional upsampling operation, through the upsampling operation, the size of the feature map of P 2 is consistent with F 2 , and the result after the upsampling operation is recorded as U 2 ; 采用级联操作,将E1和U2进行叠加,将叠加结果记为P1Using cascade operation, E 1 and U 2 are superimposed, and the superposition result is recorded as P 1 ; 采用传统上采样操作,通过上采样操作将P5的特征图尺寸与P3,一致,记上采样操作后的结果为H5Adopt the traditional up-sampling operation, make the feature map size of P 5 consistent with P 3 , through the up-sampling operation, and record the result after the up-sampling operation as H 5 ; 采用传统上采样操作,通过上采样操作将P4的特征图尺寸与P3,一致,记上采样操作后的结果为H4Using the traditional upsampling operation, the size of the feature map of P4 is consistent with that of P3 through the upsampling operation, and the result after the upsampling operation is recorded as H4 ; 将P3,另记为H5Denote P 3 as H 5 ; 采用传统池化操作,通过最大池化将P2的特征图尺寸与P3,一致,记上采样操作后的结果为H2Using the traditional pooling operation, the size of the feature map of P 2 is consistent with that of P 3 through maximum pooling, and the result after the sampling operation is recorded as H 2 ; 采用传统池化操作,通过最大池化将P1的特征图尺寸与P3,一致,记上采样操作后的结果为H1Using the traditional pooling operation, the size of the feature map of P 1 is consistent with that of P 3 through maximum pooling, and the result after the sampling operation is recorded as H 1 ; 对于H1,H2,H3,H4,H5,采用公式
Figure FDA0003327601470000041
计算得到特征图I,其中,k代表H的下标,(i,j)代表特征图的空间采样位置;
For H 1 , H 2 , H 3 , H 4 , H 5 , use the formula
Figure FDA0003327601470000041
Calculate the feature map I, where k represents the subscript of H, and (i, j) represents the spatial sampling position of the feature map;
将特征图I作为输入,采用公式
Figure FDA0003327601470000042
计算得到特征图O;其中,Ii表示在特征图I上第i个位置的特征;Oi表示在特征图O上第i个位置的特征;
Figure FDA0003327601470000043
代表归一化因子;f(Ii,Ij)是用来计算Ii和Ij之间相似度的函数,具体表达式为
Figure FDA0003327601470000044
其中,θ(Ii)=WθIi,φ(Ij)=WφIJ,Wθ和Wφ是通过1×1卷积操作学习而来的矩阵;g(Ij)=WgIj,Wg是通过1×1卷积操作学习而来的矩阵;
Taking the feature map I as input, the formula
Figure FDA0003327601470000042
Calculate the feature map O; wherein, I i represents the feature of the ith position on the feature map I; O i represents the feature of the ith position on the feature map O;
Figure FDA0003327601470000043
represents the normalization factor; f(I i , I j ) is a function used to calculate the similarity between I i and I j , and the specific expression is
Figure FDA0003327601470000044
Among them, θ(I i )=W θ I i , φ(I j )=W φ I J , W θ and W φ are matrices learned through 1×1 convolution operation; g(I j )=W g I j , W g are matrices learned through a 1×1 convolution operation;
步骤3.1中所有的网络操作完成后,得到平衡特征金字塔网络,记为Backbone;After all network operations in step 3.1 are completed, a balanced feature pyramid network is obtained, denoted as Backbone; 步骤3.2、搭建区域推荐网络Step 3.2. Build a regional recommendation network 采用传统区域推荐网络构建方法,以步骤3.1中得到的Backbone为特征提取层,构建区域推荐网络,记为RPN0The traditional regional recommendation network construction method is adopted, and the Backbone obtained in step 3.1 is used as the feature extraction layer to construct the regional recommendation network, which is recorded as RPN 0 ; 步骤3.3、搭建平衡分类回归网络Step 3.3, build a balanced classification and regression network 采用传统的全连接层方法构建全连接层FC1和FC2,将FC1的输出作为FC2的输入,将FC1和FC2作为分类头,记为Clhead;The traditional fully connected layer method is used to construct the fully connected layers FC1 and FC2, the output of FC1 is used as the input of FC2, and FC1 and FC2 are used as the classification head, which is recorded as Clhead; 采用传统卷积核方法构建四层卷积层,分别为Conv1、Conv2、Conv3、Conv4;同时,采用传统池化操作构建池化层,记为Pooling;将Conv1的输出作为Conv2的输入,Conv2的输出作为Conv3的输入,Conv3的输出作为Conv4的输入,Conv4的输出作为Pooling的输入;将Conv1、Conv2、Conv3、Conv4、Pooling作为回归头,记为Rehead;分类头Clhead和回归头Rehead有着相同的特征图输入,与Backbone一起共同构成平衡分类回归网络,记为BCRN0The traditional convolution kernel method is used to build four convolution layers, namely Conv1, Conv2, Conv3, and Conv4; at the same time, the traditional pooling operation is used to build the pooling layer, which is recorded as Pooling; the output of Conv1 is used as the input of Conv2, and the output of Conv2 is The output is used as the input of Conv3, the output of Conv3 is used as the input of Conv4, and the output of Conv4 is used as the input of Pooling; Conv1, Conv2, Conv3, Conv4, and Pooling are used as the regression head, denoted as Rehead; the classification head Clhead and the regression head Rehead have the same Feature map input, together with Backbone to form a balanced classification and regression network, denoted as BCRN 0 ; 步骤4、训练区域推荐网络Step 4. Train the regional recommendation network 设置迭代参数epoch,初始化epoch值为1;Set the iteration parameter epoch, and initialize the epoch value to 1; 步骤4.1、对区域推荐网络进行前向传播Step 4.1, forward propagation to the regional recommendation network 将步骤2中得到的扩增后的数据集Datanew的训练集Train作为区域推荐网络RPN0的输入,采用传统前向传播方法把训练集Train送入区域推荐网络RPN0进行运算,记网络RPN0的输出作为Result0;The training set Train of the augmented data set Datanew obtained in step 2 is used as the input of the regional recommendation network RPN 0 , and the traditional forward propagation method is used to send the training set Train into the regional recommendation network RPN 0 for operation, denoted the network RPN 0 the output as Result0; 步骤4.2、对前向传播结果进行平衡区间采样Step 4.2, perform balanced interval sampling on forward propagation results 将步骤4.1得到的输入Result0和训练集Train作为输入,采用公式
Figure FDA0003327601470000051
计算Result0中每个推荐框的IOU值,将Result0中IOU大于0.5的输出作为正样本,记为Result0p;将Result0中IOU小于0.5的输出作为负样本,记为Result0n;统计负样本Result0n中的总样本数为M;人为输入所需负样本数,记为N;人为输入所需等分IOU的间隔数为nb,记第i个IOU区间的样本数为Mi;设置第i个区间的随机采样概率为
Figure FDA0003327601470000052
对每个IOU区间进行随机采样,将负样本所有IOU区间的采样结果记为Result0ns;
Take the input Result0 obtained in step 4.1 and the training set Train as input, using the formula
Figure FDA0003327601470000051
Calculate the IOU value of each recommendation box in Result0, take the output with IOU greater than 0.5 in Result0 as a positive sample, and record it as Result0p; take the output with IOU less than 0.5 in Result0 as a negative sample, record it as Result0n; count the total number of negative samples in Result0n. The number of samples is M; the number of negative samples required for artificial input is denoted as N; the number of intervals of equal division IOUs required for artificial input is n b , and the number of samples in the i-th IOU interval is M i ; The random sampling probability is
Figure FDA0003327601470000052
Random sampling is performed on each IOU interval, and the sampling results of all IOU intervals of the negative sample are recorded as Result0ns;
统计正样本Result0p中的样本数,记为P;设置随机采样概率为
Figure FDA0003327601470000053
对Result0p进行随机采样,将正样本采样结果记为Result0ps;
Count the number of samples in the positive sample Result0p, denoted as P; set the random sampling probability as
Figure FDA0003327601470000053
Random sampling is performed on Result0p, and the positive sample sampling result is recorded as Result0ps;
步骤4.3、对区域推荐网络进行训练和优化Step 4.3. Train and optimize the regional recommendation network 将步骤4.2中得到的正样本采样结果Result0ps和负样本采样结果Result0ns作为输入,采用经典的Adam算法对区域推荐网络进行训练和优化;得到训练和优化之后的区域推荐网络RPN1;Using the positive sample sampling result Result0ps and the negative sample sampling result Result0ns obtained in step 4.2 as input, use the classic Adam algorithm to train and optimize the regional recommendation network; obtain the regional recommendation network RPN1 after training and optimization; 步骤5、训练平衡分类回归网络Step 5. Train a balanced classification and regression network 步骤5.1、对平衡分类回归网络进行前向传播Step 5.1, forward propagation to the balanced classification and regression network 将步骤2中得到的扩增后的数据集Datanew的训练集Train作为平衡分类回归网络BCRN0的输入,采用传统前向传播方法把训练集Train送入平衡分类回归网络BCRN0进行运算,记平衡分类回归网络BCRN0的输出作为Result1;The training set Train of the augmented data set Datanew obtained in step 2 is used as the input of the balanced classification and regression network BCRN 0 , and the traditional forward propagation method is used to send the training set Train into the balanced classification and regression network BCRN 0 for operation, and the balance is recorded. The output of the classification and regression network BCRN 0 is used as Result1; 步骤5.2、对平衡分类回归网络进行训练和优化Step 5.2. Train and optimize the balanced classification and regression network 将步骤5.1中得到的平衡分类回归网络BCRN0的输出Result1作为输入,按照经典的Adam算法对区域推荐网络进行训练和优化;得到训练和优化之后的区域推荐网络BCRN1;Take the output Result1 of the balanced classification and regression network BCRN 0 obtained in step 5.1 as input, train and optimize the regional recommendation network according to the classic Adam algorithm; obtain the regional recommendation network BCRN1 after training and optimization; 步骤6、进行交替训练Step 6. Perform alternate training 判断步骤4中设置的epoch是否等于12;如果epoch不等于12,则令epoch=epoch+1、RPN0=RPN1、BCRN0=BCRN1,依次重复步骤4.1、步骤4.2、步骤4.3、步骤5.1、步骤5.2,然后返回步骤6对epoch进行再次判断;如果epoch等于12,则令训练后的区域推荐网络RPN1和训练后的平衡分类回归网络BCRN1记为网络BL-Net,然后进行步骤7.Determine whether the epoch set in step 4 is equal to 12; if the epoch is not equal to 12, set epoch=epoch+1, RPN 0 =RPN 1 , BCRN 0 =BCRN 1 , and repeat steps 4.1, 4.2, 4.3, and 5.1 in turn , Step 5.2, and then return to step 6 to judge the epoch again; if the epoch is equal to 12, let the trained regional recommendation network RPN1 and the trained balanced classification regression network BCRN1 be recorded as the network BL-Net, and then proceed to step 7. 步骤7、评估方法Step 7. Evaluation method 步骤7.1、前向传播Step 7.1, forward propagation 以步骤6中得到网络BL-Net和步骤2.5中得到的测试集Tests作为输入,采用传统的前向传播方法,得到检测结果,记为R;Take the network BL-Net obtained in step 6 and the test set Tests obtained in step 2.5 as input, and use the traditional forward propagation method to obtain the detection result, which is denoted as R; 以检测结果R作为输入,采用传统的非极大值抑制方法,去除检测结果R1中的冗余框,具体步骤如下:Taking the detection result R as input, the traditional non-maximum suppression method is used to remove redundant boxes in the detection result R1. The specific steps are as follows: 步骤(1)首先令检测结果R1中得分最高的框,记为BS;Step (1) first make the frame with the highest score in the detection result R1, denoted as BS; 步骤(2)然后采用计算公式为:
Figure FDA0003327601470000061
计算检测结果R1所有框的重叠率阈值(IoU);舍弃IoU>0.5的框;
Step (2) then adopts the calculation formula as:
Figure FDA0003327601470000061
Calculate the overlap rate threshold (IoU) of all boxes in the detection result R1; discard the boxes with IoU>0.5;
步骤(3)从剩余框中选出得分最高的框BS;Step (3) selects the frame BS with the highest score from the remaining frames; 重复上述步骤(2)中计算IoU和舍弃的过程,直到没有框可以舍弃,最后剩余的框即为最终检测结果,记为RFRepeat the process of calculating IoU and discarding in the above step (2), until there is no frame to discard, and the last remaining frame is the final detection result, denoted as RF ; 步骤7.2、计算指标Step 7.2, Calculate the index 以步骤7.1中得到的检测结果RF作为输入,采用传统的召回率和精确率计算方法,求出网络的精确率P、召回率R和精确率和召回率曲线P(R);Taking the detection result RF obtained in step 7.1 as the input, using the traditional recall rate and precision rate calculation method, obtain the network precision rate P , recall rate R and precision rate and recall rate curve P(R); 采用公式
Figure FDA0003327601470000071
计算得到基于平衡学习的SAR船只检测平均精度mAP。
using the formula
Figure FDA0003327601470000071
The average precision mAP for SAR vessel detection based on balanced learning is calculated.
CN202111268008.2A 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning Active CN113989672B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111268008.2A CN113989672B (en) 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111268008.2A CN113989672B (en) 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning

Publications (2)

Publication Number Publication Date
CN113989672A true CN113989672A (en) 2022-01-28
CN113989672B CN113989672B (en) 2023-10-17

Family

ID=79744053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111268008.2A Active CN113989672B (en) 2021-10-29 2021-10-29 SAR image ship detection method based on balance learning

Country Status (1)

Country Link
CN (1) CN113989672B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114973016A (en) * 2022-05-31 2022-08-30 西安邮电大学 Dual-polarization radar ship classification method based on grouped bilinear convolutional neural network
CN114972739A (en) * 2022-06-21 2022-08-30 天津大学 An Image Object Detection Method Based on Object Centroid Relationship
CN114998759A (en) * 2022-05-27 2022-09-02 电子科技大学 High-precision SAR ship detection method based on visual transform

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490158A (en) * 2019-08-23 2019-11-22 安徽大学 A kind of robust human face alignment schemes based on multistage model
CN110826428A (en) * 2019-10-22 2020-02-21 电子科技大学 A high-speed method for ship detection in SAR images
US20200278465A1 (en) * 2017-09-12 2020-09-03 Schlumberger Technology Corporation Seismic image data interpretation system
CN112285712A (en) * 2020-10-15 2021-01-29 电子科技大学 Method for improving detection precision of ship on shore in SAR image
CN113378813A (en) * 2021-05-28 2021-09-10 陕西大智慧医疗科技股份有限公司 Modeling and target detection method and device based on attention balance feature pyramid

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200278465A1 (en) * 2017-09-12 2020-09-03 Schlumberger Technology Corporation Seismic image data interpretation system
CN110490158A (en) * 2019-08-23 2019-11-22 安徽大学 A kind of robust human face alignment schemes based on multistage model
CN110826428A (en) * 2019-10-22 2020-02-21 电子科技大学 A high-speed method for ship detection in SAR images
CN112285712A (en) * 2020-10-15 2021-01-29 电子科技大学 Method for improving detection precision of ship on shore in SAR image
CN113378813A (en) * 2021-05-28 2021-09-10 陕西大智慧医疗科技股份有限公司 Modeling and target detection method and device based on attention balance feature pyramid

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
TIANWEN ZHANG 等: "Balance Scene Learning Mechanism for Offshore and Inshore Ship Detection in SAR Images" *
TIANWEN ZHANG 等: "Balanced Feature Pyramid Network for Ship Detection in Synthetic Aperture Radar Images" *
张天文 等: "\"一种大场景SAR图像中舰船检测虚警抑制方法 \"" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998759A (en) * 2022-05-27 2022-09-02 电子科技大学 High-precision SAR ship detection method based on visual transform
CN114973016A (en) * 2022-05-31 2022-08-30 西安邮电大学 Dual-polarization radar ship classification method based on grouped bilinear convolutional neural network
CN114972739A (en) * 2022-06-21 2022-08-30 天津大学 An Image Object Detection Method Based on Object Centroid Relationship

Also Published As

Publication number Publication date
CN113989672B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
CN111797717B (en) High-speed high-precision SAR image ship detection method
CN112285712B (en) A method to improve the detection accuracy of docked ships in SAR images
US20230169623A1 (en) Synthetic aperture radar (sar) image target detection method
CN113989672A (en) A Balanced Learning-Based Vessel Detection Method in SAR Images
CN110555446A (en) Remote sensing image scene classification method based on multi-scale depth feature fusion and transfer learning
CN111898621B (en) A Contour Shape Recognition Method
CN113361485B (en) A hyperspectral image classification method based on spectral spatial attention fusion and deformable convolutional residual network
CN108830296A (en) A kind of improved high score Remote Image Classification based on deep learning
CN108038445A (en) A kind of SAR automatic target recognition methods based on various visual angles deep learning frame
CN108257154B (en) Polarimetric SAR image change detection method based on regional information and CNN
CN105975931A (en) Convolutional neural network face recognition method based on multi-scale pooling
CN112802054B (en) Mixed Gaussian model foreground detection method based on fusion image segmentation
CN113850189B (en) Embedded twin network real-time tracking method applied to maneuvering platform
CN112163599A (en) Image classification method based on multi-scale and multi-level fusion
CN112766340B (en) Depth capsule network image classification method and system based on self-adaptive spatial mode
Xi et al. Semi-supervised graph prototypical networks for hyperspectral image classification
CN113705331A (en) SAR ship detection method based on quaternary characteristic pyramid network
CN113449612A (en) Three-dimensional target point cloud identification method based on sub-flow sparse convolution
CN106503743A (en) A kind of quantity is more and the point self-adapted clustering method of the high image local feature of dimension
CN115272670A (en) SAR image ship instance segmentation method based on mask attention interaction
Mei et al. Cascade residual capsule network for hyperspectral image classification
CN115272842A (en) SAR image ship instance segmentation method based on global semantic boundary attention network
CN107766858A (en) A kind of method that ship detecting is carried out using diameter radar image
WO2025020476A1 (en) Point cloud feature recognition and labeling algorithm based on improved pointnet++
CN114998759A (en) High-precision SAR ship detection method based on visual transform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant