CN112668440B - SAR ship target detection method based on regression loss of balance sample - Google Patents
SAR ship target detection method based on regression loss of balance sample Download PDFInfo
- Publication number
- CN112668440B CN112668440B CN202011544100.2A CN202011544100A CN112668440B CN 112668440 B CN112668440 B CN 112668440B CN 202011544100 A CN202011544100 A CN 202011544100A CN 112668440 B CN112668440 B CN 112668440B
- Authority
- CN
- China
- Prior art keywords
- layer
- training
- module
- loss function
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Image Analysis (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses an SAR ship target detection method based on balance sample regression loss, and mainly solves the problem that a trained network model is low in ship target detection performance due to imbalance of difficult and easy samples in the conventional deep learning method. The implementation scheme is as follows: 1) Acquiring ship data, and dividing the ship data into training data and testing data; 2) Selecting a fast-RCNN network as a training network model; 3) Improving an original loss function of the training network to form a new total loss function; 4) Sending the training data into the network selected in the step 2), and training the network by using a new total loss function to obtain a finally trained network model; 5) And sending the test data to the trained network model to obtain a ship target detection result. The method can better extract the depth characteristics of the ship target, improves the detection performance of the ship target, and can be used for detecting the ship target.
Description
Technical Field
The invention belongs to the technical field of radars, and mainly relates to an SAR image ship target detection method which can be used for subsequent ship target identification and classification.
Background
The synthetic aperture radar is an active imaging sensor and has all-weather, all-time and high-resolution data acquisition capability. And has the characteristics of multiple frequency bands, multiple polarization, variable visual angle, penetrability and the like. At present, the SAR is widely applied to the fields of military reconnaissance, geological survey, topographic mapping and cartography, disaster prediction, marine application, scientific research and the like, and has wide research and application prospects. The Automatic Target Recognition (ATR) of the SAR image is one of important applications of the SAR image. The basic SAR image automatic target recognition ATR system generally comprises three stages of target detection, target identification and target recognition. The performance of target detection will directly affect the performance and efficiency of the identification and classification stages, so it is very important for the research of SAR target detection.
The traditional SAR image target detection algorithm is mainly a constant false alarm rate CFAR method, which determines a detection threshold value according to a pre-established clutter statistical model, but the models have the defects of limited application scenes and low detection efficiency. The current popular deep learning method can automatically obtain the characteristics of data through training for target detection, can ensure the accuracy and the detection speed under the condition of sufficient data, and is more suitable for different scenes, so that the method attracts general attention in researching SAR image ship target detection through the deep learning method.
Jianweii Li et al arranges and releases SSDD SHIP data sets IN AN article published by IEEE (SHIP DETECTION IN SAR IMAGES BASED ON AN IMPROVED FASTER R-CNN) and provides a method for detecting by using AN IMPROVED FASTER-RCNN network.
Yuanyuan Wang et al improve the RetinaNet Detection network in an article Automatic Ship Detection Based on RetinaNet Using Multi-Resolution Gaofen-3 image published by remote sensing, and apply the RetinaNet Detection network to high-Resolution No. 3 SAR image Ship data to obtain better Detection performance. The method solves the problem of unbalance of difficult and easy samples in classification by using a Focal loss classification loss function, however, the unbalance problem in regression is not considered, and the detection result still needs to be further improved.
Disclosure of Invention
The invention aims to provide a SAR ship target detection method based on balance sample regression loss, which aims to solve the problem that a trained network model is poor in ship target detection effect due to imbalance of difficult and easy samples in the conventional deep learning method and improve the ship target detection performance.
The technical scheme of the invention is as follows: firstly, SSDD data is divided into training data and testing data, then a fast-RCNN deep neural network model is trained by using the training data, the network uses an improved regression loss function, and after the model is converged, the trained neural network is applied to the testing data to obtain a final ship detection result, wherein the implementation steps comprise the following steps:
(1) Acquiring an SSDD ship data set, and dividing the SSDD ship data set into training data phi according to the proportion of 8 x And test data phi c ;
(2) Selecting a training network model omega formed by sequentially connecting a shared basic module VGG, a region selection module RPN and a region refinement module Fast-RCNN;
(3) Constructing an improved loss function;
(3a) Regressing the RPN blocks of the network in lossesThe function is improved asA function of the form:
wherein j is the jth training sample of the RPN module, t n Location information predicted for the RPN module for the training samples,n belongs to { x, y, w, h } and is the position information of the target frame corresponding to the training sample, a is a hyperparameter, and a =2;
wherein p is the p-th training sample of Fast-RCNN module, e m The position information predicted for the Fast-RCNN module for the training samples,and m belongs to { x, y, w, h } for the position information of the target frame corresponding to the training sample.
(3c) Obtaining the total loss function J after the training network is improved according to the improved regression loss functions of (3 a) and (3 b) s It is represented as follows:
wherein, J s1 、λ rpn And N s1 The classification loss function, the loss balance constant and the number of training samples, J, of the RPN module s3 、λ fast And N s2 For the classification loss function, loss balance constant and training sample number of the Fast-RCNN module,is the label of whether the jth sample of the RPN block is a positive sample,is a label for the p-th sample of the Fast-RCNN module.
(4) Will train data phi x Inputting the loss function into a well-constructed training network model omega, and using a total loss function J of the network s Training the network model omega until the loss function is converged to obtain a trained network model omega';
(5) Ship test data phi c And inputting the data into a finally trained network model omega' to obtain a detection result of the ship.
Compared with the prior art, the invention has the following advantages:
the method aims at the problem that regression loss is difficult and samples are unbalanced when the existing deep learning target detection method Faster-RCNN is applied to SAR ship target detection, improves a regression loss function, can better extract the depth characteristics of a ship target, and improves the ship target detection performance.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is a diagram of a training network model architecture used in the present invention;
FIG. 3 is a graph of the gradient of the regression loss function designed in the present invention.
Detailed Description
The embodiments and effects of the present invention will be described in detail below with reference to the accompanying drawings:
referring to fig. 1, the implementation steps of the invention are as follows:
The SSDD ship data set is shot by a plurality of radar satellites and formed by arranging and marking by experts, and the training data phi is divided into the training data phi according to the proportion of 8 x And test data phi c 。
And 2, selecting a training network model omega.
The existing target detection networks comprise a YOLO series, an SSD network, a Faster-RCNN network and the like, and the classical Faster-RCNN network is selected as a training network model omega for detection in the example.
Referring to fig. 2, the training network model Ω selected in this example is formed by sequentially connecting a shared basic module VGG, a region selection module RPN, and a region refinement module Fast-RCNN, and each module has the following structure:
2.1 VGG base module:
the module comprises 5 rolled blocks and 4 average pooling layers, namely a first rolled block V1 → a first average pooling layer P1 → a second rolled block V2 → a second average pooling layer P2 → a third rolled block V3 → a third average pooling layer P3 → a fourth rolled block V4 → a fourth average pooling layer P4 → a fifth rolled block V5, and the parameter settings and relationships of the layers are as follows:
2.1 a) convolution block V1, which is formed by two identical convolution blocks in cascade, each convolution block consisting of a two-layer structure, i.e. the first layer being a convolution layerThe second layer is a ReLU activation function layeri denotes the ith convolution block, i =1,2, where:
first layer of the convolution layerIts convolution kernel K 1 The window size of (3X 3), the sliding step length S 1 Is 1, the filling mode is SAME, and is used for convolving the input and outputting 64 characteristic mapsActivating a function layer as a second layerThe input of (1);
second ReLU activation function layerFor the upper layerThe output of the layer is mapped nonlinearly, and the nonlinear mapping formula is as follows:
ReLU(x)=max(0,x)
where x is the input and ReLU (x) is the output, the input and output dimensions of the layer are the same.
2.1 b) average pooling layer P1 for down-sampling the input with a down-sampling kernel U 1 Has a window size of 2 x 2, a sliding step length V 1 Is 2, 64 feature maps Y are output 1 As input to the volume block V2 layer.
2.1 c) a convolution block V2 consisting of two identical convolution blocks in cascade, each convolution block consisting of a two-layer structure, i.e. the first layer being a convolution layerThe second layer is a ReLU activation function layeri denotes the i-th convolution block, i =1,2, where:
first layer of convolutional layerIts convolution kernel K 2 Has a window size of 3 x 3, a sliding step length S 2 The filling mode is SAME, and the filling mode is used for convolving the input and outputting 128 feature mapsAs a second layerLayer of activation functionThe input of (2);
second ReLU activation function layerFor aligning the upper layerThe output of the layer is mapped nonlinearly, and the nonlinear mapping formula is as follows:
ReLU(x)=max(0,x)
where x is the input and ReLU (x) is the output, the input and output dimensions of the layer are the same.
2.1 d) average pooling layer P2 for down-sampling the input with down-sampling kernel U 2 Has a window size of 2 x 2, a sliding step length V 2 Is 2, 128 feature maps Y are output 2 As input to the volume block V3.
2.1 e) a convolution block V3 consisting of three identical convolution blocks in cascade, each convolution block consisting of a two-layer structure, i.e. the first layer being a convolution layerThe second layer is a ReLU activation function layeri denotes the ith convolution block, i =1,2,3, where:
first layer of convolutional layerIts convolution kernel K 3 Has a window size of 3 x 3, a sliding step length S 3 The filling mode is SAME, and is used for convolving input and outputting 256 characteristic graphsActivating a function layer as a second layerThe input of (2);
second ReLU activation function layerFor aligning the upper layerThe output of the layer is mapped nonlinearly, and the nonlinear mapping formula is as follows:
ReLU(x)=max(0,x)
where x is the input and ReLU (x) is the output, the input and output dimensions of the layer are the same.
2.1 f) average pooling layer P3 for down-sampling the input, with down-sampling kernel U 3 Has a window size of 2 x 2, a sliding step length V 3 To 2, 256 feature maps Y are output 3 As input to the volume block V4.
2.1 g) convolution block V4, which is cascaded with three identical convolution blocks, each convolution block consisting of a two-layer structure, i.e. the first layer being a convolution layerThe second layer is a ReLU activation function layeri denotes the ith convolution block, i =1,2,3, where:
first layer of convolutional layerIts convolution kernel K 4 Has a window size of 3 x 3, a sliding step length S 4 Is 1, the filling mode is SAME, and is used for convolving the input and outputting 512 feature mapsActivating a function layer as a second layerThe input of (1);
second ReLU activation function layerFor aligning the upper layerThe output of the layer is mapped nonlinearly, and the nonlinear mapping formula is as follows:
ReLU(x)=max(0,x)
where x is the input and ReLU (x) is the output, the input and output dimensions of the layer are the same.
2.1 h) an average pooling layer P4 for down-sampling the input with a down-sampling kernel U 4 Has a window size of 2 x 2, a sliding step length V 4 To 2, 512 feature maps Y are output 4 As input to the volume block V5.
2.1 i) convolution block V5, which is formed by three identical convolution blocks in cascade, each convolution block consisting of a two-layer structure, i.e. the first layer being a convolution layerThe second layer is a ReLU activation function layeri denotes the ith convolution block, i =1,2,3, where:
first layer of convolutional layerIts convolution kernel K 5 Has a window size of 3 x 3, a sliding step length S 5 The filling mode is SAME, and the filling mode is used for convolving the input and outputting 512 feature mapsActivating a function layer as a second layerThe input of (1);
second layerReLU activation function layerFor aligning the upper layerThe output of the layer is mapped nonlinearly, and the nonlinear mapping formula is as follows:
ReLU(x)=max(0,x)
where x is the input and ReLU (x) is the output, the input and output dimensions of the layer are the same.
Last convolution block ReLU activation function layerAnd the output is the shared feature F extracted by the VGG basic layer.
2.2 RPN module:
the layers in turn being composed of a shared convolution layer C 1 Activation function layer C 2 Two parallel classification branches C 3 And regression convolutional layer C 4 The composition, each layer parameter setting and relation are as follows:
shared convolution layer C 1 Convolution kernel K of 6 The window size of (3X 3), the sliding step length S 6 Is 1, the filling mode is SAME, and is used for convolving the input and outputting 512 feature maps Y 6 As a second layer activation function layer C 2 The input of (1);
second layer ReLU activation function layer C 2 For the upper layer C 1 The output of the layer is mapped nonlinearly, and the nonlinear mapping formula is as follows:
ReLU(x)=max(0,x)
where x is the input and ReLU (x) is the output, the input and output dimensions of the layer are the same.
The classification branch is formed by a classification convolution layer C 31 Softmax classifier layer C 32 The composition, each layer parameter setting and relation are as follows:
classified convolutional layer C 31 Convolution kernel K of it 7 Has a window size of 1 × 1, a sliding step S 7 Is 1, the filling mode isSAME for convolving input and outputting 18 characteristic maps Y 7 ;
Softmax classifier layer C 32 For winding up the layers C to be classified 31 The obtained 18 feature maps Y 7 And inputting the classification probability vector p into two types of Softmax classifiers.
Regression convolutional layer C 4 Convolution kernel K of 8 The window size of (2) is 1 x 1, the sliding step length S 8 The filling mode is SAME, and the filling mode is used for convolving the input and outputting 36 feature maps b which represent 4 offsets t respectively predicted by the RPN module for 9 preset frames x ,t y ,t w ,t h ;
According to regression convolution layer C 4 The output b of the step (a) determines the position of the initial detection frame, screens the detection frame through the classification probability vector p, judges the frame with the score meeting the threshold value as a candidate frame and uses the candidate frame as the input of a next refinement module Fast-RCNN module.
2.3 Fast-RCNN Module:
this layer comprises ROIpooling layer, first full articulamentum F1, the full articulamentum F2 of second, two categorised branch roads F3 that parallel and the full articulamentum F4 of regression in proper order, and each layer parameter sets up and the relation as follows:
a ROIploling layer R for down-sampling the input feature maps of different sizes into the same size, which divides the input feature map into 7 × 7 blocks, selects the maximum value as the output of the current feature block, and outputs 512 feature maps Y 8 ,Y 8 The size is 512 × 7 × 7 as input to the first fully-connected layer F1.
The first full-connection layer F1 is provided with 4096 neurons and is used for carrying out nonlinear mapping on the characteristics output by the previous ROIploling layer and outputting a 4096-dimensional column vector;
the second full connection layer F2 is provided with 4096 neurons and is used for carrying out nonlinear mapping on the column vector output by the last full connection layer F1 and outputting a 4096-dimensional column vector;
the classification branch is composed of a classification full-connection layer F31 and a Softmax classifier layer F32 in sequence, and the parameter setting and relation of each layer are as follows:
a classification fully-connected layer F31, which is provided with 2 neurons and is used for carrying out nonlinear mapping on the column vector output by the last fully-connected layer F2 and outputting a 2-dimensional column vector as the input of a Softmax classifier layer F32;
the Softmax classifier layer F32 is used for inputting the 2-dimensional column vectors obtained by classifying the full connection layer F31 into two types of Softmax classifiers to obtain a classification probability vector T, and classifying the candidate frames according to the probability values;
a regression full-connected layer F4 with 4 neurons for performing nonlinear mapping on the output column vector of the previous full-connected layer F2 and outputting a 4-dimensional regression column vector C representing the regression offset e of the candidate frame x ,e y ,e w ,e h 。
And 3, constructing an improved loss function.
The regression loss in the original loss function for training the network model omega uses smooth L1 The function has the problem that the regression loss is difficult and easy to sample unbalance during network training, and the network detection performance is influenced, so that the regression loss function needs to be improved, and a new total loss function is constructed.
The new total loss function is composed of the sum of the loss function of the RPN module and the loss function of the Fast-RCNN module, and is constructed as follows:
3.1 Constructing an RPN module loss function:
3.1.1 Preset frame information and target frame information: the RPN module is configured with nine preset frames for each feature point of the shared feature F extracted by the VGG base layer, the nine frames are obtained by three aspect ratios and three dimensions, wherein the width: the height e {1, 1 a ,y a ) Is a center coordinate of a preset frame, w a Is the width of the frame, h a Height of the frame (x) * ,y * ) Is the center coordinate of the target frame, w * Is the width of the target frame, h * Is the height of the target frame;
3.1.2 Calculating the intersection ratio IOU value of each preset frame and all target frames according to the preset frame information and the target frame information:
wherein, A is a preset frame, B is a target frame, A ^ B is the intersection area of the preset frame and the target frame, and A ^ B is the intersection area of the preset frame and the target frame;
3.1.3 Setting an IOU threshold, dividing sample classes of a preset frame:
setting the IOU lower threshold v 1 Is 0.3, the upper threshold value v 2 Is 0.7.
If the IOU of the preset frame and a certain target frame is more than or equal to the upper limit threshold value v 2 Or if the IOU of the preset frame and a certain target frame is the largest of all the preset frames and the IOU of the target frame, the preset frame is a positive sample, and the target frame is allocated to the preset frame;
if the IOU of the preset frame and all the target frames is less than or equal to the lower limit threshold v 1 If the IOU of the preset frame and a certain target frame is not the largest of all the preset frames and the IOU of the target frame, the preset frame is a negative sample;
if the IOU of the preset frame and all the target frames is larger than the lower limit threshold value v 1 And are all less than the upper threshold v 2 If the IOU of the preset frame and the IOU of a certain target frame are not the largest of all the preset frames and the IOU of the target frame, the preset frame is a useless sample and does not participate in training;
3.1.4 According to the preset frame information and the corresponding target frame information, the target offset is calculated
3.1.5 Raw loss function of RPN block:
obtaining a classification probability vector p and a regression feature map b by using an RPN module, wherein the classification probability vector p is the probability of the preset frame as the target and the background, and the regression feature map b is the offset information t predicted by the network for each preset frame x ,t y ,t w ,t h Obtaining a raw loss function J 'of the RPN module' rpn :
J' rpn =J s1 +λ rpn J s2
Wherein, J s1 As a cross entropy loss function, N s1 For the total number of training samples of the RPN module, N when training using a batch gradient descent algorithm s1 The number of samples 256 for a batch is taken,for the jth sample corresponding to the kth class of labels,a probability of predicting a jth sample as a kth class for the RPN network; j. the design is a square s2 In order to be a function of the regression loss,if the jth sample is a label of the positive sample, and the label is 1, the preset frame is the positive sample, and the regression loss is calculated(ii) a When the label is 0, the preset frame is a negative sample, and the regression loss is 0; t is t n Is the position information of the preset frame,n is the position information of the target frame and belongs to { x, y, w, h }, lambda rpn Is an equilibrium constant.The function is:
3.1.6 The original loss function is improved):
referring to fig. 3, the loss of the regression part of the RPN module is counted, the loss gradient value generated by a simple sample is obviously larger than the loss gradient value generated by a difficult sample, and the pair is used for solving the problem of unbalance of the difficult and easy sample in ship detectionThe function is improved to increase the gradient of the hard sample loss, theWith improved functionComprises the following steps:
wherein a is a hyperparameter, a =2, and the improved RPN module loss J is obtained rpn Comprises the following steps:
3.2 Build Fast-RCNN module loss function.
3.2.1 Regression feature map b) predicted from RPN, which represents the offset t of the candidate box x ,t y ,t w ,t h And calculating the position information x, y, w, h of the candidate frame:
t x =(x-x a )/w a ,
t y =(y-y a )/h a ,
t w =log(w/w a ),
t h =log(h/h a );
wherein, (x, y) is the center coordinate of the candidate frame, w is the width of the candidate frame, and h is the height of the frame;
3.2.2 Set the IOU threshold, partition the sample class of candidate frames:
setting the IOU threshold v 3 Is 0.5.
Calculating the IOU of the candidate frame and the target frame according to the position coordinates of the candidate frame, and if the IOU of the candidate frame and all the target frames is less than a threshold value v 3 The candidate box is a negative sample;
if the intersection ratio of the candidate frame and a certain target frame is larger than the threshold value v 3 The candidate box is a positive sample, and the target box is allocated to the candidate box;
3.2.4 Raw loss function of Fast-RCNN module:
further position regression and classification are carried out on the front 2000 candidate frame judged to be high in target probability according to the classification probability vector p obtained by the RPN module by using a Fast-RCNN module to obtain a classification probability vector T and a regression column vector C, wherein the classification probability vector is the probability that the candidate frame is judged to be a target and a background, and the regression column vector C is the prediction offset e of the candidate frame x ,e y ,e w ,e h Obtaining a loss function J 'of Fast-RCNN module' fast :
J' fast =J s3 +λ fast J s4
Wherein, J s3 As a cross-entropy loss function, N s2 Total number of training samples for Fast-RCNN Module, N when training using batch gradient descent Algorithm s2 The number of samples taken for a batch is 128,for the p-th sample corresponding to the kth class of labels,a probability of predicting the p-th sample as a k-th class for the Fast RCNN network; j. the design is a square s4 In order to return the loss of the power,if the p-th sample is a label of a positive sample, the candidate box is a positive sample when the label is 1, and the regression loss is calculated, and if the label is 0, the candidate box is a negative sample, and the regression loss is calculatedIs 0, λ fast Is an equilibrium constant.
3.2.5 Improve the original loss function:
similarly, for solving the problem of unbalanced sample difficulty in ship detection, the pairWith improved functionThe function is:
wherein a is a hyper-parameter, a =2, and the loss J of the improved Fast-RCNN module fast Comprises the following steps:
3.3 ) improved regression loss function J according to 3.1) rpn And 3.2) improved regression loss function J fast To obtain the total loss function J after the training network is improved s :
Step 4, using the loss function J constructed in the step 3 s And training the network model omega to obtain a trained network model omega'.
4.1 Will train data Φ x Input to trainingTraining in the network model omega, training one picture at a time, and calculating the loss function J of the network according to the label of the sent picture s A value of (d);
4.2 According to step 4.1) loss function J s Calculating the gradient of the network loss function, and balancing the gradient generated by the difficult and easy sample on the regression loss through the improved loss function;
4.3 According to the gradient generated by the loss function calculated in 4.2) to the network, continuously updating the weight to the direction of reducing the loss function by using a random gradient descent algorithm, and propagating the error of an output layer forward by using a back propagation algorithm to update each layer parameter of the network model omega;
4.4 Step 4.1) -step 4.3) are executed in a loop until the loss function converges, and a well-trained network model Ω' is obtained.
Step 5, testing ship data phi c And inputting the data into a finally trained network model omega' to obtain a detection result of the ship.
The effects of the present invention can be further illustrated by the following experimental data:
first, experimental conditions
1) Experimental data
The SSDD data set which is sorted and issued by the naval and aviation university of the people's liberation army of China is used in the experiment, the data comprises multi-size ship targets and various imaging conditions, such as different resolutions, sea conditions and sensor types, and the diversity of data set samples enables the trained detector to have better robustness. Data set various imaging conditions are shown in table 1.
Table 1 ship data imaging conditions
In Table 1, RADARSAT-2 is a RADARSAT-2 satellite transmitted in Canada, TERRASAR-X is a TERRASAR-X satellite transmitted in Germany, and SENTINEL-1 is a SENTINEL-1 satellite transmitted in the European Union; HH is the polarization mode of horizontal transmitting and receiving of the satellite, VV is the polarization mode of vertical transmitting and receiving of the satellite, HV is the polarization mode of horizontal transmitting and vertical receiving of the satellite, and VH is the polarization mode of vertical transmitting and horizontal receiving of the satellite.
2) Criteria for evaluation
The experiment is repeated for five times, and the average value of the average accuracy of the detection results of the five times of experiments and the average value of the detection rate are taken to evaluate the experiment results.
Second, the contents of the experiment
The experimental data were compared by the method of the present invention and the two methods in the prior art, and the results of the performance parameter comparison are shown in table 2.
TABLE 2 comparison of Performance parameters of the inventive method with those of the prior art
Comparison method | Average rate of accuracy | Detection rate |
SmoothL1 | 93.59% | 94.38% |
BalanceL1 | 93.83% | 93.99% |
The invention | 94.88% | 95.68% |
In table 2: smooth L1 uses Smooth for the fast-RCNN network regression loss function L1 A method for detecting ship data;
balance L1 as a function of regression loss for the fast-RCNN network A Balance proposed in the article Libra R-CNN, forwards Balanced Learning for Objectdetection L1 A method for detecting ship data by a regression loss function;
as can be seen from table 2, compared with the existing method, the method of the present invention achieves a better detection effect, because the loss function designed by the method of the present invention can better solve the problem of imbalance of difficult and easy samples, and the network can more accurately learn the characteristics of various samples, the present invention obtains a better detection result than the existing method.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (7)
1. A SAR ship target detection method based on balance sample regression loss is characterized by comprising the following steps:
(1) Acquiring an SSDD ship data set, and dividing the SSDD ship data set into training data phi according to the proportion of 8 x And test data phi c ;
(2) Selecting a training network model omega formed by sequentially connecting a shared basic module VGG, a region selection module RPN and a region refinement module Fast-RCNN;
(3) Constructing an improved loss function;
(3a) In regressing losses of RPN blocks of the networkThe function is improved asA function of the form:
wherein j is the jth training sample of the RPN module, t n Location information predicted for the training samples by the RPN module,the position information of a target frame corresponding to the training sample is n belongs to { x, y, w, h }, a is a hyper-parameter, and a =2;
wherein p is the p-th training sample of Fast-RCNN module, e m The position information predicted for the Fast-RCNN module for the training samples,the position information of a target frame corresponding to the training sample is m belongs to { x, y, w, h };
(3c) Obtaining the total loss function J after the training network is improved according to the improved regression loss functions of (3 a) and (3 b) s It is expressed as follows:
wherein, J s1 、λ rpn And N s1 The classification loss function, the loss balance constant and the number of training samples, J, of the RPN module s3 、λ fast And N s2 For the classification loss function, the loss balance constant and the number of training samples of the Fast-RCNN module,is the label of whether the jth sample of the RPN block is a positive sample,whether the p sample of the Fast-RCNN module is a label of a positive sample or not;
(4) Will train data phi x Inputting the total loss function J of the network into a training network model omega s Training the network model omega until the loss function is converged to obtain a trained network model omega';
(5) Ship test data phi c And inputting the data into a finally trained network model omega' to obtain a detection result of the ship.
2. The method of claim 1, wherein the VGG, the shared basis module for the network model in (2), comprises 5 volume blocks and 4 average pooling layers, in order, first volume block V1 → first average pooling layer P1 → second volume block V2 → second average pooling layer P2 → third volume block V3 → third average pooling layer P3 → fourth volume block V4 → fourth average pooling layer P4 → fifth volume block V5, and the layer parameters are set and related as follows:
the 4 pooling layers have the same structure and are used for down-sampling input, the window size of a down-sampling kernel is 2 multiplied by 2, the sliding step length is 2, the number of output feature maps is consistent with the input, and the output feature maps are used as the input of the next volume block;
the first convolution block V1 and the second convolution block V2 have the same structure and are formed by cascading two identical convolution blocks, wherein each convolution block is composed of a convolution layer L i1 And ReLU activation function layer L i2 Two-layer structure composition, i denotes the ith convolution block, i =1,2;
the third convolution block V3, the fourth convolution block V4 and the fifth convolution block V5 have the same structure and are formed by cascading three identical convolution blocks, and each convolution block is of a two-layer structure, namely the first layer is a convolution layer T j1 The second layer is a ReLU activation function layer T j2 J denotes the jth volume block, j =1,2,3.
3. The method of claim 1, wherein the region selection module RPN of the network model in (2) is sequentially selected from the shared convolution layer C 1 Activation function layer C 2 Two parallel classification branches C 3 And regression convolutional layer C 4 Composition of the classification branch C 3 Sequentially from the classified convolution layer C 31 And Softmax classifier layer C 32 A component for obtaining a classification probability vector p; the regression convolutional layer C 4 For convolution of the input to obtain 36 position prediction characteristic maps b.
4. The method according to claim 1, wherein in (2) the regional refinement module Fast-RCNN of the network model consists of a ROI posing layer, a fully-connected layer F1, a fully-connected layer F2, two juxtaposed classification legs F3 and a regressive fully-connected layer F4 in sequence, the classification legs F3 consisting of a fully-connected layer F31 and a Softmax classifier layer F32 in sequence for deriving the classification probability vector t, the regressive fully-connected layer F4 outputting a 4-dimensional regressive column vector C representing the regression offset e of the candidate box x ,e y ,e w ,e h 。
5. According to the claimThe method of claim 1, the classification penalty function J of the RPN block of (3 c) s1 Expressed as follows:
wherein N is s1 For the total number of training samples of the RPN module, N when training using a batch gradient descent algorithm s1 Taking the number of samples 256 for a batch,for the jth sample corresponding to the kth class of labels,probability of predicting the jth sample as class k for the RPN network.
6. The method according to claim 1, wherein the classification loss function J of Fast-RCNN module in (3 c) s3 Expressed as follows:
wherein N is s2 Total number of training samples for Fast-RCNN Module, N when training Using batch gradient descent Algorithm s2 The number of samples taken for a batch is 128,for the p-th sample corresponding to the kth class of labels,the probability of predicting the p-th sample as class k for the Fast-RCNN module.
7. The method of claim 1, wherein (4) the training network model Ω is trained by:
4a) Sending the training data into a network model omega for training, training one picture at a time, and calculating a network loss function J according to a label sent into the picture s A value of (d);
4b) Calculating the loss function gradient of the network, and balancing the gradient generated by the difficult and easy sample on regression through the improved loss function;
4c) According to the gradient generated by the loss function calculated in 4 b) on the network, continuously updating the weight in the direction of reducing the loss function by using a random gradient descent algorithm, and propagating the error of the output layer forward by using a back propagation algorithm to update each layer parameter of the network model omega;
4d) Loop through 4 a) -4 c) until the loss function J) is reached s And converging to obtain a trained network model omega'.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011544100.2A CN112668440B (en) | 2020-12-24 | 2020-12-24 | SAR ship target detection method based on regression loss of balance sample |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011544100.2A CN112668440B (en) | 2020-12-24 | 2020-12-24 | SAR ship target detection method based on regression loss of balance sample |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112668440A CN112668440A (en) | 2021-04-16 |
CN112668440B true CN112668440B (en) | 2023-02-10 |
Family
ID=75409594
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011544100.2A Active CN112668440B (en) | 2020-12-24 | 2020-12-24 | SAR ship target detection method based on regression loss of balance sample |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112668440B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113536936B (en) * | 2021-06-17 | 2022-10-11 | 中国人民解放军海军航空大学航空作战勤务学院 | Ship target detection method and system |
CN113469088B (en) * | 2021-07-08 | 2023-05-12 | 西安电子科技大学 | SAR image ship target detection method and system under passive interference scene |
CN113537170A (en) * | 2021-09-16 | 2021-10-22 | 北京理工大学深圳汽车研究院(电动车辆国家工程实验室深圳研究院) | Intelligent traffic road condition monitoring method and computer readable storage medium |
CN115049884B (en) * | 2022-08-15 | 2022-10-25 | 菲特(天津)检测技术有限公司 | Broad-sense few-sample target detection method and system based on fast RCNN |
CN115346125B (en) * | 2022-10-18 | 2023-03-24 | 南京金瀚途科技有限公司 | Target detection method based on deep learning |
CN115641510B (en) * | 2022-11-18 | 2023-08-08 | 中国人民解放军战略支援部队航天工程大学士官学校 | Remote sensing image ship detection and identification method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674714A (en) * | 2019-09-13 | 2020-01-10 | 东南大学 | Human face and human face key point joint detection method based on transfer learning |
CN111027454A (en) * | 2019-12-06 | 2020-04-17 | 西安电子科技大学 | SAR (synthetic Aperture Radar) ship target classification method based on deep dense connection and metric learning |
CN111460984A (en) * | 2020-03-30 | 2020-07-28 | 华南理工大学 | Global lane line detection method based on key point and gradient balance loss |
CN111626200A (en) * | 2020-05-26 | 2020-09-04 | 北京联合大学 | Multi-scale target detection network and traffic identification detection method based on Libra R-CNN |
CN111860264A (en) * | 2020-07-10 | 2020-10-30 | 武汉理工大学 | Multi-task instance level road scene understanding algorithm based on gradient equilibrium strategy |
CN112016467A (en) * | 2020-08-28 | 2020-12-01 | 展讯通信(上海)有限公司 | Traffic sign recognition model training method, recognition method, system, device and medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3596449A4 (en) * | 2017-03-14 | 2021-01-06 | University of Manitoba | Structure defect detection using machine learning algorithms |
-
2020
- 2020-12-24 CN CN202011544100.2A patent/CN112668440B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674714A (en) * | 2019-09-13 | 2020-01-10 | 东南大学 | Human face and human face key point joint detection method based on transfer learning |
CN111027454A (en) * | 2019-12-06 | 2020-04-17 | 西安电子科技大学 | SAR (synthetic Aperture Radar) ship target classification method based on deep dense connection and metric learning |
CN111460984A (en) * | 2020-03-30 | 2020-07-28 | 华南理工大学 | Global lane line detection method based on key point and gradient balance loss |
CN111626200A (en) * | 2020-05-26 | 2020-09-04 | 北京联合大学 | Multi-scale target detection network and traffic identification detection method based on Libra R-CNN |
CN111860264A (en) * | 2020-07-10 | 2020-10-30 | 武汉理工大学 | Multi-task instance level road scene understanding algorithm based on gradient equilibrium strategy |
CN112016467A (en) * | 2020-08-28 | 2020-12-01 | 展讯通信(上海)有限公司 | Traffic sign recognition model training method, recognition method, system, device and medium |
Non-Patent Citations (4)
Title |
---|
Automatic Ship Detection Based on RetinaNet UsingMulti-Resolution Gaofen-3 Imagery;Yuanyuan Wang等;《Remote Sensing》;20190331;全文 * |
Gradient Harmonized Single-Stage Detector;Buyu Li等;《Computer Vision and Pattern Recognition 》;20181113;全文 * |
Ship detection in SAR images based on an improved faster R-CNN;Jianwei Li等;《 2017 SAR in Big Data Era: Models, Methods and Applications》;20171130;全文 * |
一种基于改进Faster RCNN的校园车辆检测方法;李航等;《沈阳师范大学学报(自然科学版)》;20200215(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112668440A (en) | 2021-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112668440B (en) | SAR ship target detection method based on regression loss of balance sample | |
CN109086700B (en) | Radar one-dimensional range profile target identification method based on deep convolutional neural network | |
CN110472627B (en) | End-to-end SAR image recognition method, device and storage medium | |
CN111027454B (en) | SAR ship target classification method based on deep dense connection and metric learning | |
CN110533631A (en) | SAR image change detection based on the twin network of pyramid pondization | |
CN112395987B (en) | SAR image target detection method based on unsupervised domain adaptive CNN | |
CN110084159A (en) | Hyperspectral image classification method based on the multistage empty spectrum information CNN of joint | |
CN107256396A (en) | Ship target ISAR characteristics of image learning methods based on convolutional neural networks | |
CN107798345B (en) | High-spectrum disguised target detection method based on block diagonal and low-rank representation | |
CN112699717A (en) | SAR image generation method and generation device based on GAN network | |
Bragilevsky et al. | Deep learning for Amazon satellite image analysis | |
CN113486819A (en) | Ship target detection method based on YOLOv4 algorithm | |
CN110263644A (en) | Classifying Method in Remote Sensing Image, system, equipment and medium based on triplet's network | |
CN111832580A (en) | SAR target identification method combining few-sample learning and target attribute features | |
CN110069987B (en) | Single-stage ship detection algorithm and device based on improved VGG network | |
CN110414494A (en) | SAR image classification method with ASPP deconvolution network | |
Gui et al. | A scale transfer convolution network for small ship detection in SAR images | |
CN117710783A (en) | SAR long tail target detection method based on double-drive equalization loss and recessive characteristic enhancement | |
CN117765418A (en) | Unmanned aerial vehicle image matching method | |
CN113409381A (en) | Dual-polarization channel fusion ship size estimation method based on CNN | |
Zhang et al. | Ship detection and recognition in optical remote sensing images based on scale enhancement rotating cascade R-CNN networks | |
Li et al. | SAR image object detection based on improved cross-entropy loss function with the attention of hard samples | |
CN114049551B (en) | ResNet 18-based SAR raw data target identification method | |
CN115797663A (en) | Space target material identification method under complex illumination condition | |
CN113361439B (en) | SAR image ship target identification method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |