CN113902975A - Scene perception data enhancement method for SAR ship detection - Google Patents

Scene perception data enhancement method for SAR ship detection Download PDF

Info

Publication number
CN113902975A
CN113902975A CN202111170725.1A CN202111170725A CN113902975A CN 113902975 A CN113902975 A CN 113902975A CN 202111170725 A CN202111170725 A CN 202111170725A CN 113902975 A CN113902975 A CN 113902975A
Authority
CN
China
Prior art keywords
adopting
standard
network
input
convolutional layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111170725.1A
Other languages
Chinese (zh)
Other versions
CN113902975B (en
Inventor
张晓玲
杨振宇
张天文
师君
韦顺军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202111170725.1A priority Critical patent/CN113902975B/en
Publication of CN113902975A publication Critical patent/CN113902975A/en
Application granted granted Critical
Publication of CN113902975B publication Critical patent/CN113902975B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a scene perception data enhancement method for SAR ship detection, which is characterized in that the method is firstly improved based on a classical convolutional neural network VGG-11 so as to be more suitable for SAR images, and then the network is used for classifying images in a training set: dividing the training sample into an offshore training sample and an offshore training sample; then, utilizing scene amplification to obtain a balanced number of offshore training samples and offshore training samples; the classical detection network is trained by using the processed data set, executes a detection task and evaluates a detection result; the total detection precision of the Faster R-CNN ship detection network adopting the method is improved by 1.95 percent compared with the total detection precision of the Faster R-CNN ship detection network in the prior art, the detection precision of the ship on shore is improved by 6.61 percent, and the improvement of the detection precision of the SAR image ship on shore is realized.

Description

Scene perception data enhancement method for SAR ship detection
Technical Field
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and relates to a scene perception data enhancement method for SAR ship detection.
Background
Synthetic Aperture Radar (SAR) is a microwave active imaging radar with high resolution, has the characteristics of all-weather and all-day operation, and compared with an optical sensor, the electromagnetic wave transmitted by the SAR can penetrate through the shielding of cloud and fog, vegetation and other complex environment objects and can not be influenced by the brightness of light in a detection area, so that the SAR has wide application in the fields of civil affairs and military affairs. See the literature, "Ou Shining, application research of synthetic aperture radar in ship target positioning and imaging technology [ J ]. ship science and technology, 2019,41(02): 152-.
In recent years, ship detection in the SAR image has also become a research hotspot, because it can realize convenient marine traffic management, ship oil spill monitoring, ship disaster rescue, and the like. Ships in the SAR images are important valuable targets, particularly in the field of national defense and military, the national ocean rights and interests can be effectively protected, and an effective solving means is provided for solving the ocean disputes. In particular, the SAR work is not influenced by daytime and climate conditions, and is particularly suitable for the ocean environment of metamerism measurement, thereby making up for the defects of the optical sensor. See the literature "application of marfan, bau, synthetic aperture radar in high-resolution monitoring and mapping of ship targets [ J ] ship science and technology, 2018,40(22): 157-.
Many SAR image ship detection algorithms have been proposed so far, and the most common and effective method is various detection algorithms based on CFAR, which uses a sea clutter model established in advance, searches images using a sliding window, and determines whether to include a ship according to a ship detection threshold provided by the sea clutter model, wherein the common sea clutter models are based on gaussian distribution, rayleigh distribution, K distribution, and the like. However, since the sea surface background is affected by the surrounding environment and weather, the background clutter distribution model is difficult to fit the real background clutter distribution, so that the CFAR is difficult to apply in the condition of a more complex scene. See in detail the document "Yang Zhi, Song Hui, Du Yang, Zhang Qing, Mengming, Rice-CFAR based SAR image Ship detection [ J ]. Syzygium university of Fertilizer industry (Nature science edition), 2015,38(04):463 and 467".
With the development of artificial intelligence, deep learning is applied to the SAR image ship detection field. The deep learning-based method mainly adopts a deep convolutional neural network to automatically extract the characteristics of the ship, fits the mathematical distribution of data through learning training, and obtains the coordinate position of the ship in the SAR image through reasoning through regression, wherein the precision of the method is higher than that of various detection algorithms based on CFAR. Currently, some target detectors derived from the computer vision field, such as Fast R-CNN, YOLO, RetinaNet, etc., have been successfully applied to the SAR image ship detection field. However, the detection accuracy of the ship on shore is obviously lower than that of the ship off shore because the landing area has stronger backscattering characteristics.
Although the SAR ship detector based on the CNN has better detection performance than the traditional detection method, the detection precision of the offshore ship is still difficult to improve due to the unbalance of the sample scene. To balance the number of onshore and offshore samples, a method of a Balanced Scene Learning Mechanism (BSLM) for onshore and offshore ship detection in SAR images is proposed. The method is based on unsupervised learning, and utilizes a generation countermeasure network (GAN) to extract scene characteristics of the SAR image; using these features, scene binary clustering (onshore/offshore) by k-means; and finally, enhancing the near-shore sample by copying, rotating and transforming or adding noise to balance the off-shore sample, thereby eliminating the scene learning deviation, obtaining the balanced learning representation capability and improving the learning benefit and the detection precision. See the documents "T.Zhang et al," Balance Scene Learning Mechanism for offset and interior Shift Detection in SAR Images, "in IEEE Geoscience and Remote Sensing Letters, doi:10.1109/LGRS.2020.3033988.
Therefore, in order to solve the problem of insufficient detection precision of the traditional SAR ship landing, the invention provides a scene perception data enhancement method for SAR ship detection.
Disclosure of Invention
The invention belongs to the technical field of Synthetic Aperture Radar (SAR) image interpretation, and discloses a scene perception data enhancement method for SAR ship detection. The method is based on a deep learning theory and mainly comprises a convolutional neural network, scene amplification and a classical detection network Faster R-CNN. The method is improved to a certain extent based on a classical convolutional neural network VGG-11, so that the method is more suitable for SAR images, and then the images in a training set are subjected to secondary classification by using the network and are divided into an onshore training sample and an offshore training sample; then, utilizing scene amplification to obtain a balanced number of offshore training samples and offshore training samples; the classical detection network is trained using the processed data set, performs detection tasks and evaluates detection results. Finally, the total detection precision of the fast R-CNN ship detection network adopting the method is improved by 1.95 percent compared with the total detection precision of the fast R-CNN ship detection network in the prior art, the detection precision of the ship on shore is improved by 6.61 percent, and the improvement of the detection precision of the SAR image ship on shore is realized.
For the convenience of describing the present invention, the following terms are first defined:
definition 1: SSDD data set acquisition method
The SSDD data set refers to a SAR Ship Detection data set, which is called SAR Ship Detection data set in all english, and is a data set for Ship Detection of the first open SAR image. SSDD data mainly come from RadarSat-2, TerrasAR-X and Sentiniel-1 sensors and comprise data of four polarization modes of HH, HV, VV and VH. The observation scene of the SSDD data set is mainly sea area and offshore area, there are 1160 images of 500 × 500 and 2551 ships in total, there are 2.20 ships in each image on average, and the ships have different dimensions, different distribution positions and different resolutions, and the ship targets have diversity. The method for acquiring the SSDD data set is shown in a document ' Lijianwei, Quchang, Ponlan and Dengdong ' SAR image ship target detection [ J ] based on a convolutional neural network, a system engineering and electronic technology, 2018,40(09):1953, 1959 '.
Definition 2: classical convolutional neural network
A classical convolutional neural network is usually composed of an input layer, a hidden layer, and an output layer. The input layer can process multidimensional data, and in the field of computer vision, the input layer is generally assumed to input three-dimensional input data in advance, namely two-dimensional pixel points and RGB channels on a plane. The output layer outputs the classification labels and corresponding bounding box coordinate values, typically using a logistic function or normalized exponential function, in image detection and recognition. The hidden layer comprises a convolution layer, a nonlinear activation function, a pooling layer and a full-connection layer, wherein the convolution layer takes a small rectangular region of an input feature as a unit and abstracts the feature in a high dimension; the non-linear pooling layer is used to reduce the matrix, thereby reducing the parameters in the subsequent neural network; the fully-connected layer is equivalent to a hidden layer in a traditional feedforward neural network, and takes high-dimensional features obtained by previous abstraction as input to carry out classification and detection tasks. The classical convolution neural network method is described in detail in the literature "Huvogen, Lilinyan, Shangxinluo, Shenmilitary, Dyyonghe.Objective detection algorithm based on convolution neural network overview [ J ]. proceedings of Suzhou university of science and technology (Nature science edition), 2020,37(02):1-10+25 ].
Definition 3: standard full connection layer method
The fully-connected layer is a part of a convolutional neural network, the input and output sizes of the fully-connected layer are fixed, and each node is connected with all nodes of the previous layer and is used for integrating the extracted features. The full link layer method is described in detail in "Haoren Wang, Haotian Shi, Ke Lin, Chengjin Qin, Liqun Zhao, Yixiang Huang, Chengliang Liu.
Definition 4: convolution kernel
The convolution kernel is a node that implements weighting and then summing values within a small portion of a rectangular region in an input feature map or picture, respectively, as an output. Each convolution kernel requires the manual specification of multiple parameters. One type of parameter is the length and width of the node matrix processed by the convolution kernel, and the size of this node matrix is also the size of the convolution kernel. The other type of convolution kernel has parameters of the depth of the unit node matrix obtained by processing, and the depth of the unit node matrix is also the depth of the convolution kernel. In the convolution operation process, each convolution kernel slides on input data, then an inner product of the whole convolution kernel and the corresponding position of the input data is calculated, then the inner product is processed through a nonlinear function to obtain a final result, and finally the results of all the corresponding positions form a two-dimensional characteristic diagram. Each convolution kernel generates a two-dimensional feature map, and the feature maps generated by the plurality of convolution kernels are overlapped to form a three-dimensional feature map. The convolution kernel method is detailed in 'Vanli, Zhao hong Wei, Zhao Hao Yu, Huhuang shui, Wang Zheng' the research and study of target detection based on deep convolution neural network is reviewed in [ J ] optical precision engineering, 2020,28(05):1152 once 1164 ].
Definition 5: conventional IoU intersection ratio method
The IoU score is a standard performance metric for the object class segmentation problem. Given a set of images, the IoU measurement gives the similarity between the predicted area and the ground truth area of the objects present in the set of images, and is formulated
Figure BDA0003292999440000031
Definition, where i (x) and u (x) represent the intersection and union of "predicted bounding box" and "real bounding box", respectively. The conventional IoU Intersection ratio calculation method is described in the literature "Rahman M A, Wang Y. optimizing interaction-Over-Union in Deep Neural Networks for Image Segmentation [ M]//Advances in Visual Computing.Springer International Publishing,2016:234-244.”。
Definition 6: standard ReLU function activation method
The standard ReLU function is called a Linear rectification function (ReLU), also called a modified Linear Unit (modified Linear Unit), and is an activation function (activation function) commonly used in artificial neural networks, and generally refers to a non-Linear function represented by a ramp function and a variant thereofA linear function. The expression is
Figure BDA0003292999440000041
The function can map input variables of the function into an interval from 0 to 1, is a function with a negative half axis being constant 0 and a positive half axis being monotonically increasing and being derivable, and can increase sparsity in a neural network. The standard ReLU function activation method is detailed in the website "https:// www.cnblogs.com/makefile/p/activation-function html".
Definition 7: standard batch normalization method
The standard Batch Normalization (BN) method is a method for unifying scattered data, and is used to make the network learn the rules in the data more easily. BN is usually viewed as a layer, added in front of the activation function, to narrow the variation range of the input x value and reduce overfitting to some extent. Standard batch normalization methods are detailed in the website "https:// www.cnblogs.com/shine-lee/p/11989612. html".
Definition 8: standard maximum pooling method
The standard Max Pooling (Max Pooling) method is a method of taking the point with the largest value in the local acceptance domain, and its main role is to reduce the size of the model, increase the computation speed, and increase the robustness of the extracted features. Standard maximum pooling methods are detailed in the website "https:// blog.csdn.net/weixin _ 43336281/article/details/102149468".
Definition 9: standard softmax method
The standard softmax method is the popularization of a logistic regression model on a multi-classification problem, and the expression is
Figure BDA0003292999440000042
Wherein Vi is the output of the preceding-stage output unit of the classifier, i represents the class index, the total number of classes is C, and Si represents the ratio of the index of the current element to the sum of the indexes of all elements. The output after softmax processing can characterize the value as the relative probability between different classes. Standard softmax method details the website "https:// blog. csdn. net/qq _32642107/article/details/97270994utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_baidulandingword~default-0.control&spm=1001.2101.3001.4242”。
Definition 10: standard VGG-11 networks
The standard VGG-11 network refers to a VGG network with 11 hidden layers, is a network part for extracting features, can combine different modules in the network, comprises a plurality of convolutional layers and pooling layers, and can automatically extract useful feature information through training. See in detail the document "Simnyan K, Zisserman A. Very Deep conditional Networks for Large-Scale Image Recognition [ J ]. Computer Science,2014.
Definition 11: classical stochastic gradient descent algorithm
The classical Stochastic Gradient Descent (SGD) algorithm is an optimization algorithm that optimizes the loss function constructed from the original model to find the optimal parameters. The method is characterized in that each data calculates a loss function and calculates gradient to update parameters, and the calculation speed is high. The classical stochastic gradient descent algorithm is detailed in "https:// blog.csdn.net/qq _ 38150441/article/details/80533891".
Definition 12: recall ratio and accuracy calculation method
Recall R refers to the number of correct predictions in all positive samples, expressed as
Figure BDA0003292999440000051
The precision ratio P refers to the proportional expression of the correct number in the result predicted as positive example as
Figure BDA0003292999440000052
Wherein tp (true positive) represents a positive sample predicted to be a positive value by the model; fn (false negative) represents the negative sample predicted by the model as negative; fp (false positive) is expressed as a positive sample predicted to be negative by the model. The recall rate and accuracy curve P (R) is a function with R as independent variable and P as dependent variable, and the method for solving the numerical values of the parameters is shown in the literature' Lihang, statistical learning method [ M]Beijing, Qinghua university Press, 2012 ".
Definition 13: standard mAP index precision evaluation method
The mAP refers to the mean Average Precision, and is called mean Average Precision in English. In the field of target detection, the mAP is used to measure the accuracy of a detection model. The calculation formula is
Figure BDA0003292999440000053
Where P is precision and R is recall. Standard mAP index accuracy assessment methods are detailed in "https:// www.cnblogs.com/zongfa/p/9783972. html".
Definition 14: prior art fast R-CNN
The prior art Faster R-CNN is a target detection network. The network consists of two modules, wherein the first module is a regional recommendation network and is used for recommending the positions where targets may appear, and the second module is a Fast R-CNN network and is used for classifying the targets and performing frame regression. The method for establishing the prior art fast R-CNN network is described in detail in "Ren S, He K, Girshick R, et al. fast R-CNN: Towards read-Time Object Detection with Region pro-pos Networks [ J ]. IEEE Transactions on Pattern Analysis & Machine Analysis, 2017,39(6): 1137-.
Definition 15: classical data enhancement method
The classical data enhancement method is a method for generating a new training sample, and the method achieves the aim of generating more training samples by adding some random disturbance to original data and simultaneously ensuring that class labels of the original data are unchanged. The data enhancement function is to enhance the generalization of the network and improve various indexes of the network. Common data enhancement operations include flipping, rotating, scaling, cropping, and the like. The classic data enhancement method is detailed in https:// blog.csdn.net/u 010801994/article/details/81914716.
Definition 16: standard forward propagation method
The standard forward propagation method is the most basic method in deep learning, and mainly carries out forward reasoning on input according to parameters and connection methods in a network so as to obtain the output of the network. Standard forward propagation methods are detailed in "https:// www.jianshu.com/p/f30c8 daebebebb".
Definition 17: standard non-maximum suppression method
The standard non-maximum suppression method is an algorithm used in the field of target detection to remove redundant detection boxes. In the forward propagation result of the classical detection network, the situation that the same target corresponds to a plurality of detection boxes often occurs. Therefore, an algorithm is needed to select a detection box with the best quality and the highest score from a plurality of detection boxes of the same target. Non-maxima suppression performs a local maximum search by calculating an overlap rate threshold. Standard non-maxima suppression methods are detailed in "https:// www.cnblogs.com/makefile/p/nms. html".
Definition 18: standard image mirroring method
The standard image mirroring method is divided into horizontal mirroring and vertical mirroring. The horizontal mirror image is to exchange the left half part and the right half part of the image by taking a vertical central axis of the image as a central axis; the vertical mirror image is obtained by interchanging the upper half part and the lower half part of the image by taking the horizontal central axis of the image as a central axis. The standard image mirroring method is detailed in https:// blog.csdn.net/qq _ 30708445/adaptor/detail/87881362 utm _ medium ═ distribution.pc _ release.ne-task-block-2-default-basic _ functional-0. no _ search _ link & span ═ 1001.2101.3001.4242 ".
Definition 19: standard data set merging method
The standard data set and method is to combine different data sources together, including combining and renaming pictures and labels, and then further perform data processing and analysis. Standard data sets and methods are detailed in "https:// zhuanlan.
The invention provides a scene perception data enhancement method for SAR ship detection, the whole process is shown in the attached figure 1, and the method comprises the following steps:
step 1, preparing a data set
Obtaining an SSDD data set according to a method of obtaining the SSDD data set according to definition 1, selecting images with suffixes of 1 and 9 as a Test set, marking the images as a Test set, marking other images as a training set as Train, marking SAR images in the training set Train, dividing the SAR images into an onshore scene and an offshore scene, and obtaining a new training set, marking the new training set as new _ Train.
Step 2, establishing a scene classification network
According to a classical convolutional neural network method in definition 2, an input layer is defined and is marked as L1, and SAR images with the size of 224 multiplied by 1 are input;
taking the input layer L1 as input, constructing a convolutional layer C1 according to a convolutional neural network method as classic in definition 2, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 64, and the step size is set to 1;
activating the convolutional layer C1 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C1act
Activated convolutional layer C1 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 224 multiplied by 64 dimensional vector which is marked as L2;
taking a vector L2 with dimensions of 224 × 224 × 64 as input, performing maximum pooling of L2 with the size of 2 × 2 by using the standard maximum pooling method in definition 8 to obtain a vector with dimensions of 112 × 112 × 64, which is marked as L3;
with a 112 × 112 × 64-dimensional vector L3 as input, convolutional layer C2 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter set: the size is set to 3 × 3 × 128, the step size is set to 1;
activating the convolutional layer C2 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C2act
Activated convolutional layer C2 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 112 multiplied by 128 dimensional vector which is marked as L4;
taking a 112 × 112 × 128-dimensional vector L4 as an input, performing maximum pooling of L4 with a size of 2 × 2 by using a standard maximum pooling method in definition 8 to obtain a 56 × 56 × 128-dimensional vector, which is denoted as L5;
taking a vector L5 with dimensions of 56 × 56 × 128 as input, a convolutional layer C3 is constructed according to the convolutional neural network method as classic in definition 2, and the parameters of the convolutional kernel are set as follows: the size is set to 3 × 3 × 256, and the step size is set to 1;
activating the convolutional layer C3 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C3act
Activated convolutional layer C3 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 56 × 56 × 256-dimensional vector which is marked as L6;
taking a vector L6 with dimensions of 56 × 56 × 256 as input, a convolutional layer C4 is constructed according to the convolutional neural network method as classic in definition 2, and the parameters of the convolutional kernel are set as follows: the size is set to 3 × 3 × 256, and the step size is set to 1;
activating the convolutional layer C4 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C4act
Activated convolutional layer C4 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 56 × 56 × 256-dimensional vector which is marked as L7;
taking a 56 × 56 × 256 dimensional vector L7 as an input, performing maximum pooling of L7 with a size of 2 × 2 by using a standard maximum pooling method in definition 8 to obtain a 28 × 28 × 256 dimensional vector, which is denoted as L8;
taking a 28 × 28 × 256-dimensional vector L8 as input, a convolutional layer C5 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter setting is: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C5 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C5act
Activated convolutional layer C5 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 28 multiplied by 512-dimensional vector which is marked as L9;
taking a 28 × 28 × 512-dimensional vector L9 as input, a convolutional layer C6 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter setting is: the size is set to 3 × 3 × 512, and the step size is set to 1;
by usingDefining the standard ReLU function activation method in the 6 to activate the convolutional layer C6 to obtain the activated convolutional layer C6act
Activated convolutional layer C6 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 28 multiplied by 512-dimensional vector which is marked as L10;
taking a vector L10 with dimensions of 28 × 28 × 512 as input, performing maximum pooling of L10 with the size of 2 × 2 by using the standard maximum pooling method in definition 8 to obtain a vector with dimensions of 14 × 14 × 512, which is recorded as L11;
with a 14 × 14 × 512-dimensional vector L11 as input, convolutional layer C7 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter is set: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C7 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C7act
Activated convolutional layer C7 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 14 multiplied by 512-dimensional vector which is marked as L12;
with a 14 × 14 × 512-dimensional vector L12 as input, convolutional layer C8 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter is set: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C8 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C8act
Activated convolutional layer C8 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 14 multiplied by 512-dimensional vector which is marked as L13;
taking a vector L13 with dimensions of 14 × 14 × 512 as input, performing maximal pooling with the size of 2 × 2 on L13 by adopting a standard maximal pooling method in definition 8 to obtain a vector with dimensions of 7 × 7 × 512, which is recorded as L14;
constructing a full-link layer with the size of 1 × 1 × 4096 by taking a vector L14 with dimensions of 7 × 7 × 512 as input and adopting a standard full-link layer method in definition 3, and marking as FC 1;
taking FC1 as input, and adopting a standard full-connection layer method in definition 3 to construct a full-connection layer with the size of 1 × 1 × 4096, and recording the full-connection layer as FC 2;
using the standard full connectivity layer method in definition 3, with FC2 as input, build dimensions of 1 × 1 × NclassFull connection layer of, NclassIs the number of scene categories, and is recorded as FC-Nclass
At this point, the scene classification network is marked as Modified-VGG after being constructedpre
Step 3, training scene classification network
Taking the new _ Train obtained in the step 1 as an input, adopting a classical random gradient descent algorithm in the definition 9, and carrying out Modified-VGG (Modified-gradient G) on the scene classification network established in the step 2preAnd training and optimizing to obtain a scene classification network after training and optimizing, and recording the scene classification network as Modified-VGG.
Step 4, carrying out scene classification
And (3) taking the training set Train as an input, classifying all pictures in the Train into two types through the scene classification network Modified-VGG obtained in the step (3), wherein the first type is an inshore scene and is recorded as Data1, and the second type is an offshore scene and is recorded as Data 2.
Step 5, carrying out scene amplification
According to the classification results Data1 and Data2 obtained in step 4. Define the number of pictures of Data1 as M1Number of pictures of Data2 is M2
If M is1<M2Randomly selecting M in the first type of landing scene Data1 by adopting a standard image mirroring method in definition 182-M1Carrying out mirror image operation on a picture to obtain M after the mirror image operation2-M1Picture, denoted as extra _ Data 1. M after the mirroring operation is then mirrored using the data set union method that defines the criteria in 192-M1The picture extra _ Data1 and the first type of landing scene Data1 are merged to obtain a new landing scene Data set which is marked as new _ Data 1. New _ Data2 is defined as Data 2.
If M is1>M2Randomly selected M of the second type of offshore scene Data2 using the standard image mirroring method in definition 181-M2Carrying out mirror image operation on a picture to obtain M after the mirror image operation1-M2Picture, denoted as extra _ Data 2. M after the mirroring operation is then mirrored using the data set union method that defines the criteria in 191-M2The picture extra _ Data2 and the second type of offshore scene Data2 are merged to get a new set of offshore scene Data, denoted new _ Data 2. New _ Data1 is defined as Data 1.
A new Data set new _ Data is defined { new _ Data1, new _ Data2 }.
Step 6, carrying out experimental verification on a classical model
Step 6.1, data enhancement
And taking the new Data set new _ Data obtained in the step 5 as input, and performing Data enhancement on the new _ Data by adopting a classical Data enhancement method in the definition 15 to obtain an SAR image detection training set after Data enhancement, and recording the SAR image detection training set as DetTrain.
Step 6.2, network establishment
Adopting a classical Faster R-CNN method in definition 14 to establish an untrained Faster R-CNN network;
step 6.3, training the network
Initializing the image batch processing size of the untrained network obtained in the step 6.2, and recording as Batchsize;
initializing the learning rate of an untrained network, and recording the learning rate as eta;
initializing the weight attenuation rate and momentum of untrained network training parameters, and recording the weight attenuation rate and momentum as DC and MM respectively;
initializing random parameters of the untrained Faster R-CNN network obtained in the step 6.2, and recording the initialized parameters as W;
and (3) training the untrained Faster R-CNN network by using the training set DetTrain in the step 6.1 and adopting a classical random gradient descent algorithm in the definition 11 to obtain a loss value of the network, and recording the loss value as loss.
And when the loss value loss of the network is less than the ideal loss value, stopping training to obtain a new network parameter new _ W.
Step 6.4, evaluation of detection result
And (3) taking the new network parameter new _ W obtained in the step 6.3 and the Test set Test obtained in the step 1 as input, and obtaining a ship detection network based on the Faster R-CNN by adopting a standard forward propagation method in the definition 16 to obtain a detection Result which is recorded as Result.
Taking a detection Result obtained by a ship detection network based on fast R-CNN as input, removing a redundant frame in the detection Result by adopting a standard non-maximum value inhibition method in definition 17, and obtaining a detection frame with the highest score, wherein the method comprises the following specific steps:
(1) firstly, marking a box with the highest score in a detection Result as a BS;
(2) then adopting an IoU intersection ratio calculation method in definition 5 to calculate IoU intersection ratio of the rest frames in the detection Result and the BS to obtain the intersection ratio (IoU) of the rest frames in the detection Result and the BS, and discarding the frames with IoU >0.5 to mark the rest frames in the Result as RB;
(3) continuously selecting a box BS with the highest score from the RB;
repeating the calculation IoU and discarding process in the step (2) until no frame can be discarded, and finally, the remaining frame is the final detection result and is marked as RR.
Taking the detection result RR of the Faster R-CNN network obtained in the previous step as input, and calculating by adopting a recall rate and precision calculation method in definition 12 to obtain the precision P, the recall rate R and a precision and recall rate curve P (R) of the detection of the Faster R-CNN network; and calculating to obtain the average precision mAP of the Faster R-CNN network by adopting a standard mAP index precision evaluation method in definition 13.
The invention has the innovation point that a scene classification model is constructed by using a convolutional neural network to enhance data, so that the detection precision of the ship on the shore in the SAR image is improved. The method can classify the shore-approaching samples and the offshore samples of the training set to balance the number of the shore-approaching training samples and the offshore training samples, so that the ship detection model has better detection capability of the shore-approaching ship: the total detection precision of the fast R-CNN ship detection network adopting the method is improved by 1.95 percent compared with the total detection precision of the fast R-CNN ship detection network in the prior art, and the detection precision of the ship on shore is improved by 6.61 percent.
The method has the advantages of improving the detection precision of the ship on shore in the SAR image, overcoming the defect of the ship on shore detection precision in the prior art, and improving the overall detection precision to a certain extent.
Drawings
Fig. 1 is a schematic flow diagram of a scene awareness data enhancement method for SAR ship detection in the present invention.
Fig. 2 is a schematic diagram of a scene classification network structure of the scene awareness data enhancement method for SAR ship detection in the present invention.
Fig. 3 shows the detection accuracy of the scene awareness data enhancement method for SAR ship detection according to the present invention.
Detailed Description
Step 1, preparing a data set
Obtaining an SSDD data set according to a method of obtaining the SSDD data set according to definition 1, selecting images with suffixes of 1 and 9 as a Test set, marking the images as a Test set, marking other images as a training set as Train, marking SAR images in the training set Train, dividing the SAR images into an onshore scene and an offshore scene, and obtaining a new training set, marking the new training set as new _ Train.
Step 2, establishing a scene classification network
According to a classical convolutional neural network method in definition 2, an input layer is defined and is marked as L1, and SAR images with the size of 224 multiplied by 1 are input;
taking the input layer L1 as input, constructing a convolutional layer C1 according to a convolutional neural network method as classic in definition 2, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 64, and the step size is set to 1;
activating the convolutional layer C1 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C1act
Batch normalization method using criteria in definition 7 for activated volumesLamination C1actCarrying out batch normalization processing to obtain a 224 multiplied by 64 dimensional vector which is marked as L2;
taking a vector L2 with dimensions of 224 × 224 × 64 as input, performing maximum pooling of L2 with the size of 2 × 2 by using the standard maximum pooling method in definition 8 to obtain a vector with dimensions of 112 × 112 × 64, which is marked as L3;
with a 112 × 112 × 64-dimensional vector L3 as input, convolutional layer C2 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter set: the size is set to 3 × 3 × 128, the step size is set to 1;
activating the convolutional layer C2 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C2act
Activated convolutional layer C2 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 112 multiplied by 128 dimensional vector which is marked as L4;
taking a 112 × 112 × 128-dimensional vector L4 as an input, performing maximum pooling of L4 with a size of 2 × 2 by using a standard maximum pooling method in definition 8 to obtain a 56 × 56 × 128-dimensional vector, which is denoted as L5;
taking a vector L5 with dimensions of 56 × 56 × 128 as input, a convolutional layer C3 is constructed according to the convolutional neural network method as classic in definition 2, and the parameters of the convolutional kernel are set as follows: the size is set to 3 × 3 × 256, and the step size is set to 1;
activating the convolutional layer C3 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C3act
Activated convolutional layer C3 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 56 × 56 × 256-dimensional vector which is marked as L6;
taking a vector L6 with dimensions of 56 × 56 × 256 as input, a convolutional layer C4 is constructed according to the convolutional neural network method as classic in definition 2, and the parameters of the convolutional kernel are set as follows: the size is set to 3 × 3 × 256, and the step size is set to 1;
activating the convolutional layer C4 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C4act
Activated convolutional layer C4 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 56 × 56 × 256-dimensional vector which is marked as L7;
taking a 56 × 56 × 256 dimensional vector L7 as an input, performing maximum pooling of L7 with a size of 2 × 2 by using a standard maximum pooling method in definition 8 to obtain a 28 × 28 × 256 dimensional vector, which is denoted as L8;
taking a 28 × 28 × 256-dimensional vector L8 as input, a convolutional layer C5 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter setting is: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C5 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C5act
Activated convolutional layer C5 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 28 multiplied by 512-dimensional vector which is marked as L9;
taking a 28 × 28 × 512-dimensional vector L9 as input, a convolutional layer C6 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter setting is: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C6 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C6act
Activated convolutional layer C6 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 28 multiplied by 512-dimensional vector which is marked as L10;
taking a vector L10 with dimensions of 28 × 28 × 512 as input, performing maximum pooling of L10 with the size of 2 × 2 by using the standard maximum pooling method in definition 8 to obtain a vector with dimensions of 14 × 14 × 512, which is recorded as L11;
with a 14 × 14 × 512-dimensional vector L11 as input, convolutional layer C7 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter is set: the size is set to 3 × 3 × 512, and the step size is set to 1;
convolutional layer C7 was activated using the standard ReLU function activation method in definition 6,obtaining the activated convolutional layer C7act
Activated convolutional layer C7 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 14 multiplied by 512-dimensional vector which is marked as L12;
with a 14 × 14 × 512-dimensional vector L12 as input, convolutional layer C8 is constructed according to the convolutional neural network method as classic in definition 2, and the convolutional kernel parameter is set: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C8 by the standard ReLU function activation method in definition 6 to obtain an activated convolutional layer C8act
Activated convolutional layer C8 using the Standard batch normalization method in definition 7actCarrying out batch normalization processing to obtain a 14 multiplied by 512-dimensional vector which is marked as L13;
taking a vector L13 with dimensions of 14 × 14 × 512 as input, performing maximal pooling with the size of 2 × 2 on L13 by adopting a standard maximal pooling method in definition 8 to obtain a vector with dimensions of 7 × 7 × 512, which is recorded as L14;
constructing a full-link layer with the size of 1 × 1 × 4096 by taking a vector L14 with dimensions of 7 × 7 × 512 as input and adopting a standard full-link layer method in definition 3, and marking as FC 1;
taking FC1 as input, and adopting a standard full-connection layer method in definition 3 to construct a full-connection layer with the size of 1 × 1 × 4096, and recording the full-connection layer as FC 2;
using the standard full connectivity layer method in definition 3, with FC2 as input, build dimensions of 1 × 1 × NclassFull connection layer of, NclassIs the number of scene categories, and is recorded as FC-Nclass
At this point, the scene classification network is marked as Modified-VGG after being constructedpre
Step 3, training scene classification network
Taking the new _ Train obtained in the step 1 as an input, adopting a classical random gradient descent algorithm in the definition 9, and carrying out Modified-VGG (Modified-gradient G) on the scene classification network established in the step 2preTraining and optimizing are carried out to obtain the field after training and optimizingAnd the scene classification network is marked as Modified-VGG.
Step 4, carrying out scene classification
And (3) taking the training set Train as an input, classifying all pictures in the Train into two types through the scene classification network Modified-VGG obtained in the step (3), wherein the first type is an inshore scene and is recorded as Data1, and the second type is an offshore scene and is recorded as Data 2.
Step 5, carrying out scene amplification
According to the classification results Data1 and Data2 obtained in step 4. Define the number of pictures of Data1 as M1Number of pictures of Data2 is M2
If M is1<M2Randomly selecting M in the first type of landing scene Data1 by adopting a standard image mirroring method in definition 182-M1Carrying out mirror image operation on a picture to obtain M after the mirror image operation2-M1Picture, denoted as extra _ Data 1. M after the mirroring operation is then mirrored using the data set union method that defines the criteria in 192-M1The picture extra _ Data1 and the first type of landing scene Data1 are merged to obtain a new landing scene Data set which is marked as new _ Data 1. New _ Data2 is defined as Data 2.
If M is1>M2Randomly selected M of the second type of offshore scene Data2 using the standard image mirroring method in definition 181-M2Carrying out mirror image operation on a picture to obtain M after the mirror image operation1-M2Picture, denoted as extra _ Data 2. M after the mirroring operation is then mirrored using the data set union method that defines the criteria in 191-M2The picture extra _ Data2 and the second type of offshore scene Data2 are merged to get a new set of offshore scene Data, denoted new _ Data 2. New _ Data1 is defined as Data 1.
A new Data set new _ Data is defined { new _ Data1, new _ Data2 }.
Step 6, carrying out experimental verification on a classical model
Step 6.1, data enhancement
And taking the new Data set new _ Data obtained in the step 5 as input, and performing Data enhancement on the new _ Data by adopting a classical Data enhancement method in the definition 15 to obtain an SAR image detection training set after Data enhancement, and recording the SAR image detection training set as DetTrain.
Step 6.2, network establishment
Adopting a classical Faster R-CNN method in definition 14 to establish an untrained Faster R-CNN network;
step 6.3, training the network
Initializing the image batch processing size of the untrained network obtained in the step 6.2, and recording as Batchsize;
initializing the learning rate of an untrained network, and recording the learning rate as eta;
initializing the weight attenuation rate and momentum of untrained network training parameters, and recording the weight attenuation rate and momentum as DC and MM respectively;
initializing random parameters of the untrained Faster R-CNN network obtained in the step 6.2, and recording the initialized parameters as W;
and (3) training the untrained Faster R-CNN network by using the training set DetTrain in the step 6.1 and adopting a classical random gradient descent algorithm in the definition 11 to obtain a loss value of the network, and recording the loss value as loss.
And when the loss value loss of the network is less than the ideal loss value, stopping training to obtain a new network parameter new _ W.
Step 6.4, evaluation of detection result
And (3) taking the new network parameter new _ W obtained in the step 6.3 and the Test set Test obtained in the step 1 as input, and obtaining a ship detection network based on the Faster R-CNN by adopting a standard forward propagation method in the definition 16 to obtain a detection Result which is recorded as Result.
Taking a detection Result obtained by a ship detection network based on fast R-CNN as input, removing a redundant frame in the detection Result by adopting a standard non-maximum value inhibition method in definition 17, and obtaining a detection frame with the highest score, wherein the method comprises the following specific steps:
(1) firstly, marking a box with the highest score in a detection Result as a BS;
(2) then adopting an IoU intersection ratio calculation method in definition 5 to calculate IoU intersection ratio of the rest frames in the detection Result and the BS to obtain the intersection ratio (IoU) of the rest frames in the detection Result and the BS, and discarding the frames with IoU >0.5 to mark the rest frames in the Result as RB;
(3) continuously selecting a box BS with the highest score from the RB;
repeating the calculation IoU and discarding process in the step (2) until no frame can be discarded, and finally, the remaining frame is the final detection result and is marked as RR.
Taking the detection result RR of the Faster R-CNN network obtained in the previous step as input, and calculating by adopting a recall rate and precision calculation method in definition 12 to obtain the precision P, the recall rate R and a precision and recall rate curve P (R) of the detection of the Faster R-CNN network; and calculating to obtain the average precision mAP of the Faster R-CNN network by adopting a standard mAP index precision evaluation method in definition 13.

Claims (1)

1. A scene perception data enhancement method for SAR ship detection is characterized by comprising the following steps:
step 1, preparing a data set
Obtaining an SSDD data set by adopting a method for obtaining the SSDD data set, selecting images with suffixes 1 and 9 as a Test set, marking the images as a Test set and other images as a training set as Train, marking SAR images in the training set Train into an onshore scene and an offshore scene, and obtaining a new training set which is marked as new _ Train;
step 2, establishing a scene classification network
Defining an input layer by adopting a classical convolutional neural network method, recording the input layer as L1, and inputting an SAR image with the size of 224 multiplied by 1;
taking an input layer L1 as input, constructing a convolutional layer C1 by adopting a classical convolutional neural network method, and setting convolutional kernel parameters: the size is set to 3 × 3 × 64, and the step size is set to 1;
activating the convolutional layer C1 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C1act
Activated convolutional layer C1 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 224 multiplied by 64 dimensional vector which is marked as L2;
taking a vector L2 with dimensions of 224 multiplied by 64 as input, performing maximum pooling on L2 with the size of 2 multiplied by 2 by adopting a standard maximum pooling method to obtain a vector with dimensions of 112 multiplied by 64, and recording the vector as L3;
constructing a convolutional layer C2 by taking a vector L3 with dimensions of 112 multiplied by 64 as input according to a classical convolutional neural network method, and setting convolution kernel parameters: the size is set to 3 × 3 × 128, the step size is set to 1;
activating the convolutional layer C2 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C2act
Activated convolutional layer C2 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 112 multiplied by 128 dimensional vector which is marked as L4;
taking a 112 × 112 × 128-dimensional vector L4 as an input, performing maximum pooling on L4 by adopting a standard maximum pooling method, and obtaining a 56 × 56 × 128-dimensional vector which is marked as L5;
taking a vector L5 with dimensions of 56 multiplied by 128 as an input, constructing a convolutional layer C3 according to a classical convolutional neural network method, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 256, and the step size is set to 1;
activating the convolutional layer C3 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C3act
Activated convolutional layer C3 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 56 × 56 × 256-dimensional vector which is marked as L6;
taking a vector L6 with dimensions of 56 multiplied by 256 as input, constructing a convolutional layer C4 by adopting a classical convolutional neural network method, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 256, and the step size is set to 1;
activating the convolutional layer C4 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C4act
Activated convolutional layer C4 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 56 × 56 × 256-dimensional vector which is marked as L7;
taking a 56 × 56 × 256-dimensional vector L7 as an input, performing maximum pooling on L7 by adopting a standard maximum pooling method, wherein the size of the maximum pooling is 2 × 2, and obtaining a 28 × 28 × 256-dimensional vector which is marked as L8;
taking a vector L8 with dimensions of 28 multiplied by 256 as input, constructing a convolutional layer C5 by adopting a classical convolutional neural network method, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C5 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C5act
Activated convolutional layer C5 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 28 multiplied by 512-dimensional vector which is marked as L9;
taking a vector L9 with dimensions of 28 multiplied by 512 as input, constructing a convolutional layer C6 by adopting a classical convolutional neural network method, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C6 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C6act
Activated convolutional layer C6 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 28 multiplied by 512-dimensional vector which is marked as L10;
taking a 28 × 28 × 512-dimensional vector L10 as an input, performing maximum pooling on L10 by adopting a standard maximum pooling method, wherein the size of the maximum pooling is 2 × 2, and obtaining a 14 × 14 × 512-dimensional vector which is marked as L11;
taking a vector L11 with dimensions of 14 multiplied by 512 as input, adopting a classical convolutional neural network method to construct a convolutional layer C7, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C7 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C7act
Activated convolutional layer C7 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 14 multiplied by 512-dimensional vector which is marked as L12;
taking a vector L12 with dimensions of 14 multiplied by 512 as input, adopting a classical convolutional neural network method to construct a convolutional layer C8, and setting parameters of a convolutional kernel: the size is set to 3 × 3 × 512, and the step size is set to 1;
activating the convolutional layer C8 by adopting a standard ReLU function activation method to obtain an activated convolutional layer C8act
Activated convolutional layer C8 using standard batch normalization methodactCarrying out batch normalization processing to obtain a 14 multiplied by 512-dimensional vector which is marked as L13;
taking a vector L13 with dimensions of 14 multiplied by 512 as input, performing maximum pooling on L13 with the size of 2 multiplied by 2 by adopting a standard maximum pooling method to obtain a vector with dimensions of 7 multiplied by 512, and recording the vector as L14;
constructing a full-connection layer with the size of 1 × 1 × 4096 by taking a vector L14 with dimensions of 7 × 7 × 512 as input and adopting a standard full-connection layer method, and recording the full-connection layer as FC 1;
taking FC1 as input, and adopting a standard full-connection layer method to construct a full-connection layer with the size of 1 × 1 × 4096, and recording the full-connection layer as FC 2;
using FC2 as input, a standard full link layer approach was used to construct a dimension of 1 × 1 × NclassFull connection layer of, NclassIs the number of scene categories, and is recorded as FC-Nclass
At this point, the scene classification network is marked as Modified-VGG after being constructedpre
Step 3, training scene classification network
Taking the new training set new _ Train obtained in the step 1 as input, adopting a classical random gradient descent algorithm, and carrying out Modified-VGG (Modified-gradient G) on the scene classification network established in the step 2preTraining and optimizing to obtain a trained and optimized scene classification network, and recording the trained and optimized scene classification network as Modified-VGG;
step 4, carrying out scene classification
Classifying all pictures in the Train into two types by taking the Train as input through the scene classification network Modified-VGG obtained in the step 3, wherein the first type is an inshore scene and is recorded as Data1, and the second type is an offshore scene and is recorded as Data 2;
step 5, carrying out scene amplification
According to the classification results Data1 and Data2 obtained in step 4; define the number of pictures of Data1 as M1Number of pictures of Data2 is M2
If M is1<M2Randomly selecting M from the first type of landing scene Data1 by adopting a standard image mirroring method2-M1Carrying out mirror image operation on a picture to obtain M after the mirror image operation2-M1A picture, denoted as extra _ Data 1; then adopting a standard data set combination method to mirror M after the operation2-M1Merging the picture extra _ Data1 and the first type of landing scene Data1 to obtain a new landing scene Data set which is marked as new _ Data 1; define new _ Data2 ═ Data 2;
if M is1>M2Randomly selected M of the second offshore scene Data2 using standard image mirroring method1-M2Carrying out mirror image operation on a picture to obtain M after the mirror image operation1-M2A picture, denoted as extra _ Data 2; then adopting a standard data set combination method to mirror M after the operation1-M2Merging the picture extra _ Data2 and the second type of offshore scene Data2 to obtain a new offshore scene Data set which is marked as new _ Data 2; define new _ Data1 ═ Data 1;
defining a new Data set new _ Data { new _ Data1, new _ Data2 };
step 6, carrying out experimental verification on a classical model
Step 6.1, data enhancement
Taking the new Data set new _ Data obtained in the step 5 as input, and performing Data enhancement on the new _ Data by adopting a classical Data enhancement method to obtain an SAR image detection training set after Data enhancement, and recording the SAR image detection training set as DetTrain;
step 6.2, network establishment
Adopting a classic Faster R-CNN method to establish an untrained Faster R-CNN network;
step 6.3, training the network
Initializing the image batch processing size of the untrained network obtained in the step 6.2, and recording as Batchsize;
initializing the learning rate of an untrained network, and recording the learning rate as eta;
initializing the weight attenuation rate and momentum of untrained network training parameters, and recording the weight attenuation rate and momentum as DC and MM respectively;
initializing random parameters of the untrained Faster R-CNN network obtained in the step 6.2, and recording the initialized parameters as W;
training an untrained Faster R-CNN network by using the training set DetTrain in the step 6.1 and adopting a classical random gradient descent algorithm to obtain a loss value of the network, and recording the loss value as loss;
when the loss value loss of the network is smaller than the ideal loss value, stopping training to obtain a new network parameter new _ W;
step 6.4, evaluation of detection result
Taking the new network parameter new _ W obtained in the step 6.3 and the Test set Test obtained in the step 1 as input, and obtaining a ship detection network based on fast R-CNN by adopting a standard forward propagation method to obtain a detection Result, and recording the detection Result as Result;
the method comprises the following steps of taking a detection Result obtained by a ship detection network based on fast R-CNN as input, removing redundant frames in the detection Result by adopting a standard non-maximum value inhibition method, and obtaining a detection frame with the highest score, wherein the specific steps are as follows:
(1) firstly, marking a box with the highest score in a detection Result as a BS;
(2) then, IoU cross-over ratio calculation is carried out on the rest frames in the detection Result and the BS by adopting a traditional IoU cross-over ratio calculation method to obtain the cross-over ratio (IoU) of the rest frames in the detection Result and the BS, and after the frames with the length of IoU being more than 0.5 are abandoned, the rest frames in the Result are marked as RB;
(3) continuously selecting a box BS with the highest score from the RB;
repeating the processes of calculating IoU and discarding in the step (2) until no frame can be discarded, and recording the final remaining frame as the final detection result as RR;
taking the detection result RR of the Faster R-CNN network obtained in the previous step as input, and calculating the precision P, the recall rate R, the precision and the recall rate curve P (R) of the detection of the Faster R-CNN network by adopting a recall rate and precision rate calculation method; and calculating to obtain the average precision mAP of the Faster R-CNN network by adopting a standard mAP index precision evaluation method.
CN202111170725.1A 2021-10-08 2021-10-08 Scene perception data enhancement method for SAR ship detection Active CN113902975B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111170725.1A CN113902975B (en) 2021-10-08 2021-10-08 Scene perception data enhancement method for SAR ship detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111170725.1A CN113902975B (en) 2021-10-08 2021-10-08 Scene perception data enhancement method for SAR ship detection

Publications (2)

Publication Number Publication Date
CN113902975A true CN113902975A (en) 2022-01-07
CN113902975B CN113902975B (en) 2023-05-05

Family

ID=79190453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111170725.1A Active CN113902975B (en) 2021-10-08 2021-10-08 Scene perception data enhancement method for SAR ship detection

Country Status (1)

Country Link
CN (1) CN113902975B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114970378A (en) * 2022-08-01 2022-08-30 青岛国数信息科技有限公司 Sea clutter sample library construction method based on GAN network

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650721A (en) * 2016-12-28 2017-05-10 吴晓军 Industrial character identification method based on convolution neural network
CN106897739A (en) * 2017-02-15 2017-06-27 国网江苏省电力公司电力科学研究院 A kind of grid equipment sorting technique based on convolutional neural networks
CN108491854A (en) * 2018-02-05 2018-09-04 西安电子科技大学 Remote sensing image object detection method based on SF-RCNN
CN109359661A (en) * 2018-07-11 2019-02-19 华东交通大学 A kind of Sentinel-1 radar image classification method based on convolutional neural networks
CN109800796A (en) * 2018-12-29 2019-05-24 上海交通大学 Ship target recognition methods based on transfer learning
WO2020037960A1 (en) * 2018-08-21 2020-02-27 深圳大学 Sar target recognition method and apparatus, computer device, and storage medium
CN111563473A (en) * 2020-05-18 2020-08-21 电子科技大学 Remote sensing ship identification method based on dense feature fusion and pixel level attention
CN112285712A (en) * 2020-10-15 2021-01-29 电子科技大学 Method for improving detection precision of ship on shore in SAR image
CN113469088A (en) * 2021-07-08 2021-10-01 西安电子科技大学 SAR image ship target detection method and system in passive interference scene

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650721A (en) * 2016-12-28 2017-05-10 吴晓军 Industrial character identification method based on convolution neural network
CN106897739A (en) * 2017-02-15 2017-06-27 国网江苏省电力公司电力科学研究院 A kind of grid equipment sorting technique based on convolutional neural networks
CN108491854A (en) * 2018-02-05 2018-09-04 西安电子科技大学 Remote sensing image object detection method based on SF-RCNN
CN109359661A (en) * 2018-07-11 2019-02-19 华东交通大学 A kind of Sentinel-1 radar image classification method based on convolutional neural networks
WO2020037960A1 (en) * 2018-08-21 2020-02-27 深圳大学 Sar target recognition method and apparatus, computer device, and storage medium
CN109800796A (en) * 2018-12-29 2019-05-24 上海交通大学 Ship target recognition methods based on transfer learning
CN111563473A (en) * 2020-05-18 2020-08-21 电子科技大学 Remote sensing ship identification method based on dense feature fusion and pixel level attention
CN112285712A (en) * 2020-10-15 2021-01-29 电子科技大学 Method for improving detection precision of ship on shore in SAR image
CN113469088A (en) * 2021-07-08 2021-10-01 西安电子科技大学 SAR image ship target detection method and system in passive interference scene

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIANG KUN 等: "SAR Image Ship Detection Based On Deep Learning" *
周莉: "基于深度学习的合成孔径雷达图像船舶目标检测的研究" *
张晓玲 等: "基于深度分离卷积神经网络的高速高精度SAR舰船检测" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114970378A (en) * 2022-08-01 2022-08-30 青岛国数信息科技有限公司 Sea clutter sample library construction method based on GAN network

Also Published As

Publication number Publication date
CN113902975B (en) 2023-05-05

Similar Documents

Publication Publication Date Title
Cheng et al. FusionNet: Edge aware deep convolutional networks for semantic segmentation of remote sensing harbor images
Sharifzadeh et al. Ship classification in SAR images using a new hybrid CNN–MLP classifier
de Jong et al. Unsupervised change detection in satellite images using convolutional neural networks
CN108038445B (en) SAR automatic target identification method based on multi-view deep learning framework
Seydi et al. Oil spill detection based on multiscale multidimensional residual CNN for optical remote sensing imagery
Venugopal Automatic semantic segmentation with DeepLab dilated learning network for change detection in remote sensing images
Liu et al. Remote sensing image change detection based on information transmission and attention mechanism
CN112285712A (en) Method for improving detection precision of ship on shore in SAR image
Xiao et al. A review of remote sensing image spatiotemporal fusion: Challenges, applications and recent trends
Han et al. Research on multiple jellyfish classification and detection based on deep learning
Khesali et al. Semi automatic road extraction by fusion of high resolution optical and radar images
CN114973031A (en) Visible light-thermal infrared image target detection method under view angle of unmanned aerial vehicle
Bayramoğlu et al. Performance analysis of rule-based classification and deep learning method for automatic road extraction
Mathias et al. Deep Neural Network Driven Automated Underwater Object Detection.
Ucar et al. A novel ship classification network with cascade deep features for line-of-sight sea data
CN113902975A (en) Scene perception data enhancement method for SAR ship detection
Yu et al. Deep multi-feature learning for water body extraction from Landsat imagery
CN113378716A (en) Deep learning SAR image ship identification method based on self-supervision condition
Yao et al. LiDAR based navigable region detection for unmanned surface vehicles
Bi et al. Machine vision
Meng et al. A modified fully convolutional network for crack damage identification compared with conventional methods
Deepan et al. Comparative analysis of scene classification methods for remotely sensed images using various convolutional neural network
Dong et al. Fast infrared horizon detection algorithm based on gradient directional filtration
Chen et al. Land scene classification for remote sensing images with an improved capsule network
Liu et al. A novel deep transfer learning method for sar and optical fusion imagery semantic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant