CN112507777A - Optical remote sensing image ship detection and segmentation method based on deep learning - Google Patents
Optical remote sensing image ship detection and segmentation method based on deep learning Download PDFInfo
- Publication number
- CN112507777A CN112507777A CN202011080445.7A CN202011080445A CN112507777A CN 112507777 A CN112507777 A CN 112507777A CN 202011080445 A CN202011080445 A CN 202011080445A CN 112507777 A CN112507777 A CN 112507777A
- Authority
- CN
- China
- Prior art keywords
- ship
- detection
- resolution
- mask
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 71
- 230000011218 segmentation Effects 0.000 title claims abstract description 58
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000003287 optical effect Effects 0.000 title claims abstract description 28
- 238000013135 deep learning Methods 0.000 title claims abstract description 19
- 230000006870 function Effects 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000013526 transfer learning Methods 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims description 22
- 238000013527 convolutional neural network Methods 0.000 claims description 21
- 230000008569 process Effects 0.000 claims description 13
- 239000013598 vector Substances 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 abstract description 3
- 238000012360 testing method Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 239000004576 sand Substances 0.000 description 6
- 238000004088 simulation Methods 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 230000018109 developmental process Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013100 final test Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an optical remote sensing image ship detection and segmentation method based on deep learning, which comprises the following steps: reading in image data, and preprocessing an image according to a transfer learning method; step two, constructing a multi-resolution parallel convolution backbone network HRFPN to extract an image feature map; thirdly, generating a ship candidate area based on the RPN; step four, by using the idea of multi-task cascade detection, adding semantic segmentation branches, obtaining the classification probability value, the positioning frame and the mask of the ship, and calculating a loss function; and step five, obtaining a refined detection result by utilizing an NMS method. The invention has the advantages that: the invention adds a multi-resolution parallel convolution module and a multi-task cascade detection module on the basis of deep neural network target detection and segmentation, effectively improves the accuracy of optical remote sensing image ship detection and segmentation, and particularly has better detection capability on small targets.
Description
Technical Field
The invention relates to an optical remote sensing image ship detection and segmentation method based on deep learning, and belongs to the technical field of intelligent remote sensing image identification.
Background
In recent years, with the development of aerospace technology, high-resolution optical remote sensing images are becoming one of important means for detecting marine ships. Compared with infrared and SAR images, the optical remote sensing image has the characteristic of high spatial resolution, so that the optical remote sensing image can have more obvious geometric representation on an object. The observation target scene obtained through the high-resolution optical remote sensing image has sufficient characteristic information. In recent years, more and more scholars and research institutions focus on the identification and positioning of high-resolution remote sensing images on marine targets, and further promote the development of marine commercial activities.
With the development of deep learning technology and GPU computing power, the deep convolutional neural network has strong target feature extraction capability on computer vision tasks. At present, most methods only carry out ship detection, but neglect to carry out pixel-level segmentation on a target. One of the problems of the existing ship detection method based on the rectangular frame is that background pixels exist in the boundary frame of the obtained local candidate area, which is not beneficial to classifying the candidate area; by performing example segmentation on the image, a ship mask image without background pixels is obtained, so that accurate classification and fine positioning of ship targets are realized.
A paper published by Kaiming He, "Mask R-CNN" (16th IEEE International Conference on Computer Vision, ICCV 2017) proposes a method for simultaneously performing target detection and instance segmentation in a network. Firstly, extracting image features by using a basic Network ResNet-50 or ResNet-101, and fusing the image features by using a Feature Pyramid Network (Feature Pyramid Network); then, acquiring a candidate region of the ship through a region generation network RPN, and performing region of interest (ROI of interest) alignment operation on a candidate region feature map; for the classification and bounding box prediction branches, respectively predicting the category and the position information of the candidate box by the aligned feature vectors through full-connection layer operation; for the split branch, the aligned feature vector passes through a full convolution Network (full volumetric Network) to predict mask information of the target. The method improves the target detection effect by monitoring the mask information. However, the method still has the disadvantages that due to the large size of the optical remote sensing image and the large size and direction change range of the ship, ship targets, especially small targets, can not be effectively detected.
Disclosure of Invention
The invention aims to solve the technical problem of providing an optical remote sensing image ship detection and segmentation method based on deep learning.
The invention is realized by the following scheme: an optical remote sensing image ship detection and segmentation method based on deep learning comprises the following steps:
reading in image data, and preprocessing an image according to a transfer learning method;
step two, constructing a multi-resolution parallel convolution backbone network HRFPN to extract an image feature map;
thirdly, generating a ship candidate area based on the RPN;
step four, by using the idea of multi-task cascade detection, adding semantic segmentation branches, obtaining the classification probability value, the positioning frame and the mask of the ship, and calculating a loss function;
and step five, obtaining a refined detection result by utilizing an NMS method.
In the first step, model parameters obtained by training a convolutional neural network on a large data set are used as initial parameter values of a network extraction feature layer, and then model fine tuning is carried out.
In the first step, the HRNETV2-W40 model obtained by training on the ImageNet data set is subjected to mean value reduction processing in the training process, and the trained HRNETV2-W40 model is transferred to a ship detection and segmentation task and is subjected to the same mean value reduction preprocessing on the image.
The overall network in the second step comprises four stages: down-sampling the resolution of the image by means of two 3 x 3 convolutions with step size 2 to 1/4 of the original as input to the first stage, which is also the resolution size of the feature map of stage 1, stages 2, 3 and 4 containing 2, 3 and 4 resolution feature maps, respectively, stage 1 containing 4 residual units, each residual unit consisting of a bottleneck module of one channel 64, then scaling the number of channels of the feature map to C by means of 1 3 x 3 convolution, stages 2, 3 and 4 consisting of 1, 4 and 3 repeated modular multi-resolution blocks, respectively, the multi-resolution block consisting of a multi-resolution group convolution and a multi-resolution convolution, each branch of the multi-resolution group convolution containing 4 residual units, for each resolution, 2 3 x 3 convolutions in each unit, the input and output being feature maps of different resolutions, respectively, in order to ensure that the resolution size and the channel number of the feature map are consistent during feature map fusion, i 3 × 3 convolutions with the step length of 2 are performed when the high-resolution feature map is fused to the low-resolution feature map, j bilinear upsampling operations are performed when the low-resolution feature map is fused to the high-resolution feature map, the range of values of i and j is [1, 3], the channel numbers of the feature map with 4 resolutions are respectively C, 2C, 4C and 8C, and C is set to be 40 according to the size of the feature map.
In the second step, the network forms feature maps C2, C3, C4 and C5 by connecting multi-resolution parallel convolutions and performing repeated information exchange between the parallel convolutions, and fuses { C2, C3, C4 and C4} to form a final feature map { P2, P3, P4 and P5}, and the specific calculation formula is as follows:
wherein, Conv1×1And Conv3×3Respectively represent a 1 × 1 convolutional layer and a 3 × 3 convolutional layer; upsamplie represents bilinear upsampling, followed by 1 × 1 convolution operation; down sample represents a 3 × 3 convolutional layer with a step size of 2;representing the eigenmap addition operation, P6 has P5 generated by 1 convolution with a step size of 2 by 3 { P2, P3, P4, P5, P6} output channels all of 256.
In the third step, the anchors are respectively set to be 32 in size for P2, P3, P4, P5 and P62、642、1282、2562、5122Setting the aspect ratio of a ship candidate area on each layer of feature map as { 1:1, 1: 2 and 2: 1}, setting 15 ship candidate areas in the feature pyramid, distributing training positive and negative samples according to the overlapping rate of the ship candidate area and IoU of a corresponding label frame, and when IoU is greater than 0.7, the ship candidate area is a positive sample; when IoU is less than 0.3, the ship candidate area is a negative sample, and the total number of the positive and negative samples in one image is not more than 2000.
And fourthly, constructing a multi-task cascade network to obtain a ship positioning frame and a MASK, adjusting all ship candidate areas into fixed feature vectors by adopting RoIAlign, wherein the feature vector size of a classification and regression branch is 7 multiplied by 7, the feature vector size of a segmentation branch is 14 multiplied by 14, improving information flow by combining cascade and multi-task processing at each stage by using the idea of multi-task cascade detection, further improving accuracy by utilizing a space context, wherein the whole network is provided with 3 detection heads, CLS, BOX and MASK respectively represent classification, boundary frame prediction and MASK prediction branches, IoU threshold values of the 3 detection heads are respectively 0.5, 0.6 and 0.7, the prediction of each stage is input into the next stage to obtain a high-quality prediction result, and the prediction characteristic of the current boundary frame is obtained by the regression boundary frame of the previous stage through RoIAlign.
In the fourth step, the mask prediction branches of the adjacent stages are connected to provide information flow of the mask branch, the mask calculation process of the two adjacent stages is composed of 4 3 × 3 convolution layers and 1 deconvolution layer, firstly, feature graphs of 5 levels of the FPN are scaled to the same size to perform multi-scale feature fusion, then, features are extracted through the 4 3 × 3 convolution layers, and semantic features of fixed size are obtained through 1 × 1 convolution.
In the fourth step, for a single image, the multitask loss function in the training process is defined as follows:
wherein L iscls、LboxAnd LmaskRespectively representing classification loss, positioning frame regression loss and mask prediction loss in the t stage, wherein the values of t are 1, 2 and 3, and LsegRepresenting semantic segmentation loss; the classification loss is defined as follows:wherein, i represents the index of the anchor,denotes the tag value, p, of the ith anchoriThe predicted value of the ith anchor is expressed and is predicted as p when the ship is in the time ofi1, non-ship piFor regression loss, define t as 0i={tx,ty,tw,thThe values of the parameters of the rectangular frame are predicted for the ship,for the label value of a rectangular frame of the ship anchor, the calculation formula of four parameter values is defined as follows:
tx=(x-xa)/wa,ty=(y-ya)/ha,
tw=log(w/wa),th=log(h/ha),
wherein x, y, w and h represent coordinate values of the center point of the rectangular frame, and width and height variables x, xaAnd x*The sub-tables correspond to the coordinate values (y, w, h are the same) of the central point x of the prediction frame, the ship candidate area frame and the label frame; the regression loss function is defined as follows:
specifically, for the mask prediction branch, setting the output resolution of each anchor as an m × m binary mask map, the mask prediction loss function is defined as:wherein m isiRepresenting the confidence with which the object is predicted to be the target,representing the output of each pixel in the ith mask after sigmoid, and setting the semantic segmentation graph output by each anchor as s and the label semantic segmentation graph as s for the semantic segmentation branches*Then the semantic segmentation loss function is defined as: seg * *L=-[s×log(s)+(1-s)log(1-s)]。
and step five, sorting all the detection frames from high to low according to the scores, reserving the candidate frames with low overlapping degree and high scores among the detection frames, and discarding the candidate frames with high overlapping degree and low scores among the detection frames.
The invention has the beneficial effects that: the invention adds a multi-resolution parallel convolution module and a multi-task cascade detection module on the basis of deep neural network target detection and segmentation, effectively improves the accuracy of optical remote sensing image ship detection and segmentation, and particularly has better detection capability on small targets.
Drawings
Fig. 1 is a ship detection flow chart.
Fig. 2 is a structure diagram of a feature extraction backbone network HRFPN.
FIG. 3 is a diagram of a multi-resolution set convolution structure.
Fig. 4 is a diagram of a multi-resolution convolution structure.
Fig. 5 is a diagram of a process of generating a ship candidate region (anchor) based on RPN.
Fig. 6 is a diagram of a multitasking cascade network architecture.
Fig. 7 is a diagram showing the final detection results.
Detailed Description
The invention will be further described with reference to fig. 1-7, without limiting the scope of the invention.
In the following description, for purposes of clarity, not all features of an actual implementation are described, well-known functions or constructions are not described in detail since they would obscure the invention with unnecessary detail, it being understood that in the development of any actual embodiment, numerous implementation details must be set forth in order to achieve the developer's specific goals, such as compliance with system-related and business-related constraints, changing from one implementation to another, and it being recognized that such development effort might be complex and time consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art.
An optical remote sensing image ship detection and segmentation method based on deep learning comprises the following steps:
reading in image data, and preprocessing the image according to a transfer learning method
Migration learning, which mainly refers to training a convolutional neural network on a large data set (such as an ImageNet data set), wherein after a certain feature extraction capability is achieved, a mode of randomly initializing network parameters is not adopted when other image training tasks are performed, the model parameters obtained through training are used as parameter initial values of a network extraction feature layer, and then model fine tuning is performed; the method adopts the HRNETV2-W40 model obtained by training the ImageNet data set, and the model performs mean value reduction processing on the data in the training process; therefore, the same mean value reduction preprocessing should be performed on the images when the trained HRNETV2-W40 model is transferred to the ship detection and segmentation task.
Step two: constructing a multi-resolution parallel convolution backbone network HRFPN extraction image feature map
The overall network comprises four phases: down-sampling the resolution of the image by a 3 × 3 convolution with two steps of 2 to 1/4 for the original as input to the first stage, which is also the resolution size of the feature map at stage 1; stages 2, 3 and 4 contain resolution profiles of 2, 3 and 4, respectively.
Specifically, like the ResNet-50 structure, phase 1 contains 4 residual units, each unit consisting of a bottle neck module of channel 64, and then scales the number of channels of the feature map to C by 1 3 × 3 convolution.
In particular, stages 2, 3, 4 consist of 1, 4 and 3 repeated modular multi-resolution blocks, respectively. The multi-resolution block consists of a multi-resolution group convolution and a multi-resolution convolution.
Specifically, the structure diagram of the multi-resolution group convolution is shown in fig. 3, where each branch of the multi-resolution group convolution contains 4 residual units, and each unit contains 2 3 × 3 convolutions for each resolution.
Specifically, the structure diagram of the multiresolution convolution is shown in fig. 4, the input and the output are feature maps with different resolutions respectively, in order to ensure that the resolution size of the feature maps is consistent with the number of channels when the feature maps are fused, the high resolution feature map is fused to the low resolution feature map through i 3 × 3 convolutions with the step length of 2, the low resolution feature map is fused to the high resolution feature map through j bilinear upsampling operations, and the range of the values of i and j is [1, 3] according to the size of the feature maps.
Specifically, the feature map channels of 4 resolutions are C, 2C, 4C, and 8C, respectively, and C is set to 40.
Specifically, the network forms the feature maps C2, C3, C4, and C5 by connecting multiple resolution (from high resolution to low resolution) parallel convolutions and performing repeated information exchange between the parallel convolutions.
Specifically, like the FPN feature pyramid network, as shown in fig. 2, the final feature map { P2, P3, P4, P5} is formed by fusing { C2, C3, C4, C4}, and the specific calculation formula is as follows:
in particular, Conv1×1And Conv3×3Respectively represent a 1 × 1 convolutional layer and a 3 × 3 convolutional layer; upsamplie represents bilinear upsampling, followed by 1 × 1 convolution operation; down sample represents a 3 × 3 convolutional layer with a step size of 2;a signature addition operation is shown.
Specifically, P6 was generated by convolution of P5 with 1 convolution of 3 × 3 with step size 2.
Specifically, the output channels of { P2, P3, P4, P5, P6} are all 256.
Step three: generation of ship candidate region based on RPN
Specifically, anchors of size {32 } are respectively set for { P2, P3, P4, P5, P6}2、642、1282、2562、5122The aspect ratio of anchors on each layer of feature map is set to be 1:1, 1: 2 and 2: 1, and the feature pyramid has 15 anchors setting values in total.
Specifically, the training positive and negative samples are assigned according to the IoU overlap rate of the anchor and the corresponding label box. When IoU is greater than 0.7, the anchor is a positive sample; when IoU is less than 0.3, the anchor is negative, and the total number of positive and negative samples in an image is not more than 2000.
Step four, building a multitask cascade network to obtain a ship positioning frame and a mask
Specifically, the roaallign adjustment is applied to all anchors to fix the feature vectors, the feature vector size of the classification and regression branches is 7 × 7, and the feature vector size of the segmentation branches is 14 × 14.
Specifically, as shown in fig. 6, by using the idea of multitask cascade detection, the information flow is improved by combining cascade and multitask processing at each stage, and the accuracy is further improved by using the spatial context.
Specifically, the multitask cascade network is characterized in that: a, alternately using target positioning frame regression prediction; b, feeding back the mask characteristics of the previous stage to the mask branches of the current stage, introducing a direct path to strengthen information flow between the mask branches, and c, adding additional semantic segmentation branches, and fusing the additional semantic segmentation branches with box and mask branches to explore more context information.
Specifically, as shown in fig. 6, the whole network has 3 detection heads, and CLS, BOX and MASK represent classification, bounding BOX prediction and MASK prediction branches, respectively. The IoU threshold values of the 3 detection heads are 0.5, 0.6 and 0.7 respectively, and the prediction of each stage is input into the next stage to obtain a high-quality prediction result.
Specifically, the predicted features of the current bounding box are obtained by the bounding box after prediction regression of the previous stage through roilign.
Specifically, connections are made between mask predicted branches of adjacent stages, providing information flow for the mask branches. The mask calculation process of two adjacent stages consists of 4 3 × 3 convolutional layers and 1 deconvolution layer.
Specifically, semantic segmentation is introduced into the multitask cascade network, because the semantic segmentation needs to perform fine pixel-level classification on the whole image so as to obtain strong spatial position information, firstly, feature images of 5 levels of the FPN are scaled to the same size for multi-scale feature fusion, then, features are extracted through 4 convolution layers of 3 × 3, and semantic features of fixed size are obtained through 1 × 1 convolution.
Specifically, for a single image, the multitask loss function in the training process is defined as follows: wherein L iscls、LboxAnd LmaskAnd respectively representing classification loss, positioning frame regression loss and mask prediction loss of the t stage, wherein the values of t are 1, 2 and 3. L issegRepresenting a semantic segmentation penalty.
Specifically, the classification loss is defined as follows:wherein, i represents the index of the anchor,the label value of the ith anchor is represented, pi represents the predicted value of the ith anchor, and p is predicted when the ship is predictedi1, non-ship pi=0。
Specifically, for regression loss, t is definedi={tx,ty,tw,thThe values of the parameters of the rectangular frame are predicted for the ship, for the label value of a rectangular frame of the ship anchor, the calculation formula of four parameter values is defined as follows:
tx=(x-xa)/wa,ty=(y-ya)/ha,
tw=log(w/wa),th=log(h/ha),
wherein, x, y, w and h respectively represent coordinate values of the central point of the rectangular frame, and width and height; variable x, xaAnd x*The sub-tables correspond to the coordinate values (y, w, h are the same) of the central point x of the prediction frame, the anchor frame and the label frame; the regression loss function is defined as follows:
specifically, for the mask prediction branch, setting the output resolution of each anchor as an m × m binary mask map, the mask prediction loss function is defined as:wherein m isiRepresenting the confidence with which the object is predicted to be the target,and representing the output of each pixel in the ith mask after sigmoid.
Specifically, for semantically splitting branches, each anchor output is setThe semantic segmentation graph is s, and the label semantic segmentation graph is s*Then the semantic segmentation loss function is defined as: seg * *L=-[s×log(s)+(1-s)log(1-s)]。
step five, obtaining a refined detection result by utilizing an NMS method
Non-maxima suppression NMS specifically means: and sorting all the detection frames from high to low according to the scores, reserving the candidate frames with low overlapping degree and high scores among the detection frames, and discarding the candidate frames with high overlapping degree and low scores among the detection frames.
The final test results are shown in fig. 7.
The ship target results of the optical remote sensing images of the invention and the Mask R-CNN (a feature extraction backbone network ResNet-101) in the prior art are respectively evaluated by using two indexes of Accuracy (AP) and recall rate (AP), and the accuracy and the recall rate of the ship target detection and segmentation results of the optical remote sensing images of the invention and the Mask R-CNN (a feature extraction backbone network ResNet-101) in the prior art are respectively calculated by using the following formula.
The accuracy AP is the total detection target correct number/total detection target number.
The recall rate AR is the total number of detected correct targets/total number of actual targets.
Specifically, the data set adopted in the experiment is an Airbus-ship data set of Kaggle optical remote sensing ship detection competition, the data set comprises 42615 image data with ship positions and mask labels, and the data set is divided into a training set, a verification set and a test set according to the proportion of 8:1:1 according to the number of ship examples in the data set.
Specifically, the parameter settings during the training of the two networks are kept consistent, the initial learning rate is set to be 0.001, the total number of training epochs is 24, the learning rate is reduced by 0.1 time at 16th epochs and 22 th epochs, and the whole model is optimized by adopting a random gradient descent method (SGD) with a momentum value of 0.9 and weight attenuation of 0.0001 during the training process.
Specifically, the operating system adopted in the experiment is Ubuntu18.04, a single GTX-2080Ti GPU is adopted for training and testing, and a deep learning architecture adopts Pytroch 1.5.0.
The ship detection accuracy and recall index of the Mask R-CNN of the invention and the prior art are respectively listed in Table 1.
Table 1 summary of simulation test results
Test index | Mask R-CNN | The method of the invention |
AP | 70.0% | 80.7% |
AR | 71.6% | 82.2% |
It can be seen from table 1 that the AP and AR values of the existing Mask R-CNN are 70.0% and 71.6% respectively, and the AP and AR values of the method of the present invention are 80.7% and 82.2% respectively, and the ship target detection results of the simulation experiment of the present invention are better.
The ship segmentation accuracy and recall index of the Mask R-CNN of the invention and the prior art are respectively listed in Table 1.
Table 2 summary of simulation test results
Test index | Mask R-CNN | The method of the invention |
AP | 64.1% | 78.2% |
AR | 67.1% | 80.8% |
It can be seen from table 2 that the AP and AR values of the existing Mask R-CNN are 64.1% and 67.1%, respectively, and the AP and AR values of the method of the present invention are 78.2% and 80.8%, respectively, and the ship segmentation results of the simulation experiment of the present invention are better.
Considering that the size of the ship target under remote sensing changes greatly, the AP and the AR are subdivided into the APL,APM,APS,ARL,ARM,ARSWherein APL,APM,APSFor accuracy of large, medium and small targets, ARL,ARM,ARSRecall rates for large, medium and small targets. Specifically, a large target refers to a target size greater than 96 × 96 pixel values, a medium target refers to a target size between 32 × 32 and 96 × 96 pixel values, and a small target refers to a target size less than 32 × 32 pixel values.
Table 3 shows the ship detection AP of Mask R-CNN of the present invention and the prior artL,APM,APS,ARL,ARMAnd ARSAnd (4) indexes.
Table 3 summary of simulation test results
Test index | Mask R-CNN | The method of the invention |
APL | 94.8% | 96.0% |
APM | 95.9% | 97.5% |
APS | 56.7% | 71.9% |
ARL | 97.4% | 97.4% |
ARM | 97.2% | 98.5% |
ARS | 58.4% | 73.8% |
From Table 3, it can be seen that AP of the conventional Mask R-CNNSAnd ARSValues of 56.7% and 58.3%, respectively, of AP of the process of the inventionSAnd ARSThe values are 71.9% and 73.8% respectively, and compared with Mask R-CNN, the invention has better detection effect on small targets.
Table 4 shows the present invention and the prior artShip segmentation AP of technical Mask R-CNNL,APM,APS,ARL,ARMAnd ARSAnd (4) indexes.
Table 4 summary of simulation test results
Test index | Mask R-CNN | The method of the invention |
APL | 85.0% | 91.1% |
APM | 84.3% | 94.0% |
APS | 48.6% | 70.4% |
ARL | 88.5% | 94.6% |
ARM | 87.0% | 95.6% |
ARS | 51.7% | 73.1% |
From Table 4, it can be seen that AP of the conventional Mask R-CNNSAnd ARSValues of 48.6% and 51.7%, respectively, of AP of the process of the inventionSAnd ARSThe values are 70.4% and 73.1% respectively, and compared with Mask R-CNN, the small target segmentation effect is better.
In conclusion, the multi-resolution parallel convolution module and the multi-task cascade detection module are added on the basis of deep neural network target detection and segmentation, so that the accuracy of optical remote sensing image ship detection and segmentation is effectively improved, and particularly, the method has better detection capability on small targets.
Although the invention has been described and illustrated in some detail, it should be understood that various modifications may be made to the described embodiments or equivalents may be substituted, as will be apparent to those skilled in the art, without departing from the spirit of the invention.
Claims (10)
1. An optical remote sensing image ship detection and segmentation method based on deep learning is characterized in that: which comprises the following steps:
reading in image data, and preprocessing an image according to a transfer learning method;
step two, constructing a multi-resolution parallel convolution backbone network HRFPN to extract an image feature map;
thirdly, generating a ship candidate area based on the RPN;
step four, by using the idea of multi-task cascade detection, adding semantic segmentation branches, obtaining the classification probability value, the positioning frame and the mask of the ship, and calculating a loss function;
and step five, obtaining a refined detection result by utilizing an NMS method.
2. The optical remote sensing image ship detection and segmentation method based on deep learning of claim 1, wherein the method comprises the following steps: in the first step, model parameters obtained by training a convolutional neural network on a large data set are used as initial parameter values of a network extraction feature layer, and then model fine tuning is carried out.
3. The optical remote sensing image ship detection and segmentation method based on deep learning of claim 1, wherein the method comprises the following steps: in the first step, the HRNETV2-W40 model obtained by training on the ImageNet data set is subjected to mean value reduction processing in the training process, and the trained HRNETV2-W40 model is transferred to a ship detection and segmentation task and is subjected to the same mean value reduction preprocessing on the image.
4. The optical remote sensing image ship detection and segmentation method based on deep learning of claim 1, wherein the method comprises the following steps: the overall network in the second step comprises four stages: down-sampling the resolution of the image by means of two 3 x 3 convolutions with step size 2 to 1/4 of the original as input to the first stage, which is also the resolution size of the feature map of stage 1, stages 2, 3 and 4 containing 2, 3 and 4 resolution feature maps, respectively, stage 1 containing 4 residual units, each residual unit consisting of a bottleneck module of one channel 64, then scaling the number of channels of the feature map to C by means of 1 3 x 3 convolution, stages 2, 3 and 4 consisting of 1, 4 and 3 repeated modular multi-resolution blocks, respectively, the multi-resolution block consisting of a multi-resolution group convolution and a multi-resolution convolution, each branch of the multi-resolution group convolution containing 4 residual units, for each resolution, 2 3 x 3 convolutions in each unit, the input and output being feature maps of different resolutions, respectively, in order to ensure that the resolution size and the channel number of the feature map are consistent during feature map fusion, i 3 × 3 convolutions with the step length of 2 are performed when the high-resolution feature map is fused to the low-resolution feature map, j bilinear upsampling operations are performed when the low-resolution feature map is fused to the high-resolution feature map, the range of values of i and j is [1, 3], the channel numbers of the feature map with 4 resolutions are respectively C, 2C, 4C and 8C, and C is set to be 40 according to the size of the feature map.
5. The optical remote sensing image ship detection and segmentation method based on deep learning of claim 4, wherein: in the second step, the network forms feature maps C2, C3, C4 and C5 by connecting multi-resolution parallel convolutions and performing repeated information exchange between the parallel convolutions, and fuses { C2, C3, C4 and C4} to form a final feature map { P2, P3, P4 and P5}, and the specific calculation formula is as follows:
wherein, Conv1×1And Conv3×3Respectively represent a 1 × 1 convolutional layer and a 3 × 3 convolutional layer; upsamplie represents bilinear upsampling, followed by 1 × 1 convolution operation; down sample represents a 3 × 3 convolutional layer with a step size of 2;representing the eigenmap addition operation, P6 has P5 generated by 1 convolution with a step size of 2 by 3 { P2, P3, P4, P5, P6} output channels all of 256.
6. The optical remote sensing map based on deep learning of claim 1The image ship detection and segmentation method is characterized by comprising the following steps: in the third step, the anchors are respectively set to be 32 in size for P2, P3, P4, P5 and P62、642、1282、2562、5122Setting the aspect ratio of a ship candidate area on each layer of feature map as { 1:1, 1: 2 and 2: 1}, setting 15 ship candidate areas in the feature pyramid, distributing training positive and negative samples according to the overlapping rate of the ship candidate area and IoU of a corresponding label frame, and when IoU is greater than 0.7, the ship candidate area is a positive sample; when IoU is less than 0.3, the ship candidate area is a negative sample, and the total number of the positive and negative samples in one image is not more than 2000.
7. The optical remote sensing image ship detection and segmentation method based on deep learning of claim 1, wherein the method comprises the following steps: and fourthly, constructing a multi-task cascade network to obtain a ship positioning frame and a MASK, adjusting all ship candidate areas into fixed feature vectors by adopting RoIAlign, wherein the feature vector size of a classification and regression branch is 7 multiplied by 7, the feature vector size of a segmentation branch is 14 multiplied by 14, improving information flow by combining cascade and multi-task processing at each stage by using the idea of multi-task cascade detection, further improving accuracy by utilizing a space context, wherein the whole network is provided with 3 detection heads, CLS, BOX and MASK respectively represent classification, boundary frame prediction and MASK prediction branches, IoU threshold values of the 3 detection heads are respectively 0.5, 0.6 and 0.7, the prediction of each stage is input into the next stage to obtain a high-quality prediction result, and the prediction characteristic of the current boundary frame is obtained by the regression boundary frame of the previous stage through RoIAlign.
8. The optical remote sensing image ship detection and segmentation method based on deep learning of claim 7, wherein: in the fourth step, the mask prediction branches of the adjacent stages are connected to provide information flow of the mask branch, the mask calculation process of the two adjacent stages is composed of 4 3 × 3 convolution layers and 1 deconvolution layer, firstly, feature graphs of 5 levels of the FPN are scaled to the same size to perform multi-scale feature fusion, then, features are extracted through the 4 3 × 3 convolution layers, and semantic features of fixed size are obtained through 1 × 1 convolution.
9. The optical remote sensing image ship detection and segmentation method based on deep learning of claim 8, wherein: in the fourth step, for a single image, the multitask loss function in the training process is defined as follows:
wherein L iscls、LboxAnd LmaskRespectively representing classification loss, positioning frame regression loss and mask prediction loss in the t stage, wherein the values of t are 1, 2 and 3, and LsegRepresenting semantic segmentation loss; the classification loss is defined as follows:wherein, i represents the index of the anchor,denotes the tag value, p, of the ith anchoriThe predicted value of the ith anchor is expressed and is predicted as p when the ship is in the time ofi1, non-ship piFor regression loss, define t as 0i={tx,ty,tw,thThe values of the parameters of the rectangular frame are predicted for the ship,for the label value of a rectangular frame of the ship anchor, the calculation formula of four parameter values is defined as follows:
tx=(x-xa)/wa,ty=(y-ya)/ha,
tw=log(w/wa),th=log(h/ha),
wherein x, y, w and h represent coordinate values of the center point of the rectangular frame, and width and height variables x, xaAnd x*The sub-tables correspond to the coordinate values (y, w, h are the same) of the central point x of the prediction frame, the ship candidate area frame and the label frame; the regression loss function is defined as follows:
specifically, for the mask prediction branch, setting the output resolution of each anchor as an m × m binary mask map, the mask prediction loss function is defined as:wherein m isiRepresenting the confidence with which the object is predicted to be the target,representing the output of each pixel in the ith mask after sigmoid, and setting the semantic segmentation graph output by each anchor as s and the label semantic segmentation graph as s for the semantic segmentation branches*Then the semantic segmentation loss function is defined as: seg * *L=-[s×log(s)+(1-s)log(1-s)]。
10. the optical remote sensing image ship detection and segmentation method based on deep learning of claim 1, wherein the method comprises the following steps: and step five, sorting all the detection frames from high to low according to the scores, reserving the candidate frames with low overlapping degree and high scores among the detection frames, and discarding the candidate frames with high overlapping degree and low scores among the detection frames.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011080445.7A CN112507777A (en) | 2020-10-10 | 2020-10-10 | Optical remote sensing image ship detection and segmentation method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011080445.7A CN112507777A (en) | 2020-10-10 | 2020-10-10 | Optical remote sensing image ship detection and segmentation method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112507777A true CN112507777A (en) | 2021-03-16 |
Family
ID=74954106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011080445.7A Pending CN112507777A (en) | 2020-10-10 | 2020-10-10 | Optical remote sensing image ship detection and segmentation method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112507777A (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113111885A (en) * | 2021-04-14 | 2021-07-13 | 清华大学深圳国际研究生院 | Dynamic resolution instance segmentation method and computer readable storage medium |
CN113160246A (en) * | 2021-04-14 | 2021-07-23 | 中国科学院光电技术研究所 | Image semantic segmentation method based on depth supervision |
CN113239953A (en) * | 2021-03-30 | 2021-08-10 | 西安电子科技大学 | SAR image rotating ship detection method based on directed Gaussian function |
CN113256500A (en) * | 2021-07-02 | 2021-08-13 | 北京大学第三医院(北京大学第三临床医学院) | Deep learning neural network model system for multi-modal image synthesis |
CN113269734A (en) * | 2021-05-14 | 2021-08-17 | 成都市第三人民医院 | Tumor image detection method and device based on meta-learning feature fusion strategy |
CN113312998A (en) * | 2021-05-19 | 2021-08-27 | 中山大学·深圳 | SAR image target identification method and device based on high-resolution network and storage medium |
CN113343883A (en) * | 2021-06-22 | 2021-09-03 | 长光卫星技术有限公司 | Port ore pile segmentation method based on improved HRNetV2 network |
CN113378742A (en) * | 2021-06-21 | 2021-09-10 | 梅卡曼德(北京)机器人科技有限公司 | Image recognition method and device, electronic equipment and storage medium |
CN113420641A (en) * | 2021-06-21 | 2021-09-21 | 梅卡曼德(北京)机器人科技有限公司 | Image data processing method, image data processing device, electronic equipment and storage medium |
CN113436148A (en) * | 2021-06-02 | 2021-09-24 | 范加利 | Method and system for detecting critical points of ship-borne airplane wheel contour based on deep learning |
CN113468991A (en) * | 2021-06-21 | 2021-10-01 | 沈阳工业大学 | Parking space detection method based on panoramic video |
CN113505634A (en) * | 2021-05-24 | 2021-10-15 | 安徽大学 | Double-flow decoding cross-task interaction network optical remote sensing image salient target detection method |
CN113628208A (en) * | 2021-08-30 | 2021-11-09 | 北京中星天视科技有限公司 | Ship detection method, device, electronic equipment and computer readable medium |
CN113762204A (en) * | 2021-09-17 | 2021-12-07 | 中国人民解放军国防科技大学 | Multi-direction remote sensing target detection method and device and computer equipment |
CN113850783A (en) * | 2021-09-27 | 2021-12-28 | 清华大学深圳国际研究生院 | Sea surface ship detection method and system |
CN113870286A (en) * | 2021-09-30 | 2021-12-31 | 重庆理工大学 | Foreground segmentation method based on multi-level feature and mask fusion |
CN113989665A (en) * | 2021-10-25 | 2022-01-28 | 电子科技大学 | SAR ship detection method based on route aggregation sensing FPN |
CN114092364A (en) * | 2021-08-12 | 2022-02-25 | 荣耀终端有限公司 | Image processing method and related device |
CN114155247A (en) * | 2021-08-26 | 2022-03-08 | 航天恒星科技有限公司 | Training method and device for high-resolution remote sensing image instance segmentation model |
CN114219989A (en) * | 2021-11-25 | 2022-03-22 | 哈尔滨工程大学 | Foggy scene ship instance segmentation method based on interference suppression and dynamic contour |
CN114387492A (en) * | 2021-11-19 | 2022-04-22 | 西北工业大学 | Near-shore surface area ship detection method and device based on deep learning |
CN114612769A (en) * | 2022-03-14 | 2022-06-10 | 电子科技大学 | Integrated sensing infrared imaging ship detection method integrated with local structure information |
CN115272242A (en) * | 2022-07-29 | 2022-11-01 | 西安电子科技大学 | YOLOv 5-based optical remote sensing image target detection method |
CN116030351A (en) * | 2023-03-28 | 2023-04-28 | 南京信息工程大学 | Cascade network-based aerial image ship segmentation method |
CN117876884A (en) * | 2024-01-09 | 2024-04-12 | 中国科学院自动化研究所 | High-resolution visible light ship detection method and system guided by saliency information |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491854A (en) * | 2018-02-05 | 2018-09-04 | 西安电子科技大学 | Remote sensing image object detection method based on SF-RCNN |
CN108960143A (en) * | 2018-07-04 | 2018-12-07 | 北京航空航天大学 | Detect deep learning method in a kind of naval vessel in High Resolution Visible Light remote sensing images |
CN109711295A (en) * | 2018-12-14 | 2019-05-03 | 北京航空航天大学 | A kind of remote sensing image offshore Ship Detection |
CN111461127A (en) * | 2020-03-30 | 2020-07-28 | 华南理工大学 | Example segmentation method based on one-stage target detection framework |
CN111723748A (en) * | 2020-06-22 | 2020-09-29 | 电子科技大学 | Infrared remote sensing image ship detection method |
-
2020
- 2020-10-10 CN CN202011080445.7A patent/CN112507777A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491854A (en) * | 2018-02-05 | 2018-09-04 | 西安电子科技大学 | Remote sensing image object detection method based on SF-RCNN |
CN108960143A (en) * | 2018-07-04 | 2018-12-07 | 北京航空航天大学 | Detect deep learning method in a kind of naval vessel in High Resolution Visible Light remote sensing images |
CN109711295A (en) * | 2018-12-14 | 2019-05-03 | 北京航空航天大学 | A kind of remote sensing image offshore Ship Detection |
CN111461127A (en) * | 2020-03-30 | 2020-07-28 | 华南理工大学 | Example segmentation method based on one-stage target detection framework |
CN111723748A (en) * | 2020-06-22 | 2020-09-29 | 电子科技大学 | Infrared remote sensing image ship detection method |
Non-Patent Citations (1)
Title |
---|
苏浩: ""基于深度学习的遥感图像目标检测方法"", 《CNKI硕士论文》, 15 July 2020 (2020-07-15), pages 1 - 107 * |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239953A (en) * | 2021-03-30 | 2021-08-10 | 西安电子科技大学 | SAR image rotating ship detection method based on directed Gaussian function |
CN113239953B (en) * | 2021-03-30 | 2024-02-09 | 西安电子科技大学 | SAR image rotation ship detection method based on directed Gaussian function |
CN113111885A (en) * | 2021-04-14 | 2021-07-13 | 清华大学深圳国际研究生院 | Dynamic resolution instance segmentation method and computer readable storage medium |
CN113160246A (en) * | 2021-04-14 | 2021-07-23 | 中国科学院光电技术研究所 | Image semantic segmentation method based on depth supervision |
CN113269734A (en) * | 2021-05-14 | 2021-08-17 | 成都市第三人民医院 | Tumor image detection method and device based on meta-learning feature fusion strategy |
CN113312998A (en) * | 2021-05-19 | 2021-08-27 | 中山大学·深圳 | SAR image target identification method and device based on high-resolution network and storage medium |
CN113505634A (en) * | 2021-05-24 | 2021-10-15 | 安徽大学 | Double-flow decoding cross-task interaction network optical remote sensing image salient target detection method |
CN113436148A (en) * | 2021-06-02 | 2021-09-24 | 范加利 | Method and system for detecting critical points of ship-borne airplane wheel contour based on deep learning |
CN113420641A (en) * | 2021-06-21 | 2021-09-21 | 梅卡曼德(北京)机器人科技有限公司 | Image data processing method, image data processing device, electronic equipment and storage medium |
CN113378742A (en) * | 2021-06-21 | 2021-09-10 | 梅卡曼德(北京)机器人科技有限公司 | Image recognition method and device, electronic equipment and storage medium |
CN113468991A (en) * | 2021-06-21 | 2021-10-01 | 沈阳工业大学 | Parking space detection method based on panoramic video |
CN113420641B (en) * | 2021-06-21 | 2024-06-14 | 梅卡曼德(北京)机器人科技有限公司 | Image data processing method, device, electronic equipment and storage medium |
CN113468991B (en) * | 2021-06-21 | 2024-03-05 | 沈阳工业大学 | Parking space detection method based on panoramic video |
CN113343883A (en) * | 2021-06-22 | 2021-09-03 | 长光卫星技术有限公司 | Port ore pile segmentation method based on improved HRNetV2 network |
CN113343883B (en) * | 2021-06-22 | 2022-06-07 | 长光卫星技术股份有限公司 | Port ore pile segmentation method based on improved HRNetV2 network |
CN113256500A (en) * | 2021-07-02 | 2021-08-13 | 北京大学第三医院(北京大学第三临床医学院) | Deep learning neural network model system for multi-modal image synthesis |
CN114092364B (en) * | 2021-08-12 | 2023-10-03 | 荣耀终端有限公司 | Image processing method and related device |
CN114092364A (en) * | 2021-08-12 | 2022-02-25 | 荣耀终端有限公司 | Image processing method and related device |
CN114155247A (en) * | 2021-08-26 | 2022-03-08 | 航天恒星科技有限公司 | Training method and device for high-resolution remote sensing image instance segmentation model |
CN114155247B (en) * | 2021-08-26 | 2024-07-12 | 航天恒星科技有限公司 | Training method and device for high-resolution remote sensing image instance segmentation model |
CN113628208A (en) * | 2021-08-30 | 2021-11-09 | 北京中星天视科技有限公司 | Ship detection method, device, electronic equipment and computer readable medium |
CN113628208B (en) * | 2021-08-30 | 2024-02-06 | 北京中星天视科技有限公司 | Ship detection method, device, electronic equipment and computer readable medium |
CN113762204B (en) * | 2021-09-17 | 2023-05-12 | 中国人民解放军国防科技大学 | Multidirectional remote sensing target detection method and device and computer equipment |
CN113762204A (en) * | 2021-09-17 | 2021-12-07 | 中国人民解放军国防科技大学 | Multi-direction remote sensing target detection method and device and computer equipment |
CN113850783B (en) * | 2021-09-27 | 2022-08-30 | 清华大学深圳国际研究生院 | Sea surface ship detection method and system |
CN113850783A (en) * | 2021-09-27 | 2021-12-28 | 清华大学深圳国际研究生院 | Sea surface ship detection method and system |
CN113870286A (en) * | 2021-09-30 | 2021-12-31 | 重庆理工大学 | Foreground segmentation method based on multi-level feature and mask fusion |
CN113989665B (en) * | 2021-10-25 | 2023-04-07 | 电子科技大学 | SAR ship detection method based on route aggregation sensing FPN |
CN113989665A (en) * | 2021-10-25 | 2022-01-28 | 电子科技大学 | SAR ship detection method based on route aggregation sensing FPN |
CN114387492A (en) * | 2021-11-19 | 2022-04-22 | 西北工业大学 | Near-shore surface area ship detection method and device based on deep learning |
CN114219989A (en) * | 2021-11-25 | 2022-03-22 | 哈尔滨工程大学 | Foggy scene ship instance segmentation method based on interference suppression and dynamic contour |
CN114612769A (en) * | 2022-03-14 | 2022-06-10 | 电子科技大学 | Integrated sensing infrared imaging ship detection method integrated with local structure information |
CN115272242A (en) * | 2022-07-29 | 2022-11-01 | 西安电子科技大学 | YOLOv 5-based optical remote sensing image target detection method |
CN115272242B (en) * | 2022-07-29 | 2024-02-27 | 西安电子科技大学 | YOLOv 5-based optical remote sensing image target detection method |
CN116030351A (en) * | 2023-03-28 | 2023-04-28 | 南京信息工程大学 | Cascade network-based aerial image ship segmentation method |
CN117876884A (en) * | 2024-01-09 | 2024-04-12 | 中国科学院自动化研究所 | High-resolution visible light ship detection method and system guided by saliency information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112507777A (en) | Optical remote sensing image ship detection and segmentation method based on deep learning | |
CN111210443B (en) | Deformable convolution mixing task cascading semantic segmentation method based on embedding balance | |
CN111639692B (en) | Shadow detection method based on attention mechanism | |
CN111126472A (en) | Improved target detection method based on SSD | |
CN111898432B (en) | Pedestrian detection system and method based on improved YOLOv3 algorithm | |
Vo et al. | Semantic image segmentation using fully convolutional neural networks with multi-scale images and multi-scale dilated convolutions | |
CN111079739B (en) | Multi-scale attention feature detection method | |
CN114758288B (en) | Power distribution network engineering safety control detection method and device | |
CN117253154B (en) | Container weak and small serial number target detection and identification method based on deep learning | |
CN109446922B (en) | Real-time robust face detection method | |
CN112800955A (en) | Remote sensing image rotating target detection method and system based on weighted bidirectional feature pyramid | |
CN116645592B (en) | Crack detection method based on image processing and storage medium | |
CN113743505A (en) | Improved SSD target detection method based on self-attention and feature fusion | |
CN114419413A (en) | Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network | |
CN113177503A (en) | Arbitrary orientation target twelve parameter detection method based on YOLOV5 | |
CN112365451A (en) | Method, device and equipment for determining image quality grade and computer readable medium | |
CN116721414A (en) | Medical image cell segmentation and tracking method | |
CN111462090A (en) | Multi-scale image target detection method | |
CN116758340A (en) | Small target detection method based on super-resolution feature pyramid and attention mechanism | |
CN117456167A (en) | Target detection algorithm based on improved YOLOv8s | |
CN114241250A (en) | Cascade regression target detection method and device and computer readable storage medium | |
CN114565824B (en) | Single-stage rotating ship detection method based on full convolution network | |
CN117079095A (en) | Deep learning-based high-altitude parabolic detection method, system, medium and equipment | |
CN116363535A (en) | Ship detection method in unmanned aerial vehicle aerial image based on convolutional neural network | |
CN113657196B (en) | SAR image target detection method, SAR image target detection device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |