CN113177456A - Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion - Google Patents
Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion Download PDFInfo
- Publication number
- CN113177456A CN113177456A CN202110442872.3A CN202110442872A CN113177456A CN 113177456 A CN113177456 A CN 113177456A CN 202110442872 A CN202110442872 A CN 202110442872A CN 113177456 A CN113177456 A CN 113177456A
- Authority
- CN
- China
- Prior art keywords
- image
- target
- feature
- data set
- remote sensing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 60
- 230000004927 fusion Effects 0.000 title claims abstract description 34
- 238000012549 training Methods 0.000 claims abstract description 36
- 238000012360 testing method Methods 0.000 claims abstract description 23
- 238000000034 method Methods 0.000 claims abstract description 20
- 230000003287 optical effect Effects 0.000 claims abstract description 18
- 238000010586 diagram Methods 0.000 claims abstract description 17
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 8
- 238000001914 filtration Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 5
- 238000013135 deep learning Methods 0.000 claims description 4
- 238000002372 labelling Methods 0.000 claims description 4
- 230000009466 transformation Effects 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 238000005260 corrosion Methods 0.000 claims 2
- 230000002349 favourable effect Effects 0.000 abstract 1
- 238000002474 experimental method Methods 0.000 description 14
- 230000000877 morphologic effect Effects 0.000 description 8
- 238000007781 pre-processing Methods 0.000 description 7
- 239000000284 extract Substances 0.000 description 6
- 238000004088 simulation Methods 0.000 description 6
- 230000003628 erosive effect Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000003708 edge detection Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 101100285899 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SSE2 gene Proteins 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
- G06T5/30—Erosion or dilatation, e.g. thinning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20064—Wavelet transform [DWT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-feature fusion-based optical remote sensing image target detection method, which mainly solves the problem of insufficient extraction of optical remote sensing image target features in the prior art. The implementation scheme is as follows: 1) respectively extracting mathematical morphology characteristics, linear scale space characteristics and nonlinear scale space characteristics of original data and fusing the three characteristics to obtain a fused characteristic diagram; 2) dividing the fusion characteristic graph into a training data set and a test data set, and performing small-target expansion on the training data set; 3) constructing a target detection network, and training the network on a training data set after target expansion by using a gradient descent algorithm; 4) and testing the test data set by using the trained network to obtain a detection result. The method enhances the contour characteristics and the edge characteristics of the target, is favorable for improving the accuracy of target detection, and can be used for resource exploration, natural disaster assessment and target identification.
Description
Technical Field
The invention belongs to the technical field of optical remote sensing images, and particularly relates to a target detection method with multi-feature fusion, which can be used for resource exploration, natural disaster assessment and target identification.
Background
The remote sensing image has the characteristics of complex background, unbalanced target category, large target scale change, special shooting visual angle and the like, so that the target detection aiming at the remote sensing image has great difficulty and challenge.
The traditional target detection method mainly depends on manually designed feature extraction operators to extract image features, including V-J detection, HOG detection, DPM algorithm and the like, and the main feature is that a detector can only use a fixed feature extraction algorithm to fit a single image feature, so that the traditional target detection method can only adapt to the situation that obvious features exist and the background is simple, and the requirement of a remote sensing image target detection task cannot be met.
The target detection method based on deep learning uses a convolution network to extract image features, and can extract various abundant features in the same target, so that the detection accuracy is far higher than that of the traditional manual design method, and the target detection method becomes an industry mainstream method at present and is widely used in remote sensing image target detection tasks.
A remote sensing image ship target detection method is provided by using a YOLO v5 network structure and an attention mechanism in SENET in a patent [ CN112580439A ].
A multi-scale feature extraction network is designed in the patent CN110378297A, the network can extract multi-scale features of images and respectively predict candidate regions of feature images corresponding to each image scale, and the accuracy of remote sensing image target detection is effectively improved.
In a patent [ CN112070729A ], an anchor-free-based target detection network is adopted, firstly, linear enhancement is carried out on an obtained remote sensing image data set in a balance coefficient mixed enhancement mode, and then, feature extraction and fusion are carried out on a depth residual error network ResNet-50 and a feature pyramid network FPN. The invention fully utilizes the context multi-feature fusion method, enhances the feature extraction capability and the category prediction capability of the network, and improves the detection precision.
However, the above methods based on the deep convolutional neural network are all technical solutions in which a convolution operation is directly applied to an original input image, or a simple linear enhancement mode is adopted to preprocess data, and this solution does not alleviate the difficulty of extracting features by the deep convolutional network, and especially for target detection under a complex background of a remote sensing image, this method for detecting a target based on the deep convolutional network cannot accurately extract feature information of a target part, thereby affecting the improvement of detection performance.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a remote sensing target detection method based on single-stage full convolution network and multi-feature fusion so as to accurately extract the feature information of a target part and improve the detection performance.
In order to achieve the purpose, the technical scheme of the invention comprises the following steps:
(1) respectively extracting morphological characteristics, linear scale space characteristics and nonlinear scale space characteristics of the optical remote sensing image:
1a) respectively performing opening operation and closing operation on the original image to obtain 2n initial feature maps, then performing pixel-by-pixel addition on all the initial feature maps and taking an average value to obtain a morphological feature map of the original image, wherein n represents the number of opening operation or closing operation;
1b) filtering the original image by using a Gaussian filter and a Sobel edge extraction operator respectively to obtain a three-channel Gaussian fuzzy feature map and four single-channel local edge feature maps; summing the four local edge feature graphs pixel by pixel and averaging to obtain an integral edge feature graph; performing pixel-by-pixel fusion on each channel component of the three-channel Gaussian feature map and the integral edge feature map to obtain a linear multi-scale spatial feature map;
1c) converting an original optical remote sensing image into a single-channel gray-scale image, and performing wavelet decomposition on the single-channel gray-scale image by using a two-dimensional single-level wavelet transformation function to obtain four single-channel subgraphs, namely a low-frequency component diagram, a horizontal high-frequency component diagram, a vertical high-frequency component diagram and a diagonal high-frequency component diagram; discarding the low-frequency component subgraph, and performing channel splicing on the other three high-frequency component subgraphs to obtain a nonlinear multi-scale spatial feature graph;
(2) constructing a fusion feature map:
2a) performing pixel-by-pixel fusion on the morphological feature map and the linear multi-scale space feature map according to the proportion of alpha and beta to obtain an initial fusion image, wherein the alpha and the beta meet the condition that the alpha + beta is 0.5;
2b) multiplying the original image by a proportionality coefficient of 0.5, then carrying out pixel-by-pixel summation with the original fusion image, and then carrying out pixel-by-pixel addition with the nonlinear multi-scale spatial feature map to obtain a final feature fusion image;
(3) data set partitioning and small target expansion:
3a) for all optical remote sensing images, calculating the maximum value and the minimum value of the area in all targets to be detected according to the labeling information, and marking as SmaxAnd SminAnd sets a threshold value
3b) All optical remote sensing images are randomly arranged according to the following 8: 2 into a training data set and a test data set;
3c) for each original image in the training set, traversing all the targets to be detected in the image, if the target area SiIf the value is less than the threshold value S, selecting a target-free position in the original image, and copying the minimum square area where the target is located to the selected position to obtain a new training image; otherwise, the original image is not changed; after traversing is completed, a new training data set is obtained;
(4) training and detecting by using a deep learning-based target detection network:
4a) respectively extracting and fusing the characteristics of the test data set and the new training data set according to the operations (1) and (2) to obtain a training data set and a test data set after the characteristics are fused;
4b) training the existing single-stage full convolution target detection network by using a training data set after feature fusion through a gradient descent algorithm until the overall loss of the network is not changed any more, and obtaining a trained target detection network;
4c) and inputting the test data set into a trained target detection network to obtain a target detection result of the optical remote sensing image.
Compared with the prior art, the invention has the following advantages:
firstly, before the convolution operation of the neural network, the multi-feature extraction and fusion operation is carried out on the original image, so that the contour feature and the edge feature of the target are enhanced, and therefore, when the deep convolution network is used for extracting the target feature, the deep convolution network is more sensitive to the target part, the extracted feature is more accurate, and the accuracy of target detection is favorably improved.
Secondly, the image morphological characteristics, the linear multi-scale characteristics and the nonlinear multi-scale characteristics are fused, so that compared with the existing simple linear data enhancement method, the target saliency is enhanced, especially for small targets and complex background areas, the target characteristics are effectively enhanced, the background part is restrained, the accuracy of extracting the target characteristics by the deep convolutional neural network is improved, and the detection performance is improved.
Drawings
FIG. 1 is a general flow chart of an implementation of the present invention;
FIG. 2 is a sub-flow diagram of the present invention for constructing morphological features of an image;
FIG. 3 is a sobel operator directional template used by the present invention;
FIG. 4 is a sub-flow diagram of the construction of a linear scale spatial feature map according to the present invention;
FIG. 5 is a sub-flow diagram of the construction of a non-linear scale space feature map according to the present invention.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, the implementation steps of this example are as follows:
the remote sensing image contains abundant spatial information and scale effect, multi-scale is a characteristic naturally existing in the remote sensing image ground object observation, different levels of ground object features and spatial relation rules can be obtained by analyzing from different scales, the multi-scale spatial information is very key for the accurate identification of the ground object, and a deep learning method generally extracts and classifies the features of the remote sensing image from a set certain scale level and lacks the comprehensive consideration of the multi-scale spatial information. Therefore, more and more scholars are beginning to research how to combine the multi-scale spatial features of the remote sensing images to improve the spatial comprehensive feature recognition capability. In addition, the unique role of mathematical morphology in quantitative description and analysis of image geometric features makes the remote sensing image processing research quite intensive, and the mathematical morphology is a classical nonlinear spatial information processing technology and can extract meaningful shape components from complex information of an optical remote sensing image and retain spatial geometric structural characteristics in the image. Therefore, the characteristics of remote sensing ground object classification can be better met. A large number of researches show that the mathematical morphology can accurately describe the contour and the spatial relationship of the ground feature, and the abundant spatial information can be effectively extracted from the remote sensing image based on the calculation and processing of the mathematical morphology method. Therefore, the invention designs a multi-feature extraction fusion technology based on the mathematical morphological features and the multi-scale features of the remote sensing images, and the specific implementation steps are as follows:
The mathematical morphology method includes four basic operations: dilation, erosion, open and close operations. The expansion operation is to perform point-by-point convolution on the original image by using an operation kernel, and the maximum pixel value of the coverage area of the operation kernel is taken as a new pixel value of the convolution position. The erosion operation is to perform point-by-point convolution with the original image by using an operation kernel, and the minimum pixel value of the area covered by the operation kernel is used as a new pixel value of the convolution position. The open operation is to perform erosion before expansion on the image, and the close operation is to perform erosion after expansion on the image. In order to effectively eliminate image noise and smooth image edges, mathematical morphological features are extracted by adopting an open operation mode and a closed operation mode.
Referring to fig. 2, the specific implementation of this step is as follows:
1.1) performing opening operation and closing operation on an original image by using operation cores with the sizes of 3 × 3 and 5 × 5 respectively to obtain two opening operation characteristic diagrams open _3 and open _5 and two closing operation characteristic diagrams close _3 and close _5 respectively;
1.2) carrying out pixel-by-pixel summation and averaging on the obtained feature maps open _3, open _5, close _3 and close _5 to obtain a three-channel morphological feature map IMThe feature map has the same resolution and dimensions as the original image.
And 2, extracting the linear multi-scale spatial features of the image.
In the field of computer vision, the multi-scale features can effectively improve the results of tasks such as image classification, target detection and the like, and the adoption of a proper mode to construct the multi-scale features of the images, effectively fuse and utilize the multi-scale features and the multi-scale features are always the key problems concerned by researchers.
The gaussian kernel is the only kernel that can generate the multi-scale space, and filtering an image by using a gaussian filter can effectively establish the multi-scale space, and for a two-dimensional image I (x, y), the image after gaussian filtering is:
L(x,y,δ)=G(x,y,δ)*I(x,y)
wherein G (x, y, δ) represents a Gaussian function whose formula is
In the formula (x)0,y0) Is the coordinate of the central point, delta is a scale parameter, and determines the smoothness degree of the transformed image. The larger the value of delta, the better the filtering smoothness.
The Sobel operator is a discrete differential operator, and is commonly used for edge detection in image processing. In the implementation process of the Sobel operator, a 3 x 3 template is used as a convolution kernel to perform convolution operation with each pixel point in the image, and different direction templates are used to obtain edge detection characteristic maps in different directions.
The method for constructing linear multi-scale spatial features of images used in this example is mainly based on gaussian filter and Sobel edge extraction operator to extract edge features respectively, where the Sobel operator uses four directional templates (0 °,45 °,90 °,135 °), as shown in fig. 3.
Referring to fig. 4, the specific implementation of this step is as follows:
2.1) filtering the image by using a Gaussian filter to obtain a Gaussian blur characteristic map IG
2.2) converting the original image into a gray-scale image, and respectively extracting four edge features of the gray-scale image in four directions by using a sobel operatorRespectively representing a horizontal edge feature map, a vertical edge feature map and two diagonal edge feature maps of the image;
2.3) carrying out pixel-by-pixel fusion on the four extracted edge feature maps to obtain an integral edge feature map IS
By fusing the edge feature maps in the four directions, the edge feature part in the original picture can be enhanced, the pixel value of the edge feature part is far greater than 0, the non-edge part is inhibited, and the pixel value of the edge feature part is close to 0;
2.4) mapping the global edge profile ISSi fuzzy characteristic diagram IGFusing to obtain the final linear multi-scale space characteristic diagram IL:
Wherein r is 0.3, IGiIs represented byGThe ith channel component of (1).
And 3, extracting the nonlinear multi-scale spatial features of the image.
The present example employs wavelet transformation as the primary method of constructing nonlinear multi-scale spatial features of an image, the wavelet basis using a two-dimensional single-level wavelet transform function dwt2 ().
Referring to fig. 5, the specific implementation of this step is as follows:
3.1) converting the common three-channel optical image into a single-channel gray-scale image;
3.2) carrying out wavelet decomposition on the gray level image in 3.1) by using a two-dimensional single-level wavelet transformation function to respectively obtain a low-frequency component subgraph, a horizontal high-frequency component subgraph, a vertical high-frequency component subgraph and a diagonal high-frequency component subgraph of the gray level image, wherein the resolution of each subgraph is only one fourth of that of the original image;
3.3) removing the low-frequency component subgraph in the step 3.2), extracting only three high-frequency component subgraphs and expanding the three high-frequency component subgraphs to be the same as the resolution of the original image by using a bilinear interpolation method;
3.4) splicing the three high-frequency component images after the resolution ratio expansion to obtain a nonlinear space multi-scale characteristic image INL。
And 4, constructing a fusion characteristic graph.
The original image IpAnd Gaussian blur feature map IGLinear multi-scale space characteristic diagram ILNonlinear space multi-scale feature map INLCarrying out weighted summation to obtain a final fusion image I;
I=0.5×Ip+α×IM+β×IL+INL
wherein alpha and beta are two hyperparameters with different values, and alpha + beta is 0.5, because of INLThe pixel value in (2) is very small, and the pixel-by-pixel addition is directly performed because the term does not add a weight coefficient.
And 5, expanding the small target.
The detection of small targets is always a difficult point in the field of computer vision, the detection precision is generally not high in the existing various algorithm frames, and the small target sample is expanded in the data preprocessing stage by the method, and the method is specifically realized as follows:
5.1) calculating the area of a real target frame according to the labeling information for all samples in the data set, and finding out the maximum value S of the areamaxAnd minimum value Smin;
5.2) setting thresholdDividing all data into a training data set and a testing data set according to the ratio of 8: 2;
5.3) calculating the area S of the label box of all targets in each picture in the training seti∈(S1,S2...Sn) Go through SiAnd then comparing it with a set threshold:
if S isiIf S is true, the rectangular area where the target is located is copied, a new position is randomly selected in the image for pasting, and execution is carried out 5.4)
If S isiIf < S is not true, no operation is performed, and the next S is traversedi;
5.4) selecting a new position:
5.4.1) randomly selecting a point in the image(x, y), calculating the size [ x, y, x + w ] of the label box of the new positioni,y+hi]Wherein x + wi,y+hiRespectively representing the width and the height of the new labeling box;
5.4.2) judging whether the new position is superposed with the existing label frame in the image;
if the position is not coincident, the pasting operation is carried out on the new position;
if the superposition is carried out, returning to 5.4.1), recording the returning times, and if the returning times reach 100 times, abandoning the pasting operation;
5.5) repeat 5.3) a total of 5 times, so that the small target is fully expanded.
And 6, constructing a deep convolutional network for training and detection.
6.1) data preprocessing
Performing multi-feature fusion operation on all optical remote sensing images according to the steps 1-4 to obtain feature fusion images, dividing all the feature fusion images into a training data set and a test data set according to the ratio of 8: 2, and performing small-target expansion on all the images in the training data set according to the step 5;
6.2) constructing a target detection network
The example adopts an existing single-stage full-convolution target Detection network as a Detection framework, the single-stage full-convolution target Detection network comprises a backbone network resnet-50, a feature pyramid network FPN, a classification header Class _ Head and a Detection header Detection _ Head, wherein the pyramid network FPN comprises five feature layers P3, P4, P5, P6 and P7, and target frames are predicted in the five feature layers respectively;
in this example, the top-most layer P7 of the FPN is deleted, and the target frame prediction is performed only on four feature layers P3, P4, P5, and P6, where the target frame prediction ranges in the four feature layers are: 0.64, 64.128, 128.256, 256 ∞.
6.3) network training
And (3) sending the training data set after the small target in the step 6.1) is expanded into the single-stage full convolution target detection network constructed in the step 6.2), and training by using a gradient descent algorithm until the network converges to obtain the trained single-stage full convolution target detection network.
6.4) network testing and result evaluation
And (3) sending the test data set in the 6.1) into the 6.3) trained network for testing to obtain the detection results of the network on all targets in the test data set.
The effect of the present invention is further explained by combining the simulation experiment as follows:
firstly, simulation experiment conditions:
the hardware platform of the simulation experiment of the invention is as follows: the CPU model is Intel Xeon E5-2630 v4, 20 cores, the main frequency is 2.4GHz, and the memory size is 64 GB; the GPU is NVIDIA GeForce GTX 1080Ti/PCIe/SSE2, and the video memory size is 20 GB.
The software platform of the simulation experiment of the invention is as follows: the operating system is Ubuntu20.04 LTS, the cuda version is 10.1, and the version of Pytrch is 1.5.0. The opencv version is 4.4.0.
The data set used for the experiment was the public remote sensing image data set LEVIR.
Second, simulation experiment and results
In the first experiment, original data are used for training and testing the existing single-stage full-convolution target detection network, and the average accuracy mAP and average recall ratio recall indexes are calculated according to the test result.
And secondly, preprocessing the original data by adopting a multi-feature fusion mode, training and testing the conventional single-stage full-convolution target detection network by using the preprocessed data, and calculating the average accuracy mAP and average recall indexes.
And thirdly, preprocessing original data by adopting a small target enhancement mode, training and testing the conventional single-stage full-convolution target detection network by using the preprocessed data, and calculating the average accuracy mAP and average recall indexes.
And fourthly, preprocessing original data by adopting a multi-feature fusion mode and a small target enhancement mode, training and testing the conventional single-stage full-convolution target detection network by using the preprocessed data, and calculating the average accuracy mAP and the average recall rate call indexes.
The results of the above experiments are shown in table 1.
TABLE 1 comparison of simulation test results
Experimental setup | mAP | Recall |
Experiment one | 90.3% | 72.5% |
Experiment two | 90.6% | 72.9% |
Experiment three | 91.1% | 75.8% |
Experiment four | 91.4% | 76.1% |
By comparing the results of the first experiment and the second experiment with the results of the first experiment and the third experiment, the method for preprocessing data by using the multi-feature fusion mode and the small target enhancement mode can effectively improve the detection performance of the existing single-stage full convolution target detection network.
By comparing the results of the experiment four with the results of the experiment two and the experiment three, the improvement of the single-stage full-convolution target detection network performance by simultaneously using a small target enhancement mode and a multi-feature fusion mode to carry out data preprocessing can be seen to be most obvious.
Claims (5)
1. A target detection method of an optical remote sensing image based on multi-feature fusion is characterized by comprising the following steps:
(1) respectively extracting mathematical morphology characteristics, linear scale space characteristics and nonlinear scale space characteristics of the optical remote sensing image:
1a) respectively performing opening operation and closing operation on the original image to obtain 2n initial feature maps, then performing pixel-by-pixel addition on all the initial feature maps and taking an average value to obtain a mathematical morphology feature map of the original image, wherein n represents the number of the opening operation or the closing operation;
1b) filtering the original image by using a Gaussian filter and a Sobel edge extraction operator respectively to obtain a three-channel Gaussian fuzzy feature map and four single-channel local edge feature maps; summing the four local edge feature graphs pixel by pixel and averaging to obtain an integral edge feature graph; performing pixel-by-pixel fusion on each channel component of the three-channel Gaussian feature map and the integral edge feature map to obtain a linear multi-scale spatial feature map;
1c) converting an original optical remote sensing image into a single-channel gray-scale image, and performing wavelet decomposition on the single-channel gray-scale image by using a two-dimensional single-level wavelet transformation function to obtain four single-channel subgraphs, namely a low-frequency component diagram, a horizontal high-frequency component diagram, a vertical high-frequency component diagram and a diagonal high-frequency component diagram; discarding the low-frequency component subgraph, and performing channel splicing on the other three high-frequency component subgraphs to obtain a nonlinear multi-scale spatial feature graph;
(2) constructing a fusion feature map:
2a) performing pixel-by-pixel fusion on the mathematical morphology feature map and the linear multi-scale space feature map according to the proportion of alpha and beta to obtain an initial fusion image, wherein the alpha and the beta meet the condition that the alpha + beta is 0.5;
2b) multiplying the original image by a proportionality coefficient of 0.5, then carrying out pixel-by-pixel summation with the original fusion image, and then carrying out pixel-by-pixel addition with the nonlinear multi-scale spatial feature map to obtain a final feature fusion image;
(3) data set partitioning and small target expansion:
3a) for all optical remote sensing images, calculating the maximum value and the minimum value of the area in all targets to be detected according to the labeling information, and marking as SmaxAnd SminAnd sets a threshold value
3b) All optical remote sensing images are randomly arranged according to the following 8: 2 into a training data set and a test data set;
3c) for each original image in the training set, traversing all the targets to be detected in the image, if the target area SiIf the value is less than the threshold value S, selecting a target-free position in the original image, and copying the minimum square area where the target is located to the selected position to obtain a new training image; otherwise, the original image is not changed; after traversing is completed, a new training data set is obtained;
(4) training and detecting by using a deep learning-based target detection network:
4a) respectively extracting and fusing the characteristics of the test data set and the new training data set according to the operations (1) and (2) to obtain a training data set and a test data set after the characteristics are fused;
4b) training the existing single-stage full convolution target detection network by using a training data set after feature fusion through a gradient descent algorithm until the overall loss of the network is not changed any more, and obtaining a trained target detection network;
4c) and inputting the test data set into a trained target detection network to obtain a target detection result of the optical remote sensing image.
2. The method according to claim 1, wherein 1a) the opening operation and the closing operation are respectively performed on the original image, and the original optical remote sensing image is subjected to the expansion-first corrosion-then-corrosion operation and the expansion-first corrosion-then-corrosion operation by using convolution kernels with the sizes of 3 x 3 and 5 x 5 respectively, so as to obtain two opening operation characteristic maps and two closing operation characteristic maps.
3. The method according to claim 1, wherein the filtering of the original image by using the Sobel edge extraction operator in 1b) is performed by convolving four convolution kernels with a size of 3 × 3 with the original image respectively to obtain four local edge feature maps, wherein directions of the four convolution kernels are respectively: 0 °,45 °,90 °,135 °.
4. The method of claim 1, wherein said selecting a block of non-target locations in the original image in 3c) is performed as follows:
3c1) randomly selecting a position (x, y) in an original image, and calculating the marking frame information [ x, y, x + w ] of the new positioni,y+hi]Wherein x + wi,y+hiRespectively representing the width and height of the new target frame;
3c2) judging whether the new position is overlapped with an existing label frame in the current image, if not, selecting the position for subsequent operation, otherwise, returning to 3c 1);
3c3) when the selection is successful or the random selection of the position in 3c1) is repeated for 100 times, the position selection is finished.
5. The method of claim 1, wherein the training of the existing single-stage full-convolution target detection network by the gradient descent algorithm in 4b) is implemented as follows:
4b1) deleting the topmost feature layer P7 of the FPN in the single-stage full-convolution target detection network, and reserving the feature layers P3, P4, P5 and P6;
4b2) sending the training data into a network for forward propagation, and performing target frame regression on the P3, P4, P5 and P6 feature layers reserved in 4b1) to obtain a target prediction result, wherein the size ranges of predicted targets in the four feature layers are as follows: 0,64,128,256, ∞;
4b3) calculating the integral loss between the prediction result obtained by the step 4b2) and the real label, then performing back propagation, and updating the network parameters;
4b4) repeat 4b2) -4b3) until the network converges.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110442872.3A CN113177456B (en) | 2021-04-23 | 2021-04-23 | Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110442872.3A CN113177456B (en) | 2021-04-23 | 2021-04-23 | Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113177456A true CN113177456A (en) | 2021-07-27 |
CN113177456B CN113177456B (en) | 2023-04-07 |
Family
ID=76924464
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110442872.3A Active CN113177456B (en) | 2021-04-23 | 2021-04-23 | Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113177456B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610838A (en) * | 2021-08-25 | 2021-11-05 | 华北电力大学(保定) | Bolt defect data set expansion method |
CN114155208A (en) * | 2021-11-15 | 2022-03-08 | 中国科学院深圳先进技术研究院 | Atrial fibrillation assessment method and device based on deep learning |
CN116168302A (en) * | 2023-04-25 | 2023-05-26 | 耕宇牧星(北京)空间科技有限公司 | Remote sensing image rock vein extraction method based on multi-scale residual error fusion network |
CN116229319A (en) * | 2023-03-01 | 2023-06-06 | 广东宜教通教育有限公司 | Multi-scale feature fusion class behavior detection method and system |
CN116453078A (en) * | 2023-03-14 | 2023-07-18 | 电子科技大学长三角研究院(湖州) | Traffic fixation target detection method based on significance priori |
CN116823838A (en) * | 2023-08-31 | 2023-09-29 | 武汉理工大学三亚科教创新园 | Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629378A (en) * | 2012-03-01 | 2012-08-08 | 西安电子科技大学 | Remote sensing image change detection method based on multi-feature fusion |
CN107092871A (en) * | 2017-04-06 | 2017-08-25 | 重庆市地理信息中心 | Remote sensing image building detection method based on multiple dimensioned multiple features fusion |
CN107292339A (en) * | 2017-06-16 | 2017-10-24 | 重庆大学 | The unmanned plane low altitude remote sensing image high score Geomorphological Classification method of feature based fusion |
CN108154192A (en) * | 2018-01-12 | 2018-06-12 | 西安电子科技大学 | High Resolution SAR terrain classification method based on multiple dimensioned convolution and Fusion Features |
CN108537238A (en) * | 2018-04-13 | 2018-09-14 | 崔植源 | A kind of classification of remote-sensing images and search method |
CN109214439A (en) * | 2018-08-22 | 2019-01-15 | 电子科技大学 | A kind of infrared image icing River detection method based on multi-feature fusion |
CN109271928A (en) * | 2018-09-14 | 2019-01-25 | 武汉大学 | A kind of road network automatic update method based on the fusion of vector road network with the verifying of high score remote sensing image |
CN109325395A (en) * | 2018-04-28 | 2019-02-12 | 二十世纪空间技术应用股份有限公司 | The recognition methods of image, convolutional neural networks model training method and device |
CN112132006A (en) * | 2020-09-21 | 2020-12-25 | 西南交通大学 | Intelligent forest land and building extraction method for cultivated land protection |
CN112329677A (en) * | 2020-11-12 | 2021-02-05 | 北京环境特性研究所 | Remote sensing image river target detection method and device based on feature fusion |
CN112395958A (en) * | 2020-10-29 | 2021-02-23 | 中国地质大学(武汉) | Remote sensing image small target detection method based on four-scale depth and shallow layer feature fusion |
CN112465880A (en) * | 2020-11-26 | 2021-03-09 | 西安电子科技大学 | Target detection method based on multi-source heterogeneous data cognitive fusion |
CN112580439A (en) * | 2020-12-01 | 2021-03-30 | 中国船舶重工集团公司第七0九研究所 | Method and system for detecting large-format remote sensing image ship target under small sample condition |
-
2021
- 2021-04-23 CN CN202110442872.3A patent/CN113177456B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629378A (en) * | 2012-03-01 | 2012-08-08 | 西安电子科技大学 | Remote sensing image change detection method based on multi-feature fusion |
CN107092871A (en) * | 2017-04-06 | 2017-08-25 | 重庆市地理信息中心 | Remote sensing image building detection method based on multiple dimensioned multiple features fusion |
CN107292339A (en) * | 2017-06-16 | 2017-10-24 | 重庆大学 | The unmanned plane low altitude remote sensing image high score Geomorphological Classification method of feature based fusion |
CN108154192A (en) * | 2018-01-12 | 2018-06-12 | 西安电子科技大学 | High Resolution SAR terrain classification method based on multiple dimensioned convolution and Fusion Features |
CN108537238A (en) * | 2018-04-13 | 2018-09-14 | 崔植源 | A kind of classification of remote-sensing images and search method |
CN109325395A (en) * | 2018-04-28 | 2019-02-12 | 二十世纪空间技术应用股份有限公司 | The recognition methods of image, convolutional neural networks model training method and device |
CN109214439A (en) * | 2018-08-22 | 2019-01-15 | 电子科技大学 | A kind of infrared image icing River detection method based on multi-feature fusion |
CN109271928A (en) * | 2018-09-14 | 2019-01-25 | 武汉大学 | A kind of road network automatic update method based on the fusion of vector road network with the verifying of high score remote sensing image |
CN112132006A (en) * | 2020-09-21 | 2020-12-25 | 西南交通大学 | Intelligent forest land and building extraction method for cultivated land protection |
CN112395958A (en) * | 2020-10-29 | 2021-02-23 | 中国地质大学(武汉) | Remote sensing image small target detection method based on four-scale depth and shallow layer feature fusion |
CN112329677A (en) * | 2020-11-12 | 2021-02-05 | 北京环境特性研究所 | Remote sensing image river target detection method and device based on feature fusion |
CN112465880A (en) * | 2020-11-26 | 2021-03-09 | 西安电子科技大学 | Target detection method based on multi-source heterogeneous data cognitive fusion |
CN112580439A (en) * | 2020-12-01 | 2021-03-30 | 中国船舶重工集团公司第七0九研究所 | Method and system for detecting large-format remote sensing image ship target under small sample condition |
Non-Patent Citations (5)
Title |
---|
GUANGHUI WANG 等: "CHANGE DETECTION OF HIGH-RESOLUTION REMOTE SENSING IMAGES BASED ON ADAPTIVE FUSION OF MULTIPLE FEATURES", 《THE INTERNATIONAL ARCHIVES OF THE PHOTOGRAMMETRY, REMOTE SENSING AND SPATIAL INFORMATION SCIENCES》 * |
JIAHUAN ZHANG 等: "Multi-Feature Fusion for Weak Target Detection on Sea-Surface Based on FAR Controllable Deep Forest Model", 《REMOTE SENSING》 * |
VINCENT HAVYARIMANA 等: "A Fusion Framework Based on Sparse Gaussian–Wigner Prediction for Vehicle Localization Using GDOP of GPS Satellites", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 * |
姚群力 等: "基于多尺度融合特征卷积神经网络的遥感图像飞机目标检测", 《测绘学报》 * |
张庆春 等: "基于多特征融合和软投票的遥感图像河流检测", 《光学学报》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610838A (en) * | 2021-08-25 | 2021-11-05 | 华北电力大学(保定) | Bolt defect data set expansion method |
CN114155208A (en) * | 2021-11-15 | 2022-03-08 | 中国科学院深圳先进技术研究院 | Atrial fibrillation assessment method and device based on deep learning |
CN116229319A (en) * | 2023-03-01 | 2023-06-06 | 广东宜教通教育有限公司 | Multi-scale feature fusion class behavior detection method and system |
CN116453078A (en) * | 2023-03-14 | 2023-07-18 | 电子科技大学长三角研究院(湖州) | Traffic fixation target detection method based on significance priori |
CN116168302A (en) * | 2023-04-25 | 2023-05-26 | 耕宇牧星(北京)空间科技有限公司 | Remote sensing image rock vein extraction method based on multi-scale residual error fusion network |
CN116168302B (en) * | 2023-04-25 | 2023-07-14 | 耕宇牧星(北京)空间科技有限公司 | Remote sensing image rock vein extraction method based on multi-scale residual error fusion network |
CN116823838A (en) * | 2023-08-31 | 2023-09-29 | 武汉理工大学三亚科教创新园 | Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling |
CN116823838B (en) * | 2023-08-31 | 2023-11-14 | 武汉理工大学三亚科教创新园 | Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling |
Also Published As
Publication number | Publication date |
---|---|
CN113177456B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113177456B (en) | Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion | |
CN108961235B (en) | Defective insulator identification method based on YOLOv3 network and particle filter algorithm | |
CN108596055B (en) | Airport target detection method of high-resolution remote sensing image under complex background | |
CN111753828B (en) | Natural scene horizontal character detection method based on deep convolutional neural network | |
WO2016155371A1 (en) | Method and device for recognizing traffic signs | |
CN109685152A (en) | A kind of image object detection method based on DC-SPP-YOLO | |
CN110675370A (en) | Welding simulator virtual weld defect detection method based on deep learning | |
CN108564085B (en) | Method for automatically reading of pointer type instrument | |
CN107506761A (en) | Brain image dividing method and system based on notable inquiry learning convolutional neural networks | |
CN111950488B (en) | Improved Faster-RCNN remote sensing image target detection method | |
CN110659601B (en) | Depth full convolution network remote sensing image dense vehicle detection method based on central point | |
CN112396619A (en) | Small particle segmentation method based on semantic segmentation and internally complex composition | |
CN109345559B (en) | Moving target tracking method based on sample expansion and depth classification network | |
CN116758421A (en) | Remote sensing image directed target detection method based on weak supervised learning | |
CN111612747A (en) | Method and system for rapidly detecting surface cracks of product | |
CN116740528A (en) | Shadow feature-based side-scan sonar image target detection method and system | |
CN113313678A (en) | Automatic sperm morphology analysis method based on multi-scale feature fusion | |
CN113516771A (en) | Building change feature extraction method based on live-action three-dimensional model | |
CN114445356A (en) | Multi-resolution-based full-field pathological section image tumor rapid positioning method | |
CN116012310A (en) | Cross-sea bridge pier surface crack detection method based on linear residual error attention | |
CN107292268A (en) | The SAR image semantic segmentation method of quick ridge ripple deconvolution Structure learning model | |
CN112465821A (en) | Multi-scale pest image detection method based on boundary key point perception | |
Gooda et al. | Automatic detection of road cracks using EfficientNet with residual U-net-based segmentation and YOLOv5-based detection | |
CN112329677A (en) | Remote sensing image river target detection method and device based on feature fusion | |
CN116597275A (en) | High-speed moving target recognition method based on data enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |