CN113177456A - Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion - Google Patents

Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion Download PDF

Info

Publication number
CN113177456A
CN113177456A CN202110442872.3A CN202110442872A CN113177456A CN 113177456 A CN113177456 A CN 113177456A CN 202110442872 A CN202110442872 A CN 202110442872A CN 113177456 A CN113177456 A CN 113177456A
Authority
CN
China
Prior art keywords
image
target
feature
data set
remote sensing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110442872.3A
Other languages
Chinese (zh)
Other versions
CN113177456B (en
Inventor
白静
温征
唐晓川
董泽委
郭亚泽
裴晓龙
闫逊
孙放
张秀华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110442872.3A priority Critical patent/CN113177456B/en
Publication of CN113177456A publication Critical patent/CN113177456A/en
Application granted granted Critical
Publication of CN113177456B publication Critical patent/CN113177456B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20064Wavelet transform [DWT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-feature fusion-based optical remote sensing image target detection method, which mainly solves the problem of insufficient extraction of optical remote sensing image target features in the prior art. The implementation scheme is as follows: 1) respectively extracting mathematical morphology characteristics, linear scale space characteristics and nonlinear scale space characteristics of original data and fusing the three characteristics to obtain a fused characteristic diagram; 2) dividing the fusion characteristic graph into a training data set and a test data set, and performing small-target expansion on the training data set; 3) constructing a target detection network, and training the network on a training data set after target expansion by using a gradient descent algorithm; 4) and testing the test data set by using the trained network to obtain a detection result. The method enhances the contour characteristics and the edge characteristics of the target, is favorable for improving the accuracy of target detection, and can be used for resource exploration, natural disaster assessment and target identification.

Description

Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion
Technical Field
The invention belongs to the technical field of optical remote sensing images, and particularly relates to a target detection method with multi-feature fusion, which can be used for resource exploration, natural disaster assessment and target identification.
Background
The remote sensing image has the characteristics of complex background, unbalanced target category, large target scale change, special shooting visual angle and the like, so that the target detection aiming at the remote sensing image has great difficulty and challenge.
The traditional target detection method mainly depends on manually designed feature extraction operators to extract image features, including V-J detection, HOG detection, DPM algorithm and the like, and the main feature is that a detector can only use a fixed feature extraction algorithm to fit a single image feature, so that the traditional target detection method can only adapt to the situation that obvious features exist and the background is simple, and the requirement of a remote sensing image target detection task cannot be met.
The target detection method based on deep learning uses a convolution network to extract image features, and can extract various abundant features in the same target, so that the detection accuracy is far higher than that of the traditional manual design method, and the target detection method becomes an industry mainstream method at present and is widely used in remote sensing image target detection tasks.
A remote sensing image ship target detection method is provided by using a YOLO v5 network structure and an attention mechanism in SENET in a patent [ CN112580439A ].
A multi-scale feature extraction network is designed in the patent CN110378297A, the network can extract multi-scale features of images and respectively predict candidate regions of feature images corresponding to each image scale, and the accuracy of remote sensing image target detection is effectively improved.
In a patent [ CN112070729A ], an anchor-free-based target detection network is adopted, firstly, linear enhancement is carried out on an obtained remote sensing image data set in a balance coefficient mixed enhancement mode, and then, feature extraction and fusion are carried out on a depth residual error network ResNet-50 and a feature pyramid network FPN. The invention fully utilizes the context multi-feature fusion method, enhances the feature extraction capability and the category prediction capability of the network, and improves the detection precision.
However, the above methods based on the deep convolutional neural network are all technical solutions in which a convolution operation is directly applied to an original input image, or a simple linear enhancement mode is adopted to preprocess data, and this solution does not alleviate the difficulty of extracting features by the deep convolutional network, and especially for target detection under a complex background of a remote sensing image, this method for detecting a target based on the deep convolutional network cannot accurately extract feature information of a target part, thereby affecting the improvement of detection performance.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a remote sensing target detection method based on single-stage full convolution network and multi-feature fusion so as to accurately extract the feature information of a target part and improve the detection performance.
In order to achieve the purpose, the technical scheme of the invention comprises the following steps:
(1) respectively extracting morphological characteristics, linear scale space characteristics and nonlinear scale space characteristics of the optical remote sensing image:
1a) respectively performing opening operation and closing operation on the original image to obtain 2n initial feature maps, then performing pixel-by-pixel addition on all the initial feature maps and taking an average value to obtain a morphological feature map of the original image, wherein n represents the number of opening operation or closing operation;
1b) filtering the original image by using a Gaussian filter and a Sobel edge extraction operator respectively to obtain a three-channel Gaussian fuzzy feature map and four single-channel local edge feature maps; summing the four local edge feature graphs pixel by pixel and averaging to obtain an integral edge feature graph; performing pixel-by-pixel fusion on each channel component of the three-channel Gaussian feature map and the integral edge feature map to obtain a linear multi-scale spatial feature map;
1c) converting an original optical remote sensing image into a single-channel gray-scale image, and performing wavelet decomposition on the single-channel gray-scale image by using a two-dimensional single-level wavelet transformation function to obtain four single-channel subgraphs, namely a low-frequency component diagram, a horizontal high-frequency component diagram, a vertical high-frequency component diagram and a diagonal high-frequency component diagram; discarding the low-frequency component subgraph, and performing channel splicing on the other three high-frequency component subgraphs to obtain a nonlinear multi-scale spatial feature graph;
(2) constructing a fusion feature map:
2a) performing pixel-by-pixel fusion on the morphological feature map and the linear multi-scale space feature map according to the proportion of alpha and beta to obtain an initial fusion image, wherein the alpha and the beta meet the condition that the alpha + beta is 0.5;
2b) multiplying the original image by a proportionality coefficient of 0.5, then carrying out pixel-by-pixel summation with the original fusion image, and then carrying out pixel-by-pixel addition with the nonlinear multi-scale spatial feature map to obtain a final feature fusion image;
(3) data set partitioning and small target expansion:
3a) for all optical remote sensing images, calculating the maximum value and the minimum value of the area in all targets to be detected according to the labeling information, and marking as SmaxAnd SminAnd sets a threshold value
Figure BDA0003035819520000021
3b) All optical remote sensing images are randomly arranged according to the following 8: 2 into a training data set and a test data set;
3c) for each original image in the training set, traversing all the targets to be detected in the image, if the target area SiIf the value is less than the threshold value S, selecting a target-free position in the original image, and copying the minimum square area where the target is located to the selected position to obtain a new training image; otherwise, the original image is not changed; after traversing is completed, a new training data set is obtained;
(4) training and detecting by using a deep learning-based target detection network:
4a) respectively extracting and fusing the characteristics of the test data set and the new training data set according to the operations (1) and (2) to obtain a training data set and a test data set after the characteristics are fused;
4b) training the existing single-stage full convolution target detection network by using a training data set after feature fusion through a gradient descent algorithm until the overall loss of the network is not changed any more, and obtaining a trained target detection network;
4c) and inputting the test data set into a trained target detection network to obtain a target detection result of the optical remote sensing image.
Compared with the prior art, the invention has the following advantages:
firstly, before the convolution operation of the neural network, the multi-feature extraction and fusion operation is carried out on the original image, so that the contour feature and the edge feature of the target are enhanced, and therefore, when the deep convolution network is used for extracting the target feature, the deep convolution network is more sensitive to the target part, the extracted feature is more accurate, and the accuracy of target detection is favorably improved.
Secondly, the image morphological characteristics, the linear multi-scale characteristics and the nonlinear multi-scale characteristics are fused, so that compared with the existing simple linear data enhancement method, the target saliency is enhanced, especially for small targets and complex background areas, the target characteristics are effectively enhanced, the background part is restrained, the accuracy of extracting the target characteristics by the deep convolutional neural network is improved, and the detection performance is improved.
Drawings
FIG. 1 is a general flow chart of an implementation of the present invention;
FIG. 2 is a sub-flow diagram of the present invention for constructing morphological features of an image;
FIG. 3 is a sobel operator directional template used by the present invention;
FIG. 4 is a sub-flow diagram of the construction of a linear scale spatial feature map according to the present invention;
FIG. 5 is a sub-flow diagram of the construction of a non-linear scale space feature map according to the present invention.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, the implementation steps of this example are as follows:
the remote sensing image contains abundant spatial information and scale effect, multi-scale is a characteristic naturally existing in the remote sensing image ground object observation, different levels of ground object features and spatial relation rules can be obtained by analyzing from different scales, the multi-scale spatial information is very key for the accurate identification of the ground object, and a deep learning method generally extracts and classifies the features of the remote sensing image from a set certain scale level and lacks the comprehensive consideration of the multi-scale spatial information. Therefore, more and more scholars are beginning to research how to combine the multi-scale spatial features of the remote sensing images to improve the spatial comprehensive feature recognition capability. In addition, the unique role of mathematical morphology in quantitative description and analysis of image geometric features makes the remote sensing image processing research quite intensive, and the mathematical morphology is a classical nonlinear spatial information processing technology and can extract meaningful shape components from complex information of an optical remote sensing image and retain spatial geometric structural characteristics in the image. Therefore, the characteristics of remote sensing ground object classification can be better met. A large number of researches show that the mathematical morphology can accurately describe the contour and the spatial relationship of the ground feature, and the abundant spatial information can be effectively extracted from the remote sensing image based on the calculation and processing of the mathematical morphology method. Therefore, the invention designs a multi-feature extraction fusion technology based on the mathematical morphological features and the multi-scale features of the remote sensing images, and the specific implementation steps are as follows:
step 1, extracting mathematical morphology characteristics of an image.
The mathematical morphology method includes four basic operations: dilation, erosion, open and close operations. The expansion operation is to perform point-by-point convolution on the original image by using an operation kernel, and the maximum pixel value of the coverage area of the operation kernel is taken as a new pixel value of the convolution position. The erosion operation is to perform point-by-point convolution with the original image by using an operation kernel, and the minimum pixel value of the area covered by the operation kernel is used as a new pixel value of the convolution position. The open operation is to perform erosion before expansion on the image, and the close operation is to perform erosion after expansion on the image. In order to effectively eliminate image noise and smooth image edges, mathematical morphological features are extracted by adopting an open operation mode and a closed operation mode.
Referring to fig. 2, the specific implementation of this step is as follows:
1.1) performing opening operation and closing operation on an original image by using operation cores with the sizes of 3 × 3 and 5 × 5 respectively to obtain two opening operation characteristic diagrams open _3 and open _5 and two closing operation characteristic diagrams close _3 and close _5 respectively;
1.2) carrying out pixel-by-pixel summation and averaging on the obtained feature maps open _3, open _5, close _3 and close _5 to obtain a three-channel morphological feature map IMThe feature map has the same resolution and dimensions as the original image.
And 2, extracting the linear multi-scale spatial features of the image.
In the field of computer vision, the multi-scale features can effectively improve the results of tasks such as image classification, target detection and the like, and the adoption of a proper mode to construct the multi-scale features of the images, effectively fuse and utilize the multi-scale features and the multi-scale features are always the key problems concerned by researchers.
The gaussian kernel is the only kernel that can generate the multi-scale space, and filtering an image by using a gaussian filter can effectively establish the multi-scale space, and for a two-dimensional image I (x, y), the image after gaussian filtering is:
L(x,y,δ)=G(x,y,δ)*I(x,y)
wherein G (x, y, δ) represents a Gaussian function whose formula is
Figure BDA0003035819520000051
In the formula (x)0,y0) Is the coordinate of the central point, delta is a scale parameter, and determines the smoothness degree of the transformed image. The larger the value of delta, the better the filtering smoothness.
The Sobel operator is a discrete differential operator, and is commonly used for edge detection in image processing. In the implementation process of the Sobel operator, a 3 x 3 template is used as a convolution kernel to perform convolution operation with each pixel point in the image, and different direction templates are used to obtain edge detection characteristic maps in different directions.
The method for constructing linear multi-scale spatial features of images used in this example is mainly based on gaussian filter and Sobel edge extraction operator to extract edge features respectively, where the Sobel operator uses four directional templates (0 °,45 °,90 °,135 °), as shown in fig. 3.
Referring to fig. 4, the specific implementation of this step is as follows:
2.1) filtering the image by using a Gaussian filter to obtain a Gaussian blur characteristic map IG
2.2) converting the original image into a gray-scale image, and respectively extracting four edge features of the gray-scale image in four directions by using a sobel operator
Figure BDA0003035819520000052
Respectively representing a horizontal edge feature map, a vertical edge feature map and two diagonal edge feature maps of the image;
2.3) carrying out pixel-by-pixel fusion on the four extracted edge feature maps to obtain an integral edge feature map IS
Figure BDA0003035819520000053
By fusing the edge feature maps in the four directions, the edge feature part in the original picture can be enhanced, the pixel value of the edge feature part is far greater than 0, the non-edge part is inhibited, and the pixel value of the edge feature part is close to 0;
2.4) mapping the global edge profile ISSi fuzzy characteristic diagram IGFusing to obtain the final linear multi-scale space characteristic diagram IL
Figure BDA0003035819520000054
Wherein r is 0.3, IGiIs represented byGThe ith channel component of (1).
And 3, extracting the nonlinear multi-scale spatial features of the image.
The present example employs wavelet transformation as the primary method of constructing nonlinear multi-scale spatial features of an image, the wavelet basis using a two-dimensional single-level wavelet transform function dwt2 ().
Referring to fig. 5, the specific implementation of this step is as follows:
3.1) converting the common three-channel optical image into a single-channel gray-scale image;
3.2) carrying out wavelet decomposition on the gray level image in 3.1) by using a two-dimensional single-level wavelet transformation function to respectively obtain a low-frequency component subgraph, a horizontal high-frequency component subgraph, a vertical high-frequency component subgraph and a diagonal high-frequency component subgraph of the gray level image, wherein the resolution of each subgraph is only one fourth of that of the original image;
3.3) removing the low-frequency component subgraph in the step 3.2), extracting only three high-frequency component subgraphs and expanding the three high-frequency component subgraphs to be the same as the resolution of the original image by using a bilinear interpolation method;
3.4) splicing the three high-frequency component images after the resolution ratio expansion to obtain a nonlinear space multi-scale characteristic image INL
And 4, constructing a fusion characteristic graph.
The original image IpAnd Gaussian blur feature map IGLinear multi-scale space characteristic diagram ILNonlinear space multi-scale feature map INLCarrying out weighted summation to obtain a final fusion image I;
I=0.5×Ip+α×IM+β×IL+INL
wherein alpha and beta are two hyperparameters with different values, and alpha + beta is 0.5, because of INLThe pixel value in (2) is very small, and the pixel-by-pixel addition is directly performed because the term does not add a weight coefficient.
And 5, expanding the small target.
The detection of small targets is always a difficult point in the field of computer vision, the detection precision is generally not high in the existing various algorithm frames, and the small target sample is expanded in the data preprocessing stage by the method, and the method is specifically realized as follows:
5.1) calculating the area of a real target frame according to the labeling information for all samples in the data set, and finding out the maximum value S of the areamaxAnd minimum value Smin
5.2) setting threshold
Figure BDA0003035819520000061
Dividing all data into a training data set and a testing data set according to the ratio of 8: 2;
5.3) calculating the area S of the label box of all targets in each picture in the training seti∈(S1,S2...Sn) Go through SiAnd then comparing it with a set threshold:
if S isiIf S is true, the rectangular area where the target is located is copied, a new position is randomly selected in the image for pasting, and execution is carried out 5.4)
If S isiIf < S is not true, no operation is performed, and the next S is traversedi
5.4) selecting a new position:
5.4.1) randomly selecting a point in the image(x, y), calculating the size [ x, y, x + w ] of the label box of the new positioni,y+hi]Wherein x + wi,y+hiRespectively representing the width and the height of the new labeling box;
5.4.2) judging whether the new position is superposed with the existing label frame in the image;
if the position is not coincident, the pasting operation is carried out on the new position;
if the superposition is carried out, returning to 5.4.1), recording the returning times, and if the returning times reach 100 times, abandoning the pasting operation;
5.5) repeat 5.3) a total of 5 times, so that the small target is fully expanded.
And 6, constructing a deep convolutional network for training and detection.
6.1) data preprocessing
Performing multi-feature fusion operation on all optical remote sensing images according to the steps 1-4 to obtain feature fusion images, dividing all the feature fusion images into a training data set and a test data set according to the ratio of 8: 2, and performing small-target expansion on all the images in the training data set according to the step 5;
6.2) constructing a target detection network
The example adopts an existing single-stage full-convolution target Detection network as a Detection framework, the single-stage full-convolution target Detection network comprises a backbone network resnet-50, a feature pyramid network FPN, a classification header Class _ Head and a Detection header Detection _ Head, wherein the pyramid network FPN comprises five feature layers P3, P4, P5, P6 and P7, and target frames are predicted in the five feature layers respectively;
in this example, the top-most layer P7 of the FPN is deleted, and the target frame prediction is performed only on four feature layers P3, P4, P5, and P6, where the target frame prediction ranges in the four feature layers are: 0.64, 64.128, 128.256, 256 ∞.
6.3) network training
And (3) sending the training data set after the small target in the step 6.1) is expanded into the single-stage full convolution target detection network constructed in the step 6.2), and training by using a gradient descent algorithm until the network converges to obtain the trained single-stage full convolution target detection network.
6.4) network testing and result evaluation
And (3) sending the test data set in the 6.1) into the 6.3) trained network for testing to obtain the detection results of the network on all targets in the test data set.
The effect of the present invention is further explained by combining the simulation experiment as follows:
firstly, simulation experiment conditions:
the hardware platform of the simulation experiment of the invention is as follows: the CPU model is Intel Xeon E5-2630 v4, 20 cores, the main frequency is 2.4GHz, and the memory size is 64 GB; the GPU is NVIDIA GeForce GTX 1080Ti/PCIe/SSE2, and the video memory size is 20 GB.
The software platform of the simulation experiment of the invention is as follows: the operating system is Ubuntu20.04 LTS, the cuda version is 10.1, and the version of Pytrch is 1.5.0. The opencv version is 4.4.0.
The data set used for the experiment was the public remote sensing image data set LEVIR.
Second, simulation experiment and results
In the first experiment, original data are used for training and testing the existing single-stage full-convolution target detection network, and the average accuracy mAP and average recall ratio recall indexes are calculated according to the test result.
And secondly, preprocessing the original data by adopting a multi-feature fusion mode, training and testing the conventional single-stage full-convolution target detection network by using the preprocessed data, and calculating the average accuracy mAP and average recall indexes.
And thirdly, preprocessing original data by adopting a small target enhancement mode, training and testing the conventional single-stage full-convolution target detection network by using the preprocessed data, and calculating the average accuracy mAP and average recall indexes.
And fourthly, preprocessing original data by adopting a multi-feature fusion mode and a small target enhancement mode, training and testing the conventional single-stage full-convolution target detection network by using the preprocessed data, and calculating the average accuracy mAP and the average recall rate call indexes.
The results of the above experiments are shown in table 1.
TABLE 1 comparison of simulation test results
Experimental setup mAP Recall
Experiment one 90.3% 72.5%
Experiment two 90.6% 72.9%
Experiment three 91.1% 75.8%
Experiment four 91.4% 76.1%
By comparing the results of the first experiment and the second experiment with the results of the first experiment and the third experiment, the method for preprocessing data by using the multi-feature fusion mode and the small target enhancement mode can effectively improve the detection performance of the existing single-stage full convolution target detection network.
By comparing the results of the experiment four with the results of the experiment two and the experiment three, the improvement of the single-stage full-convolution target detection network performance by simultaneously using a small target enhancement mode and a multi-feature fusion mode to carry out data preprocessing can be seen to be most obvious.

Claims (5)

1. A target detection method of an optical remote sensing image based on multi-feature fusion is characterized by comprising the following steps:
(1) respectively extracting mathematical morphology characteristics, linear scale space characteristics and nonlinear scale space characteristics of the optical remote sensing image:
1a) respectively performing opening operation and closing operation on the original image to obtain 2n initial feature maps, then performing pixel-by-pixel addition on all the initial feature maps and taking an average value to obtain a mathematical morphology feature map of the original image, wherein n represents the number of the opening operation or the closing operation;
1b) filtering the original image by using a Gaussian filter and a Sobel edge extraction operator respectively to obtain a three-channel Gaussian fuzzy feature map and four single-channel local edge feature maps; summing the four local edge feature graphs pixel by pixel and averaging to obtain an integral edge feature graph; performing pixel-by-pixel fusion on each channel component of the three-channel Gaussian feature map and the integral edge feature map to obtain a linear multi-scale spatial feature map;
1c) converting an original optical remote sensing image into a single-channel gray-scale image, and performing wavelet decomposition on the single-channel gray-scale image by using a two-dimensional single-level wavelet transformation function to obtain four single-channel subgraphs, namely a low-frequency component diagram, a horizontal high-frequency component diagram, a vertical high-frequency component diagram and a diagonal high-frequency component diagram; discarding the low-frequency component subgraph, and performing channel splicing on the other three high-frequency component subgraphs to obtain a nonlinear multi-scale spatial feature graph;
(2) constructing a fusion feature map:
2a) performing pixel-by-pixel fusion on the mathematical morphology feature map and the linear multi-scale space feature map according to the proportion of alpha and beta to obtain an initial fusion image, wherein the alpha and the beta meet the condition that the alpha + beta is 0.5;
2b) multiplying the original image by a proportionality coefficient of 0.5, then carrying out pixel-by-pixel summation with the original fusion image, and then carrying out pixel-by-pixel addition with the nonlinear multi-scale spatial feature map to obtain a final feature fusion image;
(3) data set partitioning and small target expansion:
3a) for all optical remote sensing images, calculating the maximum value and the minimum value of the area in all targets to be detected according to the labeling information, and marking as SmaxAnd SminAnd sets a threshold value
Figure FDA0003035819510000011
3b) All optical remote sensing images are randomly arranged according to the following 8: 2 into a training data set and a test data set;
3c) for each original image in the training set, traversing all the targets to be detected in the image, if the target area SiIf the value is less than the threshold value S, selecting a target-free position in the original image, and copying the minimum square area where the target is located to the selected position to obtain a new training image; otherwise, the original image is not changed; after traversing is completed, a new training data set is obtained;
(4) training and detecting by using a deep learning-based target detection network:
4a) respectively extracting and fusing the characteristics of the test data set and the new training data set according to the operations (1) and (2) to obtain a training data set and a test data set after the characteristics are fused;
4b) training the existing single-stage full convolution target detection network by using a training data set after feature fusion through a gradient descent algorithm until the overall loss of the network is not changed any more, and obtaining a trained target detection network;
4c) and inputting the test data set into a trained target detection network to obtain a target detection result of the optical remote sensing image.
2. The method according to claim 1, wherein 1a) the opening operation and the closing operation are respectively performed on the original image, and the original optical remote sensing image is subjected to the expansion-first corrosion-then-corrosion operation and the expansion-first corrosion-then-corrosion operation by using convolution kernels with the sizes of 3 x 3 and 5 x 5 respectively, so as to obtain two opening operation characteristic maps and two closing operation characteristic maps.
3. The method according to claim 1, wherein the filtering of the original image by using the Sobel edge extraction operator in 1b) is performed by convolving four convolution kernels with a size of 3 × 3 with the original image respectively to obtain four local edge feature maps, wherein directions of the four convolution kernels are respectively: 0 °,45 °,90 °,135 °.
4. The method of claim 1, wherein said selecting a block of non-target locations in the original image in 3c) is performed as follows:
3c1) randomly selecting a position (x, y) in an original image, and calculating the marking frame information [ x, y, x + w ] of the new positioni,y+hi]Wherein x + wi,y+hiRespectively representing the width and height of the new target frame;
3c2) judging whether the new position is overlapped with an existing label frame in the current image, if not, selecting the position for subsequent operation, otherwise, returning to 3c 1);
3c3) when the selection is successful or the random selection of the position in 3c1) is repeated for 100 times, the position selection is finished.
5. The method of claim 1, wherein the training of the existing single-stage full-convolution target detection network by the gradient descent algorithm in 4b) is implemented as follows:
4b1) deleting the topmost feature layer P7 of the FPN in the single-stage full-convolution target detection network, and reserving the feature layers P3, P4, P5 and P6;
4b2) sending the training data into a network for forward propagation, and performing target frame regression on the P3, P4, P5 and P6 feature layers reserved in 4b1) to obtain a target prediction result, wherein the size ranges of predicted targets in the four feature layers are as follows: 0,64,128,256, ∞;
4b3) calculating the integral loss between the prediction result obtained by the step 4b2) and the real label, then performing back propagation, and updating the network parameters;
4b4) repeat 4b2) -4b3) until the network converges.
CN202110442872.3A 2021-04-23 2021-04-23 Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion Active CN113177456B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110442872.3A CN113177456B (en) 2021-04-23 2021-04-23 Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110442872.3A CN113177456B (en) 2021-04-23 2021-04-23 Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion

Publications (2)

Publication Number Publication Date
CN113177456A true CN113177456A (en) 2021-07-27
CN113177456B CN113177456B (en) 2023-04-07

Family

ID=76924464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110442872.3A Active CN113177456B (en) 2021-04-23 2021-04-23 Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion

Country Status (1)

Country Link
CN (1) CN113177456B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610838A (en) * 2021-08-25 2021-11-05 华北电力大学(保定) Bolt defect data set expansion method
CN114155208A (en) * 2021-11-15 2022-03-08 中国科学院深圳先进技术研究院 Atrial fibrillation assessment method and device based on deep learning
CN116168302A (en) * 2023-04-25 2023-05-26 耕宇牧星(北京)空间科技有限公司 Remote sensing image rock vein extraction method based on multi-scale residual error fusion network
CN116229319A (en) * 2023-03-01 2023-06-06 广东宜教通教育有限公司 Multi-scale feature fusion class behavior detection method and system
CN116453078A (en) * 2023-03-14 2023-07-18 电子科技大学长三角研究院(湖州) Traffic fixation target detection method based on significance priori
CN116823838A (en) * 2023-08-31 2023-09-29 武汉理工大学三亚科教创新园 Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629378A (en) * 2012-03-01 2012-08-08 西安电子科技大学 Remote sensing image change detection method based on multi-feature fusion
CN107092871A (en) * 2017-04-06 2017-08-25 重庆市地理信息中心 Remote sensing image building detection method based on multiple dimensioned multiple features fusion
CN107292339A (en) * 2017-06-16 2017-10-24 重庆大学 The unmanned plane low altitude remote sensing image high score Geomorphological Classification method of feature based fusion
CN108154192A (en) * 2018-01-12 2018-06-12 西安电子科技大学 High Resolution SAR terrain classification method based on multiple dimensioned convolution and Fusion Features
CN108537238A (en) * 2018-04-13 2018-09-14 崔植源 A kind of classification of remote-sensing images and search method
CN109214439A (en) * 2018-08-22 2019-01-15 电子科技大学 A kind of infrared image icing River detection method based on multi-feature fusion
CN109271928A (en) * 2018-09-14 2019-01-25 武汉大学 A kind of road network automatic update method based on the fusion of vector road network with the verifying of high score remote sensing image
CN109325395A (en) * 2018-04-28 2019-02-12 二十世纪空间技术应用股份有限公司 The recognition methods of image, convolutional neural networks model training method and device
CN112132006A (en) * 2020-09-21 2020-12-25 西南交通大学 Intelligent forest land and building extraction method for cultivated land protection
CN112329677A (en) * 2020-11-12 2021-02-05 北京环境特性研究所 Remote sensing image river target detection method and device based on feature fusion
CN112395958A (en) * 2020-10-29 2021-02-23 中国地质大学(武汉) Remote sensing image small target detection method based on four-scale depth and shallow layer feature fusion
CN112465880A (en) * 2020-11-26 2021-03-09 西安电子科技大学 Target detection method based on multi-source heterogeneous data cognitive fusion
CN112580439A (en) * 2020-12-01 2021-03-30 中国船舶重工集团公司第七0九研究所 Method and system for detecting large-format remote sensing image ship target under small sample condition

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629378A (en) * 2012-03-01 2012-08-08 西安电子科技大学 Remote sensing image change detection method based on multi-feature fusion
CN107092871A (en) * 2017-04-06 2017-08-25 重庆市地理信息中心 Remote sensing image building detection method based on multiple dimensioned multiple features fusion
CN107292339A (en) * 2017-06-16 2017-10-24 重庆大学 The unmanned plane low altitude remote sensing image high score Geomorphological Classification method of feature based fusion
CN108154192A (en) * 2018-01-12 2018-06-12 西安电子科技大学 High Resolution SAR terrain classification method based on multiple dimensioned convolution and Fusion Features
CN108537238A (en) * 2018-04-13 2018-09-14 崔植源 A kind of classification of remote-sensing images and search method
CN109325395A (en) * 2018-04-28 2019-02-12 二十世纪空间技术应用股份有限公司 The recognition methods of image, convolutional neural networks model training method and device
CN109214439A (en) * 2018-08-22 2019-01-15 电子科技大学 A kind of infrared image icing River detection method based on multi-feature fusion
CN109271928A (en) * 2018-09-14 2019-01-25 武汉大学 A kind of road network automatic update method based on the fusion of vector road network with the verifying of high score remote sensing image
CN112132006A (en) * 2020-09-21 2020-12-25 西南交通大学 Intelligent forest land and building extraction method for cultivated land protection
CN112395958A (en) * 2020-10-29 2021-02-23 中国地质大学(武汉) Remote sensing image small target detection method based on four-scale depth and shallow layer feature fusion
CN112329677A (en) * 2020-11-12 2021-02-05 北京环境特性研究所 Remote sensing image river target detection method and device based on feature fusion
CN112465880A (en) * 2020-11-26 2021-03-09 西安电子科技大学 Target detection method based on multi-source heterogeneous data cognitive fusion
CN112580439A (en) * 2020-12-01 2021-03-30 中国船舶重工集团公司第七0九研究所 Method and system for detecting large-format remote sensing image ship target under small sample condition

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GUANGHUI WANG 等: "CHANGE DETECTION OF HIGH-RESOLUTION REMOTE SENSING IMAGES BASED ON ADAPTIVE FUSION OF MULTIPLE FEATURES", 《THE INTERNATIONAL ARCHIVES OF THE PHOTOGRAMMETRY, REMOTE SENSING AND SPATIAL INFORMATION SCIENCES》 *
JIAHUAN ZHANG 等: "Multi-Feature Fusion for Weak Target Detection on Sea-Surface Based on FAR Controllable Deep Forest Model", 《REMOTE SENSING》 *
VINCENT HAVYARIMANA 等: "A Fusion Framework Based on Sparse Gaussian–Wigner Prediction for Vehicle Localization Using GDOP of GPS Satellites", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 *
姚群力 等: "基于多尺度融合特征卷积神经网络的遥感图像飞机目标检测", 《测绘学报》 *
张庆春 等: "基于多特征融合和软投票的遥感图像河流检测", 《光学学报》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610838A (en) * 2021-08-25 2021-11-05 华北电力大学(保定) Bolt defect data set expansion method
CN114155208A (en) * 2021-11-15 2022-03-08 中国科学院深圳先进技术研究院 Atrial fibrillation assessment method and device based on deep learning
CN116229319A (en) * 2023-03-01 2023-06-06 广东宜教通教育有限公司 Multi-scale feature fusion class behavior detection method and system
CN116453078A (en) * 2023-03-14 2023-07-18 电子科技大学长三角研究院(湖州) Traffic fixation target detection method based on significance priori
CN116168302A (en) * 2023-04-25 2023-05-26 耕宇牧星(北京)空间科技有限公司 Remote sensing image rock vein extraction method based on multi-scale residual error fusion network
CN116168302B (en) * 2023-04-25 2023-07-14 耕宇牧星(北京)空间科技有限公司 Remote sensing image rock vein extraction method based on multi-scale residual error fusion network
CN116823838A (en) * 2023-08-31 2023-09-29 武汉理工大学三亚科教创新园 Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling
CN116823838B (en) * 2023-08-31 2023-11-14 武汉理工大学三亚科教创新园 Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling

Also Published As

Publication number Publication date
CN113177456B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN113177456B (en) Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion
CN108961235B (en) Defective insulator identification method based on YOLOv3 network and particle filter algorithm
CN108596055B (en) Airport target detection method of high-resolution remote sensing image under complex background
CN111753828B (en) Natural scene horizontal character detection method based on deep convolutional neural network
WO2016155371A1 (en) Method and device for recognizing traffic signs
CN109685152A (en) A kind of image object detection method based on DC-SPP-YOLO
CN110675370A (en) Welding simulator virtual weld defect detection method based on deep learning
CN108564085B (en) Method for automatically reading of pointer type instrument
CN107506761A (en) Brain image dividing method and system based on notable inquiry learning convolutional neural networks
CN111950488B (en) Improved Faster-RCNN remote sensing image target detection method
CN110659601B (en) Depth full convolution network remote sensing image dense vehicle detection method based on central point
CN112396619A (en) Small particle segmentation method based on semantic segmentation and internally complex composition
CN109345559B (en) Moving target tracking method based on sample expansion and depth classification network
CN116758421A (en) Remote sensing image directed target detection method based on weak supervised learning
CN111612747A (en) Method and system for rapidly detecting surface cracks of product
CN116740528A (en) Shadow feature-based side-scan sonar image target detection method and system
CN113313678A (en) Automatic sperm morphology analysis method based on multi-scale feature fusion
CN113516771A (en) Building change feature extraction method based on live-action three-dimensional model
CN116012310A (en) Cross-sea bridge pier surface crack detection method based on linear residual error attention
CN107292268A (en) The SAR image semantic segmentation method of quick ridge ripple deconvolution Structure learning model
Gooda et al. Automatic detection of road cracks using EfficientNet with residual U-net-based segmentation and YOLOv5-based detection
CN112465821A (en) Multi-scale pest image detection method based on boundary key point perception
CN112329677A (en) Remote sensing image river target detection method and device based on feature fusion
CN116597275A (en) High-speed moving target recognition method based on data enhancement
CN111652287A (en) Hand-drawing cross pentagon classification method for AD (analog-to-digital) scale based on convolution depth neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant