CN115937672A - Remote sensing rotating target detection method based on deep neural network - Google Patents

Remote sensing rotating target detection method based on deep neural network Download PDF

Info

Publication number
CN115937672A
CN115937672A CN202211468221.2A CN202211468221A CN115937672A CN 115937672 A CN115937672 A CN 115937672A CN 202211468221 A CN202211468221 A CN 202211468221A CN 115937672 A CN115937672 A CN 115937672A
Authority
CN
China
Prior art keywords
network
module
feature
remote sensing
deep neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211468221.2A
Other languages
Chinese (zh)
Inventor
沈雨晨
宋智豪
业巧林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Forestry University
Original Assignee
Nanjing Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Forestry University filed Critical Nanjing Forestry University
Priority to CN202211468221.2A priority Critical patent/CN115937672A/en
Publication of CN115937672A publication Critical patent/CN115937672A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the field of image recognition, in particular to a remote sensing rotary target detection method based on a deep neural network, which aims to improve and solve the problem of neglecting global information in a classical network based on a characteristic fusion part of the deep neural network, improve the fusion effect of a characteristic diagram through connection fusion operation on the channel dimension of the characteristic diagram, enhance the information contained in the characteristic diagram and improve the precision of the network on a remote sensing image target detection task; according to the invention, through improving the network module, the universal remote sensing image public data set in the global scope is tested on the DOTA data set, compared with other methods, the method can achieve higher precision, the detection capability of small targets and large targets is improved, and meanwhile, the network parameters are not greatly increased.

Description

Remote sensing rotating target detection method based on deep neural network
Technical Field
The invention relates to the field of image recognition, in particular to a remote sensing rotary target detection method based on a deep neural network.
Background
Target detection based on remote sensing images is a major branch of the field of computer vision, which is also one of the most basic but challenging research topics. The method has a wide application prospect in target detection of remote sensing images, and accurate boundary frame identification plays an important role in many fields, such as forest disturbance dynamic monitoring, land resource management and urban environment evaluation.
In recent years, with the great development of deep convolutional neural networks, the target detection based on deep learning has been greatly developed, and various detection methods can be roughly divided into two types, wherein one type refers to, for example, dynamic R-CNN, CSL, PP-PicoDet, and the like, which explicitly optimize the training process to focus on high-quality samples by finely adjusting the label distribution standard (i.e., ioU threshold) and the loss function. Another category refers to, for example, DAL, aabo, ATSS, etc., which automatically adjust the configuration of the anchors by proposing new hyper-parameter optimization methods to customize more suitable anchors for a given data set. Many network researches can well complete simple remote sensing image target detection tasks, but most methods focus on feature extraction of a main network and processing of a classifier, most of the methods ignore importance of a feature fusion part, and the feature fusion part has great improvement space.
In a target detection task based on a remote sensing image, the following problems to be solved mostly exist: the problems that the detected targets are mostly small targets and are dense, the rotation angle of a target frame is arbitrary, when the length and the width of the target are large, no appropriate corresponding detection anchor exists, a large amount of instance-level noise exists in the remote sensing image and the like are solved, the precision of a detection network is directly or indirectly influenced, and the development of a target detection task of the remote sensing image is hindered.
For a feature map fusion part of a deep neural network, a classical network mostly focuses on fusion and extraction of local features, and ignores the role of global information in detection, so that loss of a small target in a feature map after multi-layer down-sampling can be caused, which is unfavorable for a remote sensing image detection task, especially for small target detection in a remote sensing image.
Disclosure of Invention
The invention aims to provide a remote sensing rotating target detection method based on a deep neural network, so as to solve the problems in the background technology.
In order to solve the technical problems, the invention provides the following technical scheme:
a remote sensing rotary target detection method based on a deep neural network is characterized by comprising the following steps:
s1, collecting a remote sensing image data set, acquiring data through a universal remote sensing image public data set in a global scope, and preprocessing the collected remote sensing image data set;
s2, reading the preprocessed image data, and performing online data enhancement on the data set by adopting a complex data enhancement method;
s3, inputting the extracted multilayer abstract features in the original image into an improved feature pyramid module for processing through a backbone extraction network;
s4, extracting the feature map of the processed image, and processing the feature map through a global information processing module;
s5, obtaining the result of the global information processing module according to the S4, and performing convolution processing on the feature map through a feature thinning module;
s6, transmitting the feature maps of all layers into a detector of a rotating target, wherein the detector of the rotating target is a deep learning module comprising a fully-connected network, the input of the detector of the rotating target is the feature maps, the output is the coordinates x and y of the central point of the detected target, the height h and w of the width of a detection frame and the rotation angle theta, obtaining a network prediction result through the feature maps, and constructing a network loss function according to the comparison result of the prediction result and a real sample label;
and S7, based on the minimization of the loss function, carrying out back propagation iteration through a driving quantity random gradient descent algorithm, and updating the weight of the training parameters in the network to realize the training of the deep neural network.
Further, the method for preprocessing the acquired remote sensing image data set in the S1 includes the following steps:
s1.1, acquiring a remote sensing image data set;
s1.2, cutting the original image into uniform size by adopting geometric transformation, wherein the geometric transformation comprises cutting, zooming, rotating and overturning;
according to the formula
Figure BDA0003957278820000021
The sizes of the cut images are 1024 multiplied by 1024, wherein S hou (x, y) represents a pixel point corresponding to the position information of the cut image, S qian Representing the original image, W qian Width of image before cutting, H qian Representing the height of the image before cutting;
and S1.3, taking the cut images as a set and recording the set as a training sample set.
Further, the method for performing online data enhancement on the read preprocessed image data and the data set by using a complex data enhancement method in S2 includes the following steps:
s2.1, obtaining a training sample set after cutting according to S1.3,
and S2.2, obtaining a training sample and carrying out rotation and turnover pretreatment operation on the sample.
Further, the method for inputting the extracted multilayer abstract features in the original image to the improved feature pyramid module for processing through the backbone extraction network in S3 includes the following steps:
s3.1, extracting multilayer abstract features from an original image by using a pre-trained residual error network as a backbone network, and constructing a feature pyramid module, wherein a feature pyramid is a network structure which is generally constructed in a mode of being narrow at the top and wide at the bottom, and the multilayer abstract features extracted from the backbone network become larger layer by layer and accord with the structural features of the feature pyramid;
and S3.2, inputting the extracted multi-layer abstract features into a feature pyramid module for processing, and performing feature fusion between two adjacent layers in the feature pyramid module.
Further, the method for extracting the feature map of the processed image in S4 and processing the feature map by the global information processing module includes the following steps:
s4.1, obtaining a characteristic pyramid module according to the S3.1, and processing the image characteristic graph through a global information processing module in the characteristic pyramid module;
s4.2, acquiring a convolution kernel D with the size of H multiplied by W and the number of channels of the original image, and taking a feature map with the size of 1024 multiplied by 1024 and the number of channels of C as input;
s4.3, using F epsilon as R H×W×C Denotes the convolution kernel, M ∈ R 1024×1024×C Represents the input, O ∈ R R×T×D Representing an output characteristic diagram according to a formula
Figure BDA0003957278820000031
Obtaining a corresponding output characteristic mapping channel of the jth convolution kernel, wherein an x represents a two-dimensional convolution operator, M :,:n Characteristic diagram representing an nth channel with dimensions 1024 × 1024>
Figure BDA0003957278820000032
Is shown as F (j) The jth channel size of (a) is 1024 × 1024 feature map.
S4.4, normalization processing O :,:j To obtain
Figure BDA0003957278820000033
Where μ j represents the batch normalized channel mean, σ j represents the batch normalized channel standard deviation, γ j represents the scaling factor, and β j represents the offset;
s4.5, according to the formula
Figure BDA0003957278820000034
To obtain 1 xk and kx 1 convolution kernels and output convolution kernels obtained by fusing the results, wherein F' (j) Representing the global processing output result, b j The offset is represented by the number of bits in the bit,
Figure BDA0003957278820000035
represents the output of a 1 xk convolution kernel>
Figure BDA0003957278820000036
Representing the k × 1 convolution kernel output.
The invention processes the feature map through the global information processing module, and uses the combined strip-shaped symmetrical convolution of 1 multiplied by k and multiplied by 1 to replace the traditional square convolution kernel of multiplied by k, thereby greatly reducing the parameters of the network, effectively increasing the receptive field, extracting more context details and providing data reference for the subsequent use of the feature refining module.
Further, in S5, a result of the global information processing module is obtained according to S4, and the method for performing convolution processing on the feature map by the feature refining module includes the following steps:
s5.1, obtaining a global processing output result F 'according to S4.5' (j)
S5.2, outputting a result F 'through global processing' (j) Inputting a corresponding feature map into a feature refining module;
s5.3, performing dimensionality reduction on the feature map by adopting a 1 × 1 convolution method;
s5.4, performing feature fusion processing by adopting a 3 x 3 convolution method;
s5.5, unifying the number of channels of the fused features through continuous 5-time convolution, and inputting the channels into a detector, wherein a detection head is a detection head.
According to the invention, the feature map is secondarily processed through the feature thinning module, the traditional feature superposition is changed into the combination on the channel on the basis of the feature pyramid, and in addition, the number of channels and the fusion features are reduced by using 1 × 1 convolution and 3 × 3, so that the number of parameters can be effectively reduced while more feature information is kept, and data reference is provided for the subsequent prediction object type and detection frame.
Further, the method for constructing the network loss function according to the comparison result between the prediction result and the real sample label in S6 includes the following steps:
s6.1, obtaining the feature map information processed by the feature refining module and transmitted to a detector of a rotating target in S5.5, outputting the coordinates x and y of the center point of the detecting target, the width and height h and w of a detecting frame and the rotating angle theta by the detector, and obtaining a network prediction result by processing;
s6.2, comparing the prediction result with the original image, and constructing a network loss function according to the comparison result
Figure BDA0003957278820000041
Figure BDA0003957278820000042
Where N represents the number of original images after the segmentation process, P q,w The probability of predicting the qth sample as the w-th label is shown, Y is the label of the original image information recorded as the real sample, M is the number of label values, L log (Y, P) is the total dataset loss function.
Further, the method for realizing the training of the deep neural network based on the minimization of the loss function in the S7 through back propagation iteration of the momentum stochastic gradient descent algorithm and updating the weight of the training parameter in the network comprises the following steps:
s7.1, obtaining a data set loss function L according to S6.2 log (Y,P);
S7.2, according to the formula Min = minL log ) Y, P) to obtain a value corresponding to the loss function minimization, and based on the loss function minimization, using a random gradient descent method of the momentum as an optimizer to reversely propagate and iteratively update the weight of the training parameters in the network so as to realize the training of the deep neural network.
A remote sensing rotary target detection system based on a deep neural network is characterized by comprising the following modules:
the data preprocessing and data enhancing module: the data preprocessing and data enhancing module is used for performing data enhancing processing on the original image, expanding a data set and realizing the preprocessing of the data set;
a trunk feature extraction network module: the trunk feature extraction network module is used for extracting multilayer abstract features from an original image, inputting multilayer feature maps in the trunk feature extraction network into the improved feature pyramid module and processing the multilayer feature maps;
the global information processing module: the global information processing module adopts a large convolution kernel to process and enlarge the receptive field of the network and enhance the network performance, thereby obtaining the global information of the characteristic diagram;
a characteristic refining module: the feature refinement module adopts the dimension combination of adjacent feature layer channels, and enhances the fusion of feature maps and extracts more available information while reducing the number of channels and parameters through five continuous convolutions after combination;
loss function back propagation module: and transmitting the characteristic graphs of all layers into a rotating detector to obtain a prediction result of the network, comparing the prediction result with a real sample label to construct a network loss function, carrying out back propagation iteration through a driving quantity random gradient descent algorithm based on minimization of the loss function, and updating the weight of training parameters in the network to realize the training of the deep neural network.
The method tests the universal remote sensing image public data set in the global scope on the DOTA data set by improving the network module, can achieve higher precision compared with other methods, particularly solves the problems of missing detection and false detection in the traditional method, improves the detection capability of small targets and large targets, and does not greatly increase network parameters.
Drawings
FIG. 1 is a schematic flow chart of a remote sensing rotary target detection method based on a deep neural network according to the present invention;
FIG. 2 is a network overall structure diagram of the remote sensing rotary target detection method based on the deep neural network of the present invention;
FIG. 3 is a block diagram of the global information processing module of the remote sensing rotary target detection system based on the deep neural network;
FIG. 4 is a diagram of a characteristic refining module of a remote sensing rotary target detection system based on a deep neural network;
FIG. 5 is an overall network diagram of the remote sensing rotary target detection method based on the deep neural network;
FIG. 6 is an algorithm effect diagram of the remote sensing rotating target detection method based on the deep neural network.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-5, in the embodiment of the present invention: a remote sensing rotating target detection method based on a deep neural network comprises the following steps:
a remote sensing rotary target detection method based on a deep neural network is characterized by comprising the following steps:
s1, collecting a remote sensing image data set, acquiring data through a universal remote sensing image public data set in a global scope, and preprocessing the collected remote sensing image data set;
s2, reading the preprocessed image data, and performing online data enhancement on the data set by adopting a complex data enhancement method;
s3, inputting the extracted multilayer abstract features in the original image into an improved feature pyramid module for processing through a backbone extraction network;
s4, extracting the feature map of the processed image, and processing the feature map through a global information processing module;
s5, obtaining the result of the global information processing module according to the S4, and performing convolution processing on the feature map through a feature thinning module;
s6, transmitting the characteristic graphs of all layers into a detector of a rotating target to obtain a network prediction result, and constructing a network loss function according to the prediction result and a real sample label comparison result; (ii) a
And S7, based on the minimization of the loss function, carrying out back propagation iteration through a driving quantity random gradient descent algorithm, and updating the weight of the training parameters in the network to realize the training of the deep neural network.
The method for preprocessing the acquired remote sensing image data set in the S1 comprises the following steps:
s1.1, acquiring a remote sensing image data set;
s1.2, cutting the original image into uniform size by adopting geometric transformation, wherein the geometric transformation comprises cutting, zooming, rotating and overturning;
according to the formula
Figure BDA0003957278820000061
The sizes of the cut images are 1024 multiplied by 1024, wherein S hou (x, y) represents a pixel point corresponding to the position information of the cut image, S qian Representing an original image, W qian Width of image before cutting, H qian Representing the height of the image before cutting;
and S1.3, taking the cut images as a set and recording the set as a training sample set.
The method for performing online data enhancement on the read preprocessed image data and the data set by adopting a complex data enhancement method in the S2 comprises the following steps:
s2.1, obtaining the cut training sample set according to S1.3,
and S2.2, obtaining a training sample and carrying out rotation and turnover pretreatment on the sample.
In the step S3, the method for inputting the extracted multilayer abstract features in the original image to the improved feature pyramid module for processing through the backbone extraction network includes the following steps:
s3.1, extracting a plurality of layers of abstract features from the original image by using a pre-trained residual error network as a backbone network, and constructing a feature pyramid module;
and S3.2, inputting the extracted multilayer abstract features into a feature pyramid module for processing, and performing feature fusion between two adjacent layers in the feature pyramid module.
The method for extracting the feature map of the processed image in the S4 and processing the feature map through the global information processing module comprises the following steps of:
s4.1, obtaining a characteristic pyramid module according to the S3.1, and processing the image characteristic graph through a global information processing module in the characteristic pyramid module;
s4.2, acquiring a convolution kernel D with the size of H multiplied by W and the number of channels of the original image, and taking a feature map with the size of 1024 multiplied by 1024 and the number of channels of C as input;
s4.3, using F ∈ R H×W×C Denotes the convolution kernel, M ∈ R 1024×1024×C Represents the input, O ∈ R R×T×D Representing an output characteristic diagram according to a formula
Figure BDA0003957278820000062
Obtaining a corresponding output characteristic mapping channel of the jth convolution kernel, wherein the star represents a two-dimensional convolution operator, M :,:n Characteristic diagram representing an nth channel with dimensions 1024 × 1024>
Figure BDA0003957278820000071
Is represented by F (j) The jth channel size of (a) is a 1024 x 1024 signature.
S4.4, normalization processing O :,:j To obtain
Figure BDA0003957278820000072
Wherein μ j represents the batch normalized channel mean, σ j represents the batch normalized channel standard deviation, γ j represents the scaling factor, and β j represents the offset;
s4.5 according to the formula
Figure BDA0003957278820000073
To obtain 1 xk and kx 1 convolution kernels and output convolution kernels obtained by fusing the results, wherein F' (j) Representing the global processing output result, b j Indicates a bias, <' > or>
Figure BDA0003957278820000074
Represents the output of a 1 xk convolution kernel>
Figure BDA0003957278820000075
Representing the k × 1 convolution kernel output.
In the step S5, the result of the global information processing module is obtained according to the step S4, and the method for performing convolution processing on the feature map by the feature refining module includes the following steps:
s5.1, obtaining a global processing output result F 'according to S4.5' (j)
S5.2, outputting the global processing output result F' (j) Inputting a corresponding feature map into a feature refining module;
s5.3, performing dimension reduction processing on the feature map by adopting a 1 × 1 convolution method;
s5.4, performing feature fusion processing by adopting a 3 x 3 convolution method;
and S5.5, unifying the number of channels of the fused features through continuous 5-time convolution, and inputting the unified number of channels into a detector.
The method for constructing the network loss function according to the comparison result of the prediction result and the real sample label in the S6 comprises the following steps:
s6.1, obtaining the characteristic graph information processed by the characteristic thinning module and transmitted to a detector of a rotating target to be processed to obtain a network prediction result by the S5.5;
s6.2, comparing the prediction result with the original image, and constructing a network loss function according to the comparison result
Figure BDA0003957278820000076
Figure BDA0003957278820000077
Where N represents the number of original images after the segmentation process, P q,w The probability of predicting the qth sample as the w-th label is shown, Y is the label of the original image information recorded as the real sample, M is the number of label values, L log (Y, P) is the total dataset loss function.
Based on the loss function minimization, the method for realizing the training of the deep neural network by back propagation iteration of the momentum stochastic gradient descent algorithm and updating the weight of the training parameters in the network in the S7 comprises the following steps:
s7.1, obtaining a data set loss function L according to S6.2 log (Y,P);
S7.2, according to the formula Min = minL log And (Y, P) obtaining a value corresponding to the loss function minimization, based on the loss function minimization, using a random gradient descent method with momentum as an optimizer, and performing back propagation to iteratively update the weight of the training parameters in the network so as to realize the training of the deep neural network.
A remote sensing rotary target detection system (as shown in fig. 2) based on a deep neural network, characterized in that the system comprises the following modules:
the data preprocessing and data enhancing module: the data preprocessing and data enhancing module is used for performing data enhancement processing on the original image, and the data enhancement is to expand a limited data set by some methods, increase the number and diversity of training sets and improve the generalization capability of the model;
a trunk feature extraction network module: the trunk feature extraction network module is used for extracting multilayer abstract features from an original image, inputting multilayer feature maps in the trunk feature extraction network into the improved feature pyramid module and processing the multilayer feature maps;
global information processing module (as shown in fig. 3): the global information processing module adopts a large convolution kernel for processing, the large convolution kernel can enlarge the receptive field of the network and enhance the network performance, but the large convolution kernel can cause the network parameters to be greatly increased, so that the invention uses the combined strip convolution of 1 xk and kx 1 to replace the traditional kxk square convolution kernel, greatly reduces the network parameters and simultaneously obtains the global information of the characteristic diagram;
feature refinement module (as shown in fig. 4): the feature refinement module adopts dimension combination of adjacent feature layer channels, can retain more useful information and is richer in interpretability compared with the traditional direct addition combination. After merging, through five continuous convolutions, the number of channels is reduced, the number of parameters is reduced, meanwhile, the fusion of the characteristic graphs is enhanced, and more available information is extracted;
loss function back propagation module: the feature maps of the layers are transmitted to a rotating detector to obtain a prediction result of the network, the prediction result is compared with a real sample label to construct a network loss function, the network loss function is minimized, back propagation iteration is performed through a driving quantity random gradient descent algorithm, and the weight of training parameters in the network is updated to achieve training of the deep neural network, and the deep neural network is shown as an integral network map in fig. 5.
In this embodiment, the remote sensing rotating target detection method based on the deep neural network performs prediction by performing preprocessing on an acquired original image, using a pre-trained residual error network as a feature extraction network, performing global information processing and feature refinement processing on an extracted feature map, performing fusion feature by 3 × 3, unifying the number of channels by convolution on the fused feature, inputting the unified number of channels into a detection head network, and predicting the type and detection frame of a target, as shown in fig. 6, which is a specific prediction effect map.
The network improved by the network module provided by the invention is tested on a universal remote sensing image public data set-DOTA data set in the global scope, and compared with other methods, the network improved by the network module can achieve higher precision. Particularly, the problems of missed detection and false detection in the traditional method are solved well, the detection capability of small targets and large targets is improved, and network parameters cannot be increased greatly. The average precision of the module provided by the invention is improved to 79.37 percent, compared with the average precision of 73.09 percent of the classical method without the method provided by the invention, the average precision is improved by 6.28 percent, and compared with other classical methods, the average precision is also improved.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A remote sensing rotary target detection method based on a deep neural network is characterized by comprising the following steps:
s1, collecting a remote sensing image data set, acquiring data through a universal remote sensing image public data set in a global scope, and preprocessing the collected remote sensing image data set;
s2, reading the preprocessed image data, and performing online data enhancement on the data set by adopting a complex data enhancement method;
s3, inputting the extracted multilayer abstract features in the original image into an improved feature pyramid module for processing through a backbone extraction network;
s4, extracting the feature map of the processed image, and processing the feature map through a global information processing module;
s5, obtaining the result of the global information processing module according to the S4, and performing convolution processing on the feature map through the feature thinning module;
s6, transmitting the characteristic graphs of all layers into a detector of a rotating target to obtain a network prediction result, and constructing a network loss function according to the prediction result and a real sample label comparison result;
and S7, based on the minimization of the loss function, carrying out back propagation iteration through a driving quantity random gradient descent algorithm, and updating the weight of the training parameters in the network to realize the training of the deep neural network.
2. The method for detecting the remote sensing rotating target based on the deep neural network as claimed in claim 1, wherein the method for preprocessing the acquired remote sensing image data set in the step S1 comprises the following steps:
s1.1, acquiring a remote sensing image data set;
s1.2, cutting the original image into uniform size by adopting geometric transformation, wherein the geometric transformation comprises cutting, zooming, rotating and overturning;
according to the formula
Figure FDA0003957278810000011
The sizes of the cut images are 1024 multiplied by 1024, wherein S hou (x, y) represents a pixel point corresponding to the position information of the cut image, S qian Representing the original image, W qian Width of image before cutting, H qian Representing the height of the image before cutting;
and S1.3, taking the cut images as a set and recording the set as a training sample set.
3. The remote sensing rotating target detection method based on the deep neural network as claimed in claim 2, wherein the method for performing online data enhancement on the data set by reading the preprocessed image data and adopting a complex data enhancement method in the S2 comprises the following steps:
s2.1, obtaining the cut training sample set according to S1.3,
and S2.2, obtaining a training sample and carrying out rotation and turnover pretreatment operation on the sample.
4. The method for detecting the remote sensing rotating target based on the deep neural network as claimed in claim 3, wherein in the step S3, the method for inputting the extracted multilayer abstract features in the original image into the improved feature pyramid module for processing through the trunk extraction network comprises the following steps:
s3.1, extracting a plurality of layers of abstract features from the original image by using a pre-trained residual error network as a backbone network, and constructing a feature pyramid module;
and S3.2, inputting the extracted multi-layer abstract features into a feature pyramid module for processing, and performing feature fusion between two adjacent layers in the feature pyramid module.
5. The method for detecting the remote sensing rotating target based on the deep neural network as claimed in claim 4, wherein the method for extracting the feature map of the processed image in the S4 and processing the feature map through the global information processing module comprises the following steps:
s4.1, obtaining a characteristic pyramid module according to the S3.1, and processing the image characteristic graph through a global information processing module in the characteristic pyramid module;
s4.2, acquiring a convolution kernel D with the size of H multiplied by W and the number of channels of the original image, and taking a feature map with the size of 1024 multiplied by 1024 and the number of channels of C as input;
s4.3, using F epsilon as R H×W×C Denotes the convolution kernel, M ∈ R 1024×1024×C Represents the input, O ∈ R R×T×D Representing output characteristic diagram according to formula
Figure FDA0003957278810000021
Obtaining a corresponding output characteristic mapping channel of the jth convolution kernel, wherein an x represents a two-dimensional convolution operator, M :,:n Characteristic diagram representing an nth channel with dimensions 1024 × 1024>
Figure FDA0003957278810000022
Is represented by F (j) The jth channel size of (a) is a 1024 x 1024 signature.
S4.4, normalization processing O :,:j To obtain
Figure FDA0003957278810000023
Wherein μ j represents the batch normalized channel mean, σ j represents the batch normalized channel standard deviation, γ j represents the scaling factor, and β j represents the offset;
s4.5, according to the formula
Figure FDA0003957278810000024
Obtaining convolution kernels of which the output results of the 1 xk convolution kernels and the kx 1 convolution kernels are fused, wherein F' (j) Representing the global processing output result, b j Indicates a bias, <' > or>
Figure FDA0003957278810000025
Represents the output of a 1 xk convolution kernel>
Figure FDA0003957278810000026
Representing the k × 1 convolution kernel output.
6. The method for detecting the remote sensing rotating target based on the deep neural network as claimed in claim 5, wherein the result of the global information processing module is obtained in the step S5 according to the step S4, and the method for performing convolution processing on the feature map through the feature refining module comprises the following steps:
s5.1, obtaining a global processing output result F 'according to S4.5' (j)
S5.2, outputting a result F 'through global processing' (j) Corresponding characteristic diagram input characteristic refining module;
S5.3, performing dimensionality reduction on the feature map by adopting a 1 × 1 convolution method;
s5.4, performing feature fusion processing by adopting a 3 x 3 convolution method;
and S5.5, unifying the number of channels of the fused features through continuous 5-time convolution, and inputting the unified number of channels into a detector.
7. The method for detecting the remote sensing rotating target based on the deep neural network as claimed in claim 6, wherein the method for constructing the network loss function according to the comparison result between the prediction result and the real sample label in the step S6 comprises the following steps:
s6.1, obtaining the characteristic graph information processed by the characteristic thinning module and transmitted to a rotating detector in S5.5, and processing to obtain a network prediction result;
s6.2, comparing the prediction result with the original image, and constructing a network loss function according to the comparison result
Figure FDA0003957278810000031
Figure FDA0003957278810000032
Where N represents the number of original images after the segmentation process, P q,w The probability of predicting the qth sample as the w-th label is shown, Y is the label of the original image information recorded as the real sample, M is the number of label values, L log (Y, P) is the total dataset loss function.
8. The remote sensing rotating target detection method based on the deep neural network of claim 7, wherein the method for realizing the training of the deep neural network based on the minimization of the loss function in S7 through back propagation iteration of a driving amount stochastic gradient descent algorithm and updating the weight of the training parameters in the network comprises the following steps:
s7.1, obtaining a data set loss function L according to S6.2 log (Y,P);
S7.2, according to a formula Min = minL log And (Y, P) obtaining a value corresponding to the loss function minimization, based on the loss function minimization, using a random gradient descent method of the momentum as an optimizer, and carrying out back propagation to iteratively update the weight of the training parameters in the network so as to realize the training of the deep neural network.
9. A remote sensing rotary target detection system based on a deep neural network is characterized by comprising the following modules:
the data preprocessing and data enhancing module: the data preprocessing and data enhancing module is used for performing data enhancement processing on the original image, expanding a data set and realizing the preprocessing of the data set;
a trunk feature extraction network module: the trunk feature extraction network module is used for extracting multilayer abstract features from an original image, inputting multilayer feature maps in the trunk feature extraction network into the improved feature pyramid module and processing the multilayer feature maps;
a global information processing module: the global information processing module adopts a large convolution kernel to process and enlarge the receptive field of the network and enhance the network performance, thereby obtaining the global information of the characteristic diagram;
a feature refining module: the feature refinement module adopts the dimension combination of adjacent feature layer channels, and enhances the fusion of feature maps and extracts more available information while reducing the number of channels and parameters through five continuous convolutions after combination;
loss function back propagation module: and transmitting the characteristic graphs of all layers into a rotating detector to obtain a prediction result of the network, comparing the prediction result with a real sample label to construct a network loss function, carrying out back propagation iteration through a driving quantity random gradient descent algorithm based on minimization of the loss function, and updating the weight of training parameters in the network to realize the training of the deep neural network.
CN202211468221.2A 2022-11-22 2022-11-22 Remote sensing rotating target detection method based on deep neural network Pending CN115937672A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211468221.2A CN115937672A (en) 2022-11-22 2022-11-22 Remote sensing rotating target detection method based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211468221.2A CN115937672A (en) 2022-11-22 2022-11-22 Remote sensing rotating target detection method based on deep neural network

Publications (1)

Publication Number Publication Date
CN115937672A true CN115937672A (en) 2023-04-07

Family

ID=86556884

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211468221.2A Pending CN115937672A (en) 2022-11-22 2022-11-22 Remote sensing rotating target detection method based on deep neural network

Country Status (1)

Country Link
CN (1) CN115937672A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168033B (en) * 2023-04-25 2023-08-22 厦门福信光电集成有限公司 Wafer lattice dislocation image detection method and system based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020104006A4 (en) * 2020-12-10 2021-02-18 Naval Aviation University Radar target recognition method based on feature pyramid lightweight convolutional neural network
CN113111727A (en) * 2021-03-19 2021-07-13 西北工业大学 Method for detecting rotating target in remote sensing scene based on feature alignment
CN114519819A (en) * 2022-02-10 2022-05-20 西北工业大学 Remote sensing image target detection method based on global context awareness
CN115187786A (en) * 2022-07-21 2022-10-14 北京工业大学 Rotation-based CenterNet2 target detection method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020104006A4 (en) * 2020-12-10 2021-02-18 Naval Aviation University Radar target recognition method based on feature pyramid lightweight convolutional neural network
CN113111727A (en) * 2021-03-19 2021-07-13 西北工业大学 Method for detecting rotating target in remote sensing scene based on feature alignment
CN114519819A (en) * 2022-02-10 2022-05-20 西北工业大学 Remote sensing image target detection method based on global context awareness
CN115187786A (en) * 2022-07-21 2022-10-14 北京工业大学 Rotation-based CenterNet2 target detection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUCHEN SHEN 等: ""Learning to Reduce Information Bottleneck for Object Detection in Aerial Images"", ARXIV:2204.02033V1, 5 April 2022 (2022-04-05), pages 1 - 5 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168033B (en) * 2023-04-25 2023-08-22 厦门福信光电集成有限公司 Wafer lattice dislocation image detection method and system based on deep learning

Similar Documents

Publication Publication Date Title
CN112418117B (en) Small target detection method based on unmanned aerial vehicle image
US20200250465A1 (en) Accurate tag relevance prediction for image search
US10235623B2 (en) Accurate tag relevance prediction for image search
Huang et al. A visual–textual fused approach to automated tagging of flood-related tweets during a flood event
US9665824B2 (en) Rapid image annotation via brain state decoding and visual pattern mining
CN109993100B (en) Method for realizing facial expression recognition based on deep feature clustering
Yang et al. A deep multiscale pyramid network enhanced with spatial–spectral residual attention for hyperspectral image change detection
CN112232371A (en) American license plate recognition method based on YOLOv3 and text recognition
CN116610778A (en) Bidirectional image-text matching method based on cross-modal global and local attention mechanism
Yulin et al. Wreckage target recognition in side-scan sonar images based on an improved faster r-cnn model
CN115269854A (en) False news detection method based on theme and structure perception neural network
CN115937672A (en) Remote sensing rotating target detection method based on deep neural network
CN116091946A (en) Yolov 5-based unmanned aerial vehicle aerial image target detection method
CN114626454A (en) Visual emotion recognition method integrating self-supervision learning and attention mechanism
CN113269734B (en) Tumor image detection method and device based on meta-learning feature fusion strategy
Sun et al. MOBS-TD: Multi-Objective Band Selection with Ideal Solution Optimization Strategy for Hyperspectral Target Detection
CN116543237B (en) Image classification method, system, equipment and medium for non-supervision domain adaptation of passive domain
CN112465821A (en) Multi-scale pest image detection method based on boundary key point perception
CN112861881A (en) Honeycomb lung recognition method based on improved MobileNet model
Pryor et al. Deepfake detection analyzing hybrid dataset utilizing CNN and SVM
CN116861923A (en) Multi-view unsupervised graph contrast learning model construction method, system, computer, storage medium and application
CN116681975A (en) Active learning-based open set image recognition method and system
US20210124780A1 (en) Graph search and visualization for fraudulent transaction analysis
CN113837377A (en) Neural network pruning method based on class mask
Liu et al. Peaks fusion assisted early-stopping strategy for overhead imagery segmentation with noisy labels

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination