CN110532859B - Remote sensing image target detection method based on deep evolution pruning convolution net - Google Patents

Remote sensing image target detection method based on deep evolution pruning convolution net Download PDF

Info

Publication number
CN110532859B
CN110532859B CN201910648586.5A CN201910648586A CN110532859B CN 110532859 B CN110532859 B CN 110532859B CN 201910648586 A CN201910648586 A CN 201910648586A CN 110532859 B CN110532859 B CN 110532859B
Authority
CN
China
Prior art keywords
convolution
layer
network
deep
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910648586.5A
Other languages
Chinese (zh)
Other versions
CN110532859A (en
Inventor
焦李成
李玲玲
姜升
郭雨薇
程曦娜
丁静怡
张梦璇
杨淑媛
侯彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910648586.5A priority Critical patent/CN110532859B/en
Publication of CN110532859A publication Critical patent/CN110532859A/en
Application granted granted Critical
Publication of CN110532859B publication Critical patent/CN110532859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a remote sensing image target detection method based on a deep evolution pruning convolution network, which solves the problem that the detection speed and the detection precision are not simultaneously and globally effectively optimized in the conventional remote sensing image target detection. The method comprises the following specific steps: processing the data set; constructing a deep convolution characteristic extraction subnet; constructing a full convolution FCN detection subnet; constructing and training a deep convolution target detection network; constructing and training a target detection network based on a deep evolution pruning convolution network; carrying out target detection on the test data set by using the trained model; and outputting a test result. According to the method, the inverse residual structure is constructed by using the depth separable convolution, so that the model parameters are greatly reduced while the detection precision is high; the target detection network is combined with the evolutionary pruning, and overall acceleration is achieved. The method greatly reduces the calculated amount, obviously improves the target detection speed, has high detection precision, and is used for quickly and accurately detecting small targets such as airplanes, ships and warships in the remote sensing image.

Description

Remote sensing image target detection method based on deep evolution pruning convolution net
Technical Field
The invention belongs to the technical field of image processing, and further relates to remote sensing image target detection, in particular to a remote sensing image target detection method based on a deep evolution pruning convolution network, which can be applied to detecting surface feature targets of airplanes and ships in different areas in a remote sensing image.
Background
The target detection technology is one of core problems in the field of computer vision, and the remote sensing image target detection means that an image captured by a remote sensing satellite is used as a data source, and an image processing technology is adopted to position and classify an interested target in the image. The remote sensing image target detection is used as a key technology in the application of the remote sensing image, can capture an attack target in high-tech military countermeasure, provides accurate position and category information and the like, has great influence on the military field, and has important application and research values.
In the prior art, due to the large size, low resolution, small target size and fuzzy target edge of the remote sensing image, the characteristics of the target often cannot be well learned when the target detection of the remote sensing image is carried out by the existing method, so that the accuracy of the target detection is low, and the detection speed is greatly limited due to the huge data volume of the remote sensing image and the huge parameters of the network model.
The efficiency and the accuracy of the existing target detection technology are often not compatible. The second-order detection model such as FasterR-CNN has high accuracy and brings huge calculation amount; although the first-order detection models such as YOLO and SSD have fast calculation speed, the accuracy is not satisfactory.
Tsung-Yi Lin et al put forward a general one-stage target Detection model RetinaNet in a published paper "Focal local for sense Object Detection" (CVPR2017), the model utilizes a residual error network ResNet to complete the primary extraction of image features, a feature pyramid network FPN is added to fuse feature maps of different layers generated by the residual error network, the semantic information of output features is enhanced, small targets are easy to identify, the Detection performance is further improved, then classification and regression prediction are carried out on each layer of pyramid layer, finally, the problem of class imbalance caused by excessive background, which affects the accuracy of the one-stage target Detection model, is solved by utilizing a Focal local function, and the Detection result of the one-stage target Detection model on a COCO data set is firstly higher than that of the most advanced two-stage target Detection model at that time. However, the method still has the defects that a large amount of redundant information exists in the residual error network ResNet and the feature pyramid network FPN, the parameter quantity and the operation quantity are large, the calculation complexity and the calculation speed of the model are influenced, and the requirement of deployment in the embedded equipment is not met.
Shaohui Lin et al proposed a Global Dynamic Pruning method GDP in its published paper "additive conditional network via Global & Dynamic Filter Pruning" (IJCAI2018), first providing a Global discriminant function based on prior knowledge of each Filter, Pruning filters with low significance at all levels in a Global range, then dynamically updating the significance of the whole Pruning sparse network Filter, recoding and retraining the Filter with wrong Pruning to improve the accuracy of the model, and performing Global optimization by using a greedy algorithm-based stochastic gradient descent method. However, the method still has the disadvantages that the global discriminant function based on the filter priori knowledge needs to be designed according to specific tasks, and discriminant deviation may be introduced by using the same global discriminant function in different applications, so that the overall accuracy is lost.
When the target detection is carried out on an optical remote sensing image with large size and low resolution by the existing target detection algorithm, the detection accuracy and detection speed in the prior art cannot be optimal at the same time due to the limitation of huge data volume and model parameter quantity, and the problems of small target size, fuzzy target edge and the like, so that the rapid and accurate detection of the optical remote sensing image is difficult.
Disclosure of Invention
The invention aims to provide a remote sensing image target detection method based on a deep evolution pruning convolution network, which maintains higher accuracy, greatly reduces the computation complexity and greatly improves the overall network operation speed, aiming at the defects of the prior art.
The invention relates to a remote sensing image target detection method based on a deep evolution pruning convolution net, which is characterized by comprising the following steps:
(1) processing the training dataset and the validation dataset: selecting a plurality of optical remote sensing images containing various targets, processing the images into image blocks with 512 x 512 pixels, wherein 70% of the image blocks of the optical remote sensing images form training data, 30% of the image blocks form a verification data set, and performing data enhancement on the training data set;
(2) processing the test data set: inputting another plurality of optical remote sensing images containing various targets, and processing the images into image blocks with 512 x 512 pixels to form a test data set;
(3) constructing a deep convolution feature extraction sub-network: respectively constructing a depth separable convolution inverse residual error connecting module and a characteristic pyramid convolution module, sequentially using a 7 multiplied by 7 convolution layer and a maximum pooling layer, and alternately connecting the depth separable convolution inverse residual error connecting module and the characteristic pyramid convolution module to form a depth convolution characteristic extraction sub-network;
the specific structure of the sub-network for extracting the depth convolution features is that an original image input layer → 7 × 7 convolution layer → a first maximum pooling layer → a first depth separable convolution inverse residual connection module C1 → a second depth separable convolution inverse residual connection module C2 → a first feature pyramid convolution module P1 → a third depth separable convolution inverse residual connection module C3 → a second feature pyramid convolution module P2 → a fourth depth separable convolution inverse residual connection module C4 → a third feature pyramid convolution module P3 → a second maximum pooling layer → a fourth feature pyramid convolution module P4 → a third maximum pooling layer → a fifth feature pyramid convolution module P5 → a current stage feature map output layer;
(4) constructing a fully-convoluted FCN detection subnetwork:
(4a) constructing a full convolution FCN classification subnet: the structure is, classifying the subnet input layer → the first 3 x 3 convolution layer → the second 3 x 3 convolution layer → the third 3 x 3 convolution layer → the fourth 3 x 3 convolution layer → the fifth 3 x 3 convolution layer → classifying the subnet output layer; the classified subnet input layer takes the characteristic graph of each characteristic pyramid convolution module as the input of the classified subnet in turn, and carries out classification detection in turn;
(4b) constructing a fully-convoluted FCN regression subnet: the structure is, the regression subnet input layer → the first 3 x 3 convolution layer → the second 3 x 3 convolution layer → the third 3 x 3 convolution layer → the fourth 3 x 3 convolution layer → the fifth 3 x 3 convolution layer → the regression subnet output layer; the regression subnet input layer takes the characteristic graph of each characteristic pyramid convolution module as the input of the regression subnet in turn, and carries out regression detection in turn;
(5) constructing and training a deep convolution target detection network:
(5a) constructing a deep convolution target detection network: sequentially constructing a deep convolution target detection network by using a deep convolution feature extraction sub-network, a full convolution FCN classification sub-network and a full convolution FCN regression sub-network, wherein the structure of the deep convolution target detection network is that an original image input layer → the deep convolution feature extraction sub-network → the full convolution FCN classification regression sub-network; (5b) training a deep convolution target detection network: training the deep convolution target detection network by using the training data set and the verification data set as input to obtain a trained deep convolution target detection network, and storing a weight file of the trained deep convolution target detection network;
(6) constructing and training a target detection network based on a deep evolution pruning convolution network:
(6a) performing layer-by-layer DNA coding on a convolutional filter participating in pruning in a trained deep convolutional target detection network, and recording the coding as DNA1,...l-1,l
(6b) Optimizing DNA using evolutionary algorithms1,...l-1,lCoding to obtain final optimized result coding DNA'1,...l-1,l
(6c) Coding of synthetic DNA 'with optimization results'1,...l-1,lAnd pruning rule constructionThe method comprises the steps that a target detection network based on a deep evolution pruning convolutional network is obtained, a pruning rule is that a code is 0 to represent that a convolutional filter is finally pruned, a code is 1 to represent that the convolutional filter is finally reserved, a training data set is used for fine adjustment, the trained target detection network based on the deep evolution pruning convolutional network, namely a trained model is obtained, and a trained model weight file is stored;
(7) and carrying out target detection on the test data set by using the trained model:
(7a) sequentially inputting the data blocks in the test data set into a trained target detection network based on a deep evolution pruning convolutional network to obtain a candidate frame of each data block in the test data set, a classification confidence score corresponding to the candidate frame and a target category corresponding to the candidate frame;
(7b) discarding all candidate frames of the target category with the classification confidence score lower than the threshold value 0.3, and performing non-maximum suppression processing on the remaining candidate frames after retention;
(7c) and mapping the coordinates of all the reserved candidate frames, mapping the coordinates onto the optical remote sensing image before cutting, and performing secondary non-maximum suppression processing to obtain a final detection result image of the optical remote sensing image.
The invention discloses a remote sensing image target detection method based on a deep evolution pruning convolution network, which mainly solves the problem that the detection speed and the target detection precision are not simultaneously globally and effectively optimized in the existing remote sensing image target detection technology.
Compared with the prior art, the invention has the following advantages:
an optimization scheme is provided: the method provided by the invention optimizes the model accuracy and the operation speed simultaneously, has obvious advantages in the aspects of calculation complexity and operation speed, and improves the model accuracy compared with the prior art.
The target detection network is combined with a global dynamic pruning method based on an evolutionary algorithm to realize network acceleration: a brand-new global and dynamic pruning scheme based on an evolutionary algorithm is provided, and redundant filters are removed through a pruning method, so that CNN acceleration is realized. Most previous approaches tend to prune filters sequentially in a fixed layer-by-layer manner, which cannot dynamically recover previously removed filters, may ignore complex associations between filters, and may result in poor flexibility, which may result in significant degradation of network evaluation performance. The method comprises the steps of carrying out joint coding on all layers of filters to be pruned, optimizing a network to be pruned through an evolutionary algorithm, using the performance of the network in a test data set as the fitness of the evolutionary algorithm, and completing iterative optimization of a network structure in a retraining mode, so that the model achieves an ideal acceleration effect and the performance of the model is guaranteed.
The model parameter quantity is greatly reduced: the remote sensing image target detection method based on the deep evolution pruning convolution network replaces standard convolution in a ResNet network by using deep separable convolution, and does not use a ReLU activation function after 1 multiplied by 1 point-by-point convolution in a deep separable convolution unit, but uses a linear activation function to prevent characteristic information from being damaged, so that parameter quantity and calculated quantity required by fitting data are reduced while model detection precision is maintained, convergence speed of a network model is accelerated, the defect that the model operation speed is lost due to large network parameter quantity in the prior art is overcome, and the method can be applied to computing equipment with limited computing resources and storage resources.
The accuracy is ensured while the parameter quantity is reduced: a traditional residual connecting structure uses a1 x 1 convolutional layer to perform dimension reduction and dimension increase on a channel of an input feature map, but features are compressed after dimension reduction, part of useful feature information in the image is removed, and less target feature information causes the accuracy rate of model detection to be reduced. The invention designs an inverse residual connecting module, which is used for increasing the dimension by two times of the number of channels of an input feature map to obtain more image feature information, and then accessing a depth separable unit to extract features and reduce the dimension of the channels, so that the number of parameters is reduced, the network operation speed is increased, and meanwhile, the higher accuracy is maintained.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a block diagram of a depth separable convolution element of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings.
Example 1:
the remote sensing image target detection is an application which is of great interest in the field of remote sensing image processing and analysis, for example, whether targets such as airplanes and ships exist in a remote sensing image or not is judged, and the targets are identified, classified and accurately positioned. With the continuous development of satellite technology, the data volume of the existing optical remote sensing image is more and more huge, and compared with the vast sea area, the size of the target of an airplane and a ship is small, the target is sparse, and how to quickly and accurately detect the target from the massive optical remote sensing images is a challenging task. However, the existing remote sensing image target detection technology usually focuses on how to better learn the characteristic information of the target, so as to improve the accuracy of target detection, but the existing detection speed is greatly limited due to the huge data volume of the remote sensing image and the huge parameters of the network model.
The invention develops research aiming at the current situation, provides a remote sensing image target detection method taking detection accuracy and detection speed into consideration, in particular to a remote sensing image target detection method based on a deep evolution pruning convolution network, and the method is shown in figure 1 and comprises the following steps:
(1) processing the training dataset and the validation dataset: selecting a plurality of optical remote sensing images containing various targets, cutting the images into image blocks with 512 x 512 pixels, wherein 70% of the image blocks of the optical remote sensing images form training data, 30% of the image blocks form a verification data set, and performing data enhancement on the training data set.
(1a) Inputting a plurality of large-scale optical remote sensing images containing various targets to be processed.
(1b) And marking the targets by using a marking tool for a plurality of large-amplitude optical remote sensing images containing various targets.
(1c) And cutting the optical remote sensing image into image blocks with 512 x 512 pixels by taking each target as a center.
(1d) And naming each cut image block according to a data set naming rule, and forming a training data set and a verification data set by all named image blocks, wherein the training data set accounts for 70%, the verification data set accounts for 30%, and data enhancement is performed on the training data set.
The data set naming rule is that the file name of each remote sensing image to be cut is connected with the sliding window step number corresponding to the cutting data block by using an English underline _' symbol to generate a jpg format file.
(2) Processing the test data set: inputting a plurality of optical remote sensing images containing various targets, and cutting the images into image blocks with 512 x 512 pixels to form a test data set.
(2a) Inputting another plurality of large-scale optical remote sensing images containing various targets to be processed.
(2b) And marking the target by using a marking tool for the large-amplitude optical remote sensing image to be tested.
(2c) The overlapping pixels are set to 100 in a manner of overlapping sliding windows, and the picture is sequentially cut into image blocks of 512 × 512 pixels.
(2d) And naming each cut image block according to a data set naming rule, and forming a test data set by all the named image blocks.
(3) Constructing a deep convolution feature extraction sub-network: and respectively constructing a depth separable convolution inverse residual error connection module and a feature pyramid convolution module, sequentially using a 7 multiplied by 7 convolution layer and a maximum pooling layer, and alternately connecting the depth separable convolution inverse residual error connection module and the feature pyramid convolution module to form a depth convolution feature extraction sub-network.
The specific structure of the sub-network for extracting the depth convolution features is that an original image input layer → 7 × 7 convolution layer → a first maximum pooling layer → a first depth separable convolution inverse residual connection module C1 → a second depth separable convolution inverse residual connection module C2 → a first feature pyramid convolution module P1 → a third depth separable convolution inverse residual connection module C3 → a second feature pyramid convolution module P2 → a fourth depth separable convolution inverse residual connection module C4 → a third feature pyramid convolution module P3 → a second maximum pooling layer → a fourth feature pyramid convolution module P4 → a third maximum pooling layer → a fifth feature pyramid convolution module P5 → a current stage feature map output layer.
(4) Constructing a fully-convoluted FCN detection subnetwork:
(4a) constructing a full convolution FCN classification subnet: the structure is that the characteristic diagram of each characteristic pyramid convolution module is respectively selected as a classified subnet input layer → a first 3 x 3 convolution layer → a second 3 x 3 convolution layer → a third 3 x 3 convolution layer → a fourth 3 x 3 convolution layer → a fifth 3 x 3 convolution layer → a classified subnet output layer; the classified subnet input layer takes the feature map of each feature pyramid convolution module as the input of the classified subnet in turn, and carries out classification detection, wherein the sizes of the input feature maps are respectively 64 × 64, 32 × 32, 16 × 16, 8 × 8 and 4 × 4.
Calculating the classification features output by the fifth 3 × 3 convolutional layer to obtain the classification confidence of each default frame in each classification category, inputting the feature values of the feature map output by the fifth 3 × 3 convolutional layer into a sigmoid function, and outputting the probability that the default frame belongs to the corresponding category, namely the classification confidence of the default frame in each category, wherein the calculation formula of the sigmoid function is as follows:
Figure BDA0002134402840000071
wherein x represents a characteristic value of a characteristic diagram input into the sigmoid function.
(4b) Constructing a fully-convoluted FCN regression subnet: the structure is that the characteristic diagram of each characteristic pyramid convolution module is respectively selected as a regression subnet input layer → a first 3 multiplied by 3 convolution layer → a second 3 multiplied by 3 convolution layer → a third 3 multiplied by 3 convolution layer → a fourth 3 multiplied by 3 convolution layer → a fifth 3 multiplied by 3 convolution layer → a regression subnet output layer; the regression subnet input layer takes the feature map of each feature pyramid convolution module as the input of the regression subnet in turn, and carries out regression detection, wherein the sizes of the input feature maps are respectively 64 × 64, 32 × 32, 16 × 16, 8 × 8 and 4 × 4.
(5) Constructing and training a deep convolution target detection network:
(5a) constructing a deep convolution target detection network: the method comprises the steps of sequentially building a deep convolution target detection network by using a deep convolution feature extraction sub-network, a full convolution FCN classification sub-network and a full convolution FCN regression sub-network, wherein the structure of the deep convolution target detection network is that an original image input layer → the deep convolution feature extraction sub-network → the full convolution FCN classification regression sub-network.
(5b) Training a deep convolution target detection network: and training the deep convolution target detection network by using the training data set and the verification data set as input to obtain the trained deep convolution target detection network, and storing a weight file of the trained deep convolution target detection network.
(6) Constructing and training a target detection network based on a deep evolution pruning convolution network:
(6a) performing layer-by-layer DNA coding on a convolutional filter participating in pruning in a trained deep convolutional target detection network, and recording the coding as DNA1,...l-1,l
(6b) Optimizing DNA using evolutionary algorithms1,...l-1,lCoding to obtain final optimized result coding DNA'1,...l-1,l. (6c) Incorporation of optimized result-encoding DNA'1,...l-1,lAnd constructing a target detection network based on the deep evolution pruning convolutional network by using a pruning rule, wherein the pruning rule is that the code is 0 to indicate that the convolutional filter is finally pruned, the code is 1 to indicate that the convolutional filter is finally reserved, and a training data set is used for fine adjustment to obtain a trained target detection method network based on the deep evolution pruning convolutional network, namely a trained model, and storing a trained model weight file.
(7) And carrying out target detection on the test data set by using the trained model:
(7a) and sequentially inputting the data blocks in the test data set into a trained target detection network based on the deep evolution pruning convolutional network to obtain a candidate frame of each data block in the test data set, a classification confidence score corresponding to the candidate frame and a target category corresponding to the candidate frame.
(7b) And discarding all candidate frames of the target category with the classification confidence score lower than the threshold value 0.3, and performing non-maximum suppression processing on the remaining retained candidate frames.
(7c) And mapping the coordinates of all the reserved candidate frames, mapping the coordinates onto the optical remote sensing image before cutting, and performing secondary non-maximum suppression processing to obtain a final detection result image of the optical remote sensing image.
The invention optimizes the detection accuracy and the operation speed of the target detection network simultaneously, provides an optimization scheme, achieves the best accuracy and speed simultaneously, has obvious advantages in the aspects of calculation complexity and operation speed, and improves the model accuracy compared with the prior art.
The idea of the invention is as follows: firstly, constructing an inverse residual error network based on a depth separable convolution unit capable of greatly reducing the parameter quantity of a model to extract the basic characteristics of an input image, taking the basic characteristics as the input of a characteristic pyramid convolution network to perform more precise characteristic extraction, using a full convolution FCN classification subnet and a full convolution FCN regression subnet to perform detection, and finally performing global evolution pruning optimization to realize network acceleration. The extracted features are more suitable for the remote sensing image target detection task, the accuracy rate of remote sensing image target detection can be improved, and the network operation speed is greatly improved.
Example 2:
the remote sensing image target detection method based on the deep evolution pruning convolution network is the same as the method in the embodiment 1, and the step (3) of constructing the deep convolution feature extraction sub-network comprises the following specific steps:
(3a) constructing a depth separable convolution inverse residual connecting module: the module structure is that the characteristic diagram input layer of the previous stage → 1 × 1 convolution layer → depth separable convolution unit → point-by-point addition layer → characteristic diagram output layer of the current stage.
The 1 × 1 convolution layer in the inverse residual connecting module and the depth separable convolution unit appear in a group, and the point-by-point addition layer is a feature processing layer formed by adding the output feature map of the depth separable convolution unit of the previous layer and the feature map from the input layer of the inverse residual connecting module point-by-point.
The inverse residual connecting module means that the traditional residual connecting structure firstly reduces and then increases the dimension of the channel of the input feature map, and the inverse residual connecting module firstly increases and then decreases the dimension of the channel of the input feature map, wherein the 1 × 1 convolutional layer performs 2 times of dimension increase on the channel of the input feature map, and the depth separable convolution unit performs feature extraction and 2 times of channel dimension reduction on the channel of the input feature map, so that the number of the channels of the output feature map of the constructed depth separable convolution inverse residual connecting module is consistent with that of the input feature map.
(3b) Constructing a characteristic pyramid convolution module: the feature pyramid convolution module is a two-layer input and single-layer output structure, and the module structure is input feature map 1 → the first convolution layer of input feature map 1 → two times of upsampling layer → output feature map 1, input feature map 2 → the first convolution layer of input feature map 2 → output feature map 2, point-by-point addition layer → second convolution layer → current stage feature map output layer.
The input feature map 1 is a stage feature map with the same size as an input feature map and an output feature map in a depth separable convolution inverse residual connection module, the input feature map 2 is a feature map with the same spatial size as the output feature map 1 in the inverse residual connection module, the point-by-point addition layer is a feature processing layer formed by point-by-point addition of the output feature map 1 and the output feature map 2, and the double-up sampling layer amplifies the scale of the input feature map 1 processed by the first convolution layer through a bilinear interpolation algorithm.
(3c) Sequentially using 7 multiplied by 7 convolutional layers and maximum pooling layers, alternately connecting a depth separable convolutional inverse residual error connecting module with a characteristic pyramid convolutional module to construct a depth convolutional characteristic extraction sub-network, the specific structure is that the original image input layer → 7 × 7 convolution layer → the first largest pooling layer → the first depth separable convolution inverse residual connection module C1 → the second depth separable convolution inverse residual connection module C2 → the first feature pyramid convolution module P1 → the third depth separable convolution inverse residual connection module C3 → the second feature pyramid convolution module P2 → the fourth depth separable convolution inverse residual connection module C4 → the third feature pyramid convolution module P3 → the second largest pooling layer → the fourth feature pyramid convolution module P4 → the third largest pooling layer → the fifth feature pyramid convolution module P5 → the current stage feature map output layer.
The accuracy is ensured while the parameter quantity is reduced: the invention provides an inverse residual connecting module.A traditional residual connecting structure uses a1 multiplied by 1 convolutional layer to perform dimension reduction and dimension increase on a channel of an input feature map, but features are compressed after dimension reduction, part of useful feature information in the image is removed, and less target feature information causes the accuracy rate of model detection to be reduced. The invention designs an inverse residual connecting module, which is used for increasing the dimension by two times of the number of channels of an input feature map to obtain more image feature information, and then accessing a depth separable unit to extract features and reduce the dimension of the channels, so that the higher detection accuracy is maintained while the parameter number is reduced and the network operation speed is increased.
Example 3:
the remote sensing image target detection method based on the deep evolution pruning convolutional network is the same as that in embodiment 1-2, and the deep separable convolution unit in the step (3a) is shown in fig. 2, and the unit structure is that a feature map input layer in the last stage → 3 × 3 deep convolutional layer → first batch normalization layer → ReLU activation function layer → 1 × 1 point-by-point convolutional layer → second batch normalization layer → linear activation function layer → output feature map layer.
The depth separable convolution unit divides the standard convolution into depth convolution and point-by-point convolution to realize the space and channel separation and respective processing of the features, thereby greatly reducing the parameter quantity and the calculation complexity.
The ReLU activation function is not used after the activation function after the 1 x 1 point-by-point convolution layer, but a linear activation function is used, so that the ReLU activation function is prevented from causing large information loss to a tensor with a lower channel number, and characteristic information is prevented from being damaged.
Assume that the input feature map size is Hin×Win×Cin,Hin、Win、CinThe width, height and channel number of the input feature map are respectively, a convolution kernel with the width and height of K multiplied by K is used, and the size of the output feature map is represented as Hout×Wout×Cout,Hout、Wout、CoutThe width, the height and the channel number of the output characteristic diagram are respectively, the space and channel information of the input characteristic diagram are considered simultaneously by standard convolution, and the calculation amount is as follows:
K×K×Cin×Hout×Wout×Cout
the depth separable convolution separates the channels of the input feature map from space using a depth convolution of 3 × 3 and a point-by-point convolution of 1 × 1, which are processed separately, and the amount of computation of the depth separable convolution is:
K×K×Cin×Hout×Wout+Cin×Hout×Wout×Cout
the depth separable convolution of the present invention is computationally less intensive than the standard convolution
Figure BDA0002134402840000101
For a convolution kernel of size 3 x 3, the computational effort is reduced by a factor of about 9.
The model parameter quantity is greatly reduced: the remote sensing image target detection method based on the deep evolution pruning convolution network replaces standard convolution in a ResNet network by using deep separable convolution, and does not use a ReLU activation function after 1 multiplied by 1 point-by-point convolution in a deep separable convolution unit, but uses a linear activation function to prevent characteristic information from being damaged, so that parameter quantity and calculated quantity required by fitting data are reduced while model detection precision is maintained, convergence speed of a network model is accelerated, the defect that the model operation speed is lost due to large network parameter quantity in the prior art is overcome, and the method can be applied to computing equipment with limited computing resources and storage resources.
Example 4:
the remote sensing image target detection method based on the deep evolution pruning convolutional network is the same as that of the embodiment 1-3, and the step (6a) of performing layer-by-layer DNA coding on the convolutional filter participating in pruning in the trained deep convolutional target detection network refers to the following steps:
(6a1) performing convolution operation on the output characteristic diagram: recording the l-th layer characteristic diagram of the trained deep convolution target detection network structure, wherein the height of the l-th layer characteristic diagram is HlWidth of WlThe number of channels is ClIs output characteristic diagram of
Figure BDA0002134402840000111
And note that the characteristic subgraph of the kth channel in the characteristic graph of the l layer is
Figure BDA0002134402840000112
Then Zl (k)By parameters of corresponding convolution filters
Figure BDA0002134402840000113
And a characteristic diagram Z of the front layerl-1Obtained by performing a convolution operation (. + -.), f representing an activation function, Zl (k)The calculation formula is as follows:
Zl (k)=f(Zl-1*Wl (k));
in a common deep learning framework such as TensorFlow and Caffe, the convolution operation of tensor is converted into matrix multiplication by transforming the input dimensionality and transposing a convolution filter, and the l-th layer characteristic diagram Z of the output characteristic diagram is processed by the convolution operationlThe formula is as follows:
Figure BDA0002134402840000114
wherein
Figure BDA0002134402840000115
Is the l-1 level characteristic diagram of the output characteristic diagram after being processed by the convolution operation, WlAnd representing a parameter matrix of a convolution filter corresponding to the l-th layer characteristic diagram after convolution operation processing.
(6a2) Mask-coding the convolution filter that needs pruning or reservation: detecting the l-th layer output characteristic diagram C of the network structure for the trained deep convolution targetlIntroducing masks to convolution filters requiring pruning or retention
Figure BDA0002134402840000116
Coding, coding as 0 indicates that the convolution filter is pruned, and coding as 1 indicates that the convolution filter is reserved; convolution operation formula with inner product as shown and global feature channel pruning
Figure BDA0002134402840000121
The changes are as follows:
Figure BDA0002134402840000122
(6a3) coding the convolution filter participating in pruning layer by layer: using a trained target detection network based on deep separable convolution to code the convolution filter participating in pruning layer by layer, wherein the convolution filter comprises coding DNA of all layers to be pruned1,...l-1,lIs marked as
Figure BDA0002134402840000123
Wherein the DNA1,...l-1,lDNA codes for 1 st to l th layers to be pruned,
Figure BDA0002134402840000124
representing the coding symbols in the l-1 level profile.
According to the remote sensing image target detection method based on the deep evolution pruning convolution network, a pruning algorithm is used for the trained deep convolution target detection network to remove a large number of redundant convolution filters existing in the target detection network, so that the overfitting risk of the network is reduced, the network structure is greatly simplified, the remote sensing image target detection method is easier to deploy in embedded equipment due to less parameter quantity, and meanwhile, the reasoning speed is remarkably accelerated.
Most previous approaches tend to prune filters sequentially in a fixed layer-by-layer manner, which cannot dynamically recover previously removed filters, may ignore complex associations between filters, and may result in poor flexibility, which may result in significant degradation of network evaluation performance. The invention jointly codes the filters to be pruned of all layers, has strong flexibility, fully utilizes the relevance among the filters, and improves the network performance while accelerating the network.
Example 5:
the remote sensing image target detection method based on the deep evolution pruning convolution net is the same as that of the remote sensing image target detection method in the embodiment 1-4, and the step (6b) of optimizing DNA by using the evolutionary algorithm1,...l-1,lCoding to obtain final optimized DNA1,...l-1,lThe specific method comprises the following steps:
(6b1) initialization: setting the evolution algebra counter T to be 0, setting the maximum evolution algebra T, and setting the pruning ratiocut0.5, according to ratiocutRandomly generated with DNA1,...l-1,lEncoded M individuals as an initial population
Figure BDA0002134402840000125
Wherein
Figure BDA0002134402840000126
Encodes the DNA of the m-1 st individual in the 1 st to l-th filters.
(6b2) Adjusting network parameters using the training data set: for population P of the t-th roundtConvolution formula using training data set and global feature channel pruning after adding mask code
Figure BDA0002134402840000131
And retraining the generated network and adjusting network parameters.
(6b3) And (3) fitness calculation: convolution formula using verification dataset and global feature channel pruning with mask code added
Figure BDA0002134402840000132
Computing population PtFitness of each individual
Figure BDA0002134402840000133
Wherein
Figure BDA0002134402840000134
To verify the loss of the data set.
(6b4) Generating a new individual: using fitness of individuals
Figure BDA0002134402840000135
Selecting individuals with higher fitness for crossover and mutation to generate new individuals, wherein the crossover operation is based on the crossover probability pmCarrying out cross operation on the parent individuals randomly when the probability p of mutation is 0.9, and carrying out mutation operation according to the probability p of mutationcRandomly mutating the parent individuals at 0.9, and performing the steps (6b1) to (6b4) to obtain the population PtObtaining the next generation group P after selection, crossing and mutation operationst+1
(6b5) Judging whether to terminate the evolution: if T ═ T, the individual with the maximum fitness obtained in the evolution process is taken as the optimal solution output, the calculation is terminated, and the code thereof is recorded as DNA'1,...l-1,lAnd (6) executing the step (6c) to construct a target detection network based on the deep evolution pruning convolution network. Otherwise, if T is less than T, returning to the step (6b2), repeating the steps (6b2) to (6b5) and continuing to perform evolution optimization of the codes.
The invention combines a target detection network with a global dynamic pruning method based on an evolutionary algorithm, and realizes network acceleration: redundant filters are removed by a pruning method, so that CNN acceleration is realized. Different from the comparison file 2, the method does not need to design a prior function in advance, can optimize the pruning process through a global dynamic evolutionary algorithm, and reduces the implementation difficulty of the algorithm. The invention jointly codes all layers of filters to be pruned, optimizes the network to be pruned through an evolutionary algorithm, uses the performance of the network in a test set as the fitness of the evolutionary algorithm, and completes iterative optimization of a network structure through a retraining mode, so that the model finally achieves an ideal acceleration effect and ensures the performance of the model.
A more complete and thorough example is given below to further describe the present invention.
Example 6:
the remote sensing image target detection method based on the deep evolution pruned roll network is the same as the embodiments 1 to 5, referring to FIG. 1,
step 1, processing and determining a training data set and a verification data set:
inputting a plurality of large-amplitude optical remote sensing images to be processed, wherein the large-amplitude optical remote sensing images comprise a plurality of targets, and marking the targets by using a marking tool; cutting the optical remote sensing image into image blocks with 512 x 512 pixels by taking each target as a center; and naming each cut image block according to a data set naming rule, forming a training data set and a verification data set by all named image blocks, wherein the training data set accounts for 70%, the verification data set accounts for 30%, and performing data enhancement operations such as image scale transformation, image translation, image rotation, image mirroring, image contrast and brightness adjustment, image noise addition and the like on the image blocks in the training data set at a time to form a final training data set.
The data set naming rule is that the file name of each remote sensing image to be cut is connected with the sliding window step number corresponding to the cutting data block by using an English underline _' symbol to generate a jpg format file.
Step 2, processing and determining a test data set:
inputting another plurality of large-scale optical remote sensing images containing various targets to be processed, and marking the targets by using a marking tool for the large-scale optical remote sensing images to be tested; in a mode of overlapping sliding windows, overlapping pixels are set as 100, and the picture is sequentially cut into image blocks of 512 x 512 pixels; and naming each cut image block according to a data set naming rule, and forming a test data set by all the named image blocks.
And 3, constructing a deep convolution feature extraction sub-network, which comprises the following specific steps:
(3a) constructing a depth separable convolution inverse residual connecting module: the module structure is that the characteristic diagram input layer of the previous stage → 1 × 1 convolution layer → depth separable convolution unit → point-by-point addition layer → characteristic diagram output layer of the current stage.
The unit structure of the deep separable convolution unit is that a feature map input layer at the upper stage → 3 multiplied by 3 deep convolution layer → a first batch normalization layer → a ReLU activation function layer → 1 multiplied by 1 point-by-point convolution layer → a second batch normalization layer → a linear activation function layer → an output feature map layer; the ReLU activation function is no longer used after the activation function after the 1 × 1 point-by-point convolution layer, but is used linearly to prevent the ReLU activation function from corrupting the feature information.
Assume that the input feature map size is denoted as Hin×Win×Cin,Hin、Win、CinThe width, height and channel number of the input feature map are respectively, a convolution kernel with the width and height of K multiplied by K is used, and the size of the output feature map is represented as Hout×Wout×Cout,Hout、Wout、CoutThe width, the height and the channel number of the output characteristic diagram are respectively, the space and channel information of the input characteristic diagram are considered simultaneously by standard convolution, and the calculation amount is as follows:
K×K×Cin×Hout×Wout×Cout
the depth separable convolution separates the channels of the input feature map from space using a depth convolution of 3 × 3 and a point-by-point convolution of 1 × 1, which are processed separately, and the amount of computation of the depth separable convolution is:
K×K×Cin×Hout×Wout+Cin×Hout×Wout×Cout
the depth separable convolution of the present invention is computationally less intensive than the standard convolution
Figure BDA0002134402840000151
For a convolution kernel of size 3 x 3, the computational effort is reduced by a factor of about 9, so that a speed increase of 7 to 9 times can be achieved.
The 1 × 1 convolution layer in the inverse residual connecting module and the depth separable convolution unit appear in a group, and the point-by-point addition layer is a feature processing layer formed by adding the output feature map of the depth separable convolution unit of the previous layer and the feature map from the input layer of the inverse residual connecting module point-by-point.
The inverse residual connecting module means that the conventional residual connecting structure firstly reduces and then increases the dimension of the channel of the input feature map, and the inverse residual connecting module firstly increases and then decreases the dimension of the channel of the input feature map, wherein the 1 × 1 convolutional layer performs 2 times of dimension increase on the channel of the input feature map, the 3 × 3 depth convolutional layer in the depth separable convolution unit performs feature extraction, and the 1 × 1 point-by-point convolutional layer in the depth separable convolution unit performs 2 times of dimension reduction on the channel of the input feature map, so that the number of channels of the output feature map of the constructed depth separable convolution inverse residual connecting module is consistent with that of the input feature map.
(3b) Constructing a characteristic pyramid convolution module: the module is a double-layer input single-layer output structure, and the module structure is as follows, input feature diagram 1 → the first convolution layer of input feature diagram 1 → two times of upsampling layer → output feature diagram 1, input feature diagram 2 → the first convolution layer of input feature diagram 2 → output feature diagram 2, point-by-point addition layer → second convolution layer → current stage feature diagram output layer.
The input feature map 1 is a stage feature map with the same size as an input feature map and an output feature map in a depth separable convolution inverse residual connection module, the input feature map 2 is a feature map with the same spatial size as the output feature map 1 in the inverse residual connection module, the point-by-point addition layer is a feature processing layer formed by point-by-point addition of the output feature map 1 and the output feature map 2, and the double-up sampling layer amplifies the scale of the input feature map 1 processed by the first convolution layer through a bilinear interpolation algorithm.
The specific parameters of the characteristic pyramid convolution module are set as follows: the filter size of the first convolutional layer relative to the input signature fig. 1 is 1 × 1, the convolution step size is 1; the filter size of the first convolutional layer relative to the input signature of fig. 2 is 1 × 1, the convolution step is 1, the filter size of the second convolutional layer is 3 × 3, and the convolution step is 1.
The feature pyramid convolution module can effectively take depth separable inverse residual convolution as input and extract features stage by stage, and combines semantic features from higher layers by an up-sampling method, so that a network can effectively combine deep features and shallow features and overcome semantic gaps of feature maps in different stages, the deep features and the shallow features can be effectively and simultaneously applied to classification and regression, and the accuracy of detecting small targets such as small airplanes and ships in a sensed image is integrally improved.
(3c) Sequentially using 7 multiplied by 7 convolutional layers and maximum pooling layers, alternately connecting a depth separable convolutional inverse residual error connecting module with a characteristic pyramid convolutional module to construct a depth convolutional characteristic extraction sub-network, the specific structure is that the original image input layer → 7 × 7 convolution layer → the first largest pooling layer → the first depth separable convolution inverse residual connection module C1 → the second depth separable convolution inverse residual connection module C2 → the first feature pyramid convolution module P1 → the third depth separable convolution inverse residual connection module C3 → the second feature pyramid convolution module P2 → the fourth depth separable convolution inverse residual connection module C4 → the third feature pyramid convolution module P3 → the second largest pooling layer → the fourth feature pyramid convolution module P4 → the third largest pooling layer → the fifth feature pyramid convolution module P5 → the current stage feature map output layer.
And 4, constructing a full convolution FCN detection sub-network:
constructing a full convolution FCN classification subnet: the structure is that the characteristic diagram of each characteristic pyramid convolution module is respectively selected as a classified subnet input layer → a first 3 x 3 convolution layer → a second 3 x 3 convolution layer → a third 3 x 3 convolution layer → a fourth 3 x 3 convolution layer → a fifth 3 x 3 convolution layer → a classified subnet output layer.
The classified subnet input layer takes the feature map of each feature pyramid convolution module as the input of the classified subnet in turn, and carries out classification detection, wherein the sizes of the input feature maps are respectively 64 × 64, 32 × 32, 16 × 16, 8 × 8 and 4 × 4.
The parameters of the fully-convoluted FCN classification subnet are set as follows:
performing 3 × 3 convolution calculation on the first four layers, wherein the convolution step length of each layer is 1;
performing convolution calculation on the output of the fourth 3 × 3 convolutional layer to obtain classification features, wherein the convolution step is 1, the number of filters is 9 × 2, wherein "9" represents the number of default frames corresponding to each pixel point of the feature map output by the fourth 3 × 3 convolutional layer, and "2" represents the number of classification categories of the classification subnets.
Calculating the classification features output by the fifth 3 × 3 convolutional layer to obtain the classification confidence of each default frame in each classification category, inputting the feature values of the feature map output by the fifth 3 × 3 convolutional layer into a sigmoid function, and outputting the probability that the default frame belongs to the corresponding category, namely the classification confidence of the default frame in each category, wherein the calculation formula of the sigmoid function is as follows:
Figure BDA0002134402840000171
wherein x represents a characteristic value of a characteristic diagram input into the sigmoid function.
Constructing a fully-convoluted FCN regression subnet: the structure is that the characteristic diagram of each characteristic pyramid convolution module is respectively selected as a regression subnet input layer → the first 3 x 3 convolution layer → the second 3 x 3 convolution layer → the third 3 x 3 convolution layer → the fourth 3 x 3 convolution layer → the fifth 3 x 3 convolution layer → the regression subnet output layer.
The regression subnet input layer takes the feature map of each feature pyramid convolution module as the input of the regression subnet in turn, and carries out regression detection, wherein the sizes of the input feature maps are respectively 64 × 64, 32 × 32, 16 × 16, 8 × 8 and 4 × 4.
The parameters of the fully-convolved FCN regression subnet are set as:
and performing convolution calculation on the first four layers, wherein the convolution step length of each layer is 1.
And performing convolution calculation on the output of the fourth 3 × 3 convolutional layer by one layer to obtain a default frame position offset, wherein the convolution step is 1, the number of the filters is 9 × 4, wherein "9" represents the number of the default frames corresponding to each pixel point of the feature map output by the fourth 3 × 3 convolutional layer, and "4" represents the position offsets of 4 coordinate values at the upper left corner and the lower right corner of the default frames.
The two fully-convoluted FCN subnetworks are independent of each other and do not share parameters with each other.
Step 5, constructing and training a deep convolution target detection network:
constructing a deep convolution target detection network: the method comprises the steps of sequentially building a deep convolution target detection network by using a deep convolution feature extraction sub-network, a full convolution FCN classification sub-network and a full convolution FCN regression sub-network, wherein the structure of the deep convolution target detection network is that an original image input layer → the deep convolution feature extraction sub-network → the full convolution FCN classification regression sub-network.
Training a deep convolution target detection network: and training the deep convolution target detection network by using the training data set and the verification data set as input to obtain the trained deep convolution target detection network, and storing a weight file of the trained deep convolution target detection network.
Step 6, constructing and training a target detection method network based on the deep evolution pruning convolution network:
(6a) performing layer-by-layer DNA coding on a convolutional filter participating in pruning in a trained deep convolutional target detection network, and recording the coding as DNA1,...l-1,lThe specific method comprises the following steps:
(6a1) performing convolution operation on the output characteristic diagram: recording the l-th layer characteristic diagram of the trained deep convolution target detection network structure, wherein the height of the l-th layer characteristic diagram is HlWidth of WlThe number of channels is ClIs output characteristic diagram of
Figure BDA0002134402840000172
And note that the characteristic subgraph of the kth channel in the characteristic graph of the l layer is
Figure BDA0002134402840000173
Then Zl (k)By parameters of corresponding convolution filters
Figure BDA0002134402840000181
And a characteristic diagram Z of the front layerl-1Obtained by performing a convolution operation (. + -.), f representing an activation function, Zl (k)The calculation formula is as follows:
Zl (k)=f(Zl-1*Wl (k));
in a common deep learning framework such as TensorFlow and Caffe, the convolution operation of tensor is converted into matrix multiplication by transforming the input dimensionality and transposing a convolution filter, and the l-th layer characteristic diagram Z of the output characteristic diagram is processed by the convolution operationl *The formula of (a) is as follows:
Figure BDA0002134402840000182
wherein
Figure BDA0002134402840000183
Is the l-1 level characteristic diagram of the output characteristic diagram after being processed by the convolution operation, Wl *And representing a parameter matrix of the convolution filter corresponding to the l-th layer characteristic diagram after convolution operation processing.
(6a2) Mask-coding the convolution filter that needs pruning or reservation: detecting the l-th layer output characteristic diagram C of the network structure for the trained deep convolution targetlIntroducing masks to convolution filters requiring pruning or retention
Figure BDA0002134402840000184
Coding, coding as 0 indicates that the convolution filter is pruned, and coding as 1 indicates that the convolution filter is reserved; convolution operation formula with inner product as shown and global feature channel pruning
Figure BDA0002134402840000185
The changes are as follows:
Figure BDA0002134402840000186
(6a3) coding the convolution filter participating in pruning layer by layer: using the trained deep convolution target detection network to code the convolution filter participating in pruning layer by layer, wherein the convolution filter contains the coding DNA of all the layers to be pruned1,...l-1,lIs marked as
Figure BDA0002134402840000187
Wherein the DNA1,...l-1,lDNA codes for 1 st to l th layers to be pruned,
Figure BDA0002134402840000188
indicating the coding symbols in layer l-1.
(6b) Optimizing DNA using evolutionary algorithms1,...l-1,lCoding to obtain final optimized result coding DNA'1,...l-1,lThe specific method comprises the following steps:
(6b1) initialization: setting the evolution algebra counter T to be 0, setting the maximum evolution algebra T, and setting the pruning ratiocut0.5, according to ratiocutRandomly generated with DNA1,...l-1,lEncoded M individuals as an initial population
Figure BDA0002134402840000191
Wherein
Figure BDA0002134402840000192
Encodes the DNA of the m-1 st individual in the 1 st to l-th filters.
(6b2) Adjusting network parameters using the training data set: for population P of the t-th roundtConvolution formula using training data set and global feature channel pruning after adding mask code
Figure BDA0002134402840000193
For generation ofThe network is retrained and network parameters are adjusted.
(6b3) And (3) fitness calculation: convolution formula using verification dataset and global feature channel pruning with mask code added
Figure BDA0002134402840000194
Computing population PtFitness of each individual
Figure BDA0002134402840000195
Wherein
Figure BDA0002134402840000196
To verify the loss of the data set.
(6b4) Generating a new individual: using fitness of individuals
Figure BDA0002134402840000197
Selecting individuals with higher fitness for crossover and mutation to generate new individuals, wherein the crossover operation is based on the crossover probability pmCarrying out cross operation on the parent individuals randomly when the probability p of mutation is 0.9, and carrying out mutation operation according to the probability p of mutationcRandomly mutating the parent individuals at 0.9, and performing the steps (6b1) to (6b4) to obtain the population PtObtaining the next generation group P after selection, crossing and mutation operationst+1
(6b5) Judging whether to terminate the evolution: if T is T, the individual with the maximum fitness obtained in the evolution process is taken as the optimal solution output, the calculation is stopped, and the code of the individual is marked as DNA1,...l-1,lAnd (6) executing the step (6c) to construct a target detection network based on the deep evolution pruning convolution network. Otherwise, if T is less than T, returning to the step (6b2), repeating the steps (6b2) to (6b5) and continuing to perform evolution optimization of the codes.
(6c) Incorporation of optimized result-encoding DNA'1,...l-1,lAnd constructing a target detection network based on a deep evolution pruning convolution network by using a pruning rule, wherein the pruning rule is that the code of the pruning rule is 0 to indicate that the convolution filter is finally pruned, the code of the pruning rule is 1 to indicate that the convolution filter is finally reserved, and a training data set is used for fine adjustment to obtainAnd storing the trained model weight file in the trained target detection network based on the deep evolution pruning convolution network, namely the trained model.
And 7, performing target detection on the test data set by using the trained model:
and sequentially inputting the data blocks in the test data set into a trained target detection network based on the deep evolution pruning convolutional network to obtain a candidate frame of each data block in the test data set, a classification confidence score corresponding to the candidate frame and a target category corresponding to the candidate frame.
Discarding all candidate frames of the target category with the classification confidence score lower than the threshold value 0.3, and performing non-maximum suppression processing on the remaining candidate frames, wherein the non-maximum suppression processing refers to: and sorting all the detection frames from high to low according to the classification confidence score, reserving the candidate frames with low overlapping degree and high score between the detection frames, discarding the candidate frames with high overlapping degree and low score between the detection frames, and repeating the steps until the detection frame with the lowest classification confidence of the current detection frame sequence is traversed, so that the detection result has higher accuracy and lower false alarm rate. The selection of the threshold value can be adjusted according to actual conditions.
And mapping the coordinates of all the reserved candidate frames, mapping the coordinates onto the optical remote sensing image before cutting, and performing secondary non-maximum suppression processing to obtain a final detection result image of the optical remote sensing image.
According to the method, the inverse residual connection structure is constructed by using the depth separable convolution, the model parameters and the calculated amount are greatly reduced while the higher detection accuracy is maintained, the global evolution pruning is carried out on the basis of the depth convolution target detection network, the network acceleration is realized, the overall detection speed of the model is greatly improved, and meanwhile, the detection precision of the optical remote sensing image airplane and ship is effectively improved.
The effect of the present invention is further explained by combining the simulation experiment as follows:
simulation conditions are as follows:
the simulation experiment of the invention is carried out under the hardware environment of Intel Xeon E5-2697v4 x 2 with main frequency of 2.4GHz, the hardware environment of an internal memory 64G and a GeForce GTX 1080 x 2 and the software environment of Darknet under a Linux system.
Simulation content and result analysis:
the simulation experiment of the invention is to respectively adopt the method of the invention and the RetinaNet method of the prior art and utilize the global dynamic pruning GDP to carry out target detection on the remote sensing image of the hong Kong international airport area in the Quickbird satellite.
The two indexes of accuracy and Average precision mAP (mean Average precision) are adopted to evaluate two optical remote sensing image target detection results of the invention and RetinaNet + GDP in the prior art respectively, and the accuracy and Average precision mAP of the optical remote sensing image target detection results of the invention and RetinaNet + GDP in the prior art are calculated respectively by the following formulas:
recall is total number of detected correct targets/total number of actual targets
Accuracy rate is total number of detected correct targets/total number of detected targets
And drawing an accuracy-recall rate curve, obtaining the detection precision AP of the target detection according to the area of the curve, and averaging the APs of multiple categories to obtain the average precision mAP.
The airplane detection precision, ship test precision and mAP indexes of the RetinaNet + GDP in the invention and the prior art are respectively listed in Table 1.
Table 1 summary of test results and testing accuracy of simulation experiment
Method of producing a composite material RetinaNet+GDP The method of the invention
Aircraft detection accuracy 0.9236 0.9575
Detection precision of ship 0.6319 0.6508
Average precision mAP 0.7778 0.8042
It can be seen from table 1 that the mAP of the RetinaNet + GDP in the prior art is 77.78%, the mAP of the method of the present invention is 80.42%, and the detection accuracy of the method of the present invention is higher when detecting the targets of the aircraft and the ship. The existing remote sensing image target detection technology falls into a bottleneck in the aspect of improving the detection precision, the detection precision of small targets in the remote sensing image, particularly targets such as airplanes and ships and warships, is difficult to effectively improve, and the method provided by the invention can be used for greatly improving the detection speed and also effectively improving the detection precision of small targets such as airplanes and ships in the remote sensing image.
Table 2 shows FPS of the present invention and the detecting speed of RetinaNet + GDP in the prior art per second
Table 2 simulation FPS results
Method of producing a composite material RetinaNet+GDP The method of the invention
FPS (frame per second) detection frame number 23 35
In table 2, it can be seen that the detection speed of retinaNet + GDP in the prior art is 23FPS, the detection speed of the method of the present invention is 35FPS, and the detection speed of the method of the present invention is faster when detecting the targets of the aircraft and the ship. The existing remote sensing image target detection technology usually sacrifices the detection speed when detecting a target, but is very important for real-time and rapid detection of the remote sensing image in practical application, particularly in the military field.
In summary, the remote sensing image target detection method based on the deep evolution pruning convolution network disclosed by the invention mainly solves the problem that the detection speed and the target detection precision are not simultaneously globally and effectively optimized in the conventional remote sensing image target detection acceleration technology. The method comprises the following specific steps: constructing a training data set and a verification data set; constructing a test data set; constructing a deep convolution feature extraction sub-network; constructing a complete convolution FCN detection sub-network; constructing and training a deep convolution target detection network; constructing and training a target detection network based on a deep evolution pruning convolution network; carrying out target detection on the test data set by using the trained model; and outputting a test result. According to the method, the inverse residual error connection structure is constructed by using the depth separable convolution, so that the model parameters and the calculated quantity are greatly reduced while the higher detection accuracy is maintained; the target detection network is combined with a global dynamic pruning method based on an evolutionary algorithm to realize network acceleration. The method has the advantages of greatly reducing the calculation complexity and the model parameter, obviously improving the target detection speed of the optical remote sensing image and simultaneously having high target detection precision, and can be used for quickly and accurately detecting the ground object targets of the airplane and the ship in different areas of the remote sensing image.

Claims (5)

1. A remote sensing image target detection method based on a deep evolution pruning convolution net is characterized by comprising the following steps:
(1) processing the training dataset and the validation dataset: selecting a plurality of optical remote sensing images containing various targets, cutting the images into image blocks with 512 x 512 pixels, wherein 70% of the image blocks of the optical remote sensing images form training data, 30% of the image blocks form a verification data set, and performing data enhancement on the training data set;
(2) processing the test data set: inputting another plurality of optical remote sensing images containing various targets, and cutting the images into image blocks with 512 x 512 pixels to form a test data set;
(3) constructing a deep convolution feature extraction sub-network: respectively constructing a depth separable convolution inverse residual error connection module and a characteristic pyramid convolution module, and alternately connecting the depth separable convolution inverse residual error connection module and the characteristic pyramid convolution module by using a 7 multiplied by 7 convolution layer and a maximum pooling layer in sequence to form a depth convolution characteristic extraction sub-network;
the specific structure of the sub-network for extracting the depth convolution features is that an original image input layer → 7 × 7 convolution layer → a first maximum pooling layer → a first depth separable convolution inverse residual connection module C1 → a second depth separable convolution inverse residual connection module C2 → a first feature pyramid convolution module P1 → a third depth separable convolution inverse residual connection module C3 → a second feature pyramid convolution module P2 → a fourth depth separable convolution inverse residual connection module C4 → a third feature pyramid convolution module P3 → a second maximum pooling layer → a fourth feature pyramid convolution module P4 → a third maximum pooling layer → a fifth feature pyramid convolution module P5 → a current stage feature map output layer;
(4) constructing a fully-convoluted FCN detection subnetwork:
(4a) constructing a full convolution FCN classification subnet: the structure is, classifying the subnet input layer → the first 3 x 3 convolution layer → the second 3 x 3 convolution layer → the third 3 x 3 convolution layer → the fourth 3 x 3 convolution layer → the fifth 3 x 3 convolution layer → classifying the subnet output layer; the classified subnet input layer takes the characteristic graph of each characteristic pyramid convolution module as the input of the classified subnet in turn, and carries out classification detection in turn;
(4b) constructing a fully-convoluted FCN regression subnet: the structure is, the regression subnet input layer → the first 3 x 3 convolution layer → the second 3 x 3 convolution layer → the third 3 x 3 convolution layer → the fourth 3 x 3 convolution layer → the fifth 3 x 3 convolution layer → the regression subnet output layer; the regression subnet input layer takes the characteristic graph of each characteristic pyramid convolution module as the input of the regression subnet in turn, and carries out regression detection in turn;
(5) constructing and training a deep convolution target detection network:
(5a) constructing a deep convolution target detection network: sequentially constructing a deep convolution target detection network by using a deep convolution feature extraction sub-network, a full convolution FCN classification sub-network and a full convolution FCN regression sub-network, wherein the structure of the deep convolution target detection network is that an original image input layer → the deep convolution feature extraction sub-network → the full convolution FCN classification regression sub-network;
(5b) training a deep convolution target detection network: training the deep convolution target detection network by using the training data set and the verification data set as input to obtain a trained deep convolution target detection network, and storing a weight file of the trained deep convolution target detection network;
(6) constructing and training a target detection network based on a deep evolution pruning convolution network:
(6a) performing layer-by-layer DNA coding on a convolutional filter participating in pruning in a trained deep convolutional target detection network, and recording the coding as DNA1,...l-1,l
(6b) Optimizing DNA using evolutionary algorithms1,...l-1,lCoding to obtain final optimized result coding DNA'1,...l-1,l
(6c) Incorporation of optimized result-encoding DNA'1,...l-1,lConstructing a target detection network based on the deep evolution pruning convolutional network by using a pruning rule, wherein the pruning rule is that the code of 0 represents that the convolutional filter is finally pruned, the code of 1 represents that the convolutional filter is finally reserved, and a training data set is used for fine adjustment to obtain a trained target detection network based on the deep evolution pruning convolutional network, namely a trained model, and storing a trained model weight file;
(7) and carrying out target detection on the test data set by using the trained model:
(7a) sequentially inputting the data blocks in the test data set into a trained target detection network based on a deep evolution pruning convolutional network to obtain a candidate frame of each data block in the test data set, a classification confidence score corresponding to the candidate frame and a target category corresponding to the candidate frame;
(7b) discarding all candidate frames of the target category with the classification confidence score lower than the threshold value 0.3, and performing non-maximum suppression processing on the remaining candidate frames after retention;
(7c) and mapping the coordinates of all the reserved candidate frames, mapping the coordinates onto the optical remote sensing image before cutting, and performing secondary non-maximum suppression processing to obtain a final detection result image of the optical remote sensing image.
2. The method for detecting the target of the remote sensing image based on the deep evolution pruning convolution network as claimed in claim 1, wherein the step (3) of constructing the deep convolution feature extraction sub-network comprises the following specific steps:
(3a) constructing a depth separable convolution inverse residual connecting module: the module structure is that the characteristic diagram input layer of the previous stage → 1 multiplied by 1 convolution layer → depth separable convolution unit → point-by-point addition layer → characteristic diagram output layer of the current stage;
the 1 x 1 convolution layer and the depth separable convolution unit in the inverse residual connecting module are grouped, and the point-by-point addition layer is a feature processing layer formed by performing point-by-point addition on the output feature map of the depth separable convolution unit in the previous layer and the feature map from the input layer of the inverse residual connecting module;
the inverse residual error connecting module firstly increases the dimension of a channel of an input feature map and then decreases the dimension, wherein a1 multiplied by 1 convolution layer increases the dimension of the channel of the input feature map by 2 times, and a depth separable convolution unit extracts the feature of the channel of the input feature map and decreases the dimension of the channel by 2 times, so that the number of the channels of the output feature map of the constructed depth separable convolution inverse residual error connecting module is consistent with that of the channel of the input feature map;
(3b) constructing a characteristic pyramid convolution module: the module is a double-layer input single-layer output structure, and the module structure is that an input feature diagram 1 → a first convolution layer of the input feature diagram 1 → a double upsampling layer → an output feature diagram 1, an input feature diagram 2 → a first convolution layer of the input feature diagram 2 → an output feature diagram 2, point-by-point addition layers → a second convolution layer → a feature diagram output layer at the current stage;
the input feature map 1 is a stage feature map with the same size as an input feature map and an output feature map in a depth separable convolution inverse residual connection module, the input feature map 2 is a feature map with the same spatial size as the output feature map 1 in the inverse residual connection module, the point-by-point addition layer is a feature processing layer formed by point-by-point addition of the output feature map 1 and the output feature map 2, and the double-up sampling layer amplifies the scale of the input feature map 1 processed by the first convolution layer through a bilinear interpolation algorithm;
(3c) and (3) sequentially using the 7 x 7 convolution layer and the maximum pooling layer, and alternately connecting the depth separable convolution inverse residual error connecting module and the feature pyramid convolution module to construct a depth convolution feature extraction sub-network.
3. The method for remotely sensing image targets based on the deep evolution pruning convolution net according to the claim 2, characterized in that the deep separable convolution unit in the step (3a) has the unit structure of upper stage feature map input layer → 3 x 3 deep convolution layer → first batch normalization layer → ReLU activation function layer → 1 x 1 point-by-point convolution layer → second batch normalization layer → linear activation function layer → output feature map layer;
the depth separable convolution unit divides the standard convolution into depth convolution and point-by-point convolution so as to realize the separation and the respective processing of the space and the channel of the characteristics;
the ReLU activation function is no longer used after the activation function after the 1 x 1 point-by-point convolution layer, but instead a linear activation function is used.
4. The remote sensing image target detection method based on the deep evolution pruning convolutional network as claimed in claim 1, wherein the step (6a) of performing layer-by-layer DNA coding on the convolutional filter participating in pruning in the trained deep convolutional target detection network is that:
(6a1) performing convolution operation on the output characteristic diagram: recording the l-th layer characteristic diagram of the trained deep convolution target detection network structure, and the height H of the l-th layer characteristic diagramlWidth WlThe number of channels is ClIs output characteristic diagram of
Figure FDA0002823808850000041
And note that the characteristic subgraph of the kth channel in the characteristic graph of the l layer is
Figure FDA0002823808850000042
Then Zl (k)By parameters of corresponding convolution filters
Figure FDA0002823808850000043
And a characteristic diagram Z of the front layerl-1Obtained by performing a convolution operation (. + -.), f representing an activation function, Zl (k)The calculation formula is as follows:
Zl (k)=f(Zl-1*Wl (k))
in a common deep learning framework such as TensorFlow and Caffe, the convolution operation of tensor is converted into matrix multiplication by transforming the input dimensionality and transposing a convolution filter, and the l-th layer characteristic diagram Z of the output characteristic diagram is processed by the convolution operationl *The formula of (a) is as follows:
Figure FDA0002823808850000044
wherein
Figure FDA0002823808850000045
Is the l-1 level characteristic diagram of the output characteristic diagram after being processed by the convolution operation, Wl *Representing a parameter matrix of a convolution filter corresponding to the l layer characteristic diagram after convolution operation processing;
(6a2) mask-coding the convolution filter that needs pruning or reservation: to training wellThe output characteristic diagram C of the l layer of the deep convolution target detection network structurelIntroducing masks to convolution filters requiring pruning or retention
Figure FDA0002823808850000046
Coding, coding as 0 indicates that the convolution filter is pruned, and coding as 1 indicates that the convolution filter is reserved; convolution operation formula with inner product as shown and global feature channel pruning
Figure FDA0002823808850000047
The changes are as follows:
Figure FDA0002823808850000048
(6a3) coding the convolution filter participating in pruning layer by layer: using the trained deep convolution target detection network to code the convolution filter participating in pruning layer by layer, wherein the convolution filter contains the coding DNA of all the layers to be pruned1,...l-1,lIs marked as
Figure FDA0002823808850000049
Wherein the DNA1,...l-1,lDNA codes for layers 1 to l,
Figure FDA00028238088500000410
representing the coding symbols in the l-1 level profile.
5. The method for detecting the target of the remote sensing image based on the deep evolution pruning convolution net is characterized in that the evolutionary algorithm is used for optimizing the DNA in the step (6b)1,...l-1,lCoding to obtain final optimized result DNA'1,...l-1,lThe specific method comprises the following steps:
(6b1) initialization: setting the evolution algebra counter T to be 0, setting the maximum evolution algebra T, and setting the pruning ratiocut0.5, according to ratiocutRandomly generated with DNA1,...l-1,lEncoded M individuals as an initial population
Figure FDA0002823808850000051
Wherein
Figure FDA0002823808850000052
Encoding DNA of the m-1 st individual in the 1 st to the l-th filters;
(6b2) adjusting network parameters using the training data set: for population P of the t-th roundtConvolution formula using training data set and global feature channel pruning after adding mask code
Figure FDA0002823808850000053
Retraining the generated network and adjusting network parameters;
(6b3) and (3) fitness calculation: convolution formula using verification dataset and global feature channel pruning with mask code added
Figure FDA0002823808850000054
Computing population PtFitness of each individual
Figure FDA0002823808850000055
Wherein
Figure FDA0002823808850000056
Figure FDA0002823808850000057
To verify loss of the data set;
(6b4) generating a new individual: using fitness of individuals
Figure FDA0002823808850000058
Selecting individuals with higher fitness for crossover and mutation to generate new individuals, wherein the crossover operation is based on the crossover probability pmCarrying out cross operation on the parent individuals randomly when the probability p of mutation is 0.9, and carrying out mutation operation according to the probability p of mutationcRandomly mutating the parent individuals at 0.9, and performing the steps (6b1) to (6b4) to obtain the population PtObtaining the next generation group P after selection, crossing and mutation operationst+1
(6b5) Judging whether to terminate the evolution: if T ═ T, the individual with the maximum fitness obtained in the evolution process is taken as the optimal solution output, the calculation is terminated, and the code thereof is recorded as DNA'1,...l-1,lExecuting the step (6c) to construct a target detection network based on the deep evolution pruning convolution network; otherwise, if T is less than T, returning to the step (6b2), repeating the steps (6b2) to (6b5) and continuing to perform evolution optimization of the codes.
CN201910648586.5A 2019-07-18 2019-07-18 Remote sensing image target detection method based on deep evolution pruning convolution net Active CN110532859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910648586.5A CN110532859B (en) 2019-07-18 2019-07-18 Remote sensing image target detection method based on deep evolution pruning convolution net

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910648586.5A CN110532859B (en) 2019-07-18 2019-07-18 Remote sensing image target detection method based on deep evolution pruning convolution net

Publications (2)

Publication Number Publication Date
CN110532859A CN110532859A (en) 2019-12-03
CN110532859B true CN110532859B (en) 2021-01-22

Family

ID=68660598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910648586.5A Active CN110532859B (en) 2019-07-18 2019-07-18 Remote sensing image target detection method based on deep evolution pruning convolution net

Country Status (1)

Country Link
CN (1) CN110532859B (en)

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062321B (en) * 2019-12-17 2023-05-30 佛山科学技术学院 SAR detection method and system based on deep convolutional network
CN111126303B (en) * 2019-12-25 2023-06-09 北京工业大学 Multi-parking-place detection method for intelligent parking
CN111222457B (en) * 2020-01-06 2023-06-16 电子科技大学 Detection method for identifying authenticity of video based on depth separable convolution
CN111259603B (en) * 2020-01-17 2024-01-30 南京星火技术有限公司 Electronic device, model design apparatus, and computer-readable medium
CN111582446B (en) * 2020-04-28 2022-12-06 北京达佳互联信息技术有限公司 System for neural network pruning and neural network pruning processing method
CN111640116B (en) * 2020-05-29 2023-04-18 广西大学 Aerial photography graph building segmentation method and device based on deep convolutional residual error network
CN111783794B (en) * 2020-06-08 2023-08-22 湖北工业大学 Multi-scale target detection method based on depth separable convolution residual block and improved NMS (network management system)
CN111783576B (en) * 2020-06-18 2023-08-18 西安电子科技大学 Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
CN111539224B (en) * 2020-06-25 2023-08-25 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111753787A (en) * 2020-07-01 2020-10-09 江苏金海星导航科技有限公司 Separated traffic sign detection and identification method
CN111833360B (en) * 2020-07-14 2024-03-26 腾讯科技(深圳)有限公司 Image processing method, device, equipment and computer readable storage medium
CN111832576A (en) * 2020-07-17 2020-10-27 济南浪潮高新科技投资发展有限公司 Lightweight target detection method and system for mobile terminal
CN111914924B (en) * 2020-07-28 2024-02-06 西安电子科技大学 Rapid ship target detection method, storage medium and computing equipment
CN112564716B (en) * 2020-08-05 2022-12-13 新疆大学 PC-SCMA system joint decoding method based on pruning iteration
CN112102241B (en) * 2020-08-11 2023-10-20 中山大学 Single-stage remote sensing image target detection algorithm
CN111783974A (en) * 2020-08-12 2020-10-16 成都佳华物链云科技有限公司 Model construction and image processing method and device, hardware platform and storage medium
CN112164035B (en) * 2020-09-15 2023-04-28 郑州金惠计算机系统工程有限公司 Image-based defect detection method and device, electronic equipment and storage medium
CN113159082B (en) * 2020-09-30 2023-06-02 北京理工大学 Incremental learning target detection network model construction and weight updating method
CN112287881A (en) * 2020-11-19 2021-01-29 国网湖南省电力有限公司 Satellite remote sensing image smoke scene detection method and system and computer storage medium
CN112434745B (en) * 2020-11-27 2023-01-24 西安电子科技大学 Occlusion target detection and identification method based on multi-source cognitive fusion
CN112508625B (en) * 2020-12-18 2022-10-21 国网河南省电力公司经济技术研究院 Intelligent inspection modeling method based on multi-branch residual attention network
CN112580381A (en) * 2020-12-23 2021-03-30 成都数之联科技有限公司 Two-dimensional code super-resolution reconstruction enhancing method and system based on deep learning
CN112668584A (en) * 2020-12-24 2021-04-16 山东大学 Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network
CN112784839A (en) * 2021-02-03 2021-05-11 华南理工大学 Scene character detection model lightweight method based on mobile terminal, electronic equipment and storage medium
CN112818871B (en) * 2021-02-04 2024-03-29 南京师范大学 Target detection method of full fusion neural network based on half-packet convolution
CN113160062B (en) * 2021-05-25 2023-06-06 烟台艾睿光电科技有限公司 Infrared image target detection method, device, equipment and storage medium
CN113487550B (en) * 2021-06-30 2024-01-16 佛山市南海区广工大数控装备协同创新研究院 Target detection method and device based on improved activation function
CN113487551B (en) * 2021-06-30 2024-01-16 佛山市南海区广工大数控装备协同创新研究院 Gasket detection method and device for improving dense target performance based on deep learning
CN113723472B (en) * 2021-08-09 2023-11-24 北京大学 Image classification method based on dynamic filtering constant-variation convolutional network model
CN113537399A (en) * 2021-08-11 2021-10-22 西安电子科技大学 Polarized SAR image classification method and system of multi-target evolutionary graph convolution neural network
CN113744220B (en) * 2021-08-25 2024-03-26 中国科学院国家空间科学中心 PYNQ-based detection system without preselection frame
CN113837284B (en) * 2021-09-26 2023-09-15 天津大学 Double-branch filter pruning method based on deep learning
CN113807466B (en) * 2021-10-09 2023-12-22 中山大学 Logistics package autonomous detection method based on deep learning
TWI790789B (en) * 2021-10-22 2023-01-21 大陸商星宸科技股份有限公司 Convolution operation method
CN114025198B (en) * 2021-11-08 2023-06-27 深圳万兴软件有限公司 Video cartoon method, device, equipment and medium based on attention mechanism
CN114220019B (en) * 2021-11-10 2024-03-29 华南理工大学 Lightweight hourglass type remote sensing image target detection method and system
CN114627282A (en) * 2022-03-15 2022-06-14 平安科技(深圳)有限公司 Target detection model establishing method, target detection model application method, target detection model establishing device, target detection model application device and target detection model establishing medium
CN116597486A (en) * 2023-05-16 2023-08-15 暨南大学 Facial expression balance recognition method based on increment technology and mask pruning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875373A (en) * 2016-12-14 2017-06-20 浙江大学 Mobile phone screen MURA defect inspection methods based on convolutional neural networks pruning algorithms
CN107609525A (en) * 2017-09-19 2018-01-19 吉林大学 Remote Sensing Target detection method based on Pruning strategy structure convolutional neural networks
CN108491854A (en) * 2018-02-05 2018-09-04 西安电子科技大学 Remote sensing image object detection method based on SF-RCNN
CN109544621A (en) * 2018-11-21 2019-03-29 马浩鑫 Light field depth estimation method, system and medium based on convolutional neural networks
CN109711288A (en) * 2018-12-13 2019-05-03 西安电子科技大学 Remote sensing ship detecting method based on feature pyramid and distance restraint FCN
CN109919008A (en) * 2019-01-23 2019-06-21 平安科技(深圳)有限公司 Moving target detecting method, device, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11093832B2 (en) * 2017-10-19 2021-08-17 International Business Machines Corporation Pruning redundant neurons and kernels of deep convolutional neural networks
US11250325B2 (en) * 2017-12-12 2022-02-15 Samsung Electronics Co., Ltd. Self-pruning neural networks for weight parameter reduction

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875373A (en) * 2016-12-14 2017-06-20 浙江大学 Mobile phone screen MURA defect inspection methods based on convolutional neural networks pruning algorithms
CN107609525A (en) * 2017-09-19 2018-01-19 吉林大学 Remote Sensing Target detection method based on Pruning strategy structure convolutional neural networks
CN108491854A (en) * 2018-02-05 2018-09-04 西安电子科技大学 Remote sensing image object detection method based on SF-RCNN
CN109544621A (en) * 2018-11-21 2019-03-29 马浩鑫 Light field depth estimation method, system and medium based on convolutional neural networks
CN109711288A (en) * 2018-12-13 2019-05-03 西安电子科技大学 Remote sensing ship detecting method based on feature pyramid and distance restraint FCN
CN109919008A (en) * 2019-01-23 2019-06-21 平安科技(深圳)有限公司 Moving target detecting method, device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Real-Time Object Detection Accelerator with Compressed SSDLite on FPGA;Hongxiang Fan,et al.;《2018 International Conference on Field-Programmable Technology (FPT)》;20190620;全文 *
基于反残差结构的轻量级多目标检测网络;刘万军 等;《激光与光电子学进展》;20190521;全文 *

Also Published As

Publication number Publication date
CN110532859A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN110532859B (en) Remote sensing image target detection method based on deep evolution pruning convolution net
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN110378844B (en) Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network
CN111126359B (en) High-definition image small target detection method based on self-encoder and YOLO algorithm
CN110348399B (en) Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network
CN108230278B (en) Image raindrop removing method based on generation countermeasure network
CN112288008B (en) Mosaic multispectral image disguised target detection method based on deep learning
CN110163213B (en) Remote sensing image segmentation method based on disparity map and multi-scale depth network model
CN112163602A (en) Target detection method based on deep neural network
CN109191418B (en) Remote sensing image change detection method based on feature learning of contraction self-encoder
CN109492596B (en) Pedestrian detection method and system based on K-means clustering and regional recommendation network
CN113435253B (en) Multi-source image combined urban area ground surface coverage classification method
CN113628294A (en) Image reconstruction method and device for cross-modal communication system
CN113240683B (en) Attention mechanism-based lightweight semantic segmentation model construction method
CN112489168A (en) Image data set generation and production method, device, equipment and storage medium
CN112200123B (en) Hyperspectral open set classification method combining dense connection network and sample distribution
CN111126278A (en) Target detection model optimization and acceleration method for few-category scene
CN112381733B (en) Image recovery-oriented multi-scale neural network structure searching method and network application
CN116469020A (en) Unmanned aerial vehicle image target detection method based on multiscale and Gaussian Wasserstein distance
CN115048870A (en) Target track identification method based on residual error network and attention mechanism
CN112132207A (en) Target detection neural network construction method based on multi-branch feature mapping
CN116385281A (en) Remote sensing image denoising method based on real noise model and generated countermeasure network
CN114494893B (en) Remote sensing image feature extraction method based on semantic reuse context feature pyramid
CN115375966A (en) Image countermeasure sample generation method and system based on joint loss function
CN111860668B (en) Point cloud identification method for depth convolution network of original 3D point cloud processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant