CN114170512A - Remote sensing SAR target detection method based on combination of network pruning and parameter quantification - Google Patents

Remote sensing SAR target detection method based on combination of network pruning and parameter quantification Download PDF

Info

Publication number
CN114170512A
CN114170512A CN202111488427.7A CN202111488427A CN114170512A CN 114170512 A CN114170512 A CN 114170512A CN 202111488427 A CN202111488427 A CN 202111488427A CN 114170512 A CN114170512 A CN 114170512A
Authority
CN
China
Prior art keywords
network
training
feature extraction
parameter
pruning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111488427.7A
Other languages
Chinese (zh)
Inventor
雷杰
王嘉轩
杨埂
谢卫莹
李云松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202111488427.7A priority Critical patent/CN114170512A/en
Publication of CN114170512A publication Critical patent/CN114170512A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a remote sensing SAR target detection method based on combination of network pruning and parameter quantification, which mainly solves the defects of high model complexity and low inference speed of the existing remote sensing SAR target detection method. The implementation scheme is as follows: acquiring a divided training set and a test set from a public remote sensing SAR target detection data set, and performing data expansion on the training set; performing data enhancement on the expanded training set; adjusting the existing lightweight network, and constructing a reference model; training a reference model and calculating the performance index of the reference model; evaluating the importance of each feature extraction module in the adjusted network, cutting out unimportant modules, and then performing filter pruning; setting a metric search space, searching a mixed precision quantization scheme of the pruned model to obtain a final model, and using the model to perform SAR target detection. The method improves the detection precision, saves the training cost, and can be used for target identification scenes with limited computing resources and high real-time requirements.

Description

Remote sensing SAR target detection method based on combination of network pruning and parameter quantification
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a remote sensing SAR target detection method which can be used for an SAR image recognition scene with limited computing resources and high real-time requirement.
Background
Synthetic Aperture Radar (SAR) is one of the important means for earth observation of people at present, and is widely applied to civil fields such as marine rescue, marine law enforcement and the like and military fields such as marine real-time monitoring and detection and the like. The SAR image data is usually a single-channel gray-scale image, and has the characteristics of many small targets, unobvious target characteristics, unbalanced target distribution and high similarity between the target and the background. In recent years, the convolutional neural network CNN has been widely used for SAR image target detection. Compared with the traditional detection algorithm which needs tedious feature design and modeling, the CNN-based method shows better performance than the traditional method by virtue of the autonomous learning parameters and the capability of automatically extracting features.
Currently, target detection algorithms based on CNN are mainly classified into two categories. The first type is a two-stage target detection algorithm represented by an RCNN (Region-CNN) series, which generates a candidate Region on a frame containing a target first and then performs target detection. The second type is a single-stage target detection algorithm represented by ssd (single Shot multi box detector) and YOLOv3 (young Only Look one). The algorithm does not generate a candidate region, directly performs target detection in a regression mode, has small calculated amount and high reasoning speed, and performs better in resource-limited environment, but the detection precision of the method is generally not as good as that of a two-stage method.
The convolutional neural network CNN is a core part of an object detection algorithm and is used for extracting image features. In pursuit of higher accuracy, researchers often design complex networks with a large number of redundant parameters. Such a network structure consumes a large amount of computing resources and storage resources no matter in the training phase or the reasoning phase, so that in practical application, a tradeoff between precision and resource consumption is required, and a main method for balancing the network structure is to compress a designed CNN network. Although the compression can generate some precision loss, the redundant parameters of the model can be greatly reduced, the inference speed is improved, the complexity and the resource consumption of the model are reduced, and the efficient SAR target detection is realized.
The most common method of model compression is model pruning, which can be divided into weight pruning, channel/filter pruning and layer pruning according to the rule that the pruning granularity is from fine to coarse. Wherein:
weight pruning, also known as unstructured pruning, has the idea of thinning out the original weight matrix and reducing the computational load during inference by reducing the connections between neurons. The method can reduce the calculation amount without modifying the network structure, but the increase of the reasoning speed needs special hardware for realizing sparse convolution.
Channel/filter pruning and layer pruning, also known as structured pruning, both of which alter the original network structure, tend to increase the speed of inference, but usually also with a concomitant loss of precision. Zhou et al propose channel pruning using Lasso regression and least squares on two-stage model Faster-RCNN in the document "Zhou, Wanhaipeng, Xufeng, Zhang Zhi, Wangxizheng, SAR image ship detection optimization algorithm based on channel pruning" Shanghai Sporo shuttle (Chinese & ENGLISH) No. 37, No. 4 of 2020 ", to reduce 56% of model parameters, but because the original network structure is complex, the reasoning speed is still not ideal.
The traditional pruning scheme is mostly carried out according to the following route: 1) training a large and over-parameterized network to obtain a reference model; 2) pruning the reference model based on a certain strategy; 3) and carrying out fine tuning training on the trimmed model to obtain the trimmed CNN model. Lin et al in the literature "Mingbao Lin, Rongrong Ji, Yan Wang, Yiche Zhang, Baochang Zhang, Yonghong Tian, Ling Shao.HRank: Filter planning using High-Rank Feature map. in CVPR 2020" found that the average Rank of a plurality of Feature maps generated by a single Filter is always the same, and therefore proposed a method for Filter Pruning according to the Rank of the Feature maps. Although the method does not need fine tuning training, the compression ratio of the model cannot reach a higher level because only filter pruning is carried out.
Most of the existing pruning schemes absorb the ideas of NAS (non-access stratum) architecture search and knowledge distillation, and a better structured pruning scheme is found by using complex iterative training. In a patent document ' SAR ship target detection method based on network pruning and knowledge distillation ' (patent application number: CN202011308276.8, publication number: CN112308019A) applied by the national defense science and technology university of China's liberation army, a single-stage YOLOv3 detector is used as a reference detection frame, and after channel pruning is carried out on a network, knowledge distillation guided by the interrelation between feature maps is utilized to restore the performance. However, the use of NAS methods requires a lot of GPU support and is generally only suitable for research use.
Another common method of model compression is parametric quantization. Korean pine et al used shared weights to quantize network parameters for the first time in deep compression, compressing the model by 2-3 times. Parameter quantization is a technique that replaces the default 32-bit floating point calculation with low-specific-point calculation, and is mostly used to save memory resources when the model is deployed in hardware. Although quantization can effectively reduce the model size, parameter number and computational intensity, it often causes a huge loss of precision, especially when the precision is less than 4 bits, even 1 bit, the precision challenge is larger.
The traditional quantization method generally stores model parameters by using fixed bits uniformly, but because the feature extraction capabilities of different depth layers of a network are different, the quantization by using the fixed bits cannot adapt to the characteristics of the network structure, so that important information can be lost, and the optimal compression ratio is difficult to achieve. The method of using mixed precision quantization instead of fixed bit quantization is a commonly used quantization method at present, and different quantization bits are used for representing parameters of different depth layers of a network. Due to the complexity and variability of the network structure, in order to find a better strategy for quantizing the blending precision, complex methods such as grid search or reinforcement learning are usually used, but these methods consume a large amount of computing resources and time cost.
In summary, the existing pruning and quantification methods have three disadvantages: 1) both compress only one aspect of the model, and do not combine the advantages of multiple compression methods; 2) the implementation engineering is too complex, and a large amount of computing resources are needed; 3) the method is not specially designed for a target detection task of the remote sensing SAR image. These deficiencies result in the difficulty of achieving high compression rates for network models and the inability to achieve high-speed SAR target detection in resource-constrained environments.
Disclosure of Invention
The invention aims to provide a remote sensing SAR target detection method based on combination of network pruning and parameter quantification aiming at the defects of the prior art, so as to reduce the consumption of calculation and storage resources, improve the compression ratio of a network model and improve the detection speed of an SAR target in a resource-limited environment.
In order to achieve the purpose, the technical scheme of the invention comprises the following steps:
(1) obtaining well-divided training set D from public remote sensing SAR target detection data settrainAnd test set DtestAnd performing data expansion on the images in the training set by using geometric transformation to obtain an expanded training set Dexp
(2) Training set D after expansionexpThe method comprises the steps of counting the proportion of target images with different sizes and images containing a large number of backgrounds, selecting the image with smaller proportion as a difficult sample in an extended training set, and sequentially performing offline data enhancement and online data enhancement on the sample to obtain an enhanced training set Daug
(3) Constructing an SAR target detection reference model:
(3a) the method comprises the steps of adjusting the existing lightweight network structure, namely changing an original classification layer into a detection layer, modifying part of down-sampling layers in a network deep layer into non-down-sampling layers, taking the adjusted lightweight network as a feature extraction backbone network N, wherein the feature extraction backbone network N comprises an input layer, a hidden layer and a detection layer, the hidden layer is composed of a plurality of feature extraction modules with the same structure, each feature extraction module comprises a possibly existing down-sampling layer and a plurality of non-down-sampling layers, the down-sampling layers are composed of convolution layers or pooling layers or rearrangement layers, and each non-down-sampling layer is composed of convolution layers or BN layers or activation layers or connection layers;
(3b) using K-Means clustering algorithm to pair extended training set DexpClustering target frames of the labels to obtain a prior anchor frame aiming at the data set, using the anchor frame as an anchor frame of the existing YOLO series single-stage detector to obtain an SAR target detector, and connecting the detector with a backbone network N to form a reference model for SAR target detection;
(4) updating the network weight parameters of the reference model:
(4a) taking a YOLO loss function as a loss function of a reference model, and initializing a weight parameter of a backbone network N by using a random number seed S;
(4b) will enhance the training set DaugInputting the parameters into an initialized reference model to start training, optimizing a YOLO loss function by using a momentum stochastic gradient descent algorithm (SGD) to update the weight parameters of the network, storing the network weight parameters once every 10 periods until the set maximum iteration times is reached, and stopping training to obtain a plurality of updated network weight parameters;
(5) evaluating the performance index of the reference model:
(5a) updating the reference model using the saved plurality of network weight parameters in the test set DtestCalculating F1 score of each updated reference model, and taking the maximum value of the F1 scores to be recorded as F10,F10Corresponding network weight parameter is W0Represents;
(5b) calculating W0The parameter P and floating point operands FLOPs, and the calculation results are respectively marked as P0And FLOPs0
(6) Carrying out rough pruning on the backbone network N:
(6a) setting a module mask m for N feature extraction modules in a backbone network N, and performing One Hot coding on all the feature extraction modules of the network by using the module mask m to obtain N mask subnets;
(6b) all n masked subnets share a weight parameter W0Fine tuning training each mask subnet and calculating its performance index F1 score F1iParameter PiAnd floating point operands FLOPsi
(6c) According to the performance index of each mask subnet, calculating the importance index I of each feature extraction module in the backbone network N, and respectively comparing the importance indexes I with a set importance index threshold IthrComparing when I is less than IthrThen, the feature extraction module corresponding to the importance index I in the backbone network N is cut off to obtain the roughly-cut backbone network Nm
(7) For the backbone network N after rough pruningmFine pruning:
(7a) evaluation of backbone network N after rough pruning based on Hrank methodmImportance of filters in convolutional layers, and dividing important filters and unimportant filters, backbone network N after rough pruningmBased on the obtained data, the unimportant filter is cut off to obtain the network N after fine pruningP
(7b) Initializing a fine-pruned network N using the same random number seed SPThen using the same training method as in (4) to the network N after fine pruningPTraining, storing the network weight parameter with the highest F1 score, and recording as WP
(8) For network N after fine pruningPAnd (3) carrying out parameter quantization:
(8a) network N after fine pruningPThe characteristic extraction module in the system is modified into a quantization module;
(8b) selecting quantization strategies with different precisions for the weight parameters and the activation output, and designing a quantization search space according to the importance index I of the feature extraction module;
(8c) training set D after enhancementaugThe best quantization scheme is searched in the iteration mode, and the best quantization scheme obtained through the search is applied to the network N after the fine pruningPTo obtain the final network NQ
(8d) Using WPAs a final network NQThe final network weight parameter W is obtained by fine tuning trainingQ
(9) With final network NQReplacing the backbone network N of the reference model and using the final network weight parameter WQUpdating the network weight parameters to obtain a final SAR target detection model;
(10) test set DtestAnd inputting the data into a final SAR target detection model to obtain an accurate SAR target detection result.
Compared with the prior art, the invention has the following advantages:
1. according to the SAR target detection method, a single-stage target detection algorithm with high speed is selected, and a compression scheme of module pruning, filter pruning and mixed precision quantification is further implemented, so that compared with the existing compression method using a two-stage detection algorithm and channel pruning and the method directly using the single-stage detection algorithm, the SAR target detection method has the advantages that the compression ratio of the model is greatly improved, and SAR target detection can be realized on more marginal hardware equipment;
2. aiming at the data characteristics of the remote sensing SAR target, the invention enhances the original training set data by using a plurality of data enhancement methods, improves the richness of the original data, constructs a high-performance reference model by reducing the down-sampling rate of a backbone network, and improves the accuracy of SAR target identification;
3. according to the invention, coarse-grained module pruning is carried out by analyzing module importance, and fine-grained filter pruning is carried out based on the HRank method, so that compared with the existing model compression method using reinforcement learning and knowledge distillation, the training cost is reduced;
4. the method combines random initialization training and pre-training plus fine-tuning training, uses different training methods for different models, reduces the training difficulty and improves the target detection precision of the models.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is an illustration of a sample remote sensing SAR image used in the present invention;
FIG. 3 is a sub-flow diagram of the present invention for implementing network pruning;
fig. 4 is a schematic diagram of designing a parameter quantization search space in the present invention.
Detailed Description
Embodiments of the present invention are further described below with reference to the accompanying drawings.
Referring to fig. 1, the invention provides a remote sensing SAR target detection method based on network pruning and parameter quantification combination, which comprises five stages: data preprocessing, building a reference model and training, calculating performance indexes, network pruning and parameter quantification.
The concrete implementation is as follows:
step 1: and (4) preprocessing data.
1.1) acquiring a public data set and carrying out data expansion:
obtaining well-divided training set D from public remote sensing SAR target detection data settrainAnd test set DtestA single image in the dataset may be represented as X1×H×WWherein (1 × H × W) represents a single channel, a height, and a width of the image, respectively;
since the disclosed remote sensing SAR data set typically contains less data, data expansion of the acquired data set is required, for DtrainRespectively performing translation transformation and rotation transformation on the images of (1), and converting the images of G 'after translation transformation'i',j'And a rotation-converted image G'W',Q'Are all added to the training set DtrainObtaining an extended training set Dexp
The translation transformation formula is as follows:
G'i',j'=Gi+x,j+y
wherein G isi,jRepresenting an original image, (i, j) representing coordinates before translation, (i ', j') representing coordinates after translation, and (x, y) representing a translation direction;
the rotational transformation formula is as follows:
G'W',Q'=GWcosθ+Qsinθ,-Wsinθ+Qcosθ
wherein, (W, Q) represents the coordinate before rotation, (W ', Q') represents the coordinate after rotation, and θ represents the angle of rotation;
1.2) on the extended training set DexpOffline data enhancement and online data enhancement are performed:
1.2.1) evaluation of the indices according to COCO in the extended training set DexpIn the image, the area of the real object frame of all the objects is calculated and 32 is used2And 962Dividing all targets into small targets, medium targets and large targets for nodes, counting the number of images containing various large and small targets, and manually selecting images containing a large amount of background noise as images of difficult examples, as shown in fig. 2, wherein fig. 2(a) is an image containing a small target, fig. 2(b) is an image containing a medium target, fig. 2(c) is an image containing a large target, and fig. 2(d) is an image containing a large amount of background noise, wherein a white frame is a real target frame manually added at a later stage;
1.2.2) offline data enhancement: scaling an image containing a small target in a difficult sample, and performing background noise reduction on the image partially containing a large amount of backgrounds;
1.2.3) data enhancement on line: reading the images of the training set after offline data enhancement into a memory, and then sequentially carrying out dynamic random change of random overturning, random expanding and random erasing on the images to ensure that the same image is different in different iterative training periods to obtain a data set D after data enhancementaug
Step 2: and constructing and training a reference model.
2.1) selecting a lightweight network structure according to computing resources, and determining whether to connect a feature pyramid FPN and a PAN module on the basis of the network structure according to requirements;
the existing lightweight network structure comprises an input layer, a hidden layer and a detection layer, wherein the hidden layer is generally composed of a plurality of feature extraction modules with the same structure, each feature extraction module comprises a possibly existing downsampling layer and a plurality of non-downsampling layers, the downsampling layer is composed of a convolution layer or a pooling layer or a rearrangement layer, each non-downsampling layer is composed of a convolution layer or a BN layer or an activation layer or a connection layer, and in the example, due to limited computing resources, feature pyramid FPN and PAN modules are not added;
2.2) adjusting the selected lightweight network structure:
usually, the lightweight network structure includes 5 down-sampling layers with a step length of 2, which can reduce the size of the network output feature map to 1/32 of the original image, resulting in the loss of small target information, and therefore, it needs to be adjusted, i.e. the classification layer in the original structure is changed into a detection layer, and part of the down-sampling layers in the network deep layer are modified into non-down-sampling layers, and the modification mode is to select different modification modes according to different structures used by the down-sampling layers: if the downsampling layer adopts a convolution layer, changing the step length of the convolution layer into 1; if the downsampling layer adopts the pooling layer, the empty layer is used for replacing the layer, and after modification, the number of network modules without changing the characteristic diagram scale is increased, so that the number of the modules can be reduced appropriately;
the adjusted lightweight network is used as a feature extraction backbone network N, so that the identification capability of the network on small targets can be enhanced, and although the calculated amount is increased compared with that of the original network, the target detection performance can be greatly improved;
2.3) use of K-Means clustering Algorithm on the extended training set DexpClustering target frames of the labels to obtain a plurality of groups of prior anchor frames for the data set, using the anchor frames as anchor frames of the existing YOLO series single-stage detector to obtain an SAR target detector, and connecting the detector with a backbone network N to form an SAR target detection reference model;
2.4) obtaining the loss function of YOLOv2 by using a YOLO detector, and taking the loss function as the loss function of a reference model:
splitting input pictures into S with a YOLO detector2Each grid is provided with an A group of prior anchor frames obtained by clustering, and the coordinate of the upper left corner of the grid is assumed to be cxAnd cyThe width and height of the prior anchor frame is pwAnd phThe backbone network performs regression for each prior anchor frame to obtain 5 outputs, tx、ty、tw、thAnd toWhich respectively correspond to the regression boxesx, y coordinates, width, height and confidence;
the 5 outputs are converted as follows to yield bx、by、bw、bhAnd p (object) which corresponds to the x, y coordinates of the prediction box, width, height and probability of being an object, respectively:
bx=σ(tx)+cx
by=σ(ty)+cy
Figure BDA0003398253580000081
Figure BDA0003398253580000082
P(object)=σ(to)*IOU(boxpred,boxtruth)
wherein σ () is sigmoid function, IOU (box)pred,boxtruth) The overlapping rate between the regression box and the truth box is shown;
the loss function of YOLOv2 is derived from the above conversion parameters:
Figure BDA0003398253580000083
wherein the content of the first and second substances,
Figure BDA0003398253580000084
the number of the target categories is the number of the target categories,
Figure BDA0003398253580000085
is the number of anchor frames a priori,
Figure BDA0003398253580000086
g is a coordinate variable corresponding to a truth box, b is a coordinate variable corresponding to a prediction box, i is a grid number, and S is total2Each grid, j is the serial number of the prior anchor frame, and each grid is provided with an A group of prior anchor frames;
Figure BDA0003398253580000087
and
Figure BDA0003398253580000088
the flag bit of the overlapping relation between the prior anchor frame and the truth frame is judged to determine which group of prior anchor frames is used for predicting the result: if the overlap ratio of the jth prior anchor frame and a certain true value frame in the ith grid is greater than a set overlap ratio threshold value, the overlap ratio of the jth prior anchor frame and the certain true value frame in the ith grid is greater than the set overlap ratio threshold value
Figure BDA0003398253580000089
Otherwise
Figure BDA00033982535800000810
If the overlapping rate of the jth prior anchor frame and all the true value frames in the ith grid is less than the set overlapping degree threshold value, the overlapping degree of the jth prior anchor frame and all the true value frames in the ith grid is less than the set overlapping degree threshold value
Figure BDA00033982535800000811
Otherwise
Figure BDA00033982535800000812
2.5) initializing the weight parameter of the backbone network N by using the random number seed S, and enhancing the training set DaugInputting the data into an initialized reference model to start training, and optimizing a YOLO loss function by using a momentum Stochastic Gradient Descent (SGD) algorithm to update the weight parameters of the network:
2.5.1) setting the initial iteration time t as 0 and the maximum iteration time t as tmaxInitial acceleration v0Initializing the network weight parameter θ using the random number seed S, 00
2.5.2) at each iteration, from the enhanced training set DaugObtaining a batch of training samples (X) with the size of nt,Yt)nRandomly selecting a training sample from the batch of samples
Figure BDA00033982535800000813
To train the sample
Figure BDA00033982535800000814
And thetatObtaining the current loss function as the parameter of the YOLO loss function
Figure BDA0003398253580000091
Wherein i belongs to 1, 2.. and n;
2.5.3) the t iteration, updating the network weight parameter theta by using the following expression to obtain the updated network weight parameter thetat+1
θt+1=θt-vt
Figure BDA0003398253580000092
Wherein v istRepresenting the acceleration of the accumulated image at time t, v, at the first iterationt=v0,θtFor the model weight parameter at time t, θ at the first iterationt=θ0Alpha represents the power, the value is 0.9, and eta is the learning rate;
2.5.4) repeat 2.5.2) and 2.5.3), the enhanced training set DaugThe iteration is once recorded as a period, and the network weight parameter is stored every 10 periods until the set maximum iteration time t is reachedmaxAnd stopping training to obtain a plurality of network weight parameters.
And step 3: and calculating the performance index.
The performance indexes include: f1 fraction, parameter P and floating point operands FLOPs, which are specifically realized as follows:
3.1) updating the reference model using the saved plurality of network weight parameters, and testing the set DtestInputting the target detection result of each updated reference model into each updated reference model, and acquiring the following variable values of each updated reference model from the target detection result of each updated reference model:
true positive TP: the number of targets that are correctly detected;
false positive FP: the number of targets that are detected incorrectly, i.e., false alarms;
false negative FN: the number of objects which are not identified, namely missing detection;
using the three variable values, the accuracy Pr, recall Re and F1 scores for each updated reference model are calculated:
Figure BDA0003398253580000093
Figure BDA0003398253580000094
Figure BDA0003398253580000095
the maximum value of the F1 score was designated as F10,F10Corresponding network weight parameter is W0Represents;
3.2) calculating W0Parameter P and floating point operands FLOPs:
3.2.1) calculating W using the following equation0Parameter P of each convolutional layer in corresponding backbone network Ni:
Pi=K2×Cin×Cout
Where K is the convolution kernel size, CinFor input channel number, CoutThe number of output channels;
3.2.2) to W0Parameter P of all convolutional layers in corresponding backbone network NiSumming to obtain W0Total parameter number P of0
Figure BDA0003398253580000101
Wherein N is the number of convolutional layers in the backbone network N;
3.2.3) calculating W using the following equation0Corresponding backbone network NFloating point operands FLOPs per convolutional layeri
FLOPsi=2K2×Cin×Cout×Hout×Wout
Where K is the convolution kernel size, HoutTo output the feature map height, WoutIs the output signature width;
3.2.4) pairs of W0All convolution layer floating point operands FLOPs in the corresponding backbone network NiSumming to obtain W0Total floating point operands of FLOPs0
Figure BDA0003398253580000102
And 4, step 4: and (4) performing network pruning according to the performance index obtained by calculation in the step (3).
The goal of network pruning is: the parameter P and the floating-point operand FLOPs are reduced, and the F1 is ensured to be reduced within a reasonable expectation;
referring to fig. 3, the implementation of the coarse-grained module pruning and the fine-grained filter pruning on the network in this step is as follows:
4.1) carrying out module pruning on the backbone network N:
4.1.1) set module mask m ═ m1,m2,...,mi,...,mnIn which m isiCorresponding to the ith feature extraction module m in the backbone network NiE {0,1}, i ═ 1, 2., N, N represents the number of feature extraction modules in the backbone network N;
4.1.2) order mj|j≠i=1,miObtaining a mask subnet shielding the ith feature extraction module;
4.1.3) repeating 4.1.2) to obtain n mask subnets;
4.1.4) sharing W across all n masked subnets0Fine tuning training each mask subnet and calculating their performance index F1 score F1iParameter PiAnd floating point operands FLOPsi
4.15) calculating each mask subnet Performance indicator and benchmark model Performance indicator F10、P0And FLOPs0Is Δ F1i、ΔPi、ΔFLOPsi
ΔF1i=F10-F1i
ΔPi=P0-Pi
ΔFLOPsi=FLOPs0-FLOPsi
4.1.6) defining the importance index I of the ith feature extraction module in the backbone network NiComprises the following steps:
Figure BDA0003398253580000111
wherein, alpha, beta and gamma are respectively delta F1i、ΔPi、ΔFLOPsiConstant of influence factor of IiThe importance degree of each module is represented, the meaning of the importance degree is the influence of the equilibrium value of each unit parameter and floating point operand on the change of F1, and the larger I is, the more important the module is;
4.1.7) comparing the module importance index I with a set importance index threshold IthrComparing when I is less than IthrThen, the feature extraction module corresponding to the importance index I in the backbone network N is cut off to obtain the backbone network N after the module pruningm
4.2) backbone network N after pruning the modules based on the Hrank methodmPerforming filter pruning:
since the average ranks of a plurality of feature maps generated by a single filter are always the same, the importance of the filter can be judged through the ranks of the feature maps, and then unimportant filters are cut off to obtain a network after pruning, for NmThe pruning procedure for each convolutional layer of (a) is as follows:
4.2.1) optional training set DaugA few images are input to the backbone network NmIn the method, the average rank R ═ R of each convolution layer output characteristic graph in the network is calculated1,r2,...ri,...,rnTherein of,riFor the ith filter wiThe average rank of the output feature map, i ∈ {1, 2., n }, n denotes the number of filters of the layer;
4.2.2) sorting R according to descending order to obtain average rank of sorted convolutional layer output characteristic diagram
Figure BDA0003398253580000112
Wherein the content of the first and second substances,
Figure BDA0003398253580000113
represents the ith value from high to low in R;
4.2.3) setting the number of filters n reserved1And the number n of filters to be pruned2Wherein n is1+n2=n;
4.2.4) from
Figure BDA0003398253580000114
Before n is selected1Filter corresponding to the value to obtain important filter set
Figure BDA0003398253580000115
From
Figure BDA0003398253580000116
After (n) is selected2Filters whose values correspond to each other, resulting in a non-significant set of filters
Figure BDA0003398253580000117
Wherein the content of the first and second substances,
Figure BDA0003398253580000118
is composed of
Figure BDA0003398253580000119
The filter corresponding to the jth value in the list, j ∈ {1,21},
Figure BDA00033982535800001110
Is composed of
Figure BDA00033982535800001111
The k-th value in (k) corresponds to a filter, k ∈ {1,22};
4.2.5) set of insignificant filters U removing all convolutional layerswThen, the network N after fine pruning is obtainedPInitializing the fine-pruned network N using the same random number seed SPThen to the network N after fine pruningPTraining and calculating performance indexes, storing the network weight parameter with the highest F1 score, and recording as WP
And 5: and quantizing the network parameters to obtain a final SAR target detection model.
After pruning, the network parameters are further compressed by parameter quantization:
5.1) quantizing the weight parameters in the full-precision network module by using a quantization function to obtain quantized weight parameters wq
Figure BDA0003398253580000121
Wherein, the QuankIs a quantization function, which is expressed as:
Figure BDA0003398253580000122
k is the number of quantization bits;
wτas a primary network weight parameter, wqThe quantized network weight parameters;
5.2) quantizing the activation output in the full-precision network module by using a quantization function to obtain quantized activation output:
Figure BDA0003398253580000123
Figure BDA0003398253580000124
where x is the activation function input and y is activationOutput, yqFor quantized activation output, α is a learnable parameter;
5.3) to the post-pruning network NPThe weight parameter and the activation output set different quantization bit value ranges, that is, the quantization bit value range Q of the weight parameter is setwSetting a quantization bit value range Q of the activation output as {4,5a={3,4,...,7};
5.4) extracting the importance index I of all the feature extraction modules to { I ═ I1,I2,...,Ii,...InSequencing in ascending order to obtain the importance indexes of the feature extraction modules sequenced in ascending order
Figure BDA0003398253580000125
Wherein, IiRepresents the importance index of the ith feature extraction module,
Figure BDA0003398253580000126
the ith value from small to large in I is represented, I belongs to {1, 2.. multidot.n };
referring to FIG. 4, this example will be described as Io、QwAnd QaDividing into 3 subsets, making the three 3 subsets one-to-one correspondence in order, forming quantization search space, i.e. for IoEach value in (1) can be corresponding to a feature extraction module in QwSubset and QaCarrying out quantitative search in the subset;
5.5) Q corresponding to each characteristic extraction module in each quantitative searchwSubset and QaRespectively selecting one value from the subset as a weight and a k value of a quantization function in activation output to obtain a quantization network, and then performing enhancement on a training set DaugTraining upper fine adjustment and calculating performance indexes;
5.6) repeat 5.5), search for the best quantization solution, and apply the best quantization solution obtained by the search to the network N after fine pruningPTo obtain the final network NQ
5.7) Final network N after quantizationQIs small, is difficult to train from scratch using random initialization parameters, so it usesNetwork weight parameter W after fine pruningPAs a final network NQThe final network weight parameter W is obtained by fine tuning trainingQ
5.8) Final network NQReplacing the backbone network N of the reference model and using the final network weight parameter WQAnd updating the network weight parameters to obtain a final SAR target detection model.
Step 6: test set DtestAnd inputting the data into a final SAR target detection model to obtain an accurate SAR target detection result.
The foregoing description is only an example of the present invention and is not intended to limit the invention, so that it will be apparent to those skilled in the art that various changes and modifications in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (12)

1. A remote sensing SAR target detection method based on network pruning and parameter quantification combination is characterized by comprising the following steps:
(1) obtaining well-divided training set D from public remote sensing SAR target detection data settrainAnd test set DtestAnd performing data expansion on the images in the training set by using geometric transformation to obtain an expanded training set Dexp
(2) Training set D after expansionexpThe method comprises the steps of counting the proportion of target images with different sizes and images containing a large number of backgrounds, selecting the image with smaller proportion as a difficult sample in an extended training set, and sequentially performing offline data enhancement and online data enhancement on the sample to obtain an enhanced training set Daug
(3) Constructing an SAR target detection reference model:
(3a) the method comprises the steps of adjusting the existing lightweight network structure, namely changing an original classification layer into a detection layer, modifying part of down-sampling layers in a network deep layer into non-down-sampling layers, taking the adjusted lightweight network as a feature extraction backbone network N, wherein the feature extraction backbone network N comprises an input layer, a hidden layer and a detection layer, the hidden layer is composed of a plurality of feature extraction modules with the same structure, each feature extraction module comprises a possibly existing down-sampling layer and a plurality of non-down-sampling layers, the down-sampling layers are composed of convolution layers or pooling layers or rearrangement layers, and each non-down-sampling layer is composed of convolution layers or BN layers or activation layers or connection layers;
(3b) using K-Means clustering algorithm to pair extended training set DexpClustering target frames of the labels to obtain a prior anchor frame aiming at the data set, using the anchor frame as an anchor frame of the existing YOLO single-stage detector to obtain an SAR target detector, and connecting the detector with a backbone network N to form a reference model for SAR target detection;
(4) updating the network weight parameters of the reference model:
(4a) taking a YOLO loss function as a loss function of a reference model, and initializing a weight parameter of a backbone network N by using a random number seed S;
(4b) will enhance the training set DaugInputting the parameters into an initialized reference model to start training, optimizing a YOLO loss function by using a momentum stochastic gradient descent algorithm (SGD) to update the weight parameters of the network, storing the network weight parameters once every 10 periods until the set maximum iteration times is reached, and stopping training to obtain a plurality of updated network weight parameters;
(5) evaluating the performance index of the reference model:
(5a) updating the reference model using the saved plurality of network weight parameters in the test set DtestCalculating F1 score of each updated reference model, and taking the maximum value of the F1 scores to be recorded as F10,F10Corresponding network weight parameter is W0Represents;
(5b) calculating W0The parameter P and floating point operands FLOPs, and the calculation results are respectively marked as P0And FLOPs0
(6) Carrying out rough pruning on the backbone network N:
(6a) setting a module mask m for N feature extraction modules in a backbone network N, and performing One Hot coding on all the feature extraction modules of the network by using the module mask m to obtain N mask subnets;
(6b) all n masked subnets share a weight parameter W0Fine tuning training each mask subnet and calculating its performance index F1 score F1iParameter PiAnd floating point operands FLOPsi
(6c) According to the performance index of each mask subnet, calculating the importance index I of each feature extraction module in the backbone network N, and respectively comparing the importance indexes I with a set importance index threshold IthrComparing when I is less than IthrThen, the feature extraction module corresponding to the importance index I in the backbone network N is cut off to obtain the roughly-cut backbone network Nm
(7) For the backbone network N after rough pruningmFine pruning:
(7a) evaluation of backbone network N after rough pruning based on Hrank methodmImportance of filters in convolutional layers, and dividing important filters and unimportant filters, backbone network N after rough pruningmBased on the obtained data, the unimportant filter is cut off to obtain the network N after fine pruningP
(7b) Initializing a fine-pruned network N using the same random number seed SPThen using the same training method as in (4) to the network N after fine pruningPTraining, storing the network weight parameter with the highest F1 score, and recording as WP
(8) For network N after fine pruningPAnd (3) carrying out parameter quantization:
(8a) network N after fine pruningPThe characteristic extraction module in the system is modified into a quantization module;
(8b) selecting quantization strategies with different precisions for the weight parameters and the activation output, and designing a quantization search space according to the importance index I of the feature extraction module;
(8c) training set D after enhancementaugThe best quantization scheme is searched in the iteration mode, and the best quantization scheme obtained through the search is applied to the network N after the fine pruningPTo obtain the final network NQ
(8d) Using WPAs a final network NQThe final network weight parameter W is obtained by fine tuning trainingQ
(9) With final network NQReplacing the backbone network N of the reference model and using the final network weight parameter WQUpdating the network weight parameters to obtain a final SAR target detection model;
(10) test set DtestAnd inputting the data into a final SAR target detection model to obtain an accurate SAR target detection result.
2. The method of claim 1, wherein the image is data augmented in (1) by:
1a) for the obtained training set DtrainThe image is subjected to translation transformation to obtain an image G 'subjected to translation transformation'i',j'
G'i',j'=Gi+x,j+y
Wherein G represents the original image, (i, j) represents the coordinates before translation, (i ', j') represents the coordinates after translation, and (x, y) represents the translation direction;
1b) performing rotation transformation on the obtained training set image to obtain an image G 'after rotation transformation'W',Q'
G'W',Q'=GWcosθ+Qsinθ,-Wsinθ+Qcosθ
Wherein, (W, Q) represents the coordinate before rotation, (W ', Q') represents the coordinate after rotation, and θ represents the angle of rotation;
1c) the translated image G'i',j'And a rotated image G'W',Q'Are all added to the training set DtrainObtaining an extended training set Dexp
3. The method of claim 1, wherein the extended training set D in (2)expThe difficult samples in (1) sequentially perform offline data enhancement and online data enhancement, and the implementation is as follows:
2a) selecting a difficult sample:
training set D after expansionexpIn the image, the area of the real object frame of all the objects is calculated and 32 is used2And 962Dividing all targets into small targets, medium targets and large targets for nodes, counting the number of images containing various large and small targets, and manually selecting images containing a large amount of background noise as difficult samples;
2b) offline data enhancement:
scaling images containing small targets in the difficult samples, and performing background noise reduction on the images containing a large number of backgrounds to obtain a training set after offline data enhancement;
2c) on-line data enhancement:
reading the images of the training set after offline data enhancement into a memory, and then sequentially carrying out dynamic random change of random overturning, random expanding and random erasing on the images to ensure that the same image is different in different iterative training periods to obtain a data set D after data enhancementaug
4. The method of claim 1, wherein the detection reference model loss function in (4a) is expressed as follows:
Figure FDA0003398253570000041
wherein the content of the first and second substances,
Figure FDA0003398253570000042
the number of the target categories is the number of the target categories,
Figure FDA0003398253570000043
is the number of anchor frames a priori,
Figure FDA0003398253570000044
g is a coordinate variable corresponding to a truth box, b is a coordinate variable corresponding to a prediction box, i is a grid number, and S is total2Each grid, j being the serial number of the prior anchor frame, each grid having A groups of priorAn anchor frame is arranged on the base plate,
Figure FDA0003398253570000045
and
Figure FDA0003398253570000046
is a flag bit for judging the overlapping relation between the prior anchor frame and the truth frame, and sigma () is sigmoid function, IOU (box)pred,boxtruth) Is the overlap ratio between the prior anchor box and the truth box, bx=σ(tx)+cx,by=σ(ty)+cy
Figure FDA0003398253570000047
tx、ty、tw、thAnd toRespectively 5 outputs of the model obtained for each prior anchor frame, which respectively correspond to the x, y coordinates, width, height and confidence of the regression frame, cxAnd cyAs x, y coordinates of the upper left corner of the grid, pwAnd phThe width and height of the anchor frame a priori.
5. The method of claim 1, wherein the SGD optimization of the YOLO loss function in (4b) is performed by using a momentum stochastic gradient descent algorithm to update the weight parameters of the network, and the method is implemented as follows:
4b1) the initial iteration time t is 0, and the maximum iteration time t ismaxInitial acceleration v0Randomly initializing a network weight parameter θ when equal to 00
4b2) At each iteration, from a batch of training samples (X) of size nt,Yt)nRandomly selecting a training sample
Figure FDA0003398253570000048
Will train the sample
Figure FDA0003398253570000049
And thetatObtaining the current loss function as the parameter of the YOLO loss function
Figure FDA00033982535700000410
Wherein i belongs to 1, 2.. and n;
4b3) during the t iteration, the network weight parameter theta is updated by using the following expression to obtain the updated network weight parameter thetat+1
θt+1=θt-vt
Figure FDA00033982535700000411
Wherein v istRepresenting the acceleration of the accumulated image at time t, v, at the first iterationt=v0,θtFor the model weight parameter at time t, θ at the first iterationt=θ0Alpha represents the power, the value is 0.9, and eta is the learning rate;
4b4) repeat 4b2) and 4b3) when the set maximum number of iterations t) is reachedmaxWhen the network weight parameter is updated, the updating is stopped to obtain the final network weight parameter
Figure FDA0003398253570000051
6. The method of claim 1, wherein in (5a) is in test set DtestThe updated F1 score for each reference model was calculated as follows:
5a1) test set DtestInputting the target detection result into each updated reference model to obtain a target detection result of each updated reference model;
5a2) obtaining, from the target detection result of each updated reference model, a variable value for each updated reference model as follows:
true positive TP: the number of targets that are correctly detected;
false positive FP: the number of targets that are detected incorrectly, i.e., false alarms;
false negative FN: the number of objects which are not identified, namely missing detection;
5a3) and calculating the accuracy rate Pr and the recall rate Re of each updated reference model by using the three variable values:
Figure FDA0003398253570000052
Figure FDA0003398253570000053
5a4) using the calculated accuracy Pr and recall Re, the F1 score for each updated benchmark model is calculated:
Figure FDA0003398253570000054
7. the method of claim 1, wherein the network weight parameter W that maximizes the F1 score is calculated in (5b)0The parameter P and floating point operands FLOPs of (1) are implemented as follows:
5b1) calculating W0Parameter P of each convolutional layer in corresponding backbone network Ni
Pi=K2×Cin×Cout
Where K is the convolution kernel size, CinFor input channel number, CoutThe number of output channels;
5b2) to W0Parameter P of all convolutional layers in corresponding backbone network NiSumming to obtain W0Total parameter number P of0
Figure FDA0003398253570000061
Wherein N is the number of convolutional layers in the backbone network N;
5b3) calculating W0Corresponding backbone network NFloating point operands FLOPs per convolutional layeri
FLOPsi=2K2×Cin×Cout×Hout×Wout
Where K is the convolution kernel size, HoutTo output the feature map height, WoutIs the output signature width;
5b4) to W0All convolution layer floating point operands FLOPs in the corresponding backbone network NiSumming to obtain W0Total floating point operands of FLOPs0
Figure FDA0003398253570000062
8. The method of claim 1, wherein all feature extraction modules of the backbone network N are One Hot encoded using a module mask in (6b) as follows:
6b1) set module mask m ═ m1,m2,...,mi,...,mnIn which m isiCorresponding to the ith feature extraction module m in the backbone network NiE {0,1}, i ═ 1, 2., N, N represents the number of feature extraction modules in the backbone network N;
6b2) let mj|j≠i=1,miObtaining a mask subnet shielding the ith feature extraction module;
6b3) repeat 6b2) resulting in n masked subnets.
9. The method according to claim 1, wherein the importance indicator I of each feature extraction module in the backbone network N is calculated according to the performance indicator of each mask subnet in (6c) as follows:
6c1) calculating the difference value between the performance index of each mask subnet and the performance index of the reference model:
ΔF1i=F10-F1i
ΔPi=P0-Pi
ΔFLOPsi=FLOPs0-FLOPsi
6c2) defining an importance index I of the ith feature extraction module in a backbone network NiComprises the following steps:
Figure FDA0003398253570000071
wherein, alpha, beta and gamma are respectively delta F1i、ΔPi、ΔFLOPsiIs constant.
10. The method according to claim 1, wherein the evaluation of the backbone network N after rough pruning in (7a) is based on the Hrank methodmThe importance of the filters in the convolutional layer, and the important filters and unimportant filters are divided, and the following are realized:
7a1) optional training set DaugA few images are input to the backbone network NmIn the method, the average rank R ═ R of each convolution layer output characteristic graph of the network is calculated1,r2,...ri,...,rnWherein r isiFor the ith filter wiThe average rank of the output feature map, i ∈ {1, 2., n }, n denotes the number of filters of the layer;
7a2) sequencing R according to descending order to obtain average rank of output characteristic diagram of sequenced convolution layer
Figure FDA0003398253570000072
Wherein the content of the first and second substances,
Figure FDA0003398253570000073
represents the ith value from high to low in R;
7a3) setting the number n of retained filters1And the number n of filters to be pruned2Wherein n is1+n2=n;
7a4) From
Figure FDA0003398253570000074
Before n is selected1Filter corresponding to the value to obtain important filter set
Figure FDA0003398253570000075
From
Figure FDA0003398253570000076
After (n) is selected2Filters whose values correspond to each other, resulting in a non-significant set of filters
Figure FDA0003398253570000077
Wherein the content of the first and second substances,
Figure FDA0003398253570000078
is composed of
Figure FDA0003398253570000079
The filter corresponding to the jth value in the list, j ∈ {1,21},
Figure FDA00033982535700000710
Is composed of
Figure FDA00033982535700000711
The k-th value in (k) corresponds to a filter, k ∈ {1,22}。
11. The method of claim 1, wherein the full-precision network module is modified to the quantized network module in (8a) by:
8b1) quantizing the weight parameters in the full-precision network module by using a quantization function to obtain quantized weight parameters wq
Figure FDA00033982535700000712
Wherein, the QuankAnd a quantization function, which is expressed as:
Figure FDA00033982535700000713
k is the number of quantization bits;
wτas a primary network weight parameter, wqThe quantized network weight parameters;
8b2) quantizing the activation output in the full-precision network module by using a quantization function to obtain quantized activation output:
Figure FDA0003398253570000081
Figure FDA0003398253570000082
where x is the activation function input, y is the activation output, yqFor quantized activation output, α is a learnable parameter.
12. The method of claim 1, wherein in (8b), quantization strategies of different precisions are selected for the weight parameters and the activation outputs, and the quantization search space is designed according to the importance index I of the feature extraction module, which is implemented as follows:
8a1) setting quantization bit value range Q of weight parameterwSetting a quantization bit value range Q of the activation output as {4,5a={3,4,...,7};
8a2) The importance index I of all the feature extraction modules is set as { I ═ I1,I2,...,Ii,...InSequencing in ascending order to obtain the importance indexes of the feature extraction modules sequenced in ascending order
Figure FDA0003398253570000083
Wherein, IiRepresents the importance index of the ith feature extraction module,
Figure FDA0003398253570000084
in representation I fromAn ith value as small as large, i ∈ {1, 2.., n };
8a3) respectively mixing Io、QwAnd QaDividing into x subsets, making the three subsets correspond to each other in sequence, and forming a quantized search space, namely for IoAt each value in the set of values, in its corresponding QwSubset and QaA quantitative search is performed in the subset.
CN202111488427.7A 2021-12-08 2021-12-08 Remote sensing SAR target detection method based on combination of network pruning and parameter quantification Pending CN114170512A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111488427.7A CN114170512A (en) 2021-12-08 2021-12-08 Remote sensing SAR target detection method based on combination of network pruning and parameter quantification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111488427.7A CN114170512A (en) 2021-12-08 2021-12-08 Remote sensing SAR target detection method based on combination of network pruning and parameter quantification

Publications (1)

Publication Number Publication Date
CN114170512A true CN114170512A (en) 2022-03-11

Family

ID=80484098

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111488427.7A Pending CN114170512A (en) 2021-12-08 2021-12-08 Remote sensing SAR target detection method based on combination of network pruning and parameter quantification

Country Status (1)

Country Link
CN (1) CN114170512A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781640A (en) * 2022-06-16 2022-07-22 阿里巴巴达摩院(杭州)科技有限公司 Model deployment method, system, storage medium and electronic device
CN115439684A (en) * 2022-08-25 2022-12-06 艾迪恩(山东)科技有限公司 Household garbage classification method based on lightweight YOLOv5 and APP
CN116992944A (en) * 2023-09-27 2023-11-03 之江实验室 Image processing method and device based on leavable importance judging standard pruning

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781640A (en) * 2022-06-16 2022-07-22 阿里巴巴达摩院(杭州)科技有限公司 Model deployment method, system, storage medium and electronic device
CN115439684A (en) * 2022-08-25 2022-12-06 艾迪恩(山东)科技有限公司 Household garbage classification method based on lightweight YOLOv5 and APP
CN115439684B (en) * 2022-08-25 2024-02-02 艾迪恩(山东)科技有限公司 Household garbage classification method and APP based on lightweight YOLOv5
CN116992944A (en) * 2023-09-27 2023-11-03 之江实验室 Image processing method and device based on leavable importance judging standard pruning
CN116992944B (en) * 2023-09-27 2023-12-19 之江实验室 Image processing method and device based on leavable importance judging standard pruning

Similar Documents

Publication Publication Date Title
CN110399850B (en) Continuous sign language recognition method based on deep neural network
CN114170512A (en) Remote sensing SAR target detection method based on combination of network pruning and parameter quantification
CN112101430B (en) Anchor frame generation method for image target detection processing and lightweight target detection method
CN113052211B (en) Pruning method based on characteristic rank and channel importance
CN110929577A (en) Improved target identification method based on YOLOv3 lightweight framework
CN110516095B (en) Semantic migration-based weak supervision deep hash social image retrieval method and system
CN108764138B (en) Plateau area cloud and snow classification method based on multidimensional and multi-granularity cascade forest
CN109871749B (en) Pedestrian re-identification method and device based on deep hash and computer system
CN115131760B (en) Lightweight vehicle tracking method based on improved feature matching strategy
CN115100709B (en) Feature separation image face recognition and age estimation method
CN104318271B (en) Image classification method based on adaptability coding and geometrical smooth convergence
CN116206185A (en) Lightweight small target detection method based on improved YOLOv7
CN116824585A (en) Aviation laser point cloud semantic segmentation method and device based on multistage context feature fusion network
CN115393690A (en) Light neural network air-to-ground observation multi-target identification method
CN114627424A (en) Gait recognition method and system based on visual angle transformation
CN114547365A (en) Image retrieval method and device
CN117217282A (en) Structured pruning method for deep pedestrian search model
CN117392406A (en) Low-bit-width mixed precision quantization method for single-stage real-time target detection model
CN112132207A (en) Target detection neural network construction method based on multi-branch feature mapping
CN116152644A (en) Long-tail object identification method based on artificial synthetic data and multi-source transfer learning
CN113327227B (en) MobileneetV 3-based wheat head rapid detection method
Liu et al. Target detection of hyperspectral image based on faster R-CNN with data set adjustment and parameter turning
Azzawi et al. Face recognition based on mixed between selected feature by multiwavelet and particle swarm optimization
Zheng et al. A real-time face detector based on an end-to-end CNN
CN112733925A (en) Method and system for constructing light image classification network based on FPCC-GAN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination