CN113408723A

CN113408723A - Convolutional neural network pruning and quantization synchronous compression method for remote sensing application

Info

Publication number: CN113408723A
Application number: CN202110545477.8A
Authority: CN
Inventors: 陈禾; 齐保贵; 陈亮
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2021-05-19
Filing date: 2021-05-19
Publication date: 2021-09-17
Anticipated expiration: 2041-05-19
Also published as: CN113408723B

Abstract

Compared with other convolutional neural network pruning and quantization compression methods, the convolutional neural network pruning and quantization synchronous compression method for remote sensing application integrates the model pruning and quantization processes, realizes synchronous pruning and quantization of the convolutional neural network model, and reduces the precision loss after model compression while improving the compression ratio of model parameters; the model after pruning and quantization is retrained, so that more accurate parameter values can be obtained, and the network precision is improved; setting a rule degree constraint condition to be observed during coding, namely, when a convolutional neural network model is realized in an actual remote sensing platform processor, pruning convolutional kernels of different filters in the same layer at the same position simultaneously to improve the universality degree of a computing unit and improve the parallelism degree of computation; the compressed model can be transplanted and applied to platform processors with limited resources such as airborne and satellite-borne platforms.

Description

Convolutional neural network pruning and quantization synchronous compression method for remote sensing application

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a convolutional neural network compression method for remote sensing application.

Background

With the wide application of various convolutional neural network models in the field of remote sensing data processing, the requirement for transplanting the convolutional neural network models to mobile platforms such as airborne platforms, satellite-borne platforms and vehicle-mounted platforms is more and more strong. When the method is applied to platforms with strict resource constraints, due to the fact that the resources of selectable special devices are limited, reliability needs to be guaranteed through multi-mode redundancy and the like, the resources which can be used for achieving the convolutional neural network model are very limited. The high-performance convolutional neural network often has a large number of parameters and floating point calculation times, is difficult to apply on platforms with limited resources, and needs a series of parameter compression work such as pruning, quantization and the like. However, the current convolutional neural network pruning and quantization compression method has the problem of large loss of model precision after compression. Both the pruning and quantization processes affect the model accuracy, and the accuracy losses are superposed after the two steps of compression. Pruning first and then quantization will result in a large loss of model accuracy.

Disclosure of Invention

In view of this, the present invention provides a convolutional neural network compression method for remote sensing applications, which improves the compression ratio of model parameters and ensures the accuracy of the compressed model.

A convolution neural network pruning and quantization synchronous compression method for remote sensing application comprises the following steps:

s1, performing parameter training on the convolutional neural network by adopting data containing the remote sensing image to obtain a model M2;

s2, pruning and coding the quantization parameters of the model M2, which are specifically as follows:

for the parameters of each layer in the convolutional neural network, its coding is defined as:

O_k＝(p₁,p₂,...,p_n,q₁,q₂,...,q_n)

wherein n represents the number of parameters of each layer in the network; p is a radical of₁,p₂,...,p_nThe value of 0 or 1 is used for representing whether each parameter is pruned, wherein 0 represents that the parameter is pruned, and 1 represents that the parameter is reserved and not pruned; q. q.s₁,q₂,...,q_nRepresenting the number of quantization bits for each parameter;

s3, generating a set number of initial populations according to the codes of the step S2, wherein each population comprises a plurality of individuals;

s4, setting parameter pruning and quantification in the model M2 according to the codes of each individual in the population in the step S3, so as to obtain models corresponding to the number of the individuals, and training parameter values of the models by adopting remote sensing image data to obtain the models after parameter training;

s5, performing precision evaluation on the model obtained after parameter training, removing the part of individuals with the worst precision in each population, randomly changing the codes of the rest individuals, and performing cross and/or migration on the codes among the individuals to generate new individuals and populations;

s6, repeating the step S5 until the set conditions are met, and stopping training the model parameters;

in the training process, the parameter value of the model used in the current training is the parameter value of the last training;

and S7, performing precision evaluation on the model obtained in the S6, and reserving a part of individuals with the highest precision to finish pruning and quantification of the convolutional neural network.

Further, retraining the partial individuals with the highest accuracy in S7 specifically includes:

firstly, constructing a convolution neural network model of floating point parameters according to S7 pruning results, and training the network model parameters by adopting training data;

then, on the basis of the parameters of the post-pruning model, training the quantitative parameters of each layer of the model in turn, specifically: and (3) quantizing the parameters of the first layer of the network model according to the parameter quantization result of S7, training the whole convolutional neural network on the basis of the pruned model parameters, fixing the quantized parameters of the first layer after finishing the training of the other layers of the network model as floating point parameters, quantizing the parameters of the second layer according to the result of S7, keeping the floating point parameters of the third layer and the later layers, training the network, and so on until the last layer of the network, and finishing the final training of the model parameters.

Further, in S2, the rule constraint condition is met during encoding, specifically: when the constraint only prunes the filter level of the convolutional neural network, the same pruning coding is needed for the parameters in the same filter; when the constraint only prunes at the convolution kernel level, all parameters in the same convolution kernel require the same pruning coding.

Preferably, in S3, the method for generating the initial population and the individuals comprises:

randomly encoding each layer of the network under the condition of satisfying parameter compression ratio_k＝(p₁,p₂,...,p_n,q₁,q₂,...,q_n) Assigning the value of the middle element;

or, the existing pruning and quantization algorithm is adopted to obtain a pruned or quantized model structure, and the codes of the individuals in the initial population are assigned according to the structure to obtain the individuals.

Preferably, the setting conditions in S6 are: the training is stopped when the accuracy, the parameter amount and the calculated amount of any one individual reach a specified threshold or reach a predetermined number of times of training.

Preferably, when the model parameters are trained, the number of training cycles of each model is the same, and the training is performed for the least number of times under the condition that the model performance can be distinguished.

Preferably, when the model accuracy is evaluated in S5, if the model accuracy is consistent, the individual with a large number of model parameters is rejected.

Preferably, when the model accuracy is evaluated in S5, the evaluation method is selected according to different task types.

Preferably, in S5, encoding crossover is performed on different individuals in the same population, the optimal individual in each population is migrated to other populations, and the individual and the population are updated.

Preferably, in S1, when the remote sensing image data is insufficient, the natural scene image data is used to perform parameter training on the network to obtain a pre-training model M1; and then carrying out parameter training on the pre-training model M1 by using the remote sensing image data to obtain a model M2.

The invention has the following beneficial effects:

compared with other convolutional neural network pruning and quantization compression methods, the convolutional neural network pruning and quantization synchronous compression method for remote sensing application integrates the model pruning and quantization processes, realizes synchronous pruning and quantization of the convolutional neural network model, and reduces the precision loss after model compression while improving the compression ratio of model parameters;

the model after pruning and quantization is retrained, so that more accurate parameter values can be obtained, and the network precision is improved;

setting a rule degree constraint condition to be observed during coding, namely, when a convolutional neural network model is realized in an actual remote sensing platform processor, pruning convolutional kernels of different filters in the same layer at the same position simultaneously to improve the universality degree of a computing unit and improve the parallelism degree of computation;

the compressed model can be transplanted and applied to platform processors with limited resources such as airborne and satellite-borne platforms.

Detailed Description

The present invention will be described in detail with reference to examples.

In order to solve the problems that the precision is easy to lose and the like in the traditional convolutional neural network pruning and quantization compression method, the invention synchronously prunes and quantizes the model on the basis of the convolutional neural network compression algorithm before summarization, improves the compression ratio of the model parameters and ensures the precision of the compressed model.

The invention provides a convolutional neural network pruning and quantization synchronous compression method for remote sensing application, which mainly comprises the following steps:

s1, building a deep convolutional neural network model, and performing parameter training on the network by using natural scene image data to obtain a pre-training model M1;

s2, carrying out parameter training on the pre-training model M1 by using remote sensing image data to obtain a model M2;

the model is pre-trained by applying a large amount of natural image data to the scene with less remote sensing image data volume, so that the feature extraction capability of the model can be improved, and when the remote sensing image data volume is not less than the natural scene image data volume, the model pre-training in the step S1 is not required.

S3, pruning and coding the quantization parameters of the model M2, wherein the specific coding scheme is designed as follows:

the pruning of the model parameters contains two possibilities of whether to prune, while the quantization comprises a quantization bit number of 1 to n bits, which for a certain layer in the network can be defined as:

O_k＝(p₁,p₂,...,p_n,q₁,q₂,...,q_n)

where n represents the number of parameters for each layer in the network. p is a radical of₁To p_nThe value of (1) is 0 or 1, 0 means that the parameter is pruned, and 1 means that the parameter is retained and pruning is not performed. q. q.s₁To q_nIs determined by the desired number of quantization bits, e.g. 4 to 8, representing the number of quantization bits of the parameter.

S4, randomly generating N (generally greater than 5) initial populations according to the encoding scheme of step S3 under the condition that the codes of the populations meet the rule constraint, each population containing K (generally greater than 20) individuals, wherein the random population generation scheme is designed according to the expected model compression method, and the typical 2 initial populations and individual generation methods are as follows:

firstly, randomly encoding O to each layer of the network under the condition of satisfying the set parameter compression ratio_k＝(p₁,p₂,...,p_n,q₁,q₂,...,q_n) Assigning the value of the middle element;

and secondly, obtaining a model structure after pruning or quantization by adopting the existing pruning and quantization algorithm, and assigning the codes of the initial population individuals according to the structure.

The rule degree constraint condition is required to be met during encoding. Pruning may be classified by granularity into filter level, convolution kernel level, kernel parameter level, etc. Considering that when a convolutional neural network model is implemented in an actual remote sensing platform processor, the general degree of a computing unit can be improved by pruning convolutional kernels of different filters in the same layer at the same position at the same time, and the parallelism of computation is improved, so that the rule degree constraint is set flexibly according to the hardware computation characteristics of the processor.E.g. when the constraint only prunes to the filter level, the parameters in the same filter need the same pruning coding. When the constraint only prunes at the convolutional kernel level, all parameters in the same convolutional kernel require the same pruning code. When the position of the pruning code is 0, the parameter quantization code corresponding to the position is also set to 0, for example, p₁When q is 0₁＝0。

S5, respectively carrying out parameter training on the models corresponding to the N populations and the K individual codes in the step S4 by adopting remote sensing image data (the data is the same as the data in the step S2);

during training, each individual retains the parameters in model M2, on the basis of which pruning and quantification are performed. The training cycles of each model are the same, and the training is carried out for a few times under the condition that the performance of the model can be distinguished, so that the training time is reduced.

S6, performing precision evaluation on the model obtained in S5, eliminating the individual with the worst precision part in each population, randomly changing the codes of the rest individuals, and performing code crossing on different individuals in the same population (exchanging codes O among different individuals)_kThe element values at the same position) while migrating the optimal individual in each population to other populations, increasing the population by individuals.

The precision evaluation method is selected according to different task types, for example, for an image classification task, the classification accuracy of the model is used as a precision evaluation standard. And when the model precision is consistent, eliminating the individual with large model parameter quantity. When the encoding is changed randomly, the foregoing rule degree constraint in S4 needs to be satisfied.

S7, performing parameter training on the new population and the individuals obtained in the S6, and repeating the S6;

in the training process, the parameter value of each individual source is reserved, and training and parameter evaluation are carried out on the basis of the last training parameter. When the parameter at a certain position is pruned at the last time and no corresponding parameter value exists, the step is traced upwards until the value of the corresponding parameter in the model M2 is traced.

S8, stopping training when the precision, the parameter quantity and the calculated quantity of any individual reach a specified threshold or reach a preset training number;

and S9, performing precision evaluation on the model obtained in the S8, and reserving the T individuals with the highest precision to finish pruning and quantification of the convolutional neural network.

Whether parameters of the pruned and quantized model are pruned or not and the parameter bit number are determined, and the specific values of the parameters need to be trained and optimized continuously, which specifically comprises the following steps:

and when the training is carried out again, a layer-by-layer training method is adopted. Firstly, constructing a convolution neural network model of a floating point parameter according to an S9 pruning result, and training the network model by adopting training data, wherein the training of the pruning model of the floating point parameter can be carried out by adopting the steps S1 and S2 in sequence; then, on the basis of the parameters of the post-pruning model, the quantitative parameters of each layer of the model are trained in sequence. For example, according to the parameter quantization result of S9, the parameters of the first layer of the network model are quantized, the other layers of the network are floating point parameters, the entire convolutional neural network is trained on the basis of the pruned model parameters, after that, the quantized parameters of the first layer are fixed, the parameters are quantized according to the parameter quantization result of the second layer of S9, the parameters of the third layer and the following layers are kept as floating point parameters, then training is performed, and so on until the last layer of the network, and the training of the model parameters is completed.

In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A convolutional neural network pruning and quantization synchronous compression method for remote sensing application is characterized by comprising the following steps:

O_k＝(p₁,p₂,...,p_n,q₁,q₂,...,q_n)

2. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1, wherein the retraining is performed on the part of individuals with the highest S7 precision, specifically:

3. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1 or 2, wherein in S2, the rule degree constraint condition is met during encoding, specifically: when the constraint only prunes the filter level of the convolutional neural network, the same pruning coding is needed for the parameters in the same filter; when the constraint only prunes at the convolution kernel level, all parameters in the same convolution kernel require the same pruning coding.

4. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1 or 2, wherein in S3, the method for generating initial population and individuals is as follows:

5. The convolutional neural network pruning and quantization synchronous compression method for remote sensing applications as claimed in claim 1 or 2, wherein the set conditions in S6 are: the training is stopped when the accuracy, the parameter amount and the calculated amount of any one individual reach a specified threshold or reach a predetermined number of times of training.

6. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1 or 2, wherein when training model parameters, the number of training cycles of each model is the same, and the training is performed for the minimum number of times under the condition that the performance of the model can be distinguished.

7. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1 or 2, wherein when model accuracy is evaluated in S5, individuals with large model parameter quantity are removed when model accuracy is consistent.

8. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1 or 2, wherein when evaluating model accuracy in S5, the evaluation method is selected according to different task types.

9. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1 or 2, wherein in S5, coding interleaving is performed on different individuals in the same population, the optimal individual in each population is migrated to other populations, and the individual and the population are updated.

10. The convolutional neural network pruning and quantization synchronous compression method for remote sensing application as claimed in claim 1 or 2, wherein in S1, when the remote sensing image data is insufficient, the network is firstly subjected to parameter training using natural scene image data to obtain a pre-training model M1; and then carrying out parameter training on the pre-training model M1 by using the remote sensing image data to obtain a model M2.