CN116245162A

CN116245162A - Neural network pruning method and system based on improved adaptive genetic algorithm

Info

Publication number: CN116245162A
Application number: CN202211492198.0A
Authority: CN
Inventors: 袁沁; 孙闽红; 包建荣
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2022-11-25
Filing date: 2022-11-25
Publication date: 2023-06-09

Abstract

The invention belongs to the technical field of neural networks, and particularly relates to a neural network pruning method and system based on an improved self-adaptive genetic algorithm, wherein the neural network pruning method comprises the following steps: s1, pre-training a convolutional neural network to obtain trained network parameter weights; s2, searching an optimal network structure of each layer by utilizing an improved self-adaptive genetic algorithm, and pruning a redundant convolution kernel; s3, a retraining process is carried out, and network precision is recovered; s4, searching the next layer, and repeating the steps S2-S3; s5, trimming all the convolution layers, and obtaining the neural network with the optimal structure. The invention reduces the number of convolution kernels of the convolution layer while ensuring the accuracy of the neural network model, reduces the number of redundant parameters, accelerates calculation and can realize model compression.

Description

Neural network pruning method and system based on improved adaptive genetic algorithm

Technical Field

The invention belongs to the technical field of neural networks, and particularly relates to a neural network pruning method and system based on an improved self-adaptive genetic algorithm.

Background

Neural networks have enjoyed great success in the application of computer vision, for example: image classification, face recognition, object detection, machine translation, etc. However, for complex tasks, a deeper neural network is required. As the depth of the network model increases, the number of parameters and the amount of calculation are multiplied. The increasing depth of the network model is disadvantageous to be deployed in embedded devices such as smart phones or wearable devices. While the latest hardware has an acceleration framework dedicated to neural networks, there is still a need to reduce model size to save memory space; at the same time, the influence on the model precision is not too great.

There are complex structures and huge amounts of parameters in neural networks, where pruning can remove this part of redundancy without lack of unnecessary or unimportant parameters and network structure, and can be classified into unstructured pruning and structured pruning according to the size of the pruning parameter scale.

Unstructured pruning is the finest granularity pruning, sometimes referred to as weight pruning, which is simple to implement, but there are still some limitations: deleting a small number of parameters is not significant in reducing network storage and computation; unstructured pruning tends to result in a large number of sparse matrices, which often require specialized hardware to achieve the effect of accelerating computation, which can be costly.

Structured pruning is coarser in granularity compared with unstructured pruning, and can be roughly classified into convolution kernel pruning, filter pruning and layer pruning according to the small to large pruning granularity; the method can lead the network model to generate structural sparsity, and the sparsity is very beneficial to saving the computing resources in hardware systems such as embedded computers, parallel computing systems and the like. However, the determination of the redundant structure is an NP-hard problem, and most of the existing methods are based on certain criteria, such as L1, L2 norms, etc., or introduce sparse penalty terms during training, which tend to have high computational complexity, and have limited model compression effect, so that it is difficult to find the optimal structure.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a neural network pruning method and a neural network pruning system based on an improved self-adaptive genetic algorithm, and aims to find an optimal structure of a neural network, reduce the number of parameters, reduce the network performance loss and achieve the effect of compressing a model.

In order to achieve the above object, the present invention adopts the following technical scheme:

a neural network pruning method based on an improved adaptive genetic algorithm comprises the following steps:

s1, pre-training a convolutional neural network to obtain trained network parameter weights;

s2, searching an optimal network structure of each layer by utilizing an improved self-adaptive genetic algorithm, and pruning a redundant convolution kernel;

s3, a retraining process is carried out, and network precision is recovered;

s4, searching the next layer, and repeating the steps S2-S3;

s5, trimming all the convolution layers, and obtaining the neural network with the optimal structure.

Preferably, in the step S2, the improved adaptive genetic algorithm includes the following steps:

s21, setting the maximum iteration times r, the population number m, the parent number k and the super parameter omega;

s22, coding a convolution kernel matrix of a convolution layer into binary codes, wherein each code is a chromosome, and carrying out random value taking on an initial chromosome to generate a population with m individuals;

s23, calculating the fitness value of each individual;

s24, selecting k individuals as parents by using a roulette selection method;

s25, randomly selecting two optimal individuals to perform cross operation, wherein P is arranged at each position _c Is exchanged to generate two offspring;

s26, carrying out mutation operation on each sub-generation, and carrying out mutation operation on each coding positionAll have P _m The probability of (2) is mutated to produce a new individual;

s27, repeating the steps S23-S26, and enabling the k individuals to enter the next iteration until the maximum iteration number is reached.

In a preferred embodiment, in the step S22, a series of binary code strings are randomly generated in the encoding process, and the number of binary code strings is equal to the number of populations, and the expression is as follows:

S＝{s ₁ ,s ₂ ,...,s _n }

where n is the number of convolution kernels per layer, s _i A code value representing the ith convolution kernel, i=1, 2, …, n; s is(s) _i ∈{0,1}，s _i A value of 1 means that the convolution kernel, s, is preserved _i A value of 0 indicates that the convolution kernel is deleted.

Preferably, in the step S23, the fitness value f of the jth individual _j The calculation formula of (2) is as follows:

wherein ρ is _j For the number of encoded network parameters, ρ, of the jth individual ₀ For the number of initial network parameters before coding of the jth individual, ω is a constant with a value of (0, 1) interval, representing a loss function and a weight occupied by a network structure, and L represents a loss function of model accuracy, and is obtained by calculating the output cross entropy:

wherein y is _pq Representing the q-th output value of the p-th convolutional layer.

In a preferred embodiment, in the step S24, k individuals are selected as parents by using a roulette selection method, and the probability of occurrence of the individuals in the offspring is calculated according to the fitness value of the individuals, wherein the probability of selection of the individuals with larger fitness value is larger, and the individuals with smaller fitness value tend to be unselected, and the expression is as follows:

wherein P is _j The probability of being selected for the jth individual.

Preferably, in step S25, the two chromosomes are swapped using a single point crossover, i.e., the right hand portion is segmented and swapped at randomly selected locations, where the crossover probability P _c The expression of (2) is:

wherein k is ₁ 、k ₂ 、k ₃ Are all constants between 0 and 1, k ₂ >k ₃ ，f _c For higher fitness value in the first two parents of the crossover operation, f _max And f _min Is the maximum and minimum fitness in the population.

Preferably, in the step S26, the mutation probability P _m The expression of (2) is:

wherein k is ₄ ，k ₅ ，k ₆ Are all constants between 0 and 1, k ₅ >k ₆ ，f _c For the higher fitness value of the two parents before the mutation operation, f _max And f _min Is the maximum and minimum fitness in the population.

Preferably, in the step S3, the retraining process includes the following steps:

s31, inputting a data set into a neural network by applying the network structure of the offspring obtained last time, and calculating a loss function value L of the data set;

s32, updating the network parameter weight by using a gradient descent method, wherein the expression is as follows:

wherein W' is the weight after updating, W is the weight before updating, eta is the learning rate,

is the partial derivative of L to W;

s33, repeating the steps S31-S32 until the loss function L converges.

The invention also provides a neural network pruning system based on the improved adaptive genetic algorithm, which applies the neural network pruning method according to any one of the schemes, and comprises the following steps:

the pre-training module is used for pre-training the convolutional neural network to obtain trained network parameter weights;

the pruning module is used for searching the optimal network structure of each layer by utilizing the improved self-adaptive genetic algorithm and pruning the redundant convolution kernel;

the retraining module is used for retraining the pruned neural network and recovering network precision;

and the output module is used for outputting the neural network with the optimal structure obtained after all the convolution layers are trimmed.

Compared with the prior art, the invention has the following advantages:

1. compared with a method for training the model by using sparse punishment items or LASSO operators and the like, the method can omit the training process and reduce the calculation time by using the pre-trained model;

2. the invention adopts the genetic algorithm to search the optimal structure of each layer, the genetic algorithm is used as a computer simulation research, has strong global searching capability, and carries out global optimization based on probability. In the existing research, most pruning methods utilize certain criteria to measure the importance of a network structure, such as calculating the L1 norm, the L2 norm, the entropy and the like of the structure, and the calculation complexity is high. Compared with the methods, the method does not need strict mathematical derivation, has no function use limitation, can adaptively adjust the search path, and has universal applicability to various problems.

3. Compared with the common genetic algorithm, the invention does not need to set the cross probability, the variation probability and other super parameters in advance, and the parameters are automatically changed according to the fitness value in the cross and variation processes, so that the parameters do not need to be manually changed, and the efficiency is higher.

4. The invention improves the calculation of the fitness value of the genetic algorithm, and is different from other pruning methods based on the genetic algorithm, generally only uses the loss function of the model as the fitness value.

Drawings

FIG. 1 is a flow chart of a neural network pruning method based on an improved adaptive genetic algorithm;

FIG. 2 is a diagram of the encoding process in the genetic algorithm of the present invention;

FIG. 3 is a cross-process diagram in the genetic algorithm of the present invention;

FIG. 4 is a graph showing the mutation process in the genetic algorithm of the present invention.

Detailed Description

In order to more clearly illustrate the embodiments of the present invention, specific embodiments of the present invention will be described below with reference to the accompanying drawings. It is evident that the drawings in the following description are only examples of the invention, from which other drawings and other embodiments can be obtained by a person skilled in the art without inventive effort.

The neural network pruning method based on the improved self-adaptive genetic algorithm comprises the steps of pre-training a convolutional neural network to obtain trained network parameter weights, searching the optimal network structure of each layer by utilizing the improved self-adaptive genetic algorithm, pruning redundant convolution kernels, retraining the model when the maximum iteration number is reached, recovering network precision, and then entering the search of the next layer until all the convolutional layers are pruned, wherein the neural network with the optimal structure is obtained. The invention mainly solves the problems of redundant parameters and large calculated amount of the neural network, and comprises the following main processes: pre-training the neural network to obtain trained parameter values; and then, coding a convolution kernel of each convolution layer by utilizing an improved self-adaptive genetic algorithm to form a plurality of combinations of possible individuals, calculating the fitness value of each individual, selecting population individuals by utilizing a selection strategy, obtaining diversified individuals through crossover and mutation operation, and then, re-entering the next layer of iteration. And (3) retraining the model to recover part of accuracy until the maximum iteration number is reached, entering a next layer of network, searching the optimal structure again by using a genetic algorithm until all convolution layers are calculated, and obtaining the final structure as an optimal solution. The invention reduces the number of convolution kernels of the convolution layer while ensuring the accuracy of the neural network model, reduces the number of redundant parameters, accelerates calculation and can realize model compression.

As shown in fig. 1, the specific process of the neural network pruning method based on the improved adaptive genetic algorithm in the embodiment of the invention comprises the following steps:

s1, the pretreatment process of the invention is as follows: the special data sets, MNIST, CIFAR-10, CIFAR-100 and the like are downloaded from an open source website, the convolutional neural network is trained, the weight is updated to be stable, and network parameters are stored locally.

S2, searching an optimal network structure of each layer by utilizing an improved self-adaptive genetic algorithm, and pruning a redundant convolution kernel; the adaptive genetic algorithm searching process of the embodiment of the invention comprises the following steps:

(1) Firstly, setting the maximum iteration times r, the population quantity m, the parent quantity k and the super parameter omega.

(2) The convolution kernel matrix of the convolution layer is coded into binary codes, each code is a chromosome, and the initial chromosome is randomly valued to generate a population with m individuals; i.e. the convolution kernels in the next layer of convolution layer are encoded, a number m of binary code strings is generated according to the number of convolution kernels of the layer, 1 and 0 are randomly distributed, as shown in fig. 2, wherein 1 represents a guardLeaving the convolution kernel (dark gray portion in fig. 2), 0 represents pruning the convolution kernel (white portion in fig. 2), and can be expressed as s= { S ₁ ,s ₂ ,...,s _n Each code string representing an individual; where n is the number of convolution kernels per layer, s _i A code value representing the ith convolution kernel, i=1, 2, …, n; s is(s) _i ∈{0,1}，s _i A value of 1 means that the convolution kernel, s, is preserved _i A value of 0 indicates that the convolution kernel is deleted.

(3) Calculating fitness value of each individual:

wherein ρ is _j For the number of encoded network parameters, ρ, of the jth individual ₀ For the number of initial network parameters before coding of the jth individual, ω is a constant with a value of (0, 1) interval, representing a loss function and a weight occupied by a network structure, and L represents the loss function of the model accuracy, which can be calculated by using the output cross entropy:

wherein y is _pq Representing the q-th output value of the p-th convolutional layer. The higher the fitness value, the better the model effect, the more excellent the genes of the chromosome;

(4) Selecting k individuals as parents, calculating the total number of fitness values of each individual, and calculating the probability of the individual appearing in offspring according to the fitness values, wherein the larger the fitness value is, the larger the probability of the individual being selected, and the smaller the fitness value is, the more the individual tends to be unselected, and the expression is:

wherein P is _i The probability of being selected for the jth individual.

(5) Crossing two chromosomes using single point crossoverInstead, the right-hand part is segmented and swapped at randomly selected locations, as shown in FIG. 3, where the crossover probability P _c For the adaptation value, the expression is:

wherein k is ₁ ，k ₂ ，k ₃ Are all constants between 0 and 1, k ₂ >k ₃ ，f _c For the higher fitness value of the two parents before the crossover operation, f _max And f _min Is the maximum and minimum fitness in the population. After crossing, two filial generations are generated, and the excellent genes of the father generation can be inherited, so that the selection and calculation of the next generation can be carried out.

(6) In the mutation strategy, mutation is performed at each individual random position, and becomes 0 if the current position is coded as 1, and becomes 1 if the current position is coded as 0, as shown in FIG. 4, wherein the mutation probability P _m Is an adaptive value, and the expression is as follows:

wherein k is ₄ ，k ₅ ，k ₆ Are all constants between 0 and 1, k ₅ >k ₆ ，f _c For the higher fitness value of the two parents before the mutation operation, f _max And f _min Is the maximum and minimum fitness in the population. The mutation can change part of genes of the chromosome, so that the algorithm is prevented from falling into a local optimal solution;

(6) Performing the next iteration, calculating the fitness value of the population individuals, repeating the processes of selection, crossing and mutation, retaining excellent individuals (network results), and eliminating inferior individuals;

(7) When the maximum iteration number r is reached, the codes of k excellent individuals are obtained, the code with the largest fitness value is selected, the convolution kernel with the code value of 1 is reserved in the network structure, and the convolution kernel with the code value of 0 is removed.

S3, retraining: and training the network again, and recovering part of accuracy. The method specifically comprises the following steps:

a retraining process comprising the steps of:

is the partial derivative of L to W;

s33, repeating the steps S31-S32 until the loss function L converges.

S4, searching the next layer, and repeating the steps S2-S3.

And S5, calculating all convolution layers, and reserving the final structure to obtain the optimal network structure.

The neural network pruning method of the embodiment of the invention can be realized through python, and coding, calculation and the like involved in the algorithm are realized by using a numpy library.

The network structure mainly used in the embodiment of the invention is LeNet, resNet-32 and AlexNet, and the network structure can be built by using a pyrach library.

The embodiment of the invention also provides a neural network pruning system based on the improved self-adaptive genetic algorithm, which comprises a pre-training module, a pruning module, a retraining module and an output module.

Specifically, the pre-training module of the embodiment of the invention is used for pre-training the convolutional neural network to obtain trained network parameter weights, specifically, downloading a special data set, MNIST, CIFAR-10, CIFAR-100 and the like from an open source website, training the convolutional neural network, updating the weights to enable the convolutional neural network to be stable, and storing network parameters to the local.

The pruning module of the embodiment of the invention is used for searching the optimal network structure of each layer by utilizing the improved self-adaptive genetic algorithm and pruning the redundant convolution kernel.

Specifically, the adaptive genetic algorithm searching process of the embodiment of the invention is as follows:

(2) Encoding the convolution kernels in the next convolution layer, generating a binary code string with m number according to the number of the convolution kernels, and randomly distributing 1 and 0, as shown in fig. 2, wherein 1 represents that the convolution kernels (dark gray part in fig. 2) are reserved, and 0 represents that the convolution kernels are pruned (white part in fig. 2), which can be expressed as s= { S ₁ ,s ₂ ,...,s _n Each code string representing an individual; where n is the number of convolution kernels per layer, s _i A code value representing the ith convolution kernel, i=1, 2, …, n; s is(s) _i ∈{0,1}，s _i A value of 1 means that the convolution kernel, s, is preserved _i A value of 0 indicates that the convolution kernel is deleted.

(3) Calculating fitness value of each individual:

wherein P is _i The probability of being selected for the jth individual.

(5) Swapping two chromosomes with a single point crossover, partitioning at randomly selected positions and swapping right-hand parts, as shown in FIG. 3, where crossover probability P _c For the adaptation value, the expression is:

The retraining module is used for retraining the pruned neural network and recovering network accuracy. The specific retraining process is as follows:

(1) Applying the last obtained network structure of the offspring, inputting the data set into the neural network, and calculating the loss function value L of the data set;

(2) The network parameter weight is updated by using a gradient descent method, and the expression is as follows:

is the partial derivative of L to W;

(3) Repeating the steps (1) to (2) until the loss function L converges.

The output module of the embodiment of the invention is used for outputting the neural network with the optimal structure obtained after trimming all the convolution layers.

The foregoing is only illustrative of the preferred embodiments and principles of the present invention, and changes in specific embodiments will occur to those skilled in the art upon consideration of the teachings provided herein, and such changes are intended to be included within the scope of the invention as defined by the claims.

Claims

1. The neural network pruning method based on the improved adaptive genetic algorithm is characterized by comprising the following steps of:

s3, a retraining process is carried out, and network precision is recovered;

s4, searching the next layer, and repeating the steps S2-S3;

2. The neural network pruning method according to claim 1, wherein in the step S2, the improved adaptive genetic algorithm comprises the following steps:

s23, calculating the fitness value of each individual;

s24, selecting k individuals as parents by using a roulette selection method;

s26, carrying out mutation operation on each child generation, wherein each coding position of each child generation is provided with P _m The probability of (2) is mutated to produce a new individual;

3. The neural network pruning method according to claim 2, wherein in the step S22, a series of binary code strings are randomly generated during the encoding process, and the number of binary code strings is equal to the number of populations, and the expression is:

S＝{s ₁ ,s ₂ ,...,s _n }

4. The neural network pruning method according to claim 3, wherein in the step S23, the fitness value f of the jth individual _j The calculation formula of (2) is as follows:

5. The neural network pruning method according to claim 4, wherein in the step S24, k individuals are selected as parents by using a roulette selection method, the probability of occurrence of individuals in offspring is calculated according to the fitness value of the individuals, the probability of selection of individuals with a larger fitness value is higher, and individuals with a smaller fitness value tend to be unselected, and the expression is:

wherein P is _j The probability of being selected for the jth individual.

6. The neural network pruning method according to claim 5, wherein in the step S25, two chromosomes are swapped by single-point crossover, i.e., the right-hand part is segmented and swapped at randomly selected positions, wherein the crossover probability P _c The expression of (2) is:

7. The neural network pruning method according to claim 6, wherein in the step S26, the variation probability P _m The expression of (2) is:

8. The neural network pruning method according to claim 7, wherein in the step S3, the retraining process includes the steps of:

is the partial derivative of L to W;

s33, repeating the steps S31-S32 until the loss function L converges.

9. A neural network pruning system based on an improved adaptive genetic algorithm, applying the neural network pruning method according to any one of claims 1-8, characterized in that the neural network pruning system comprises: