CN113673697A

CN113673697A - Model pruning method and device based on adjacent convolution and storage medium

Info

Publication number: CN113673697A
Application number: CN202110975018.3A
Authority: CN
Inventors: 王晓锐; 郑强; 高鹏
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-08-24
Filing date: 2021-08-24
Publication date: 2021-11-19
Also published as: WO2023024407A1

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a model pruning method based on adjacent convolution, which comprises the following steps: obtaining a filter Manhattan distance of a filter matrix in the convolutional layer to be evaluated and a channel Manhattan distance of a channel matrix by using an absolute value function, and further obtaining a convolutional layer filter mode parameter and a convolutional layer channel mode parameter; multiplying the convolution layer filter mode parameter and the convolution layer channel mode parameter to form a filter pruning probability parameter for judging the filter pruning probability; sorting the filter pruning probability parameters according to a preset rule, and determining a filter to be pruned according to a sorting result of the filter pruning probability parameters; and cutting the determined filter to be pruned. The convolution model to be pruned can be a neural network model for intelligent medical treatment; the invention realizes the technical effect of higher precision while keeping the relatively better model performance of the convolution model.

Description

Model pruning method and device based on adjacent convolution and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a model pruning method and device based on adjacent convolution and a computer readable storage medium.

Background

Convolutional Neural Networks (CNN) are a class of feed forward Neural Networks (fed forward Neural Networks) that contain convolution computations and have a deep structure, and are one of the representative algorithms for deep learning (deep learning). Although the convolutional neural network has good performance, the convolutional neural network model requires a large amount of computational overhead and contains a large amount of redundant information, so that the convolutional neural network needs to be compressed. In the field of intelligent medical treatment, a convolutional neural network can widely support the functions of disease auxiliary diagnosis, health management, remote consultation and the like; the existing convolution neural network model compression method applied to intelligent medical treatment comprises model pruning, quantification and distillation. Model pruning in the prior art is performed by selecting and removing filters (filters) of relatively unimportant convolution kernels; fine-tuning (fine-tune) the model with the insignificant filters removed; the loss of accuracy due to the removal of the trim is recovered.

In the prior art, the selection method of the filter of the relatively unimportant convolution kernel does not consider the relationship between filter filters, so that redundant information in the convolution filter of the model cannot be fully mined, and only single-layer convolution is considered, and the relationship between two layers of convolution is not considered.

Therefore, a convolution pruning method that fully mines redundant information in the convolution filter of the model and fully considers the relationship between two layers of convolution is needed.

Disclosure of Invention

The invention provides a model pruning method and system based on adjacent convolution, electronic equipment and a storage medium, and mainly aims to solve the problem that in the existing intelligent medical scene, the convolution pruning process is only limited to single-layer convolution and does not consider the relationship between two layers of convolution.

In order to achieve the above object, the present invention provides a model pruning method based on adjacent convolution, applied to an electronic device, including:

obtaining a filter Manhattan distance of a filter matrix in the convolutional layer to be evaluated and a channel Manhattan distance of a channel matrix by using an absolute value function, obtaining a convolutional layer filter mode parameter according to the filter Manhattan distance, and obtaining a convolutional layer channel mode parameter according to the channel Manhattan distance;

multiplying the convolution layer filter mode parameter and the convolution layer channel mode parameter to form a filter pruning probability parameter for judging the filter pruning probability;

sequencing the filter pruning probability parameters according to a preset rule, and determining a filter to be pruned according to a sequencing result of the filter pruning probability parameters;

and cutting the determined filter to be pruned.

Further, preferably, the method for clipping the determined filter to be clipped includes:

acquiring a filter to be pruned, and training a pruning model based on adjacent convolution according to the filter to be pruned and a preset clipping threshold;

acquiring a mask matrix according to the original parameters of the pruning model based on the adjacent convolution; the mask matrix is consistent with the original parameter matrix size of the pruning model based on adjacent convolution, and the mask matrix is a training matrix comprising 0 and 1;

adjusting parameters of the pruning model based on the adjacent convolution by utilizing the mask matrix;

and pruning is carried out by utilizing the pruning model based on the adjacent convolution after the parameters are adjusted.

Further, preferably, the method for adjusting the parameters of the pruning model based on the adjacent convolution by using the mask matrix includes:

multiplying parameters of the neighboring convolution-based pruning model with the mask matrix;

screening model parameters of a pruning model with mask code of 1, and training and adjusting the model parameter values of the mask code in a back propagation way;

storing the model parameter values adjusted by back propagation and the corresponding matrix positions;

and acquiring the final parameters of the pruning model based on the adjacent convolution according to the model parameter values and the corresponding matrix positions thereof, and completing the adjustment of the parameters of the pruning model.

Further, preferably, the method for determining the filter to be pruned according to the sorting result of the filter pruning probability parameters in the step includes:

sequencing according to the pruning probability parameters of the filter from large to small, and taking a channel with the pruning probability parameter of the filter smaller than a preset threshold value as a filter to be pruned; wherein the preset threshold is 1%.

Further, preferably, the absolute value function manhattan distance said loss function is obtained by the following formula:

wherein p ═ p₀,…,p_C-1]For probability distribution, each element p_iIndicates the probability of a sample belonging to class i, y ═ y₀,…,y_C-1]Is a one-hot coded representation of the sample label, when the sample belongs to class i, then y_i1, otherwise y_i0; c is the total number of classes.

Further, preferably, in the pruning by using the pruning model based on the adjacent convolution after the parameter adjustment, the pruning includes cutting the number of channels in a convolution kernel, cutting the number of channels in the input feature map corresponding to the number of channels in the convolution kernel, and outputting a convolution kernel of a corresponding upper layer of the current input feature map.

Further, preferably, the input of the convolution layer to be evaluated is an input feature map H × W × C_inThe output is convolution kernel W (C)_in*k_h*k_w)*(C_out) And outputting a characteristic diagram (H W) C_out) (ii) a Where H and W are the height and width of the output signature, respectively.

In order to solve the above problem, the present invention further provides a model pruning apparatus based on adjacent convolution, the apparatus comprising:

the filter mode parameter and channel mode parameter acquisition unit is used for acquiring the Manhattan distance of a filter matrix in the convolutional layer to be evaluated and the Manhattan distance of a channel matrix by using an absolute value function, acquiring the mode parameter of the convolutional layer filter according to the Manhattan distance of the filter, and acquiring the channel mode parameter of the convolutional layer according to the Manhattan distance of the channel;

a filter pruning probability parameter obtaining unit, configured to multiply the convolutional layer filter mode parameter with the convolutional layer channel mode parameter to form a filter pruning probability parameter for determining a filter pruning probability;

the filter to be pruned determining unit is used for sequencing the filter pruning probability parameters according to a preset rule and determining the filter to be pruned according to a sequencing result of the filter pruning probability parameters;

and the pruning unit is used for cutting the determined filter to be pruned.

In order to solve the above problem, the present invention also provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the aforementioned neighboring convolution based model pruning method.

In order to solve the above problem, the present invention further provides a computer-readable storage medium, which stores at least one instruction, which is executed by a processor in an electronic device to implement the neighboring convolution-based model pruning method described above.

The model pruning method based on the adjacent convolution provided by the invention is based on the fusion of the importance of the adjacent two layers of convolution, and solves the problem that the convolution pruning process is only limited to single-layer convolution and does not consider the relationship between the two layers of convolution in the prior art; unimportant filters in convolution can be really obtained; therefore, the technical effect of higher precision is achieved while relatively better model performance of the convolution model is kept.

Drawings

FIG. 1 is a schematic flow chart of a model pruning method based on adjacent convolution according to an embodiment of the present invention;

FIG. 2 is a block diagram of a logic structure of a model pruning device based on adjacent convolution according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an internal structure of an electronic device implementing a model pruning method based on adjacent convolution according to an embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology in the application is a machine learning technology based on a convolutional neural network. Convolutional neural network based applications can be used in many different fields, such as speech recognition, medical diagnostics, testing of applications, etc.

The method aims at the problem that the redundant information in the convolution filter of the model cannot be fully mined because the relation between filter filters is not considered in the selection method of the filter (filter) of the relatively unimportant convolution kernel in the prior art. The model pruning method based on adjacent convolution fully considers the redundant information in the convolution filter and the relation between two layers of convolution, so that the technical effect of high precision is achieved while relatively good model performance of the convolution model is kept.

In the prior art, the selection of the filter of the relatively unimportant convolution kernel is as follows: 1) judging the importance degree of a filter of a convolution kernel according to the size of the parameters (weights) grouped into one layer; although easy to understand and easy to implement, weights of the BN layer are difficult to measure the amount of information really possessed by the correlation filters, and thus information correlation between the filters cannot be measured; 2) judging the importance degree of the filter of the convolution kernel according to the Manhattan distance or Euclidean norm value of the filter; although easy to understand and easy to implement, the information correlation between the filters cannot be measured only by the size of the numerical value; 3) judging the importance degree of a filter of a convolution kernel by a method of a geometric median of a space where the filter is located; i.e. the filter closest to the median in the set of all filters is obtained by calculation, the filter is decided as insignificant and pruned. However, the information amount of the geometric median cannot be substituted for the information amount of the filter.

The hardware of the model pruning method based on adjacent convolution adopts NVIDIAV100GPU, and PyTorch frames are adopted. Among them, PyTorch is a Python package developed by Facebook for training neural networks, and is also a deep learning framework made by Facebook force. PyTorch provides an abstract method similar to NumPy to characterize the tensor (or multidimensional array), which can be accelerated by using a GPU (Graphics Processing Unit Graphics processor). PyTorch allows the behavior of the network to be arbitrarily changed with zero delay or zero cost by a technique called Reverse-mode auto-differentiation. PyTorch as an end-to-end machine learning framework with functions including Torch scripts, distributed training, mobility (experimental), tools and libraries, native ONNX support, C + + front-end, and cloud partners. The PyTorch is compact and easy to use, and is excellent in speed performance of the model, and many models may be realized faster than frames such as TensorFlow. Thus, the model pruning scenario for adjacent convolutions of the present invention applies.

A conventional convolutional neural network is formed by repeatedly superimposing a convolutional layer, a BN layer (Batch Normalization layer), and a ReLu layer (nonlinear activation layer) in this order. And taking units consisting of each group of convolutional layers, batch normalization layers and nonlinear activation layers as feature extraction units of the convolutional neural network, wherein the feature extraction units are arranged in sequence in the depth direction of the convolutional neural network. And the output feature maps of one group of feature extraction units are used as the input feature maps of the next group of feature extraction units. Each cuboid in the convolutional layer is a filter, and each filter has a plurality of channels (channels) from front to back. The invention judges the importance of the convolution layer from two dimensions of a filter and a channel.

Specifically, as an example, fig. 1 is a schematic flowchart of a model pruning method based on adjacent convolution according to an embodiment of the present invention. Referring to fig. 1, the present invention provides a model pruning method based on adjacent convolution, which may be performed by a device, which may be implemented by software and/or hardware.

In this embodiment, the model pruning method based on adjacent convolution includes: steps S110 to S140:

s110: the method comprises the steps of obtaining a filter Manhattan distance of a filter matrix in a convolutional layer to be evaluated and a channel Manhattan distance of a channel matrix by using an absolute value function, obtaining a convolutional layer filter mode parameter according to the filter Manhattan distance, and obtaining a convolutional layer channel mode parameter according to the channel Manhattan distance. Specifically, filter-level thinning is performed on the convolutional layer by adding a constraint term of a penalty term (group lasso) to the objective function of the network.

The input of the convolution layer to be evaluated is an input feature map H W C_inThe output is convolution kernel W (C)_in*k_h*k_w)*(C_out) And outputting a characteristic diagram (H W) C_out) (ii) a Wherein H andw is the height and width of the output signature, respectively. As can be seen from the matrix multiplication, the corresponding rows in the convolution kernel are multiplied by only specific columns in the input signature matrix.

Absolute value function (ABS function) manhattan distance said loss function is obtained by the following formula:

wherein p ═ p₀,…,p_C-1]For probability distribution, each element p_iIndicates the probability of a sample belonging to class i, y ═ y₀,…,y_C-1]Is a one-hot coded (onehot) representation of the sample label, when the sample belongs to class i, then y_i1, otherwise y_i0; c is the total number of classes.

Note that, the importance calculation based on the filter method focuses on the filter dimension, and the index related to the comparison between each filter and another filter is calculated in units of filters. The evaluation is carried out by using the Manhattan distance (namely L1 norm) as an importance index, parameters in the filters are used as evaluation bases, and the difference of parameter Manhattan distance mean values among a plurality of filters is compared. That is, when the manhattan distance mean value of the filter is smaller, the importance of the filter is smaller, the filter does not play an important role in the calculation of the neural network, the information of the filter has redundancy, and the filter can be deleted in pruning. On the contrary, if the manhattan distance mean value is larger, the filter has a great influence on the result in the calculation, and the filter contains a great amount of information, which cannot be deleted.

Taking as an example the convolutional layer to be evaluated of a matrix with convolution parameters of nxc × k × k, N is the number of filters and c is the number of channels in each filter. The absolute value function is an LV function for obtaining an absolute value. The manhattan distance is taken as absolute value of the elements in the c multiplied by k matrix, then the c multiplied by k absolute values are averaged, and the obtained numerical value is the index of the pruning probability of the filter, namely the index of the importance of the filter. In summary, row and column sparsification of the 2D matrix is achieved by sparsifying the filter stages, and then values of all 0 are cut off on the rows and columns of the matrix to reduce the dimensionality of the matrix, thereby improving the operational efficiency of the convolution model.

The importance calculation based on the channel mode focuses on the channel dimension. Still taking the convolutional layer to be evaluated of a matrix with convolution parameters nxc x k as an example, the filter divides it into N combinations of c x k, while the channel divides it into c combinations of N x k. The mean value of the manhattan distances is used as an index for evaluating the importance of the channel, as the dimension of the filter is the same. That is, the importance of a layer is reflected by comparing the importance of the channels within that layer. Different channels have different manhattan distances, and the greater the value, the greater the importance of the channel, and the less important it can be removed in pruning. After a plurality of channels of the current layer are cut off, the output characteristic diagram of the current layer is reconstructed, so that the loss information is minimum.

S120: and multiplying the convolution layer filter mode parameter and the convolution layer channel mode parameter to form a filter pruning probability parameter for judging the filter pruning probability.

The filter mode channel mode filter pruning probability parameters are also important filter parameters, and are obtained through the following formula:

the filter pruning probability parameter is equal to the convolutional layer filter mode parameter x convolutional layer channel mode parameter.

In two successive convolutions, the different channel paths of the second convolution are generated by different filter convolution kernels of the first convolution. If the first convolution is evaluated by the filter method and the second convolution is evaluated by the channel method, the evaluation is important for evaluating the filter of the first layer, and therefore, the results of the two can be merged. Specifically, the filter pruning probability parameter is convolutional layer filter mode parameter × convolutional layer channel mode parameter. Namely, the results of the two are multiplied and fused to obtain a new index, and the importance of the filter is evaluated more accurately. This evaluation takes into account not only the importance of the filter used to generate the feature map (featuremap), but also the importance of the channel when using the feature map.

S130: and sequencing the filter pruning probability parameters according to a preset rule, and determining a filter to be pruned according to a sequencing result of the filter pruning probability parameters.

The method for determining the filter to be pruned according to the sequencing result of the filter pruning probability parameters comprises the following steps: sequencing according to the pruning probability parameters of the filter from large to small, and taking a channel with the pruning probability parameter of the filter smaller than a preset threshold value as a filter to be pruned; wherein the preset threshold is 1%.

After the importance results of the filter mode and the channel mode are fused, the filters are sorted according to the size of the results, and under the condition of preselecting a set pruning rate, a certain number of filters with relatively low total value and related weights thereof are removed, so that a pruned model can be obtained. That is, sorting according to the pruning probability parameters of the filter from large to small, and then cutting out all channels with the importance smaller than a preset threshold value. In a specific implementation, the predetermined threshold may be 1%.

In one particular embodiment, a layer importance value of the network may be used to determine the channels that are pruned for that layer. In layer l, channels of importance less than p times the maximum value in that layer will be cropped away; following the above notation system, the set of channels clipped in layer I is where p ∈ (0, 1) is the threshold. For example, if there are four channels stacked in a certain convolution, and the importance of each channel is calculated to be {1.5, 2.1, 0.003, 0.02}, and p is 0.01, the third and fourth channels are cut.

Filter-level only changes the number of Filter banks and feature channels in the network, and the obtained model can run without special algorithm design, which is called structured pruning. Because the current stage is a redundant module, it is not meant that this is also redundant to the other stages. By comprehensively considering the importance of adjacent convolutions, linkage among layers of a convolution frame is realized, and pruning of the whole convolution model is further completed.

S140: and cutting the determined filter to be pruned.

The layer to be pruned is determined through steps S110 to S130, and pruning is then performed according to a preset clipping threshold or ratio. In particular, the layers requiring pruning are typically fully connected layers.

The method for cutting the determined filter to be pruned comprises the following steps:

s141, obtaining a filter to be pruned, and training a pruning model based on adjacent convolution according to the filter to be pruned and a preset clipping threshold; s142, obtaining a mask matrix according to the original parameters of the pruning model based on the adjacent convolution; the mask matrix is consistent with the original parameter matrix size of the pruning model based on adjacent convolution, and the mask matrix is a training matrix comprising 0 and 1; s143, adjusting parameters of the pruning model based on the adjacent convolution by using the mask matrix.

The method for adjusting the parameters of the pruning model based on the adjacent convolution by using the mask matrix comprises the following steps: multiplying parameters of the neighboring convolution-based pruning model with the mask matrix; screening model parameters of a pruning model with mask code of 1, and training and back propagation adjustment are carried out on the model parameters with mask code; storing the model parameter values adjusted by back propagation and the corresponding matrix positions; and acquiring the final parameters of the pruning model based on the adjacent convolution according to the model parameter values and the corresponding matrix positions thereof, and completing the adjustment of the parameters of the pruning model.

And S144, pruning is carried out by utilizing the pruning model based on the adjacent convolution after the parameters are adjusted.

In the step of pruning by using the pruning model based on the adjacent convolution after parameter adjustment, pruning includes cutting the number of channels in a convolution kernel, cutting the number of channels in the input feature map corresponding to the number of channels in the convolution kernel, and outputting the convolution kernel of the upper layer corresponding to the current input feature map.

Wherein, it needs to be stated that, the specific implementation is that a mask matrix with the same size as the parameter matrix is added by modifying the code; only 0 and 1 in the mask matrix are actually the networks used for retraining. That is, the ratio of the number of pruned channels to the total number of channels in the network is defined as the pruning rate, which is denoted as pruned _ ratio. The number of channels to be pruned is pruned _ channels multiplied by the total number of channels in the network. The upper limit of the pruning rate upper _ ratio is 1, and the lower limit of the pruning rate lower _ ratio is 0. The initial pruning rate was 0.5. And averaging the sorted characteristic maps of the channels, namely sortmin → max { ch _ avg }, wherein the mask values of the channel selection layers corresponding to the channels with the number of preceding _ channels are set to be 0, and the mask values of the channel selection layers corresponding to the rest of the channels are set to be 1.

It should be noted that pruning is an iterative process, and the iterative process of model pruning, which is usually called "iterative pruning", is an alternate and repeated process of both pruning and model training. The aim of model pruning is to only reserve important weight, and the processing platform comprises full-connection layer pruning and convolutional layer pruning. It has different effects on the deep neural network. The biggest influence is that the calculation cost can be reduced while the same performance is kept, and the reasoning and training process can be accelerated by deleting the characteristics which are not really used in the deep network; the second influence is that the generalization ability of the model can be improved by reducing the number of parameters, namely reducing the redundancy in the parameter space.

In this embodiment, an vgg16 model is taken as an example to verify the effectiveness of the algorithm, the data set adopts cifar10, 500 epochs of model compression training are performed in each experiment, NVIDIAV100GPU is adopted as hardware, and a PyTorch framework is adopted. The uncompressed model accuracy was 93.99%, and the model accuracy after pruning is shown in table 1 below.

Table 1: algorithmic comparison

Compression method	RawModel	APoZ	MeanAct	FPGM	TaylarFO	The invention
							Precision (%)	93.99	91.89	92.77	93.45	93.54	93.65

As can be seen by observing Table 1, APoZ is the Pruning based on the average percentage of zeros in the featuremap, and the detailed information is found in Network Trimming, A Data-drive Neuron Pruning application programs Efficient Deep architecture; meanoct is Pruning based on the minimum average activation value, and the detailed information is found in the Pruning public network for Resource efficiency reference; FPGM is channel Pruning based on Geometric Median, and the detailed information is shown in Filter planning via Geometric Median for Deep conditional Neural Networks Accerance; taylarfo is channel Pruning based on first-order Taylor expansion, and the detailed information is shown in the introduction Estimation for Neural Network Pruning. It can be seen from the table that the method of the present invention achieves the highest accuracy, which is better than other pruning methods. The invention really finds the unimportant filter in convolution, is a good pruning method, and can effectively give consideration to the detection loss rate and the precision in pruning.

In a word, the model pruning method based on adjacent convolution solves the problem that the relation between two layers of convolution is not considered in the convolution pruning process only limited to single-layer convolution in the prior art through the fusion of the importance based on the adjacent two layers of convolution; unimportant filters in convolution can be really obtained; therefore, the technical effect of higher precision is achieved while relatively better model performance of the convolution model is kept.

Corresponding to the model pruning method based on the adjacent convolution, the invention also provides a model pruning device based on the adjacent convolution. Fig. 2 shows functional blocks of a model pruning apparatus based on adjacent convolution according to an embodiment of the present invention.

As shown in fig. 2, the model pruning device 200 based on adjacent convolution according to the present invention can be installed in an electronic device. According to the implemented functions, the model pruning device 200 based on adjacent convolution may include a filter mode parameter and channel mode parameter obtaining unit 210, a filter pruning probability parameter obtaining unit 220, a filter to be pruned determining unit 230, and a pruning unit 240. The units of the invention, which may also be referred to as modules, refer to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a certain fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

a filter mode parameter and channel mode parameter obtaining unit 210, configured to obtain a filter manhattan distance of a filter matrix and a channel manhattan distance of a channel matrix in a convolutional layer to be evaluated by using an absolute value function, obtain a convolutional layer filter mode parameter according to the filter manhattan distance, and obtain a convolutional layer channel mode parameter according to the channel manhattan distance;

a filter pruning probability parameter obtaining unit 220, configured to multiply the convolutional layer filter mode parameter with the convolutional layer channel mode parameter to form a filter pruning probability parameter for determining a filter pruning probability;

a to-be-pruned filter determining unit 230, configured to sort the filter pruning probability parameters according to a preset rule, and determine a to-be-pruned filter according to a sorting result of the filter pruning probability parameters;

and the pruning unit 240 is used for pruning the determined filter to be pruned.

The filter pruning probability parameter obtaining unit 220 obtains the filter pruning probability parameter according to the following formula: the filter pruning probability parameter is equal to the convolutional layer filter mode parameter x convolutional layer channel mode parameter.

In an embodiment of the present invention, the pruning unit 240 further includes a model training subunit, a parameter adjusting unit and a model pruning subunit (not shown in the figure).

The model training subunit is used for acquiring a filter to be pruned and training a pruning model based on adjacent convolution according to the filter to be pruned and a preset pruning threshold; acquiring a mask matrix according to the original parameters of the pruning model based on the adjacent convolution; the mask matrix is consistent with the original parameter matrix size of the pruning model based on adjacent convolution, and the mask matrix is a training matrix comprising 0 and 1;

a parameter adjusting subunit, configured to adjust parameters of the pruning model based on the adjacent convolution by using the mask matrix;

and the model pruning subunit is used for pruning by using the pruning model based on the adjacent convolution after the parameter adjustment.

In a word, the model pruning device based on adjacent convolution solves the problem that the convolution pruning process in the prior art is only limited to single-layer convolution and does not consider the relationship between two layers of convolution through the fusion of the importance based on the adjacent two layers of convolution; unimportant filters in convolution can be really obtained; therefore, the technical effect of higher precision is achieved while relatively better model performance of the convolution model is kept.

More specific implementation manners of the model pruning device based on adjacent convolution provided by the present invention can be described with reference to the above embodiments of the model pruning method based on adjacent convolution, and are not listed here.

According to the embodiment, the model pruning method based on the adjacent convolution solves the problem that the convolution pruning process is only limited to single-layer convolution and does not consider the relationship between two layers of convolution in the prior art through the fusion of the importance based on the adjacent two layers of convolution; unimportant filters in convolution can be really obtained; therefore, the technical effect of higher precision is achieved while relatively better model performance of the convolution model is kept.

As shown in fig. 3, the present invention provides an electronic device 3 for a model pruning method based on adjacent convolution.

The electronic device 3 may comprise a processor 30, a memory 31 and a bus, and may further comprise a computer program, such as a model pruning program 32 based on adjacent convolutions, stored in the memory 31 and executable on said processor 30.

The memory 31 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 31 may in some embodiments be an internal storage unit of the electronic device 3, for example a removable hard disk of the electronic device 3. The memory 31 may also be an external storage device of the electronic device 3 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 3. Further, the memory 31 may also include both an internal storage unit and an external storage device of the electronic device 3. The memory 31 may be used not only to store application software installed in the electronic device 3 and various types of data, such as codes of a model pruning program based on adjacent convolution, etc., but also to temporarily store data that has been output or is to be output.

The processor 30 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 30 is a Control Unit of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 3 by running or executing programs or modules (e.g., model pruning programs based on adjacent convolution, etc.) stored in the memory 31 and calling data stored in the memory 41.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 31 and at least one processor 30 or the like.

Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 3, and may comprise fewer or more components than those shown, or some components may be combined, or a different arrangement of components.

For example, although not shown, the electronic device 3 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 30 through a power management device, so that functions such as charge management, discharge management, and power consumption management are implemented through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 3 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device 3 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 3 and other electronic devices.

Optionally, the electronic device 3 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), or optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 3 and for displaying a visualized user interface.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The memory 31 in the electronic device 3 stores a neighboring convolution based model pruning program 32 that is a combination of instructions that, when executed in the processor 30, may implement: obtaining a filter Manhattan distance of a filter matrix in the convolutional layer to be evaluated and a channel Manhattan distance of a channel matrix by using an absolute value function, obtaining a convolutional layer filter mode parameter according to the filter Manhattan distance, and obtaining a convolutional layer channel mode parameter according to the channel Manhattan distance; multiplying the convolution layer filter mode parameter and the convolution layer channel mode parameter to form a filter pruning probability parameter for judging the filter pruning probability; sequencing the filter pruning probability parameters according to a preset rule, and determining a filter to be pruned according to a sequencing result of the filter pruning probability parameters; and cutting the determined filter to be pruned.

Specifically, the processor 30 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again. It should be emphasized that, in order to further ensure the privacy and security of the model pruning program based on adjacent convolution, the model pruning program based on adjacent convolution is stored in the node of the block chain where the server cluster is located.

Further, the integrated modules/units of the electronic device 3, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

An embodiment of the present invention further provides a computer-readable storage medium, where the storage medium may be nonvolatile or volatile, and the storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements: obtaining a filter Manhattan distance of a filter matrix in the convolutional layer to be evaluated and a channel Manhattan distance of a channel matrix by using an absolute value function, obtaining a convolutional layer filter mode parameter according to the filter Manhattan distance, and obtaining a convolutional layer channel mode parameter according to the channel Manhattan distance; fusing the mode parameters of the convolutional layer filter and the mode parameters of the convolutional layer channel to form filter pruning probability parameters for judging the filter pruning probability; sequencing the filter pruning probability parameters according to a preset rule, and determining a filter to be pruned according to a sequencing result of the filter pruning probability parameters; and cutting the determined filter to be pruned.

Further, preferably, the filter pruning probability parameter in the filter mode channel mode is obtained by the following formula: the filter pruning probability parameter is equal to the convolutional layer filter mode parameter x convolutional layer channel mode parameter.

Further, preferably, the method for clipping the determined filter to be clipped includes: acquiring a filter to be pruned, and training a pruning model based on adjacent convolution according to the filter to be pruned and a preset clipping threshold; acquiring a mask matrix according to the original parameters of the pruning model based on the adjacent convolution; the mask matrix is consistent with the original parameter matrix size of the pruning model based on adjacent convolution, and the mask matrix is a training matrix comprising 0 and 1; adjusting parameters of the pruning model based on the adjacent convolution by utilizing the mask matrix; and pruning is carried out by utilizing the pruning model based on the adjacent convolution after the parameters are adjusted.

Further, preferably, the method for adjusting the parameters of the pruning model based on the adjacent convolution by using the mask matrix includes: multiplying parameters of the neighboring convolution-based pruning model with the mask matrix; screening model parameters of a pruning model with mask code of 1, and training and back propagation adjustment are carried out on the model parameters with mask code; storing the model parameter values adjusted by back propagation and the corresponding matrix positions; and acquiring the final parameters of the pruning model based on the adjacent convolution according to the model parameter values and the corresponding matrix positions thereof, and completing the adjustment of the parameters of the pruning model.

Further, preferably, the method for determining the filter to be pruned according to the sorting result of the filter pruning probability parameters in the step includes: sequencing according to the pruning probability parameters of the filter from large to small, and taking a channel with the pruning probability parameter of the filter smaller than a preset threshold value as a filter to be pruned; wherein the preset threshold is 1%.

Further, preferably, the manhattan distance of the absolute value function manhattan distance said loss func is obtained by the following formula:

wherein p ═ p₀,…,p_C-1]For probability distribution, each element p_iIndicates the probability of a sample belonging to class i, y ═ y₀,…,y_C-1]Is the onehot representation of the sample label, when the sample belongs to category i, then y_i1, otherwise y_i0; c is the total number of classes.

Specifically, the specific implementation method of the computer program when being executed by the processor may refer to the description of the relevant steps in the model pruning method based on adjacent convolution in the embodiment, which is not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, and an application service layer, which may store medical data, such as personal health profiles, kitchens, examination reports, and the like.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A model pruning method based on adjacent convolution is applied to an electronic device and is characterized by comprising the following steps:

and cutting the determined filter to be pruned.

2. The neighbor convolution-based model pruning method according to claim 1, wherein the method of clipping the determined filter to be pruned comprises:

acquiring a filter to be pruned, and training a pruning model based on adjacent convolution according to the filter to be pruned and a preset clipping threshold; acquiring a mask matrix according to the original parameters of the pruning model based on the adjacent convolution; the mask matrix is consistent with the original parameter matrix size of the pruning model based on adjacent convolution, and the mask matrix is a training matrix comprising 0 and 1;

3. The neighbor convolution-based model pruning method of claim 2, wherein the method of adjusting the parameters of the neighbor convolution-based pruning model using the mask matrix comprises:

screening model parameters of a pruning model with mask code of 1, masking the model parameters, and training and back propagation adjustment of the model parameters;

4. The neighbor convolution-based model pruning method according to claim 1, wherein the step of determining the filter to be pruned according to the sorted results of the filter pruning probability parameters comprises:

5. The neighbor convolution-based model pruning method of claim 1, wherein an absolute value function manhattan distance; the loss function is obtained by the following formula:

wherein p ═ p₀，…，p_C-1]For probability distribution, each element p_iIndicates the probability of a sample belonging to class i, y ═ y₀，…，y_C-1]Is a representation of the sample label, when a sample belongs to class i, then y_i1, otherwise y_i0; c is the total number of classes.

6. The neighbor convolution-based model pruning method of claim 3,

the step of pruning is carried out by utilizing the pruning model based on the adjacent convolution after the parameter adjustment, wherein the pruning comprises the steps of cutting the number of channels in a convolution kernel, cutting the number of channels in the input feature diagram corresponding to the number of channels in the convolution kernel and outputting the convolution kernel of the upper layer corresponding to the current input feature diagram.

7. The neighbor convolution-based model pruning method of claim 2,

the input of the convolution layer to be evaluated is an input feature map H W C_inThe output is convolution kernel W (C)_in*k_h*k_w)*(C_out) And outputting a characteristic diagram (H W) C_out) (ii) a Where H and W are the height and width of the output signature, respectively.

8. An apparatus for model pruning based on adjacent convolution, the apparatus comprising:

and the pruning unit is used for cutting the determined filter to be pruned.

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps in the neighbor convolution based model pruning method of any of claims 1 to 7.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a neighbor convolution-based model pruning method according to any one of claims 1 to 7.