CN115660071A - Model pruning method and device - Google Patents

Model pruning method and device Download PDF

Info

Publication number
CN115660071A
CN115660071A CN202211595555.6A CN202211595555A CN115660071A CN 115660071 A CN115660071 A CN 115660071A CN 202211595555 A CN202211595555 A CN 202211595555A CN 115660071 A CN115660071 A CN 115660071A
Authority
CN
China
Prior art keywords
filter
target
convolutional layer
filters
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211595555.6A
Other languages
Chinese (zh)
Inventor
兰婷婷
支涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yunji Technology Co Ltd
Original Assignee
Beijing Yunji Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yunji Technology Co Ltd filed Critical Beijing Yunji Technology Co Ltd
Priority to CN202211595555.6A priority Critical patent/CN115660071A/en
Publication of CN115660071A publication Critical patent/CN115660071A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Processing (AREA)

Abstract

The disclosure relates to the technical field of machine learning, and provides a model pruning method and device. The method comprises the following steps: screening out a target convolution layer needing channel pruning from the target model; clustering the filters in the target convolution layer into a preset number of filter groups by using a clustering algorithm; performing multiple rounds of training on the target convolutional layer to enable parameters of the same group of filters in the target convolutional layer to be continuously close until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and ending the training; each group of filters in the target convolutional layer are fused into one filter to obtain a target convolutional layer after channel pruning; and according to the target convolutional layer after the channel pruning, carrying out fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the target model after the channel pruning. By adopting the technical means, the problem that the precision of the model is reduced due to the existing model channel pruning technology in the prior art is solved.

Description

Model pruning method and device
Technical Field
The disclosure relates to the technical field of machine learning, in particular to a model pruning method and device.
Background
As the convolutional network becomes wider and deeper, the memory occupation, the power consumption and the floating point operation times per second of the convolutional network are all increased sharply. In this context, convolutional network compression and acceleration methods have gained wide attention. Compared with compression methods such as model quantization and sparsification, channel pruning (also called filter pruning) is independent of model parameter precision, and does not depend on a special hardware structure, so that a better compression acceleration effect can be obtained, and the method is a research focus in recent years. The current channel pruning method estimates the importance of the filters through various indexes artificially designed, directly prunes some filters (the weight is 0), and reconstructs the network by using the rest filters. However, although those pruned filters are less important in a sense, they are not fully redundant, and thus the model accuracy is degraded by the pruning reconstruction operation.
In the course of implementing the disclosed concept, the inventors found that there are at least the following technical problems in the related art: the existing model channel pruning technology can cause the problem of reduced model precision.
Disclosure of Invention
In view of this, embodiments of the present disclosure provide a model pruning method and apparatus, an electronic device, and a computer-readable storage medium, so as to solve the problem in the prior art that the accuracy of a model is reduced due to the existing model channel pruning technology.
In a first aspect of the embodiments of the present disclosure, a model pruning method is provided, including: screening a target convolutional layer needing channel pruning from the target model; clustering the filters in the target convolution layer into a preset number of filter groups by using a clustering algorithm; performing multiple rounds of training on the target convolutional layer to enable parameters of the same group of filters in the target convolutional layer to be close to each other continuously until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and finishing the training; fusing each group of filters in the target convolutional layer into one filter to obtain a target convolutional layer after channel pruning; and according to the target convolutional layer after the channel pruning, carrying out fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the target model after the channel pruning.
In a second aspect of the embodiments of the present disclosure, there is provided a model pruning device, including: the screening module is configured to screen out a target convolutional layer needing channel pruning from the target model; a clustering module configured to cluster the filters in the target convolution layer into a preset number of filter banks using a clustering algorithm; the training module is configured to perform multiple rounds of training on the target convolutional layer, so that the parameters of the same group of filters in the target convolutional layer are continuously close to each other until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and the training is finished; the first fusion module is configured to fuse each group of filters in the target convolutional layer into one filter to obtain a target convolutional layer after channel pruning; and the second fusion module is configured to perform fusion processing on the filter in the convolution layer next to the target convolution layer in the target model according to the target convolution layer after channel pruning to obtain the target model after channel pruning.
In a third aspect of the embodiments of the present disclosure, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the above method when executing the computer program.
In a fourth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the steps of the above-mentioned method.
Compared with the prior art, the embodiment of the disclosure has the following beneficial effects: screening out a target convolution layer needing channel pruning from the target model; clustering the filters in the target convolution layer into a preset number of filter groups by using a clustering algorithm; performing multiple rounds of training on the target convolutional layer to enable parameters of the same group of filters in the target convolutional layer to be continuously close until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and ending the training; fusing each group of filters in the target convolutional layer into one filter to obtain a target convolutional layer after channel pruning; and according to the target convolutional layer after the channel pruning, carrying out fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the target model after the channel pruning. By adopting the technical means, the problem that the precision of the model is reduced due to the existing model channel pruning technology in the prior art can be solved, and a method which can not reduce the precision of the model after channel pruning is further provided.
Drawings
To more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.
FIG. 1 is a scenario diagram of an application scenario of an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram of a model pruning method provided by an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a model pruning device provided by the embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. It will be apparent, however, to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
A model pruning method and apparatus according to an embodiment of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 is a scene schematic diagram of an application scenario of an embodiment of the present disclosure. The application scenario may include terminal devices 101, 102, and 103, server 104, and network 105.
The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 104, including but not limited to smart phones, robots, laptop portable computers, desktop computers, and the like (e.g., 102 may be a robot); when the terminal apparatuses 101, 102, and 103 are software, they can be installed in the electronic apparatus as above. The terminal devices 101, 102, and 103 may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module, which is not limited by the embodiments of the present disclosure. Further, various applications, such as a data processing application, an instant messaging tool, social platform software, a search type application, a shopping type application, and the like, may be installed on the terminal devices 101, 102, and 103.
The server 104 may be a server providing various services, for example, a backend server receiving a request sent by a terminal device establishing a communication connection with the server, and the backend server may receive and analyze the request sent by the terminal device and generate a processing result. The server 104 may be a server, may also be a server cluster composed of a plurality of servers, or may also be a cloud computing service center, which is not limited in this disclosure.
The server 104 may be hardware or software. When the server 104 is hardware, it may be various electronic devices that provide various services to the terminal devices 101, 102, and 103. When the server 104 is software, it may be multiple software or software modules providing various services for the terminal devices 101, 102, and 103, or may be a single software or software module providing various services for the terminal devices 101, 102, and 103, which is not limited by the embodiment of the present disclosure.
The network 105 may be a wired network connected by a coaxial cable, a twisted pair and an optical fiber, or may be a wireless network that can interconnect various Communication devices without wiring, for example, bluetooth (Bluetooth), near Field Communication (NFC), infrared (Infrared), and the like, which is not limited in the embodiment of the present disclosure.
The target user can establish a communication connection with the server 104 via the network 105 through the terminal devices 101, 102, and 103 to receive or transmit information or the like. It should be noted that the specific types, numbers and combinations of the terminal devices 101, 102 and 103, the server 104 and the network 105 may be adjusted according to the actual requirements of the application scenario, and the embodiment of the present disclosure does not limit this.
Fig. 2 is a schematic flow chart of a model pruning method according to an embodiment of the present disclosure. The model pruning method of fig. 2 may be performed by the terminal device or the server of fig. 1. As shown in fig. 2, the model pruning method includes:
s201, screening out a target convolutional layer needing channel pruning from a target model;
s202, clustering the filters in the target convolution layer into a preset number of filter groups by using a clustering algorithm;
s203, performing multiple rounds of training on the target convolutional layer to enable parameters of the same group of filters in the target convolutional layer to be close to each other continuously until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and finishing the training;
s204, fusing each group of filters in the target convolutional layer into one filter to obtain a target convolutional layer after channel pruning;
s205, according to the target convolutional layer after channel pruning, the filter in the convolutional layer next to the target convolutional layer in the target model is subjected to fusion processing, and the target model after channel pruning is obtained.
The target model may be any model used in machine learning. The channel pruning is performed on the target model, and is performed on a plurality of target convolutional layers in the target model, so that the number of the target convolutional layers in the embodiment of the present disclosure may be multiple. When clustering the filter, a common clustering algorithm may be used, such as k-means. The filters in the convolutional layer are a matrix of size m x n that is used to detect specific features in the image, with different filters having different parameters. And performing multiple rounds of training on the target convolutional layer to enable the parameters of the same group of filters in the target convolutional layer to be continuously close, and when the parameters of the same group of filters are close to a certain degree (namely the difference value of the parameters of the same group of filters is smaller than a preset threshold), replacing the group of filters with one filter. Filter fusion is understood to be a superposition of filters, after which a filter can be used instead of the set of filters. After the channel pruning of the target convolutional layer is completed, because the output layer of the previous layer in the adjacent convolutional layers must be consistent with the input layer of the next layer in the number of channels (or the number of filters is consistent, and the number of filters is equal to the number of channels), the channel pruning can be performed on the convolutional layer of the next layer of the target convolutional layer in the target model according to the principle, and according to the method, the channel pruning of the target model is completed.
Optionally, the target convolutional layer to be subjected to channel pruning is screened from the target model according to the principle that the convolutional layers after channel pruning cannot influence each other. For example, the channel pruning of the first layer of the convolutional layer can affect the second layer of the convolutional layer, but does not affect the third layer of the convolutional layer, so that a target convolutional layer needing the channel pruning is screened from the target model, and the target convolutional layer can be the first layer of the convolutional layer and the third layer of the convolutional layer (because the channel pruning of the first layer of the convolutional layer can affect the second layer of the convolutional layer, the channel pruning of the second layer of the convolutional layer can be completed by utilizing the effect, so that the second layer of the convolutional layer does not need to be picked independently) \\8230
According to the technical scheme provided by the embodiment of the disclosure, a target convolutional layer needing channel pruning is screened from a target model; clustering the filters in the target convolution layer into a preset number of filter groups by using a clustering algorithm; performing multiple rounds of training on the target convolutional layer to enable parameters of the same group of filters in the target convolutional layer to be continuously close until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and ending the training; each group of filters in the target convolutional layer are fused into one filter to obtain a target convolutional layer after channel pruning; and according to the target convolutional layer after the channel pruning, carrying out fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the target model after the channel pruning. By adopting the technical means, the problem that the precision of the model is reduced due to the existing model channel pruning technology in the prior art can be solved, and a method which can not reduce the precision of the model after channel pruning is further provided.
Performing multiple rounds of training on the target convolutional layer to enable the parameters of the same group of filters in the target convolutional layer to be continuously close to each other until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and finishing the training, wherein the training comprises the following steps: calculating a central point matrix of each filter group according to all filters in the filter group; sequentially calculating the distance sum of each filter in each filter group to the central point matrixes of all other filter groups; determining the filter with the maximum corresponding distance sum in each filter group as a target filter of the filter group; and performing multiple rounds of training on each filter group by using a gradient descent function according to the target filter of each filter group, so that the parameters of the filters in the same group are continuously close to each other until the difference value of the parameters of the filters in the same group is smaller than a preset threshold value, and finishing the training.
The center point matrix of each filter bank can be obtained by summing and averaging all the filters in the filter bank.
Sequentially calculating the distance sum of each filter in each filter group to the central point matrix of all other filter groups, wherein the distance sum comprises the following steps: for each filter bank, sequentially calculating the distance sum of each filter in the filter bank to the central point matrix of all other filter banks by using the following formula:
Figure 17330DEST_PATH_IMAGE001
wherein, P i For the ith filter in the filter bank, M j For the jth filter bank except the filter bank, K is the number of filter banks in the target convolutional layer and | is the norm operator.
For example for a filter bank, P i The sum of the distances to which the filter corresponds is maximum, then P i Is the target filter of the filter bank. So that each filter bank can determine a target filter.
According to the target filter of each filter group, performing multi-round training on each filter group by using a gradient descent function, so that the parameters of the filters in the same group are continuously close to each other until the difference value of the parameters of the filters in the same group is smaller than a preset threshold value, and finishing the training, wherein the training comprises the following steps: the following steps are executed in a circulating mode for multiple rounds of training: training each filter bank by using a gradient descent function according to the target filter of each filter bank, and adding one to the training round; when the difference value of the parameters of the filters in the same group after training is smaller than a preset threshold value, finishing the training; and when the difference value of the parameters of the filters in the same group after training is not less than the preset threshold value, updating the target filter of each filter group after training, and continuing the next round of training.
When the difference value of the parameters of the filters in the same group after training is not less than a preset threshold value, updating the central point matrix of each filter group according to all the filters in each filter group after training; sequentially updating the distance sum of each filter in each filter group to the central point matrixes of all other filter groups; and updating the target filter according to the updated distance sum.
The gradient descent function is:
Figure 531488DEST_PATH_IMAGE002
wherein, P 0 For the target filter in each filter bank, P h For the H-th filter, H, in each filter bank i For the number of filters in each filter bank, L is the loss function of the target model.
The training of each filter bank is the training of the filters in each filter bank. And calculating the gradient value of the filter in each filter bank in the direction with the fastest gradient descent by using a gradient descent function, and further updating the parameters of the filter in each filter bank by using a gradient back propagation algorithm.
The gradient descent function may be derived by:
because the gradient is descending in the direction of
Figure 395539DEST_PATH_IMAGE003
P h To P 0 The gradient of the approach is in the descending direction
Figure 413173DEST_PATH_IMAGE004
So a gradient descent function of
Figure 970056DEST_PATH_IMAGE005
Further, the gradient descent function can be derived as:
Figure 971510DEST_PATH_IMAGE006
wherein, P 0 For the target filter, P, in each filter bank h For the H-th filter, H, in each filter bank i For the number of filters in each filter bank, L is the loss function of the target model, η is the attenuation factor, ε is the step size, P i Is the ith filter in the filter bank.
Further, the gradient descent function can be derived as:
Figure 639252DEST_PATH_IMAGE007
a and b are constants that can be adjusted.
According to the target convolutional layer after the channel pruning, carrying out fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the target model after the channel pruning, wherein the fusion processing comprises the following steps: according to the filter in the output layer of the target convolutional layer after channel pruning, carrying out fusion processing on the filter in the input layer of the next convolutional layer of the target convolutional layer in the target model; and according to the fused filter in the input layer of the next layer of the target convolutional layer, carrying out fusion processing on the filters in other layers of the next layer of the target convolutional layer to obtain a target model after channel pruning.
All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
Fig. 3 is a schematic diagram of a model pruning device provided in an embodiment of the present disclosure. As shown in fig. 3, the model pruning device includes:
the screening module 301 is configured to screen out a target convolutional layer which needs to be subjected to channel pruning from a target model;
a clustering module 302 configured to cluster the filters in the target convolution layer into a preset number of filter banks using a clustering algorithm;
a training module 303 configured to perform multiple rounds of training on the target convolutional layer, so that the parameters of the same group of filters in the target convolutional layer are continuously close to each other until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and ending the training;
a first fusion module 304 configured to fuse each group of filters in the target convolutional layer into one filter, so as to obtain a channel pruned target convolutional layer;
a second fusion module 305, configured to perform fusion processing on the filter in the convolutional layer next to the target convolutional layer in the target model according to the target convolutional layer after channel pruning, to obtain the target model after channel pruning.
The target model may be any model used in machine learning. The channel pruning is performed on the target model, and is performed on a plurality of target convolutional layers in the target model, so that the number of the target convolutional layers in the embodiment of the present disclosure may be multiple. When clustering the filter, a common clustering algorithm may be used, such as k-means. The filters in the convolutional layer are a matrix of m x n size that is used to detect specific features in the image, with different filters having different parameters. And performing multiple rounds of training on the target convolutional layer to enable the parameters of the same group of filters in the target convolutional layer to be continuously close, and when the parameters of the same group of filters are close to a certain degree (namely the difference value of the parameters of the same group of filters is smaller than a preset threshold), replacing the group of filters with one filter. Filter fusion is understood to be a superposition of filters, after which a filter can be used instead of the set of filters. After the channel pruning of the target convolutional layer is completed, because the output layer of the previous layer in the adjacent convolutional layers must be consistent with the input layer of the next layer in the number of channels (or the number of filters is consistent, and the number of filters is equal to the number of channels), the channel pruning can be performed on the convolutional layer of the next layer of the target convolutional layer in the target model according to the principle, and according to the method, the channel pruning of the target model is completed.
Optionally, the screening module 301 is further configured to screen out a target convolutional layer to be subjected to channel pruning from the target model according to a principle that the convolutional layers after channel pruning cannot affect each other. For example, the channel pruning of the first layer of the convolutional layer can affect the second layer of the convolutional layer, but does not affect the third layer of the convolutional layer, so that a target convolutional layer needing the channel pruning is screened from the target model, and the target convolutional layer can be the first layer of the convolutional layer and the third layer of the convolutional layer (because the channel pruning of the first layer of the convolutional layer can affect the second layer of the convolutional layer, the channel pruning of the second layer of the convolutional layer can be completed by utilizing the effect, so that the second layer of the convolutional layer does not need to be picked independently) \\8230
According to the technical scheme provided by the embodiment of the disclosure, a target convolutional layer needing channel pruning is screened from a target model; clustering the filters in the target convolution layer into a preset number of filter groups by using a clustering algorithm; performing multiple rounds of training on the target convolutional layer to enable parameters of the same group of filters in the target convolutional layer to be close to each other continuously until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and finishing the training; fusing each group of filters in the target convolutional layer into one filter to obtain a target convolutional layer after channel pruning; and according to the target convolutional layer after the channel pruning, carrying out fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the target model after the channel pruning. By adopting the technical means, the problem that the precision of the model is reduced due to the existing model channel pruning technology in the prior art can be solved, and a method which can not reduce the precision of the model after channel pruning is further provided.
Optionally, the training module 303 is further configured to calculate a central point matrix of each filter bank according to all filters in the filter bank; sequentially calculating the distance sum of each filter in each filter group to the central point matrixes of all other filter groups; determining the filter with the maximum corresponding distance in each filter group as a target filter of the filter group; and performing multiple rounds of training on each filter group by using a gradient descent function according to the target filter of each filter group, so that the parameters of the filters in the same group are continuously close to each other until the difference value of the parameters of the filters in the same group is smaller than a preset threshold value, and finishing the training.
The center point matrix of each filter bank can be obtained by summing and averaging all the filters in the filter bank.
Optionally, the training module 303 is further configured to, for each filter bank, sequentially calculate a sum of distances from each filter in the filter bank to the center point matrix of all other filter banks using the following formula:
Figure 511393DEST_PATH_IMAGE001
wherein, P i For the ith filter, M, in the filter bank j For the jth filter bank except the filter bank, K is the number of filter banks in the target convolution layer and | | is the norm operator.
For example for a filter bank, P i The sum of the distances to which the filter corresponds is maximum, then P i Is the target filter of the filter bank. So that each filter bank can determine a target filter.
Optionally, the training module 303 is further configured to perform multiple rounds of training by performing the following steps in a loop: training each filter bank by using a gradient descent function according to the target filter of each filter bank, and adding one to the training round; when the difference value of the parameters of the filters in the same group after training is smaller than a preset threshold value, finishing the training; and when the difference value of the parameters of the filters in the same group after training is not less than the preset threshold value, updating the target filter of each filter group after training, and continuing the next round of training.
When the difference value of the parameters of the filters in the same group after training is not less than a preset threshold value, updating the central point matrix of each filter group according to all the filters in each filter group after training; sequentially updating the distance sum of each filter in each filter group to the central point matrixes of all other filter groups; and updating the target filter according to the updated distance sum.
The gradient descent function is:
Figure 239178DEST_PATH_IMAGE002
wherein, P 0 For the target filter in each filter bank, P h For the H-th filter in each filter bank, H i L is the loss function of the target model for the number of filters in each filter bank.
The training of each filter bank is the training of the filters in each filter bank. And calculating the gradient value of the filter in each filter bank in the direction with the fastest gradient descent by using a gradient descent function, and further updating the parameters of the filter in each filter bank by using a gradient back propagation algorithm.
The gradient descent function may be derived by:
since the gradient is descending in the direction of
Figure 462349DEST_PATH_IMAGE003
P h To P 0 The gradient of the approach is in the descending direction
Figure 933781DEST_PATH_IMAGE004
So a gradient descent function of
Figure 926008DEST_PATH_IMAGE005
Further, the gradient descent function can be derived as:
Figure 824694DEST_PATH_IMAGE006
wherein, P 0 For the target filter in each filter bank, P h For the H-th filter in each filter bank, H i For the number of filters in each filter bank, L is the loss function of the target model, η is the attenuation factor, ε is the step size, P i Is the ith filter in the filter bank.
Further, it can be derived that the gradient descent function is:
Figure 269582DEST_PATH_IMAGE007
a and b are constants that can be adjusted.
Optionally, the second fusion module 305 is further configured to perform fusion processing on the filter in the input layer of the convolutional layer next to the target convolutional layer in the target model according to the filter in the output layer of the target convolutional layer after channel pruning; and according to the fused filter in the input layer of the next layer of the target convolutional layer, carrying out fusion processing on the filters in other layers of the next layer of the target convolutional layer to obtain a target model after channel pruning.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.
Fig. 4 is a schematic diagram of an electronic device 4 provided by the embodiment of the present disclosure. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: a processor 401, a memory 402 and a computer program 403 stored in the memory 402 and executable on the processor 401. The steps in the various method embodiments described above are implemented when the processor 401 executes the computer program 403. Alternatively, the processor 401 implements the functions of the respective modules/units in the above-described respective apparatus embodiments when executing the computer program 403.
Illustratively, the computer program 403 may be partitioned into one or more modules/units, which are stored in the memory 402 and executed by the processor 401 to accomplish the present disclosure. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, which are used to describe the execution of the computer program 403 in the electronic device 4.
The electronic device 4 may be a desktop computer, a notebook, a palm computer, a cloud server, or other electronic devices. The electronic device 4 may include, but is not limited to, a processor 401 and a memory 402. Those skilled in the art will appreciate that fig. 4 is merely an example of the electronic device 4, and does not constitute a limitation of the electronic device 4, and may include more or less components than those shown, or combine certain components, or different components, e.g., the electronic device may also include input-output devices, network access devices, buses, etc.
The Processor 401 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 402 may be an internal storage unit of the electronic device 4, for example, a hard disk or a memory of the electronic device 4. The memory 402 may also be an external storage device of the electronic device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 4. Further, the memory 402 may also include both internal storage units of the electronic device 4 and external storage devices. The memory 402 is used for storing computer programs and other programs and data required by the electronic device. The memory 402 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In the embodiments provided in the present disclosure, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other ways. For example, the above-described apparatus/electronic device embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logical function, and may be implemented in other ways, and multiple units or components may be combined or integrated into another system, or some features may be omitted or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method in the above embodiments, and may also be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the above methods and embodiments. The computer program may comprise computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer-readable medium may contain suitable additions or subtractions depending on the requirements of legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer-readable media may not include electrical carrier signals or telecommunication signals in accordance with legislation and patent practice.
The above examples are only intended to illustrate the technical solution of the present disclosure, not to limit it; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present disclosure, and are intended to be included within the scope of the present disclosure.

Claims (10)

1. A method of model pruning, comprising:
screening a target convolutional layer needing channel pruning from the target model;
clustering the filters in the target convolution layer into a preset number of filter groups by using a clustering algorithm;
performing multiple rounds of training on the target convolutional layer to enable parameters of the same group of filters in the target convolutional layer to be continuously close until the difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and ending the training;
each group of filters in the target convolutional layer is fused into one filter to obtain the target convolutional layer after the channel is pruned;
and according to the target convolutional layer after the channel pruning, carrying out fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the target model after the channel pruning.
2. The method of claim 1, wherein performing multiple rounds of training on the target convolutional layer so that parameters of the same group of filters in the target convolutional layer are continuously close until the difference between the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold, and ending training comprises:
calculating a central point matrix of each filter group according to all filters in the filter group;
sequentially calculating the distance sum of each filter in each filter group to the central point matrixes of all other filter groups;
determining the filter with the maximum corresponding distance in each filter group as a target filter of the filter group;
and performing multiple rounds of training on each filter group by using a gradient descent function according to the target filter of each filter group, so that the parameters of the filters in the same group are continuously close to each other until the difference value of the parameters of the filters in the same group is smaller than the preset threshold value, and finishing the training.
3. The method of claim 2, wherein calculating the sum of distances from each filter in each filter bank to the center point matrix of all other filter banks in turn comprises:
for each filter bank, sequentially calculating the distance sum of each filter in the filter bank to the central point matrix of all other filter banks by using the following formula:
Figure 68642DEST_PATH_IMAGE001
wherein, P i For the ith filter in the filter bank, M j For the jth filter bank except the filter bank, K is the number of filter banks in the target convolution layer, | | | | is a normA number operator.
4. The method according to claim 2, wherein the performing multiple rounds of training on each filter bank according to the target filter of each filter bank by using a gradient descent function so that the parameters of the filters in the same group are continuously close until the difference value of the parameters of the filters in the same group is smaller than the preset threshold value, and ending the training comprises:
the following steps are executed in a circulating mode for the multiple rounds of training:
according to the target filter of each filter group, training each filter group by using the gradient descent function, and adding one to the training round;
when the difference value of the parameters of the filters in the same group after training is smaller than the preset threshold value, finishing the training;
and when the difference value of the parameters of the filters in the same group after training is not less than the preset threshold value, updating the target filter of each filter group after training, and continuing the next round of training.
5. The method according to claim 2 or 4, wherein the gradient descent function is:
Figure 660160DEST_PATH_IMAGE002
wherein, P 0 For the target filter, P, in each filter bank h For the H-th filter, H, in each filter bank i For the number of filters in each filter bank, L is the loss function of the target model.
6. The method according to claim 2 or 4, wherein the gradient descent function is:
Figure 823288DEST_PATH_IMAGE003
wherein, P 0 For the target filter in each filter bank, P h For the H-th filter in each filter bank, H i For the number of filters in each filter bank, L is the loss function of the object model, η is the attenuation factor, ε is the step size, P i For the ith filter in each filter bank.
7. The method of claim 1, wherein the performing, according to the channel-pruned target convolutional layer, a fusion process on a filter in a convolutional layer next to the target convolutional layer in the target model to obtain the channel-pruned target model comprises:
according to the filter in the output layer of the target convolutional layer after the channel pruning, carrying out fusion processing on the filter in the input layer of the convolutional layer next to the target convolutional layer in the target model;
and according to the fused filter in the input layer of the next layer of the target convolutional layer, carrying out fusion processing on the filters in other layers of the next layer of the target convolutional layer to obtain the target model after channel pruning.
8. A model pruning device, comprising:
the screening module is configured to screen out a target convolutional layer needing channel pruning from the target model;
a clustering module configured to cluster the filters in the target convolution layer into a preset number of filter banks using a clustering algorithm;
a training module configured to perform multiple rounds of training on the target convolutional layer, so that parameters of the same group of filters in the target convolutional layer are continuously close until a difference value of the parameters of the same group of filters in the target convolutional layer is smaller than a preset threshold value, and ending the training;
a first fusion module configured to fuse each group of filters in the target convolutional layer into one filter, so as to obtain a target convolutional layer after channel pruning;
and the second fusion module is configured to perform fusion processing on a filter in a convolutional layer next to the target convolutional layer in the target model according to the target convolutional layer subjected to channel pruning to obtain the target model subjected to channel pruning.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202211595555.6A 2022-12-13 2022-12-13 Model pruning method and device Pending CN115660071A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211595555.6A CN115660071A (en) 2022-12-13 2022-12-13 Model pruning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211595555.6A CN115660071A (en) 2022-12-13 2022-12-13 Model pruning method and device

Publications (1)

Publication Number Publication Date
CN115660071A true CN115660071A (en) 2023-01-31

Family

ID=85019230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211595555.6A Pending CN115660071A (en) 2022-12-13 2022-12-13 Model pruning method and device

Country Status (1)

Country Link
CN (1) CN115660071A (en)

Similar Documents

Publication Publication Date Title
CN111581291B (en) Data processing method, device, electronic equipment and readable medium
CN109214543B (en) Data processing method and device
CN115473841B (en) Network path determining method, device and storage medium
EP4386579A1 (en) Retrieval model training method and apparatus, retrieval method and apparatus, device and medium
CN114205690A (en) Flow prediction method, flow prediction device, model training method, model training device, electronic equipment and storage medium
CN107679158A (en) Data managing method, device, computer-readable medium and electronic equipment
CN116684330A (en) Traffic prediction method, device, equipment and storage medium based on artificial intelligence
CN113965313B (en) Model training method, device, equipment and storage medium based on homomorphic encryption
CN114700957B (en) Robot control method and device with low computational power requirement of model
CN110728118B (en) Cross-data-platform data processing method, device, equipment and storage medium
CN110084255A (en) The detection method and device of abnormal data
CN115660071A (en) Model pruning method and device
CN113630476B (en) Communication method and communication device applied to computer cluster
CN115048430A (en) Data verification method, system, device and storage medium
CN113642654B (en) Image feature fusion method and device, electronic equipment and storage medium
CN115204178A (en) Text sorting matching method, system, device and storage medium
CN113779335A (en) Information generation method and device, electronic equipment and computer readable medium
CN109712011B (en) Community discovery method and device
CN111161067A (en) Method and device for determining transaction route
CN111582456A (en) Method, apparatus, device and medium for generating network model information
CN114970535B (en) Intention recognition method, system, device and storage medium
CN116501993B (en) House source data recommendation method and device
CN115062630B (en) Method and device for confirming nickname of robot
CN113902098A (en) Multi-type data parallel learning method and device, computer equipment and medium
CN114494817A (en) Image processing method, model training method, related device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination