WO2023159760A1 - 卷积神经网络模型剪枝方法和装置、电子设备、存储介质 - Google Patents

卷积神经网络模型剪枝方法和装置、电子设备、存储介质 Download PDF

Info

Publication number
WO2023159760A1
WO2023159760A1 PCT/CN2022/090721 CN2022090721W WO2023159760A1 WO 2023159760 A1 WO2023159760 A1 WO 2023159760A1 CN 2022090721 W CN2022090721 W CN 2022090721W WO 2023159760 A1 WO2023159760 A1 WO 2023159760A1
Authority
WO
WIPO (PCT)
Prior art keywords
pruning
filter
model
convolutional layer
pruned
Prior art date
Application number
PCT/CN2022/090721
Other languages
English (en)
French (fr)
Inventor
王晓锐
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023159760A1 publication Critical patent/WO2023159760A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a convolutional neural network model pruning method and device, electronic equipment and storage media.
  • model pruning is an important direction of model compression technology.
  • the detection model and segmentation model in the deep learning model can remove redundant parameters through pruning, so as to ensure the accuracy of the model as much as possible, compress the model size, and improve the operation speed of the model.
  • the current method of model pruning to select the pruning filter only considers the information of a single filter, does not consider the relationship between filters, and does not consider the relationship between filters.
  • the redundant information of the internal filter of each convolutional layer in the model is obtained through the relationship, and then the redundant information is used for pruning, which makes the pruning accuracy and model compression accuracy of the convolutional neural network model low.
  • the embodiment of the present application proposes a convolutional neural network model pruning method, including:
  • the pruning model is obtained by pruning the model to be pruned.
  • the embodiment of the present application proposes a convolutional neural network model pruning device, including:
  • Convolutional layer information acquisition module used to obtain the convolutional layer information in the model to be pruned
  • a filter similarity calculation module configured to perform convolution calculations according to the convolutional layer information, to obtain filter similarity values corresponding to filters in each convolutional layer;
  • the pruning importance index calculation module is used to calculate the pruning importance index corresponding to each convolution layer according to the similarity value of the filter
  • the pruning module is configured to prune the model to be pruned to obtain the pruned model according to the preset pruning rate and the pruning importance index corresponding to each convolution layer.
  • the embodiment of the present application provides an electronic device, including:
  • the program is stored in a memory, and the processor executes the at least one program to implement a convolutional neural network model pruning method, wherein the convolutional neural network model pruning method includes:
  • the pruning model is obtained by pruning the model to be pruned.
  • the embodiment of the present application provides a storage medium, the storage medium is a computer-readable storage medium, and the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute A convolutional neural network model pruning method, wherein the convolutional neural network model pruning method comprises:
  • the pruning model is obtained by pruning the model to be pruned.
  • the convolutional neural network model pruning method and device, electronic equipment, and storage medium proposed in the embodiment of the present application obtain the importance value of the filter by performing convolution calculation on the filter in the convolution layer, and then obtain each convolution layer The corresponding pruning importance index. Quantify the importance of the filter in the convolutional layer through the convolution operation, and obtain the redundant information of the internal filter of each convolutional layer in the model according to the importance value of the filter, and then use the redundant information for pruning, which can be used Improve the accuracy of convolutional neural network model pruning, improve model compression accuracy and operation speed.
  • FIG. 1 is a flow chart of a convolutional neural network model pruning method provided by an embodiment of the present application.
  • Fig. 2 is another flow chart of the convolutional neural network model pruning method provided by the embodiment of the present application.
  • Fig. 3 is a schematic diagram of a convolutional layer in a convolutional neural network model.
  • Fig. 4 is a schematic diagram of a filter in a convolutional neural network model.
  • Fig. 5 is another flow chart of the convolutional neural network model pruning method provided by the embodiment of the present application.
  • Fig. 6 is another flow chart of the convolutional neural network model pruning method provided by the embodiment of the present application.
  • Fig. 7 is another flow chart of the convolutional neural network model pruning method provided by the embodiment of the present application.
  • Fig. 8 is another flow chart of the convolutional neural network model pruning method provided by the embodiment of the present application.
  • FIG. 9 is a structural block diagram of a convolutional neural network model pruning device provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • Convolutional Neural Network (Convolutional Neural Networks, CNN): It is a kind of feed-forward neural network with convolution calculation and deep structure, and it is one of the representative algorithms of deep learning.
  • the convolutional neural network has the ability of representation learning, and can perform translation invariant classification on the input information according to its hierarchical structure.
  • the convolutional neural network imitates the biological visual perception mechanism, and can perform supervised learning and unsupervised learning.
  • the convolution kernel parameter sharing in the hidden layer and the sparsity of the inter-layer connection enable the convolutional neural network to use a small calculation. Quantity pairs gridded features.
  • a common convolutional neural network structure is input layer-convolution layer-pooling layer-fully connected layer-output layer.
  • Convolution It is a mathematical method of integral transformation, which generates a mathematical operator of the third function through two functions f and g, and characterizes the product of the overlapping part function value of function f and g after flipping and translation to the overlapping length Integral, if a function participating in convolution is regarded as an indicator function of the interval, convolution can also be regarded as a "sliding average".
  • Model pruning is an important direction of model compression technology.
  • the detection model and segmentation model in the deep learning model can remove redundant parameters through pruning, so as to ensure the accuracy of the model as much as possible, compress the model size, and improve the operation speed of the model.
  • model pruning is mainly divided into two steps: first, select and remove relatively unimportant convolution kernel filters, and then fine-tune and optimize the model with unimportant filters removed to restore the loss caused by removing the filters. loss of precision. Therefore, the pruning methods in the related art all address how to select filters for relatively unimportant convolution kernels. For example, there are three common methods: 1) Directly adopt the weight of BN layer. This method is easy to understand and easy to implement, but the weight of BN layer is difficult to measure the amount of information that the correlation filter really has. There is no strong relationship between the two. Correlation, so that the information correlation between filters cannot be measured; 2) The size of the L1 or L2 norm value of the filter is used as the filter importance judgment index.
  • This method has similar shortcomings to the first method, only Depends on the size of the value, without considering the correlation between filters; 3) Using the geometric median method of the space where the filter is located, this method first calculates the nearest geometric median of all filters filter, and then pruned it, but whether the information content of the geometric median can really be replaced by the information content of other filters is not supported by strict evidence.
  • the embodiment of the present application provides a convolutional neural network model pruning method and device, electronic equipment, and storage media.
  • the filter importance value is obtained, and then each The pruning importance index corresponding to convolutional layers.
  • Embodiments of the present application provide a convolutional neural network model pruning method and device, electronic equipment, and a storage medium, which are specifically described through the following embodiments. First, the convolutional neural network model pruning method in the embodiment of the present application is described.
  • AI artificial intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • the convolutional neural network model pruning method provided in the embodiment of the present application relates to the technical field of artificial intelligence, especially to the technical field of data mining.
  • the convolutional neural network model pruning method provided in the embodiment of the present application can be applied to a terminal, can also be applied to a server, and can also be software running on a terminal or a server.
  • the terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer or a smart watch, etc.
  • the server can be an independent server, or can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage , network services, cloud communications, middleware services, domain name services, security services, content delivery network (Content Delivery Network, CDN), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms;
  • the application can be used in numerous general purpose or special purpose computer system environments or configurations. Examples: personal computers, server computers, handheld or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, including A distributed computing environment for any of the above systems or devices, etc.
  • This application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.
  • Fig. 1 is an optional flow chart of the convolutional neural network model pruning method provided by the embodiment of the present application.
  • the method in Fig. 1 may include but not limited to steps S110 to S120.
  • Step S110 acquiring convolutional layer information in the model to be pruned.
  • the model to be pruned is a convolutional neural network model
  • the convolutional layer information may be a filter included in the convolutional layer.
  • pruning the model to be pruned the relationship between the filters and the redundant information of the internal filters of each convolutional layer in the model are taken into account to quantify the importance of the filters and improve the convolution neural network.
  • the accuracy of network model pruning improves model compression accuracy and computing speed.
  • Step S120 perform convolution calculation according to the convolutional layer information, and obtain the filter similarity value corresponding to the filter in each convolutional layer.
  • step S120 includes but not limited to step S121 to step S122:
  • Step S121 obtaining filters corresponding to each convolutional layer, wherein each convolutional layer corresponds to at least two filters.
  • Step S122 perform pairwise convolution calculation on the filters in each convolution layer to obtain multiple filter similarity values corresponding to each filter in the convolution layer.
  • each convolutional layer is a schematic diagram of a convolutional layer.
  • X and Y in FIG. 3 are two filters in a convolutional neural network model (such as a model to be pruned).
  • a continuous feature map (Feature map) including multiple convolutional layers between X and Y, these filters will identify specific features of the image, and each filter will slide and convolute the feature map of the previous layer, as shown in the figure
  • the Conv1 represents one of the convolutional layers, and the Feature map Y can be obtained after the calculation of the Feature map X through the convolutional layer.
  • each convolutional layer is composed of multiple filters (Filter), and each filter has several channels (Channel) from front to back.
  • a filter is a tool used to extract features in a picture, such as edge features, texture features, etc., and the filter is composed of multiple 2D filters.
  • FIG. 4 it is a schematic diagram of a filter. The filter shown in FIG. 4 can be used to detect edges and belongs to a common LOG filter in image processing.
  • the model to be pruned is a convolutional neural network model, and the values in the filters in the convolutional neural network model are obtained by training the model with training data, and are not artificially designed filters.
  • the filter in the convolutional layer adopts a convolution operation: first, the filter filters the image, and then passes the filter across a certain area of the original image in turn, and performs point multiplication with the original pixel value of the area. If the features in the image are very similar to the features of the filter, the sum of the dot products will be high. When there is no corresponding relationship between the image area and the filter, the value obtained by the dot multiplication and summation is very small. If the filter is multiplied and summed with itself, a larger value will be obtained.
  • each convolutional layer is composed of multiple filters, by comparing the filters, if the comparison results prove that the two filters are relatively similar, then the two filters play a role in the convolutional layer.
  • the effect is also similar, so it can be used as the redundant information of the convolutional layer, and one of the filters can be removed for model pruning.
  • the purpose of pruning the model to be pruned is to remove a certain proportion of filters. For example, the filter shown by the dotted line in FIG. 3 can be pruned.
  • This method has the following defects: 1) It does not consider the function of the filter to extract features; 2) and the filter All the elements in have the same effect, without considering the different positions of the elements in the filter, and the position has a great role in feature extraction, which cannot be ignored; 3) There is no physical meaning in using the direct difference method, There is a lack of theoretical basis.
  • a convolution calculation between two filters is used to measure filter similarity. Since the two filters are more similar, the features extracted by the two filters are more similar, and the value obtained by the convolution operation between the two filters is larger, which further shows that the importance of the corresponding filter is relatively high. Weak, it does not play a big role in the calculation of the convolutional layer, and its information is redundant, and the filter can be deleted during the pruning process. On the contrary, if the value obtained by the convolution operation is large, it means that the filter has a significant impact on the result in the calculation of the convolution layer, and it contains a significant amount of information and cannot be deleted. Therefore, in this embodiment, first Obtain the filters in each convolutional layer, perform pairwise convolution calculations on the filters in each convolutional layer, and obtain multiple filter similarity values corresponding to each filter in the convolutional layer.
  • the convolution operation process between the filters is described as: calculating the product sum of corresponding positions of the two filters.
  • the process of pairwise convolution between filters is described as follows:
  • each filter contains a plurality of filter similarity values, specifically, the filter similarity values of filter 1 include: ⁇ S1, S2, S3, S4 ⁇ , and the filter similarity values of filter 2 include : ⁇ S1, S5, S6, S7 ⁇ , filter similar values of filter 3 include: ⁇ S2, S5, S8, S9 ⁇ , filter similar values of filter 4 include: ⁇ S3, S6, S8, S10 ⁇ , the filter similarity values of filter 5 include: ⁇ S4, S7, S9, S10 ⁇ .
  • Step S130 calculating the pruning importance index corresponding to each convolutional layer according to the filter similarity value.
  • step S130 includes but not limited to step S131 to step S132:
  • step S131 the mean value or sum of multiple filter similarity values corresponding to each filter is determined as the filter importance value corresponding to the filter.
  • the filter importance value corresponding to each filter can be calculated by means of summing and averaging or summing, that is, the sum can be obtained only by summing, or the mean value can be obtained by averaging after summing. Select a specific calculation method according to actual needs.
  • the filter importance value in the above example is calculated by summing and averaging, expressed as:
  • the filter importance value of filter 1 is: (S1+S2+S3+S4)/4;
  • the filter importance value of filter 2 is: (S1+S5+S6+S7)/4;
  • the filter importance value of filter 3 is: (S2+S5+S8+S9)/4;
  • the filter importance value of filter 4 is: (S3+S6+S8+S10)/4;
  • the filter importance value of filter 5 is: (S4+S7+S9+S10)/4.
  • step S132 the pruning importance index corresponding to the convolution layer is obtained according to the filter importance value of each filter in the convolution layer.
  • step S132 includes but not limited to step S1321 to step S1322:
  • Step S1321 sorting the filter importance values of each filter in the convolutional layer to obtain a sorting result.
  • step S1322 the pruning importance index corresponding to the convolutional layer is obtained according to the sorting result.
  • the filters included in each convolutional layer are sorted in descending order of importance, and the filters that are ranked higher have greater importance. It can be understood that they can also be sorted from large to small, and when selecting pruning filters for pruning, they are selected in reverse order, which is not specifically limited here.
  • the importance information of each filter in the convolutional layer is the pruning importance index corresponding to the convolutional layer, and then the model to be pruned is pruned according to the pruning importance index.
  • Step S140 according to the preset pruning rate and the pruning importance index corresponding to each convolutional layer, pruning the model to be pruned to obtain a pruned model.
  • a filter may be selected for pruning according to the pruning importance index. For example, the filter importance value of the selected filter is larger, which means that the filter is more important in the model to be pruned. If the corresponding filter is cut off, it will have a greater impact on the performance of the model to be pruned. Therefore In this embodiment, a filter with a smaller filter importance value is selected as a pruning filter during pruning, that is, a filter that is ranked lower in the sorting result, and is pruned.
  • step S140 includes but not limited to step S141 to step S142:
  • Step S141 determining the number of pruning filters in each convolutional layer according to a preset pruning rate.
  • the preset pruning rate needs to be set according to actual needs in the pruning operation. If the pruning rate is too high, the accuracy of the model will be reduced, and if the pruning rate is too low, the efficiency of the model operation will be poorly improved.
  • the actual demand sets the preset pruning rate, pruning according to the preset pruning rate, and decides how many filters to cut, that is, the number of pruning filters in each convolutional layer can be determined according to the preset pruning rate .
  • Step S142 according to the number of pruning filters and the pruning importance index corresponding to each convolutional layer, pruning the model to be pruned to obtain a pruned model.
  • a pruning filter is determined from multiple filters in each convolutional layer according to a preset pruning rate and a pruning importance index, and the pruning filter is pruned to obtain a pruning model. For example, if the preset pruning rate is set to 75%, then 3/4 of the filters will be removed through the pruning operation, and 3/4 of the filters with smaller filter importance values will be removed as pruning filters according to the above sorting results , the filter with a smaller filter importance value is weaker in the model to be pruned, and there is redundant information, so removing it will not have a great impact on the performance of the model to be pruned, and at the same time effectively reduce the
  • the model parameters of the branch model can reduce the calculation amount and storage space of the model to be pruned.
  • the pruning model needs to be fine-tuned to restore the accuracy of the model.
  • the step of fine-tuning the pruning model includes but Not limited to step S810 to step S820:
  • Step S810 select some filters of the pruned model according to preset selection rules.
  • the preset selection rule may be to select some filters close to the input end of the pruning model, and the selection of the number of filters may be set according to actual requirements, which is not limited here.
  • step S820 model training is performed on the remaining filters and corresponding fully connected layers selected from the pruned model to obtain a pruned model.
  • model training is performed on the selected remaining filters (such as filters near the output end) and the corresponding fully connected layers on the target data set to realize fine-tuning compensation for the pruned model and achieve the largest model compression scale The purpose of not affecting the model operation performance under optimized conditions.
  • the VGG16 model is a convolutional neural network suitable for classification and positioning tasks.
  • Neural network model which consists of 5 layers of convolutional layers, 3 layers of fully connected layers, and softmax output layer. The layers are separated by max-pooling (maximized pooling), and the activation units of all hidden layers use ReLU function.
  • the VGG16 model uses multiple convolution layers with smaller convolution kernels (such as 3x3) instead of a convolution layer with a larger convolution kernel. On the one hand, it can reduce parameters, and on the other hand, it is equivalent to more nonlinear mapping. , to increase the fitting/expressive power of the network.
  • the data set used for verification is the CIFAR-10 data set.
  • CIFAR-10 data set There are 60,000 color images in the CIFAR-10 dataset. The size of these images is 32*32, and they are divided into 10 categories, each containing 6,000 images.
  • Epochs 100 Epoch (epochs) of pruning model compression training was performed for each experiment.
  • the hardware used for the verification was NVIDIA V100 GPU, and the PyTorch framework was used.
  • the preset pruning rate was 50%.
  • the compression methods (i.e., pruning methods) used for verification include:
  • APoZ model pruning method that is, the pruning object is determined according to the percentage of the activation function output being zero, and APoZ is used to predict the importance of each filter in the network.
  • Model pruning method with minimum activation value Before activation, set the model weight and deviation to 0, and then cut off the filter that has the least influence on the activation value of the next layer after activation, that is, the average activation value is the smallest (meaning The filter with the least number of uses).
  • L1 model pruning method pruning based on the L1 norm weight parameter, based on the L1 norm weight parameter pruning, each convolutional layer uses a smaller L1 norm to cut off a certain proportion of filters.
  • the calculation accuracy of the model to be pruned without pruning is 93.99%.
  • the table below compare the calculation accuracy of the pruning models obtained by the above three different pruning methods:
  • the convolutional neural network model pruning method in the embodiment of the present application has the highest operational accuracy of 93.41%, which is close to the 93.99% operational accuracy of the model to be pruned without pruning. It can be seen that the implementation of the present application
  • the pruning method of the convolutional neural network model in the example will not have a great impact on the calculation accuracy of the pruned model, and it also has a certain regularization effect. At the same time, it can effectively reduce the model parameters of the pruned model and reduce the Model computation and storage space.
  • the convolutional neural network model pruning method proposed in the embodiment of the present application obtains the convolutional layer information in the model to be pruned, and then performs convolution calculation according to the convolutional layer information to obtain the filter corresponding to the filter in each convolutional layer Filter similarity value, and then calculate the pruning importance index corresponding to each convolution layer according to the filter similarity value, and pruning the pruning model according to the preset pruning rate and the pruning importance index corresponding to each convolution layer.
  • the branches get the pruned model.
  • the convolution calculation is performed on the filters in the convolution layer to obtain the filter importance value, and then obtain the pruning importance index corresponding to each convolution layer.
  • the embodiment of the present application also provides a convolutional neural network model pruning device, which can implement the above convolutional neural network model pruning method.
  • the device includes:
  • the convolutional layer information acquisition module 910 is used to acquire the convolutional layer information in the model to be pruned;
  • the filter similarity calculation module 920 is used to perform convolution calculation according to the convolutional layer information to obtain the filter similarity value corresponding to the filter in each convolutional layer;
  • the pruning importance index calculation module 930 is used to calculate the pruning importance index corresponding to each convolutional layer according to the filter similarity value
  • the pruning module 940 is configured to prune the model to be pruned to obtain a pruned model according to the preset pruning rate and the pruning importance index corresponding to each convolution layer.
  • the specific implementation manner of the convolutional neural network model pruning device of this embodiment is basically the same as the specific implementation manner of the above-mentioned convolutional neural network model pruning method, and will not be repeated here.
  • the embodiment of the present application also provides an electronic device, including:
  • the program is stored in a memory, and the processor executes the at least one program to implement a convolutional neural network model pruning method, wherein the convolutional neural network model pruning method includes:
  • the pruning model is pruned to obtain the pruning model.
  • the electronic device may be any intelligent terminal including a mobile phone, a tablet computer, a personal digital assistant (Personal Digital Assistant, PDA for short), and a vehicle-mounted computer.
  • a mobile phone a tablet computer
  • PDA Personal Digital Assistant
  • FIG. 10 illustrates a hardware structure of an electronic device in another embodiment.
  • the electronic device includes:
  • the processor 1001 may be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize The technical solutions provided by the embodiments of the present application;
  • a general-purpose CPU Central Processing Unit, central processing unit
  • a microprocessor an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize The technical solutions provided by the embodiments of the present application;
  • ASIC Application Specific Integrated Circuit
  • the memory 1002 may be implemented in the form of a ROM (ReadOnly Memory, read-only memory), a static storage device, a dynamic storage device, or a RAM (Random Access Memory, random access memory).
  • the memory 1002 can store operating systems and other application programs.
  • the relevant program codes are stored in the memory 1002 and called by the processor 1001 to execute the implementation of the present application.
  • the convolutional neural network model pruning method of the example
  • the communication interface 1004 is used to realize the communication and interaction between the device and other devices, and the communication can be realized through a wired method (such as USB, network cable, etc.), or can be realized through a wireless method (such as a mobile network, WIFI, Bluetooth, etc.); and
  • bus 1005 which transmits information between various components of the device (such as processor 1001, memory 1002, input/output interface 1003 and communication interface 1004);
  • the processor 1001 , the memory 1002 , the input/output interface 1003 and the communication interface 1004 are connected to each other within the device through the bus 1005 .
  • the embodiment of the present application also provides a storage medium, the storage medium is a computer-readable storage medium, and the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute a convolutional neural network A network model pruning method, wherein the convolutional neural network model pruning method includes:
  • the pruning model is pruned to obtain the pruning model.
  • the convolutional neural network model pruning method, convolutional neural network model pruning device, electronic equipment, and storage medium proposed in the embodiment of the present application obtain the filter importance value by performing convolution calculation on the filter in the convolutional layer, Then the pruning importance index corresponding to each convolutional layer is obtained. Quantify the importance of the filter in the convolutional layer through the convolution operation, and obtain the redundant information of the internal filter of each convolutional layer in the model according to the importance value of the filter, and then use the redundant information for pruning, which can be used Improve the accuracy of convolutional neural network model pruning, improve model compression accuracy and operation speed.
  • the computer-readable storage medium may be non-volatile or volatile.
  • memory can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • At least one (item) means one or more, and “multiple” means two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A exists, only B exists, and A and B exist at the same time , where A and B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • At least one of the following” or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • At least one item (piece) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c ", where a, b, c can be single or multiple.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including multiple instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the foregoing storage medium comprises: U disk, mobile hard disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), magnetic disk or optical disk and other media that can store programs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供卷积神经网络模型剪枝方法和装置、电子设备、存储介质,涉及人工智能技术领域。该卷积神经网络模型剪枝方法,包括:通过获取待剪枝模型中卷积层信息,然后根据卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值,再根据滤波器相似值计算每个卷积层对应的剪枝重要性指标,根据预设剪枝率和每个卷积层对应的剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。本实施例通过卷积运算量化卷积层中滤波器的重要性,并根据滤波器的重要性值得到模型中每个卷积层内部滤波器的冗余信息,然后利用该冗余信息进行剪枝,能够提高卷积神经网络模型剪枝的准确率,提升模型压缩精度和运算速度。

Description

卷积神经网络模型剪枝方法和装置、电子设备、存储介质
本申请要求于2022年02月22日提交中国专利局、申请号为202210163245.0,发明名称为“卷积神经网络模型剪枝方法和装置、电子设备、存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及卷积神经网络模型剪枝方法和装置、电子设备和存储介质。
背景技术
随着互联网技术和人工智能的发展,基于卷积神经网络的模型在很多任务中都表现出了很好的性能,例如用于目标检测的卷积神经网络模型应用较广,但这些模型在使用时都需要巨大的计算开销和内存占用,由于这些模型中通常会含有大量的冗余信息,因此,对模型进行压缩以减少使用过程中的计算开销和内存占用成为必不可少的一步。模型剪枝作为模型压缩技术的一个重要方向,目前深度学习模型中的检测模型和分割模型可以通过剪枝取出冗余参数,尽可能保证模型精度,对模型大小进行压缩,同时提高模型运算速度。
技术问题
以下是发明人意识到的现有技术的技术问题:目前的模型剪枝选取剪枝滤波器的方法仅考虑单个滤波器的信息,没有考虑滤波器之间的关系,也没有根据滤波器之间的关系得到模型中每个卷积层内部滤波器的冗余信息,然后利用该冗余信息进行剪枝,使得卷积神经网络模型的剪枝准确率和模型压缩精度低。
技术解决方案
第一方面,本申请实施例提出了一种卷积神经网络模型剪枝方法,包括:
获取待剪枝模型的卷积层信息;
根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
第二方面,本申请实施例提出了一种卷积神经网络模型剪枝装置,包括:
卷积层信息获取模块,用于获取待剪枝模型中卷积层信息;
滤波器相似性计算模块,用于根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
剪枝重要性指标计算模块,用于根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
剪枝模块,用于根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
第三方面,本申请实施例提出了一种电子设备,包括:
至少一个存储器;
至少一个处理器;
至少一个程序;
所述程序被存储在存储器中,处理器执行所述至少一个程序以实现一种卷积神经网络模型剪枝方法,其中,所述卷积神经网络模型剪枝方法包括:
获取待剪枝模型的卷积层信息;
根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
第四方面,本申请实施例提出了一种存储介质,该存储介质是计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行一种卷积神经网络模型剪枝方法,其中,所述卷积神经网络模型剪枝方法包括:
获取待剪枝模型的卷积层信息;
根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
有益效果
本申请实施例提出的卷积神经网络模型剪枝方法和装置、电子设备、存储介质,通过对卷积层中滤波器进行卷积计算,得到滤波器重要性值,进而得到每个卷积层对应的剪枝重要性指标。通过卷积运算量化卷积层中滤波器的重要性,并根据滤波器的重要性值得到模型中每个卷积层内部滤波器的冗余信息,然后利用该冗余信息进行剪枝,能够提高卷积神经网络模型剪枝的准确率,提升模型压缩精度和运算速度。
附图说明
图1是本申请实施例提供的卷积神经网络模型剪枝方法的流程图。
图2是本申请实施例提供的卷积神经网络模型剪枝方法的又一流程图。
图3是卷积神经网络模型中卷积层的一种示意图。
图4是卷积神经网络模型中滤波器的一种示意图。
图5是本申请实施例提供的卷积神经网络模型剪枝方法的又一流程图。
图6是本申请实施例提供的卷积神经网络模型剪枝方法的又一流程图。
图7是本申请实施例提供的卷积神经网络模型剪枝方法的又一流程图。
图8是本申请实施例提供的卷积神经网络模型剪枝方法的又一流程图。
图9是本申请实施例提供的卷积神经网络模型剪枝装置的结构框图。
图10是本申请实施例提供的电子设备的硬件结构示意图。
本发明的实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
首先,对本申请中涉及的若干名词进行解析:
卷积神经网络(Convolutional Neural Networks, CNN):是一类包含卷积计算且具有深度结构的前馈神经网络,是深度学习的代表算法之一。卷积神经网络具有表征学习能力,能够按其阶层结构对输入信息进行平移不变分类。卷积神经网络仿造生物的视知觉机制构建,可以进行监督学习和非监督学习,其隐含层内的卷积核参数共享和层间连接的稀疏性使得卷积神经网络能够以较小的计算量对格点化特征。一种常见的卷积神经网络结构是输入层-卷积层-池化层-全连接层-输出层。
卷积:是一种积分变换的数学方法,通过两个函数f和g生成第三个函数的一种数学算子,表征函数f与g经过翻转和平移的重叠部分函数值乘积对重叠长度的积分,如果将参加卷积的一个函数看作区间的指示函数,卷积还可以被看作是“滑动平均”。
随着互联网技术和人工智能的发展,基于卷积神经网络的模型在很多任务中都表现出了很好的性能,但这些模型在使用时都需要巨大的计算开销和内存占用,由于这些模型中通常会含有大量的冗余信息,因此,对模型进行压缩以减少使用过程中的计算开销和内存占用成为必不可少的一步。模型剪枝作为模型压缩技术的一个重要方向,目前深度学习模型中的检测模型和分割模型可以通过剪枝取出冗余参数,尽可能保证模型精度,对模型大小进行压缩,同时提高模型运算速度。
模型剪枝的操作主要分为两步:首先选取出相对不重要的卷积核的滤波器并去掉,然后对去掉了不重要滤波器的模型进行微调优化,以恢复因为去掉滤波器带来的精度损失。因此,相关技术中剪枝方法都在解决如何选取出相对不重要的卷积核的滤波器。例如常见的三种方式:1)直接采用BN层的权重的大小,这种方法便于理解,易于实现,但是BN层的权重难以衡量相关滤波器真正具有的信息量,两者之间不具备强相关性,从而不能衡量滤波器之间的信息相关性;2)采用滤波器的L1或L2范数值的大小作为滤波器重要性判断指标,这种方法具有和第一种方法类似的缺点,仅仅依赖于数值的大小,没有对滤波器之间的相关性进行考虑;3)采用滤波器所在空间的几何中位数的方法,这种方法首先计算得到距离所有滤波器的几何中位数最近的滤波器,之后将其进行剪枝,但是几何中位数的信息量是否真的可以以其他的滤波器的信息量来替代是没有严格证据支持的。
由此可见,相关技术中目前的模型剪枝选取剪枝滤波器的方法仅考虑单个滤波器的信息,并没有考虑滤波器之间的关系,也没有并根据滤波器之间的关系得到模型中每个卷积层内部滤波器的冗余信息,进而利用该冗余信息进行剪枝,使得卷积神经网络模型的剪枝准确率和模型压缩精度低。
基于此,本申请实施例提供一种卷积神经网络模型剪枝方法和装置、电子设备、存储介质,通过对卷积层中滤波器进行卷积计算,得到滤波器重要性值,进而得到每个卷积层对应的剪枝重要性指标。通过卷积运算量化卷积层中滤波器的重要性,并根据滤波器的重要性值得到模型中每个卷积层内部滤波器的冗余信息,然后利用该冗余信息进行剪枝,能够提高卷积神经网络模型剪枝的准确率,提升模型压缩精度和运算速度。
本申请实施例提供卷积神经网络模型剪枝方法和装置、电子设备、存储介质,具体通过如下实施例进行说明,首先描述本申请实施例中的卷积神经网络模型剪枝方法。
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。
本申请实施例提供的卷积神经网络模型剪枝方法,涉及人工智能技术领域,尤其涉及数据挖掘技术领域。本申请实施例提供的卷积神经网络模型剪枝方法可应用于终端中,也可应用于服务器端中,还可以是运行于终端或服务器端中的软件。在一些实施例中,终端可以是智能手机、平板电脑、笔记本电脑、台式计算机或者智能手表等;服务器可以是独立的服务器,也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器;软件可以是实现卷积神经网络模型剪枝方法的应用等,但并不局限于以上形式。
本申请可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
图1是本申请实施例提供的卷积神经网络模型剪枝方法的一个可选的流程图,图1中的方法可以包括但不限于包括步骤S110至步骤S120。
步骤S110,获取待剪枝模型中卷积层信息。
在一实施例中,待剪枝模型为卷积神经网络模型,卷积层信息可以是该卷积层中包含的滤波器。本实施例对待剪枝模型进行剪枝时,考虑到滤波器之间的关系以及模型中每个卷积层内部滤波器的冗余信息,用以量化滤波器的重要性,来提高卷积神经网络模型剪枝的准确率,提升模型压缩精度和运算速度。
步骤S120,根据卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值。
在一实施例中,参照图2,步骤S120包括但不限于步骤S121至步骤S122:
步骤S121,获取每个卷积层对应的滤波器,其中每个卷积层对应至少两个滤波器。
步骤S122,对每个卷积层中滤波器进行两两卷积计算,得到卷积层中每个滤波器对应的多个滤波器相似值。
在一实施例中,每个卷积层中都包含多个滤波器,参照图3,为卷积层示意图,图3中X和Y为卷积神经网络模型(例如待剪枝模型)中两个连续的特征图(Feature map),X和Y之间包括多个卷积层,这些滤波器会去识别图像的特定的某些特征,每个滤波器会去滑动卷积上一层的特征图,例如图中示出的Conv1表示其中一个卷积层,Feature map X通过卷积层的计算之后便可得到Feature map Y。其中,每个卷积层均由多个滤波器(Filter)组成,每个滤波器中,从前到后一共有具有若干个通道(Channel)。
在一实施例中,滤波器是一种用来提取图片中的特征的工具,比如可以用来提取边缘特征、纹理特征等,滤波器是由多个2D的滤波器组成。参照图4,为一滤波器示意图,图4所示滤波器可以用来检测边缘,属于图像处理里常见的LOG滤波器。
本实施例中待剪枝模型为卷积神经网络模型,卷积神经网络模型中滤波器中的值为模型依靠训练数据训练得到的,并不是人为设计好的滤波器。在提取特征时,卷积层中滤波器采用卷积操作:首先滤波器对图片进行过滤,然后依次将过滤器划过原始图像的某一区域,并和该区域原始像素值进行点乘。若图像中的特征与滤波器的特征很相似,则点乘求和后得到的数值就很高。当图像区域与滤波器没有对应关系,则点乘求和所得的值就很小。若滤波器与自己点乘求和,则会得到较大的值。
由于每个卷积层均由多个滤波器组成,通过对滤波器进行比较,如果根据比较结果证明两个滤波器之间较为相似,则这两个滤波器在该卷积层中起到的作用也是相似的,因此可以将其作为该卷积层的冗余信息,去掉其中一个滤波器进行模型剪枝。对待剪枝模型进行剪枝的目的就是去掉一定比例的滤波器,例如图3中用虚线示出的滤波器可进行剪枝。
上述实施例中,要保持经过剪枝之后的模型最终的预测性能尽量不降低,因此需要评价卷积层中每个滤波器的重要性,根据重要性去掉最不重要的部分滤波器,从而使得对模型精度的影响最小。相关技术中有利用两个滤波器的值对应元素直接作差的方法计算滤波器之间的相似性,这种方法具有以下缺陷:1)没有考虑滤波器提取特征的作用;2)而且滤波器中所有的元素作用一致,没有考虑元素在滤波器中所处位置的不同,而位置在特征提取中具有着很大的作用,是不能忽略的;3)采用直接作差的方法没有物理意义,理论依据匮乏。
在一实施例中,利用两个滤波器之间的卷积计算来衡量滤波器的相似性。由于两个滤波器越相似,通过这两个滤波器提取的特征就越相似,并且这两个滤波器之间进行卷积操作得到的值就越大,进一步说明对应的滤波器的重要性较弱,其在该卷积层的计算中没有发挥较大的作用,其信息存在冗余,在剪枝过程中可以将该滤波器进行删除。相反,若进行卷积操作得到的值较大,则说明该滤波器在该卷积层计算中对结果有着重大的影响,其含有重大的信息量,是不能删除的,因此本实施例中首先获取每个卷积层中滤波器,对每个卷积层中滤波器进行两两卷积计算,得到卷积层中每个滤波器对应的多个滤波器相似值。
在一实施例中,滤波器之间的卷积运算过程描述为:计算两个滤波器相应位置乘积和。滤波器之间两两卷积的过程描述如下:
例如待剪枝模型中某一个卷积层包括5个滤波器,分别是:滤波器1(记为F1)、滤波器2(记为F2)、滤波器3(记为F3)、滤波器4(记为F4)和滤波器5(记为F5),则上述对每个滤波器进行两两卷积得到滤波器相似值,包括:{S1=F1*F2,S2=F1*F3,S3=F1*F4,S4=F1*F5,S5=F2*F3,S6=F2*F4,S7=F2*F5,S8=F3*F4,S9=F3*F5,S10=F4*F5},其中“*”表示卷积运算。
即上述实施例中,每个滤波器包含多个滤波器相似值,具体地,滤波器1的滤波器相似值包括:{S1,S2,S3,S4},滤波器2的滤波器相似值包括:{S1,S5,S6,S7},滤波器3的滤波器相似值包括:{S2,S5,S8,S9},滤波器4的滤波器相似值包括:{S3,S6,S8,S10},滤波器5的滤波器相似值包括:{S4,S7,S9,S10}。
步骤S130,根据滤波器相似值计算每个卷积层对应的剪枝重要性指标。
在一实施例中,参照图5,步骤S130包括但不限于步骤S131至步骤S132:
步骤S131,对每个滤波器对应的多个滤波器相似值的均值或总和确定为滤波器对应的滤波器重要性值。
在一实施例中,即可以利用求和取平均或求和的方式计算每个滤波器对应的滤波器重要性值,即可以仅求和得到总和,也可以求和后取平均得到均值,可根据实际需求选取具体的计算方式。
例如采取求和后取平均的方式计算上述示例中滤波器重要性值,表示为:
滤波器1的滤波器重要性值为:(S1+S2+S3+S4)/4;
滤波器2的滤波器重要性值为:(S1+S5+S6+S7)/4;
滤波器3的滤波器重要性值为:(S2+S5+S8+S9)/4;
滤波器4的滤波器重要性值为:(S3+S6+S8+S10)/4;
滤波器5的滤波器重要性值为:(S4+S7+S9+S10)/4。
步骤S132,根据卷积层中每个滤波器的滤波器重要性值得到卷积层对应的剪枝重要性指标。
在一实施例中,参照图6,步骤S132包括但不限于步骤S1321至步骤S1322:
步骤S1321,对卷积层中每个滤波器的滤波器重要性值进行排序,得到排序结果。
步骤S1322,根据排序结果得到卷积层对应的剪枝重要性指标。
在一实施例中,对每个卷积层中包含的滤波器按照重要性进行从大到小排序,排序靠前的滤波器其重要性越大。可以理解的是,也可以从大到小进行排序,则选取剪枝滤波器进行剪枝时,按照倒序的方式来选取,在此不做具体限定。卷积层中每个滤波器的重要性信息即该卷积层对应的剪枝重要性指标,然后根据剪枝重要性指标对待剪枝模型进行剪枝。
步骤S140,根据预设剪枝率和每个卷积层对应的剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
在一实施例中,得到每个卷积层对应的剪枝重要性指标后,就可以根据该剪枝重要性指标选取滤波器进行剪枝。例如选取的滤波器的滤波器重要性值较大,表示该滤波器在待剪枝模型中的重要性越强,若剪除相应的滤波器,则对待剪枝模型的性能产生较大影响,因此本实施例在剪枝中选择将滤波器重要性值较小的滤波器作为剪枝滤波器,即排序结果中排序靠后的滤波器,将其进行剪枝。
在一实施例中,参照图7,步骤S140包括但不限于步骤S141至步骤S142:
步骤S141,根据预设剪枝率确定每个卷积层中剪枝滤波器的个数。
在一实施例中,剪枝操作中需要根据实际需求设定预设剪枝率,剪枝率过高会导致模型精度降低,剪枝率过低会导致模型运算效率提升较差,因此需要根据实际需求设定预设剪枝率,按照预设剪枝率进行剪枝,决定剪掉多少个滤波器,即根据预设剪枝率能够确定每个卷积层中剪枝滤波器的个数。
步骤S142,根据剪枝滤波器的个数和每个卷积层对应的剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
在一实施例中,根据预设剪枝率和剪枝重要性指标从每个卷积层的多个滤波器中确定剪枝滤波器,并对剪枝滤波器进行剪枝得到剪枝模型。例如若设置预设剪枝率为75%,则通过剪枝操作去掉3/4的滤波器,按照上述排序结果将滤波器重要性值较小的3/4个滤波器作为剪枝滤波器去掉,滤波器重要性值较小的滤波器在待剪枝模型中作用较弱,存在冗余信息,因此将其去除之后不会对待剪枝模型的性能产生较大影响,同时有效地减少待剪枝模型的模型参数,降低待剪枝模型运算量和存储空间。
在一些实施例,得到剪枝模型之后,为了补偿因滤波器剪枝导致的累加误差,需要对剪枝模型进行微调来恢复模型的精度,参照图8,对剪枝模型进行微调的步骤包括但不限于步骤S810至步骤S820:
步骤S810,根据预设选取规则,选取剪枝模型的部分滤波器。
在一实施例中,预设选取规则可以是选取靠近剪枝模型输入端的部分滤波器,滤波器数量的选择可以根据实际需求设定,在此不做限定。
步骤S820,对剪枝模型中选取剩下的滤波器和对应的全连接层进行模型训练,得到剪枝模型。
在一实施例中,在目标数据集上对选取剩下的滤波器(例如靠近输出端的滤波器)和对应的全连接层进行模型训练,实现对剪枝模型的微调补偿,实现模型压缩尺度最大化的条件下不影响模型运算性能的目的。
在一具体应用场景,以VGG16模型作为待剪枝模型为例来验证上述实施例中卷积神经网络模型剪枝方法的有效性,其中,VGG16模型是一种适用于分类和定位任务的卷积神经网络模型,该模型由5层卷积层、3层全连接层、softmax输出层构成,层与层之间使用max-pooling(最大化池)分开,所有隐含层的激活单元都采用ReLU函数。并且VGG16模型使用多个较小卷积核(例如3x3)的卷积层代替一个卷积核较大的卷积层,一方面可以减少参数,另一方面相当于进行了更多的非线性映射,增加网络的拟合/表达能力。
同时,验证采用的数据集为CIFAR-10数据集。CIFAR-10数据集中共有60000张彩色图像,这些图像的尺寸为是32*32,一共分为10个类,每类包含6000张图。该数据集中有50000张图像用于训练过程,总共构成5个训练批次,每一批次包含10000张图;另外10000张图像用于测试过程,单独构成一个批次,在测试批的数据里,取自10类中的每一类,每一类随机取1000张图像,剩下的随机排列组成了训练批。
验证过程中,每次实验进行100个Epoch(时期)的剪枝模型压缩训练,验证所采用的硬件为NVIDIA V100 GPU,且均采用PyTorch框架,预设剪枝率为50%。
验证采用的压缩方法(即剪枝方法)包括:
1)APoZ模型剪枝方法:即根据激活函数输出为零所占百分比的大小决定剪枝的对象,使用APoZ来预测每一个滤波器在网络中的重要性。
2)激活值最小模型剪枝方法:即在激活之前,先将模型权重和偏差设为0,激活后剪去对下一层的激活值影响最小的滤波器,即平均激活值最小(意味着使用次数最少)的滤波器。
3)L1模型剪枝方法:即基于L1范数权重参数进行剪枝,基于L1范数权重参数剪枝,每一卷积层都使用较小的 L1 范数剪掉一定比例的滤波器。
4)上述实施例中的卷积神经网络模型剪枝方法。
根据验证结果来看,未进行剪枝的待剪枝模型的运算精度为93.99%。参照下表,为上述三种不同剪枝方法得到的剪枝模型的运算精度对比:
剪枝方法 未剪枝模型 APoZ 激活值最小 L1 本申请
运算精度 93.99% 92.24% 92.81% 93.05% 93.42%
根据上表可以看出,本申请实施例的卷积神经网络模型剪枝方法的运算精度最高,为93.41%,趋近未进行剪枝的待剪枝模型的运算精度93.99%,可见本申请实施例的卷积神经网络模型剪枝方法不会对待剪枝模型的运算精度性能产生较大影响,还具有一定的正则化作用,同时能够有效地减少待剪枝模型的模型参数,降低待剪枝模型运算量和存储空间。
本申请实施例提出的卷积神经网络模型剪枝方法,通过获取待剪枝模型中卷积层信息,然后根据卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值,再根据滤波器相似值计算每个卷积层对应的剪枝重要性指标,根据预设剪枝率和每个卷积层对应的剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。本实施例对卷积层中滤波器进行卷积计算,得到滤波器重要性值,进而得到每个卷积层对应的剪枝重要性指标。通过卷积运算量化卷积层中滤波器的重要性,并根据滤波器的重要性值得到模型中每个卷积层内部滤波器的冗余信息,然后利用该冗余信息进行剪枝,能够提高卷积神经网络模型剪枝的准确率,提升模型压缩精度和运算速度。
另外,本申请实施例还提供一种卷积神经网络模型剪枝装置,可以实现上述卷积神经网络模型剪枝方法,参照图9,该装置包括:
卷积层信息获取模块910,用于获取待剪枝模型中卷积层信息;
滤波器相似性计算模块920,用于根据卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
剪枝重要性指标计算模块930,用于根据滤波器相似值计算每个卷积层对应的剪枝重要性指标;
剪枝模块940,用于根据预设剪枝率和每个卷积层对应的剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
本实施例的卷积神经网络模型剪枝装置的具体实施方式与上述卷积神经网络模型剪枝方法的具体实施方式基本一致,在此不再赘述。
本申请实施例还提供了一种电子设备,包括:
至少一个存储器;
至少一个处理器;
至少一个程序;
所述程序被存储在存储器中,处理器执行所述至少一个程序以实现一种卷积神经网络模型剪枝方法,其中,所述卷积神经网络模型剪枝方法包括:
获取待剪枝模型的卷积层信息;
根据卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
根据滤波器相似值计算每个卷积层对应的剪枝重要性指标;
根据预设剪枝率和每个卷积层对应的剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
该电子设备可以为包括手机、平板电脑、个人数字助理(Personal Digital Assistant,简称PDA)、车载电脑等任意智能终端。
请参阅图10,图10示意了另一实施例的电子设备的硬件结构,电子设备包括:
处理器1001,可以采用通用的CPU(CentralProcessingUnit,中央处理器)、微处理器、应用专用集成电路(ApplicationSpecificIntegratedCircuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本申请实施例所提供的技术方案;
存储器1002,可以采用ROM(ReadOnlyMemory,只读存储器)、静态存储设备、动态存储设备或者RAM(RandomAccessMemory,随机存取存储器)等形式实现。存储器1002可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器1002中,并由处理器1001来调用执行本申请实施例的卷积神经网络模型剪枝方法;
输入/输出接口1003,用于实现信息输入及输出;
通信接口1004,用于实现本设备与其他设备的通信交互,可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信;和
总线1005,在设备的各个组件(例如处理器1001、存储器1002、输入/输出接口1003和通信接口1004)之间传输信息;
其中处理器1001、存储器1002、输入/输出接口1003和通信接口1004通过总线1005实现彼此之间在设备内部的通信连接。
本申请实施例还提供了一种存储介质,该存储介质是计算机可读存储介质,该计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令用于使计算机执行一种卷积神经网络模型剪枝方法,其中,所述卷积神经网络模型剪枝方法包括:
获取待剪枝模型的卷积层信息;
根据卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
根据滤波器相似值计算每个卷积层对应的剪枝重要性指标;
根据预设剪枝率和每个卷积层对应的剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
本申请实施例提出的卷积神经网络模型剪枝方法、卷积神经网络模型剪枝装置、电子设备、存储介质,通过对卷积层中滤波器进行卷积计算,得到滤波器重要性值,进而得到每个卷积层对应的剪枝重要性指标。通过卷积运算量化卷积层中滤波器的重要性,并根据滤波器的重要性值得到模型中每个卷积层内部滤波器的冗余信息,然后利用该冗余信息进行剪枝,能够提高卷积神经网络模型剪枝的准确率,提升模型压缩精度和运算速度。
所述计算机可读存储介质可以是非易失性,也可以是易失性。存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该处理器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
本申请实施例描述的实施例是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域技术人员可知,随着技术的演变和新应用场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本领域技术人员可以理解的是,图1-8中示出的技术方案并不构成对本申请实施例的限定,可以包括比图示更多或更少的步骤,或者组合某些步骤,或者不同的步骤。
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
本申请的说明书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括多指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序的介质。
以上参照附图说明了本申请实施例的优选实施例,并非因此局限本申请实施例的权利范围。本领域技术人员不脱离本申请实施例的范围和实质内所作的任何修改、等同替换和改进,均应在本申请实施例的权利范围之内。

Claims (20)

  1. 一种卷积神经网络模型剪枝方法,其中,包括:
    获取待剪枝模型的卷积层信息;
    根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
    根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
    根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
  2. 根据权利要求1所述的卷积神经网络模型剪枝方法,其中,所述根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值,包括:
    获取每个卷积层对应的至少两个滤波器;
    对所述卷积层中每个滤波器进行两两卷积计算,得到所述卷积层中每个滤波器对应的多个所述滤波器相似值。
  3. 根据权利要求1所述的卷积神经网络模型剪枝方法,其中,所述根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标,包括:
    对每个滤波器对应的多个所述滤波器相似值的均值或总和确定为所述滤波器对应的滤波器重要性值;
    根据卷积层中每个滤波器的滤波器重要性值得到所述卷积层对应的所述剪枝重要性指标。
  4. 根据权利要求3所述的卷积神经网络模型剪枝方法,其中,所述根据卷积层中每个滤波器的滤波器重要性值得到所述卷积层对应的所述剪枝重要性指标,包括:
    对卷积层中每个滤波器的滤波器重要性值进行排序,得到排序结果;
    根据所述排序结果得到所述卷积层对应的所述剪枝重要性指标。
  5. 根据权利要求1所述的卷积神经网络模型剪枝方法,其中,所述根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型,包括:
    根据预设剪枝率确定每个所述卷积层中剪枝滤波器的个数;
    根据所述剪枝滤波器的个数和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
  6. 根据权利要求5所述的卷积神经网络模型剪枝方法,其中,所述根据所述剪枝滤波器的个数和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型,包括:
    根据所述预设剪枝率和所述剪枝重要性指标从每个所述卷积层的多个滤波器中确定所述剪枝滤波器;
    对所述剪枝滤波器进行剪枝得到所述剪枝模型。
  7. 根据权利要求1至6任一项所述的卷积神经网络模型剪枝方法,其中,得到所述剪枝模型后,还包括:
    根据预设选取规则,选取所述剪枝模型的部分滤波器;
    对所述剪枝模型中选取剩下的所述滤波器和对应的全连接层进行模型训练,得到所述剪枝模型。
  8. 一种卷积神经网络模型剪枝装置,其中,包括:
    卷积层信息获取模块,用于获取待剪枝模型中卷积层信息;
    滤波器相似性计算模块,用于根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
    剪枝重要性指标计算模块,用于根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
    剪枝模块,用于根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
  9. 一种电子设备,其中,包括:
    至少一个存储器;
    至少一个处理器;
    至少一个程序;
    所述程序被存储在存储器中,处理器执行所述至少一个程序以实现一种卷积神经网络模型剪枝方法,其中,所述卷积神经网络模型剪枝方法包括:
    获取待剪枝模型的卷积层信息;
    根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
    根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
    根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
  10. 根据权利要求9所述的电子设备,其中,所述根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值,包括:
    获取每个卷积层对应的至少两个滤波器;
    对所述卷积层中每个滤波器进行两两卷积计算,得到所述卷积层中每个滤波器对应的多个所述滤波器相似值。
  11. 根据权利要求9所述的电子设备,其中,所述根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标,包括:
    对每个滤波器对应的多个所述滤波器相似值的均值或总和确定为所述滤波器对应的滤波器重要性值;
    根据卷积层中每个滤波器的滤波器重要性值得到所述卷积层对应的所述剪枝重要性指标。
  12. 根据权利要求11所述的电子设备,其中,所述根据卷积层中每个滤波器的滤波器重要性值得到所述卷积层对应的所述剪枝重要性指标,包括:
    对卷积层中每个滤波器的滤波器重要性值进行排序,得到排序结果;
    根据所述排序结果得到所述卷积层对应的所述剪枝重要性指标。
  13. 根据权利要求9所述的电子设备,其中,所述根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型,包括:
    根据预设剪枝率确定每个所述卷积层中剪枝滤波器的个数;
    根据所述剪枝滤波器的个数和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
  14. 根据权利要求13所述的电子设备,其中,所述根据所述剪枝滤波器的个数和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型,包括:
    根据所述预设剪枝率和所述剪枝重要性指标从每个所述卷积层的多个滤波器中确定所述剪枝滤波器;
    对所述剪枝滤波器进行剪枝得到所述剪枝模型。
  15. 一种存储介质,所述存储介质为计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行一种卷积神经网络模型剪枝方法,其中,所述卷积神经网络模型剪枝方法包括:
    获取待剪枝模型的卷积层信息;
    根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值;
    根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标;
    根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
  16. 根据权利要求15所述的存储介质,其中,所述根据所述卷积层信息进行卷积计算,得到每个卷积层中滤波器对应的滤波器相似值,包括:
    获取每个卷积层对应的至少两个滤波器;
    对所述卷积层中每个滤波器进行两两卷积计算,得到所述卷积层中每个滤波器对应的多个所述滤波器相似值。
  17. 根据权利要求15所述的存储介质,其中,所述根据所述滤波器相似值计算每个卷积层对应的剪枝重要性指标,包括:
    对每个滤波器对应的多个所述滤波器相似值的均值或总和确定为所述滤波器对应的滤波器重要性值;
    根据卷积层中每个滤波器的滤波器重要性值得到所述卷积层对应的所述剪枝重要性指标。
  18. 根据权利要求17所述的存储介质,其中,所述根据卷积层中每个滤波器的滤波器重要性值得到所述卷积层对应的所述剪枝重要性指标,包括:
    对卷积层中每个滤波器的滤波器重要性值进行排序,得到排序结果;
    根据所述排序结果得到所述卷积层对应的所述剪枝重要性指标。
  19. 根据权利要求15所述的存储介质,其中,所述根据预设剪枝率和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型,包括:
    根据预设剪枝率确定每个所述卷积层中剪枝滤波器的个数;
    根据所述剪枝滤波器的个数和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型。
  20. 根据权利要求19所述的存储介质,其中,所述根据所述剪枝滤波器的个数和每个卷积层对应的所述剪枝重要性指标,对待剪枝模型进行剪枝得到剪枝模型,包括:
    根据所述预设剪枝率和所述剪枝重要性指标从每个所述卷积层的多个滤波器中确定所述剪枝滤波器;
    对所述剪枝滤波器进行剪枝得到所述剪枝模型。
PCT/CN2022/090721 2022-02-22 2022-04-29 卷积神经网络模型剪枝方法和装置、电子设备、存储介质 WO2023159760A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210163245.0 2022-02-22
CN202210163245.0A CN114492799A (zh) 2022-02-22 2022-02-22 卷积神经网络模型剪枝方法和装置、电子设备、存储介质

Publications (1)

Publication Number Publication Date
WO2023159760A1 true WO2023159760A1 (zh) 2023-08-31

Family

ID=81483093

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090721 WO2023159760A1 (zh) 2022-02-22 2022-04-29 卷积神经网络模型剪枝方法和装置、电子设备、存储介质

Country Status (2)

Country Link
CN (1) CN114492799A (zh)
WO (1) WO2023159760A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992944A (zh) * 2023-09-27 2023-11-03 之江实验室 基于可学习重要性评判标准剪枝的图像处理方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115496210B (zh) * 2022-11-21 2023-12-08 深圳开鸿数字产业发展有限公司 网络模型的滤波剪枝方法、系统、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190294929A1 (en) * 2018-03-20 2019-09-26 The Regents Of The University Of Michigan Automatic Filter Pruning Technique For Convolutional Neural Networks
CN112561041A (zh) * 2021-02-25 2021-03-26 之江实验室 基于滤波器分布的神经网络模型加速方法及平台
CN113240085A (zh) * 2021-05-12 2021-08-10 平安科技(深圳)有限公司 模型剪枝方法、装置、设备及存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190294929A1 (en) * 2018-03-20 2019-09-26 The Regents Of The University Of Michigan Automatic Filter Pruning Technique For Convolutional Neural Networks
CN112561041A (zh) * 2021-02-25 2021-03-26 之江实验室 基于滤波器分布的神经网络模型加速方法及平台
CN113240085A (zh) * 2021-05-12 2021-08-10 平安科技(深圳)有限公司 模型剪枝方法、装置、设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992944A (zh) * 2023-09-27 2023-11-03 之江实验室 基于可学习重要性评判标准剪枝的图像处理方法及装置
CN116992944B (zh) * 2023-09-27 2023-12-19 之江实验室 基于可学习重要性评判标准剪枝的图像处理方法及装置

Also Published As

Publication number Publication date
CN114492799A (zh) 2022-05-13

Similar Documents

Publication Publication Date Title
WO2023134086A1 (zh) 卷积神经网络模型剪枝方法和装置、电子设备、存储介质
CN108491817B (zh) 一种事件检测模型训练方法、装置以及事件检测方法
Yuan et al. Factorization-based texture segmentation
CN110348572B (zh) 神经网络模型的处理方法及装置、电子设备、存储介质
WO2023159760A1 (zh) 卷积神经网络模型剪枝方法和装置、电子设备、存储介质
CN105574534A (zh) 基于稀疏子空间聚类和低秩表示的显著性目标检测方法
CN109840524B (zh) 文字的类型识别方法、装置、设备及存储介质
CN107943897B (zh) 一种用户推荐方法
CN112132279B (zh) 卷积神经网络模型压缩方法、装置、设备及存储介质
CN111831844A (zh) 图像检索方法、图像检索装置、图像检索设备及介质
CN111222548A (zh) 相似图像检测方法、装置、设备及存储介质
CN112529068B (zh) 一种多视图图像分类方法、系统、计算机设备和存储介质
CN111914908A (zh) 一种图像识别模型训练方法、图像识别方法及相关设备
CN114092474A (zh) 一种手机外壳复杂纹理背景的加工缺陷检测方法及系统
CN113128612B (zh) 电力数据中异常值的处理方法及终端设备
US20200364259A1 (en) Image retrieval
CN109902720B (zh) 基于子空间分解进行深度特征估计的图像分类识别方法
CN111079930A (zh) 数据集质量参数的确定方法、装置及电子设备
Elashry et al. Feature matching enhancement using the graph neural network (gnn-ransac)
CN116151323A (zh) 模型生成方法、装置、电子设备及存储介质
CN111860557A (zh) 图像处理方法及装置、电子设备及计算机存储介质
CN115908907A (zh) 一种高光谱遥感图像分类方法及系统
CN114565772A (zh) 集合特征提取方法、装置、电子设备及存储介质
CN115115920A (zh) 一种数据训练方法及装置
Gaikwad et al. Pruning the convolution neural network (SqueezeNet) based on L 2 normalization of activation maps

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22928032

Country of ref document: EP

Kind code of ref document: A1