CN110705708A - Compression method and device of convolutional neural network model and computer storage medium - Google Patents

Compression method and device of convolutional neural network model and computer storage medium Download PDF

Info

Publication number
CN110705708A
CN110705708A CN201910959688.9A CN201910959688A CN110705708A CN 110705708 A CN110705708 A CN 110705708A CN 201910959688 A CN201910959688 A CN 201910959688A CN 110705708 A CN110705708 A CN 110705708A
Authority
CN
China
Prior art keywords
pruning
convolutional
neural network
network model
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910959688.9A
Other languages
Chinese (zh)
Inventor
付宇卓
刘婷
钱刘宸
吉学刚
曹德明
申子正
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Zhongtong Bus Holding Co Ltd
Original Assignee
Shanghai Jiaotong University
Zhongtong Bus Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, Zhongtong Bus Holding Co Ltd filed Critical Shanghai Jiaotong University
Priority to CN201910959688.9A priority Critical patent/CN110705708A/en
Publication of CN110705708A publication Critical patent/CN110705708A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Computing arrangements based on biological models using neural network models
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
    • G06N3/045

Abstract

The invention discloses a compression method and a device of a convolutional neural network model and a computer storage medium, wherein the compression method comprises the following steps: respectively pruning each convolutional layer of the first convolutional neural network model, and determining the pruning step length of each convolutional layer; and then, according to the pruning step length of each convolutional layer, pruning the first convolutional neural network model layer by layer to obtain a compressed second convolutional neural network model. The invention ensures the compression effect of the model, reduces the compression time, improves the compression efficiency and has obvious practical value.

Description

Compression method and device of convolutional neural network model and computer storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for compressing a convolutional neural network model and a computer storage medium.
Background
With the development of deep learning techniques, a number of machine vision algorithms began to use convolutional neural networks. Because of the strong feature extraction capability of the convolutional layer, the neural network uses the convolutional layer in a large amount, however, the convolutional layer usually contains a large amount of parameters and calculation, which results in an increasingly large model scale. Therefore, in the prior art, the convolutional neural network is usually operated on a desktop or a server. The convolutional neural network is difficult to deploy on an embedded system with limited resources due to the huge model and calculation amount of the convolutional neural network. At the same time, the need to deploy machine vision algorithms on embedded systems is becoming more stringent.
In recent years, model compression technology has appeared to facilitate deployment of machine vision algorithms to embedded ends. The goal of the model compression technology is to reduce the parameters and the calculation amount of the model, and the common model compression technology comprises three categories of pruning, quantization and weight decomposition. Pruning refers to pruning unimportant weights by a certain discrimination method to achieve the purpose of compression. Quantization is to convert the weight value into a data type with a smaller occupied bit number, and the converted bit number is usually 2 bits, 3 bits, 8 bits, and so on. The weight decomposition is to decompose the weight matrix to realize compression, and the common decomposition methods include CP decomposition, SVD decomposition, and the like.
However, these compression methods have respective problems: the pruning method does not consider the difference between different layers, and usually the layer with the largest computation amount or the largest parameter amount is selected for pruning, so that the pruning efficiency is low. The quantification method has large effect limitation, and after the network or the data set is replaced, the model expression fluctuation is large, and the universality is poor. The weight decomposition method is difficult to deploy other compression methods on the basis of the weight decomposition method due to the fact that the weight is decomposed, and expansibility is poor.
It is noted that the information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.
Disclosure of Invention
The technical problem to be solved by the present invention is to overcome the defects of the prior art, and a first object of the present invention is to provide a compression method of a convolutional neural network model, which greatly improves pruning efficiency and pruning effect by a dynamic pruning method with adaptive step length; it is a second object of the present invention to provide a convolutional neural network model compression apparatus and a third object of the present invention to provide a computer storage medium.
In order to achieve the first object, the invention provides a method for compressing a convolutional neural network model, which comprises the following steps:
s100: respectively pruning each convolutional layer of the first convolutional neural network model, and determining the pruning step length of each convolutional layer;
s200: and according to the pruning step length of each convolutional layer, pruning the first convolutional neural network model layer by layer to obtain a compressed second convolutional neural network model.
Optionally, the method for determining a pruning step for each convolutional layer in step S100 includes setting an initial pruning step and a pruning order of the convolutional cores, and then performing the following steps:
s110, carrying out pruning attempt on the convolutional layer to obtain a pruned intermediate convolutional neural network model;
s120: fine-tuning the intermediate convolutional neural network model, and operating the fine-tuned intermediate convolutional neural network model on a verification set to obtain an mAP after attempted pruning;
s130: if the mAP after the attempted pruning is lower than a first preset mAP threshold value, reducing the initial pruning step length to a first pruning step length; if the mAP after the attempted pruning is higher than the first preset mAP threshold value, increasing the initial pruning step length to a second pruning step length; otherwise, determining the initial pruning step length as the pruning step length of the convolutional layer;
s140: if the pruning change step length meets the preset change step length threshold, executing step S110 to continue pruning the convolutional layer; otherwise, the initial pruning step length is used as the pruning step length of the convolutional layer.
Optionally, the method for setting the initial pruning step size includes:
setting an initial pruning proportion;
and obtaining the initial pruning step length of the convolutional layer according to the initial pruning proportion and the number of the convolutional kernels.
Optionally, the method for setting the pruning order of the convolution kernels comprises:
calculating the sum of absolute values of weights contained in each convolution kernel;
and setting the sequence of the sum of the absolute values of the weights contained in the convolution kernel from small to large as the pruning sequence of the convolution kernel.
Optionally, before step S200, the method further includes sorting each of the convolutional layers, and taking the order of sensitivity of the convolutional layers from high to low as a pruning order of the convolutional layers.
Optionally, the pruning the first convolutional neural network model layer by layer in step S200 includes, according to the pruning sequence of the convolutional layers, pruning the convolutional layers one by one according to the pruning step length of each convolutional layer, and judging whether pruning is completed one by one, and pruning the next convolutional layer after the previous convolutional layer is pruned until all the convolutional layers are pruned, so as to obtain the compressed second convolutional neural network model.
Optionally, the method for determining whether pruning of the convolutional layer is completed in step S200 includes the following steps,
step S210: fine-tuning the first convolutional neural network model after pruning, and operating the first convolutional neural network model after fine tuning on a verification set to obtain the mAP after pruning;
step S220: if any kind of AP is 0, completing pruning of the convolutional layer; otherwise, executing step S230;
step S230: judging whether the mAP after pruning is lower than a second preset mAP threshold value or not, and if so, completing pruning of the convolutional layer; otherwise, go to step S240;
step S240: if the pruning step length of the convolutional layer is smaller than the number of the residual convolutional cores, continuously pruning the convolutional layer according to the pruning step length of the convolutional layer; and if not, correcting the pruning step length of the convolutional layer to a third pruning step length, and continuing pruning the convolutional layer.
Optionally, before step S100, training to obtain the first convolutional neural network model based on a target detection algorithm.
For the second purpose of the present invention, the present invention also provides a compressing apparatus of a convolutional neural network model, comprising a processor adapted to implement instructions; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to: the steps of implementing a method of compression of a convolutional neural network model as claimed in any preceding claim.
To achieve the third object of the present invention, the present invention also provides a computer storage medium having computer executable instructions stored thereon, which when executed implement the method of compressing a convolutional neural network model as described in any one of the above.
Compared with the prior art, the compression method of the convolutional neural network model provided by the invention has the following beneficial effects:
(1) the insensitive layer is pruned to a large extent, and the pruning amplitude of the sensitive layer is smaller, so that the model expression after pruning can be effectively improved.
(2) The step length is dynamically determined, which is beneficial to improving the pruning speed.
(3) Compared with the traditional pruning method, the method can dynamically determine the pruning step length according to the actual conditions of each layer, and greatly improves the pruning efficiency and the pruning effect.
The compression device of the convolutional neural network model and the computer storage medium have the same inventive concept, so the same beneficial effects are achieved.
Drawings
FIG. 1 is a flowchart illustrating an overall method for compressing a convolutional neural network model according to an embodiment of the present invention;
FIG. 2 is a flow chart of the compression method of the convolutional neural network model for determining the pruning step according to the sensitivity level according to the embodiment of the present invention;
fig. 3 is a flowchart of layer-by-layer pruning of a compression method of a convolutional neural network model according to an embodiment of the present invention.
Detailed Description
To make the objects, advantages and features of the present invention more apparent, the present invention is further described by the following specific examples in conjunction with the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of a portion of the invention and not all embodiments. Based on the embodiments of the present invention, those skilled in the art should obtain all other embodiments without any creative effort, and all other embodiments shall fall within the protection scope of the present invention.
It should be noted that the terms "first" and "second" in the description and claims of the present invention and the drawings described above are used for descriptive purposes only and are not necessarily used for describing a particular order or sequence, but are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated.
The core idea of the invention is to provide a compression method of a convolutional neural network model, dynamically determining the pruning step length of each layer according to the sensitivity of each layer, pruning convolutional layers layer by layer according to the sensitivity and the determined pruning step length, and pruning the next convolutional layer after completing pruning of one convolutional layer. The insensitive layer is subjected to large-amplitude pruning, and the sensitive layer is subjected to small pruning amplitude; for the convolution kernels of the same convolutional layer, pruning is performed from small to large according to importance.
Specifically, referring to fig. 1, the compression method of the convolutional neural network model provided in this embodiment includes the following steps:
s100: respectively pruning each convolutional layer of the first convolutional neural network model, and determining the pruning step length of each convolutional layer;
s200: and according to the pruning step length of each convolutional layer, pruning the first convolutional neural network model layer by layer to obtain a compressed second convolutional neural network model.
Obviously, in other embodiments, before performing step S100, the first convolutional neural network model may be obtained based on target detection algorithm training. In particular, the present invention does not limit the target detection algorithm for obtaining the first convolutional neural network model, nor the application scenario of the first convolutional neural network model, i.e. the above-mentioned compression method of the convolutional neural network model can be applied to any field using the convolutional neural network model, but not limited thereto. For example, fields including, but not limited to, image recognition, speech recognition, and document analysis.
Specifically, the method for determining the pruning step size of each convolutional layer in step S100 further includes setting an initial pruning step size and the pruning order of the convolutional cores, and then performing the steps shown in fig. 2. For the convenience of understanding, the following first describes the setting of the initial pruning step size and the pruning sequence of the convolution kernel, and then, with reference to fig. 2, the method for determining the pruning step size is described in detail.
In one embodiment, the method for setting the initial pruning step length comprises the following steps:
the first step is as follows: setting an initial pruning proportion, specifically, the initial pruning proportion is 3% to 10% of the number of convolution kernels of the convolutional layer where the initial pruning proportion is, in this embodiment, 5% of the number of convolution kernels of the convolutional layer where the initial pruning proportion is.
The second step is that: and obtaining the initial pruning step length of the convolutional layer according to the initial pruning proportion and the number of the convolutional kernels.
As a preferred embodiment, the method for setting the pruning order of the convolution kernels comprises the following steps:
first, the sum of the absolute values of the weights included in each convolution kernel is calculated.
Then, the order of the sum of absolute values of the weights included in the convolution kernel from small to large is set as the pruning order of the convolution kernel. Namely: and sequencing the importance of each convolution kernel according to the sum of the absolute values of the weights contained in each convolution kernel, wherein the greater the sum of the absolute values of the weights contained in the convolution kernels is, the more important the convolution kernels are. The lower the importance of the convolution kernel, the greater the likelihood of pruning.
With reference to fig. 2, the method for determining the pruning step length is described as follows, and specifically includes the following steps:
and S110, carrying out pruning attempt on the convolutional layer to obtain a pruned intermediate convolutional neural network model.
S120: fine-tuning (fine-tune) the intermediate convolutional neural network model, and operating the fine-tuned intermediate convolutional neural network model on a verification set to obtain the mAP after trial pruning. Compared with the training from the beginning, the fine adjustment can save a large amount of computing resources and computing time, improve computing efficiency, even improve accuracy, and effectively avoid risks of model non-convergence, parameter optimization insufficiency, low accuracy, low model sublimation capability and the like. mAP: mean Average Precision, Average Precision, is first to find the Average Precision ap (Average Precision) in one category, then to find the Average Precision of all categories (mean Average Precision), the maps are used to judge the accuracy of identification in target detection, i.e. to measure the probability of the detected article.
S130: and correcting the initial pruning step length according to the mAP after the pruning attempt. If the mAP after the attempted pruning is lower than a first preset mAP threshold value, reducing the initial pruning step length to a first pruning step length; if the mAP after the attempted pruning is higher than the first preset mAP threshold value, increasing the initial pruning step length to a second pruning step length; otherwise, determining the initial pruning step length as the pruning step length of the convolutional layer. According to actual needs, the first preset mAP threshold may be set to 70% to 90% of the initial mAP, in this embodiment, the first preset mAP threshold is 80% of the initial mAP, and the initial mAP is the mAP of the first convolutional neural network model before pruning; the first pruning step length is 30% to 70% of the initial pruning step length; the second pruning step is 3% to 10% of the initial pruning step plus the number of convolution kernels.
Specifically, in this embodiment, when the maps after the attempted pruning is lower than 80% of the initial maps, the pruning step size is halved, that is, the first pruning step size is 50% of the initial pruning step size; when the mAP after the attempted pruning is higher than 80% of an initial mAP, increasing the initial pruning step size by 5% of the number of convolution kernels of the convolutional layer.
S140: if the pruning change step length meets the preset change step length threshold, executing step S110 to continue pruning the convolutional layer; otherwise, the initial pruning step length is used as the pruning step length of the convolutional layer. In this embodiment, when the pruning change step length is greater than 1 and less than 30% of the number of convolution kernels in the convolution layer, the operation of pruning the convolution layer is continued. It can be seen that the pruning step size of each of the convolutional layers of the convolutional neural network model is dynamically determined. The pruning proportion and the mAP after pruning can be well considered.
Thus, the pruning step size in each of the convolutional layers and the pruning order of the convolutional kernels in the convolutional layers have been dynamically determined. In one embodiment, step S200 is executed next to prune the first convolutional neural network model layer by layer according to the pruning step of each convolutional layer, so as to obtain a compressed second convolutional neural network model.
In another embodiment of the present invention, as shown in fig. 3, before step S200, the method further includes sorting each of the convolutional layers, and using the order of sensitivity of the convolutional layers from high to low as the pruning order of the convolutional layers. Namely: and arranging each convolutional layer in the first convolutional neural network in a descending order according to the sensitivity of the convolutional layer, wherein the convolutional layer with high sensitivity is arranged in front.
Specifically, the pruning the first convolutional neural network model layer by layer in the step S200 includes, according to the pruning sequence of the convolutional layers, pruning the convolutional layers one by one according to the pruning step length of each convolutional layer, and judging whether pruning is completed one by one, and pruning the next convolutional layer after the previous convolutional layer is pruned until all the convolutional layers are pruned, so as to obtain the compressed second convolutional neural network model.
In this embodiment, the method for determining whether pruning of the convolutional layer is completed in step S200 includes the following steps:
step S210: and finely adjusting the first convolutional neural network model after pruning, and operating the first convolutional neural network model after fine adjustment on a verification set to obtain the mAP after pruning.
Step S220: if any kind of AP is 0, completing pruning of the convolutional layer; otherwise, step S230 is executed. The purpose of this step is to ensure that either category can be retained.
Step S230: judging whether the mAP after pruning is lower than a second preset mAP threshold value or not, and if so, completing pruning of the convolutional layer; otherwise, step S240 is performed. Wherein the second predetermined mAP threshold is 70% to 90% of the initial mAP, and in this embodiment, the second predetermined mAP threshold is 80% of the initial mAP.
Step S240: if the pruning step length of the convolutional layer is smaller than the number of the residual convolutional cores, continuously pruning the convolutional layer according to the pruning step length of the convolutional layer; and if not, correcting the pruning step length of the convolutional layer to a third pruning step length, and continuing pruning the convolutional layer. Wherein the third pruning step is 10% to 50% of the number of convolution kernels remaining for the convolutional layer, and in this embodiment, the third pruning step is 25% of the number of convolution kernels remaining for the convolutional layer.
In particular, the above embodiments are merely introduced to separate the method for determining the pruning step size of the convolutional layer from the step of specifically performing pruning for convenience of description, and do not constitute any limitation to the present invention. It is obvious that those skilled in the art can implement the method for determining the pruning step size of each of the convolutional layers into the detailed operation of layer-by-layer pruning in other embodiments without creative labor according to the disclosure of the present invention, and it is within the scope of the present invention.
In yet another embodiment of the present invention, there is provided a compressing apparatus for a convolutional neural network model, including a processor adapted to implement each instruction; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to: the steps of implementing the method for compressing a convolutional neural network model according to any one of the above embodiments have been described in detail, and are not described herein again.
In another embodiment of the present invention, there is further provided a computer-readable storage medium, where computer-executable instructions are stored on the computer-readable storage medium, and when executed, the steps of the method for compressing a convolutional neural network model are implemented, and the specific steps are described in detail above, and are not repeated herein.
Through the description of the above embodiments, those skilled in the art can clearly understand that the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better embodiment. Based on this understanding, the portions of the present invention that contribute to the prior art can be embodied in the form of software products. The computer software product is stored on a computer-readable storage medium and includes instructions for causing an apparatus, including but not limited to a computer and/or a mobile phone, to perform the method according to the various embodiments of the present invention.
In summary, the present invention discloses a convolutional neural network compression method, device and computer readable storage medium, in which the convolutional neural network compression method uses the sum of absolute values of weights included in convolutional kernels to determine the importance of each convolutional kernel, and dynamically determines the pruning step length according to the actual situation of each convolutional layer. The problem that the number of parameters and the calculated amount of the conventional convolutional neural network are huge is solved. And the pruning speed is effectively improved while the model performance is ensured, and the possibility of deploying the convolutional neural network model on the embedded system with limited resources is provided.
The above embodiments are merely illustrative of the principles and effects of the present invention, and do not limit the scope of the present invention, which includes but is not limited to the configurations listed in the above embodiments, and one skilled in the art can take the above embodiments as a basis, and any changes and modifications made by one skilled in the art according to the above disclosure are within the scope of the claims.

Claims (10)

1. A compression method of a convolutional neural network model is characterized by comprising the following steps:
s100: respectively pruning each convolutional layer of the first convolutional neural network model, and determining the pruning step length of each convolutional layer;
s200: and according to the pruning step length of each convolutional layer, pruning the first convolutional neural network model layer by layer to obtain a compressed second convolutional neural network model.
2. The method of claim 1, wherein the step S100 of determining the pruning step for each convolutional layer comprises setting an initial pruning step and a pruning order of convolutional kernels, and then performing the following steps:
s110, carrying out pruning attempt on the convolutional layer to obtain a pruned intermediate convolutional neural network model;
s120: fine-tuning the intermediate convolutional neural network model, and operating the fine-tuned intermediate convolutional neural network model on a verification set to obtain an mAP after attempted pruning;
s130: if the mAP after the attempted pruning is lower than a first preset mAP threshold value, reducing the initial pruning step length to a first pruning step length; if the mAP after the attempted pruning is higher than the first preset mAP threshold value, increasing the initial pruning step length to a second pruning step length; otherwise, go to step S150;
s140: if the pruning change step length meets the preset change step length threshold, executing step S110 to continue pruning the convolutional layer; otherwise, go to step S150;
s150: and taking the initial pruning step length as the pruning step length of the convolutional layer.
3. The method of claim 2, wherein the step of setting the initial pruning step size comprises:
setting an initial pruning proportion;
and obtaining the initial pruning step length of the convolutional layer according to the initial pruning proportion and the number of the convolutional kernels.
4. The method of claim 2, wherein the step of setting the pruning order of the convolution kernels comprises:
calculating the sum of absolute values of weights contained in each convolution kernel;
and setting the sequence of the sum of the absolute values of the weights contained in the convolution kernel from small to large as the pruning sequence of the convolution kernel.
5. The method of claim 1, further comprising, before step S200, sorting each of the convolutional layers, and taking the order of the sensitivity of the convolutional layer from high to low as a pruning order of the convolutional layer.
6. The method according to claim 5, wherein the step S200 of pruning the first convolutional neural network model layer by layer comprises pruning the convolutional layers one by one according to the pruning sequence of the convolutional layers and according to the pruning step length of each convolutional layer and judging whether pruning is completed one by one, and pruning the next convolutional layer after the previous convolutional layer is pruned until all convolutional layers are pruned to obtain the compressed second convolutional neural network model.
7. The method of claim 6, wherein the step S200 of determining whether the convolutional layer has been pruned comprises the following steps,
step S210: fine-tuning the first convolutional neural network model after pruning, and operating the first convolutional neural network model after fine tuning on a verification set to obtain the mAP after pruning;
step S220: if any kind of AP is 0, completing pruning of the convolutional layer; otherwise, executing step S230;
step S230: judging whether the mAP after pruning is lower than a second preset mAP threshold value or not, and if so, completing pruning of the convolutional layer; otherwise, go to step S240;
step S240: if the pruning step length of the convolutional layer is smaller than the number of the residual convolutional cores, continuously pruning the convolutional layer according to the pruning step length of the convolutional layer; and if not, correcting the pruning step length of the convolutional layer to a third pruning step length, and continuing pruning the convolutional layer.
8. The method according to any one of claims 1 to 7, further comprising training the first convolutional neural network model based on a target detection algorithm before step S100.
9. An apparatus for compressing a convolutional neural network model, comprising a processor adapted to implement instructions; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to: a step of implementing a compression method of the convolutional neural network model as claimed in any one of claims 1 to 8.
10. A computer storage medium having computer-executable instructions stored thereon that, when executed, implement a method of compressing a convolutional neural network model as defined in any one of claims 1-8.
CN201910959688.9A 2019-10-10 2019-10-10 Compression method and device of convolutional neural network model and computer storage medium Pending CN110705708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910959688.9A CN110705708A (en) 2019-10-10 2019-10-10 Compression method and device of convolutional neural network model and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910959688.9A CN110705708A (en) 2019-10-10 2019-10-10 Compression method and device of convolutional neural network model and computer storage medium

Publications (1)

Publication Number Publication Date
CN110705708A true CN110705708A (en) 2020-01-17

Family

ID=69200148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910959688.9A Pending CN110705708A (en) 2019-10-10 2019-10-10 Compression method and device of convolutional neural network model and computer storage medium

Country Status (1)

Country Link
CN (1) CN110705708A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100628A (en) * 2020-11-16 2020-12-18 支付宝(杭州)信息技术有限公司 Method and device for protecting safety of neural network model
WO2021208151A1 (en) * 2020-04-13 2021-10-21 商汤集团有限公司 Model compression method, image processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021208151A1 (en) * 2020-04-13 2021-10-21 商汤集团有限公司 Model compression method, image processing method and device
CN112100628A (en) * 2020-11-16 2020-12-18 支付宝(杭州)信息技术有限公司 Method and device for protecting safety of neural network model

Similar Documents

Publication Publication Date Title
WO2019051941A1 (en) Method, apparatus and device for identifying vehicle type, and computer-readable storage medium
CN109002889B (en) Adaptive iterative convolution neural network model compression method
CN110705708A (en) Compression method and device of convolutional neural network model and computer storage medium
CN109145898A (en) A kind of object detecting method based on convolutional neural networks and iterator mechanism
US20020122593A1 (en) Pattern recognition method and apparatus
CN109583586B (en) Convolution kernel processing method and device in voice recognition or image recognition
CN111105017A (en) Neural network quantization method and device and electronic equipment
CN110807529A (en) Training method, device, equipment and storage medium of machine learning model
CN110647974A (en) Network layer operation method and device in deep neural network
CN112101543A (en) Neural network model determination method and device, electronic equipment and readable storage medium
CN110659735A (en) Method, device and equipment for dynamically adjusting neural network channel
CN114139678A (en) Convolutional neural network quantization method and device, electronic equipment and storage medium
CN114580281A (en) Model quantization method, apparatus, device, storage medium, and program product
US11544563B2 (en) Data processing method and data processing device
US20200349416A1 (en) Determining computer-executed ensemble model
CN113642710A (en) Network model quantification method, device, equipment and storage medium
CN110674939A (en) Deep neural network model compression method based on pruning threshold automatic search
KR20210138382A (en) Method and apparatus for multi-level stepwise quantization for neural network
CN111598233A (en) Compression method, device and equipment of deep learning model
CN110706706A (en) Voice recognition method, device, server and storage medium
US11507782B2 (en) Method, device, and program product for determining model compression rate
CN113887709A (en) Neural network adaptive quantization method, apparatus, device, medium, and product
CN113255576B (en) Face recognition method and device
CN112613604A (en) Neural network quantification method and device
CN113988316A (en) Method and device for training machine learning model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200117