CN112183746A - Neural network pruning method, system and device for sensitivity analysis and reinforcement learning - Google Patents

Neural network pruning method, system and device for sensitivity analysis and reinforcement learning Download PDF

Info

Publication number
CN112183746A
CN112183746A CN202011056171.8A CN202011056171A CN112183746A CN 112183746 A CN112183746 A CN 112183746A CN 202011056171 A CN202011056171 A CN 202011056171A CN 112183746 A CN112183746 A CN 112183746A
Authority
CN
China
Prior art keywords
precision
reinforcement learning
weight
pruning
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011056171.8A
Other languages
Chinese (zh)
Inventor
陈海波
关翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenlan Artificial Intelligence Shenzhen Co Ltd
Original Assignee
Shenlan Artificial Intelligence Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenlan Artificial Intelligence Shenzhen Co Ltd filed Critical Shenlan Artificial Intelligence Shenzhen Co Ltd
Priority to CN202011056171.8A priority Critical patent/CN112183746A/en
Publication of CN112183746A publication Critical patent/CN112183746A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a neural network pruning method, system and device for sensitivity analysis and reinforcement learning, comprising the following steps: setting a sparsity threshold, and selecting a weight with low sensitivity for pruning; obtaining a cutting method and precision, and determining the weight of random pruning according to the sensitivity weight; randomly cutting each selected weight, and putting a pruning method and precision of multiple times of random cutting into a buffer area; training reinforcement learning, namely training a reinforcement learning agent by using data in a buffer area, and putting a cutting method and precision generated after training into the buffer area; and repeating the steps until the network precision reaches a preset value. According to the invention, the low-sensitivity weight is selected for pruning, and the sparsity threshold value of each weight is set to ensure that the reduced accuracy of the network is kept within a preset range after the pruned weight is pruned by adopting the current sparsity process. Under the condition of ensuring the network precision, the compression ratio of the neural network is maximally improved.

Description

Neural network pruning method, system and device for sensitivity analysis and reinforcement learning
Technical Field
The application relates to the technical field of deep learning compression, in particular to a neural network pruning method based on sensitivity analysis and reinforcement learning.
Background
Pruning (prune) is a compression technique for Convolutional Neural Networks (CNN), which is mainly used to reduce the computation load of Convolutional Neural Networks (CNN). The pruning algorithm usually achieves the purpose of reducing the calculation amount of the whole neural network by cutting off the unimportant tensor (tensor) in the weight (weight) of the neural network.
Which tensors (tensors) in the neural network weights (weights) are not important is determined by their sparsity (sparsity). Sparsity is expressed by the number of 0 s in the tensor (tensor) and the size of the tensor. Therefore, the tensor (tensor) with higher sparsity in the weight (weight) is cut out, and the purpose of compressing the Convolutional Neural Network (CNN) can be achieved.
The criterion of Convolutional Neural Network (CNN) compression is to guarantee the accuracy of the network while reducing the amount of computation. A sensitivity analysis (sensitivity analysis) is proposed in document 1 to solve the problem of clipping out a tensor (tensor) of which sparsity is greater than what in weight (weight). Namely, the tenors of each weight are cut out independently, and then the accuracy of the network is detected through a data verification set. In this way, the sensitivity of each weight can be analyzed to determine how much tensor (tensor) of the current weight is clipped. The pruning method based on sensitivity analysis (sensitivity analysis) mainly aims at independent weights, and does not consider the correlation among different weights, so that better compression efficiency cannot be achieved.
Neural network pruning based on reinforcement learning is an automatic pruning technology, sparsity (sparsity) of weight (weight) of a neural network can be automatically analyzed, then a reasonable decision is made to prune the network, and the network precision and the compression rate of the pruned network are good under most conditions.
The neural network pruning based on reinforcement learning is divided into three steps: randomly clipping multiple weights (weights) of the neural network, and then fine tuning (fine tuning) the clipped network to record the accuracy of the network. Then the clipping method and the precision of the neural network after clipping are recorded and put into a data buffer area. Repeating the first step for n times, after certain data are accumulated in the buffer area, training a reinforcement learning agent (agent) by using the data in the data buffer area in the second step, predicting a specific cutting action by using the agent (agent) trained in the second step, pruning the network by using the method, finely tuning (fine tuning) the network after the second step of pruning, recording the network precision after the fine tuning (fine tuning), and putting the cutting action predicted in the second step and the network precision after the third step of fine tuning (fine tuning) into a buffer area. And then jumps to the second step. When the net accuracy after the third fine tuning step reaches the expected value, the loop is stopped. The reinforcement learning-based method is to teach the network to adopt what method to tailor the network, so that high return (network accuracy) can be obtained. This requires that the data of the training agent (agent) is good enough and the information contained is complete enough. However, after randomly clipping a plurality of weights (weights) in the first step mentioned above, the obtained network precision is sometimes not ideal, and it is difficult to train an effective agent (agent) by using the "bad" data, or the time for training the agent (agent) is increased. The pruning approach based on reinforcement learning, although considering the impact of single weight (weight) and multiple weight (weight) clipping on network accuracy. However, sometimes, because valid data cannot be obtained, better agents (agents) cannot be trained, and these bad agents (agents) often cannot generate a good pruning method.
Disclosure of Invention
1. Objects of the invention
The invention provides a neural network pruning method based on sensitivity analysis and reinforcement learning, aiming at solving the problem of low network precision caused by the fact that training data in the reinforcement learning method cannot contain all information.
2. The technical scheme adopted by the invention
The invention provides a neural network pruning method for sensitivity analysis and reinforcement learning, which comprises the following steps:
setting a sparsity threshold, and selecting a weight with low sensitivity for pruning;
obtaining a cutting method and precision, and determining the weight of random pruning according to the sensitivity weight; randomly cutting each selected weight, and putting a pruning method and precision of multiple times of random cutting into a buffer area;
training reinforcement learning, namely training a reinforcement learning agent by using data in a buffer area, and putting a cutting method and precision generated after training into the buffer area; and repeating the steps until the network precision reaches a preset value.
Preferably, the sparsity threshold setting step selects a weight with low sensitivity for pruning, that is: and setting sparsity threshold values of all the weights, and cutting by adopting the current sparsity threshold value to keep the descending precision of the network within a preset range.
Preferably, a clipping method and precision are obtained, and each selected weight is randomly clipped, so that the sparsity of the selected weight is smaller than a sparsity threshold value.
Preferably, the step of training reinforcement learning includes training reinforcement learning agents by using data in a buffer area, predicting currently selected weights by using agents generated after training, determining a corresponding cutting method, cutting each network weight by using the generated cutting method, fine-tuning the cut network for multiple times, recording final network precision, and putting the trained cutting method and precision into the buffer area.
Preferably, the accuracy of the network degradation is kept within 20%.
The invention provides a neural network pruning system for sensitivity analysis and reinforcement learning, which comprises the following components:
a sparsity threshold setting module for selecting a weight with low sensitivity to prune;
a cutting method and precision module are obtained and used for determining the weight of random pruning according to the sensitivity weight; randomly cutting each selected weight, and putting a pruning method and precision of multiple times of random cutting into a buffer area;
the training reinforcement learning module is used for training a reinforcement learning agent by using data in the buffer area, and a cutting method and precision generated after training are put into the buffer area; and repeating the steps until the network precision reaches a preset value.
Preferably, the sparsity threshold module is configured to select a weight with low sensitivity for pruning, that is: and setting sparsity threshold values of all the weights, and cutting by adopting the current sparsity threshold value to keep the descending precision of the network within a preset range.
Preferably, the method for obtaining clipping and the precision module are used for randomly clipping each selected weight to ensure that the sparsity of the selected weight is smaller than a sparsity threshold.
Preferably, the training reinforcement learning module is configured to train a reinforcement learning agent by using data in the buffer, predict a currently selected weight by using an agent generated after training, determine a corresponding clipping method, clip each network weight by using the generated clipping method, fine-tune the clipped network for multiple times, record final network precision, and place the trained clipping method and precision in the buffer.
Preferably, the accuracy of the network degradation is kept within 20%.
The invention provides a neural network pruning device for sensitivity analysis and reinforcement learning, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the method when executing the computer program.
The invention proposes a computer-readable storage medium on which a computer program is stored which, when being executed by a processor, carries out the method steps.
3. Advantageous effects adopted by the present invention
(1) According to the invention, the low-sensitivity weight is selected for pruning, and the sparsity threshold value of each weight is set to ensure that the reduced accuracy of the network is kept within a preset range after the pruned weight is pruned by adopting the current sparsity process. In the process of ensuring the high precision of the network,
(2) the invention provides a structure pruning method instead of element pruning, which maximally improves the compression ratio of a neural network under the condition of ensuring high precision, and specifically comprises the following steps:
the compression ratio has two measurement modes, one is a ratio based on the number of model parameters
Figure BDA0002710920800000041
ParaprunedRepresenting the number of model parameters after pruning, ParaunprunedRepresenting the number of parameters of the original model; one is based on the ratio of the calculated quantities of the model
Figure BDA0002710920800000042
MACprunedRepresenting the number of multiply-add times, MAC, of the pruned modelunprunedRepresenting the number of multiply-add times of the original model). The majority of the methods mentioned in the prior art use elemental pruning, although the compressibility can be greatly reduced
Figure BDA0002710920800000043
However, support of a specific chip architecture is required, and development of a specific chip is a long process, and the invention refers to a structure pruning method which can be conveniently accelerated by using existing hardware conditions (such as ARM neon, X86 sse2, GPU, and the like).
(3) In the prior art, pruning is performed according to the size of the model, the size of the neural network model is easy to control, but the calculated quantities MACs (multiply-add times) of the neural network are difficult to control well, and the pruned network needs to be deployed on specific hardware and has no universality.
In conclusion, the method adopted by the invention can prune the model calculated amount and the storage space at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the steps of obtaining a clipping approach and precision according to the present invention;
FIG. 3 illustrates the training reinforcement learning procedure of the present invention.
Detailed Description
The technical solutions in the examples of the present invention are clearly and completely described below with reference to the drawings in the examples of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without inventive step, are within the scope of the present invention.
The present invention will be described in further detail with reference to the accompanying drawings.
Example 1
The neural network is analyzed by sensitivity analysis (sensitivity analysis), so that the initial data in the data buffer area of the reinforcement learning is randomly performed within a certain range.
It can be determined by sensitivity (sensitivity analysis) which weights are more sensitive and cannot be pruned, and which weights are too sensitive and can be pruned.
The neural network pruning method for sensitivity analysis and reinforcement learning, as shown in fig. 1, includes:
s100, selecting weights with low sensitivity for pruning, setting the selected weights as W (W0, W1, W2.. wn), and then setting sparsity threshold values T (T0, T1, T2.. tn) of the weights. The thresholds are selected to ensure that the accuracy of the network degradation is kept within 20% after the clipped weight is clipped by the current sparsity.
And S200, determining the weight needing random pruning according to the W obtained in the step S100. As shown in fig. 2, includes:
s201, randomly clipping each selected weight wi, and ensuring that the sparsity of the weight wi is smaller than ti. Performing m rounds of experiments, performing fine tuning (fine tuning) on the neural network of each round of pruning for p times, and recording the network precision after fine tuning (fine tuning); the pruning method and precision of the m-round experiment are put into a buffer B.
S300, training an agent (agent) by using the data in the buffer B, as shown in fig. 3, includes:
s301, predicting by using the agent (agent) generated after training and the currently selected weight, and determining a corresponding cutting method;
s302, cutting each network weight by using the generated cutting method;
s303, fine tuning (fine tuning) is carried out on the cut network for p times, the final network precision is recorded, and the cutting method and precision are placed into a buffer B. And repeating the step 300 until the network precision reaches the requirement.
The invention provides a neural network pruning system for sensitivity analysis and reinforcement learning, which comprises the following components: the system comprises a sparsity threshold setting module, a cutting method and precision obtaining module and a training reinforcement learning module;
a sparsity threshold setting module for selecting a weight with low sensitivity to prune;
a cutting method and precision module are obtained and used for determining the weight of random pruning according to the sensitivity weight; randomly cutting each selected weight, and putting a pruning method and precision of multiple times of random cutting into a buffer area;
the training reinforcement learning module is used for training a reinforcement learning agent by using data in the buffer area, and a cutting method and precision generated after training are put into the buffer area; and repeating the steps until the network precision reaches a preset value.
Wherein, set for the threshold module of sparsity, is used for choosing the weight of the low sensitivity to prune, namely: and setting sparsity threshold values of all the weights, and cutting by adopting the current sparsity threshold value to keep the descending precision of the network within a preset range.
The system comprises a cutting method acquisition module, a cutting precision module and a sparsity threshold acquisition module, wherein the cutting method acquisition module and the cutting precision module are used for randomly cutting each selected weight and ensuring that the sparsity is smaller than the sparsity threshold.
The training reinforcement learning module is used for training reinforcement learning agents by using data in the buffer area, predicting the currently selected weights by using agents generated after training, determining a corresponding cutting method, cutting each network weight by using the generated cutting method, finely adjusting the cut network for multiple times, recording the final network precision, and putting the trained cutting method and precision into the buffer area.
The machine-readable storage medium is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules (the illustrated obtaining module, the first determining module, the second determining module, and the object control module) corresponding to the virtual reality object control method in the embodiment of the present application. The processor detects the software program, the instructions and the modules stored in the machine-readable storage medium, so as to execute various functional applications and data processing of the terminal device, that is, to implement the above virtual reality object control method, which is not described herein again.
The machine-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the machine-readable storage medium may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be a Read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of example, but not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double data rate Synchronous Dynamic random access memory (DDR SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchronous link SDRAM (SLDRAM), and direct memory bus RAM (DR RAM). It should be noted that the memories of the systems and methods described herein are intended to comprise, without being limited to, these and any other suitable memory of a publishing node. In some examples, the machine-readable storage medium may further include memory located remotely from the processor, which may be connected to the virtual reality device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, virtual reality device, or data center to another website site, computer, virtual reality device, or data center by wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a virtual reality device, a data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (12)

1. A neural network pruning method for sensitivity analysis and reinforcement learning is characterized by comprising the following steps:
setting a sparsity threshold, and selecting a weight with low sensitivity for pruning;
obtaining a cutting method and precision, and determining the weight of random pruning according to the sensitivity weight; randomly cutting each selected weight, and putting a pruning method and precision of multiple times of random cutting into a buffer area;
training reinforcement learning, namely training a reinforcement learning agent by using data in a buffer area, and putting a cutting method and precision generated after training into the buffer area; and repeating the steps until the network precision reaches a preset value.
2. The neural network pruning method for sensitivity analysis and reinforcement learning according to claim 1, wherein the sparsity threshold setting step selects the weight with low sensitivity for pruning, namely: and setting sparsity threshold values of all the weights, and cutting by adopting the current sparsity threshold value to keep the descending precision of the network within a preset range.
3. The neural network pruning method for sensitivity analysis and reinforcement learning according to claim 2, wherein the obtaining a clipping method and precision step performs random clipping on each selected weight to ensure that the sparsity is less than a sparsity threshold.
4. The neural network pruning method for sensitivity analysis and reinforcement learning of claim 3, wherein the step of training reinforcement learning comprises training a reinforcement learning agent using data in a buffer, predicting using a currently selected weight of the agent generated after training, determining a corresponding clipping method, clipping each network weight using the generated clipping method, fine-tuning the clipped network for a plurality of times, recording the final network precision, and putting the trained clipping method and precision into the buffer.
5. The neural network pruning method for sensitivity analysis and reinforcement learning of claim 1, wherein the accuracy of the network degradation is kept within 20%.
6. A neural network pruning system for sensitivity analysis and reinforcement learning is characterized in that,
a sparsity threshold setting module for selecting a weight with low sensitivity to prune;
a cutting method and precision module are obtained and used for determining the weight of random pruning according to the sensitivity weight; randomly cutting each selected weight, and putting a pruning method and precision of multiple times of random cutting into a buffer area;
the training reinforcement learning module is used for training a reinforcement learning agent by using data in the buffer area, and a cutting method and precision generated after training are put into the buffer area; and repeating the steps until the network precision reaches a preset value.
7. The neural network pruning system for sensitivity analysis and reinforcement learning according to claim 6, wherein the sparsity threshold module is configured to select the weights with low sensitivity for pruning, namely: and setting sparsity threshold values of all the weights, and cutting by adopting the current sparsity threshold value to keep the descending precision of the network within a preset range.
8. The neural network pruning system for sensitivity analysis and reinforcement learning of claim 7, wherein the acquisition clipping approach and precision module is configured to perform random clipping on each selected weight to ensure that the sparsity is less than a sparsity threshold.
9. The neural network pruning system for sensitivity analysis and reinforcement learning of claim 8, wherein the training reinforcement learning module is configured to train a reinforcement learning agent using data in the buffer, predict using a currently selected weight of the agent generated after training, determine a corresponding clipping method, clip each network weight using the generated clipping method, fine tune the clipped network a plurality of times, record a final network precision, and place the trained clipping method and precision in the buffer.
10. The sensitivity analysis and learning-intensive neural network pruning system according to claim 6, wherein the accuracy of the network degradation is kept within 20%.
11. A neural network pruning device for sensitivity analysis and reinforcement learning comprises a memory and a processor, wherein the memory stores a computer program and is characterized in that; the processor, when executing the computer program, realizes the method steps of any of claims 1-5.
12. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program implementing the method steps of any one of claims 1 to 5 when executed by a processor.
CN202011056171.8A 2020-09-30 2020-09-30 Neural network pruning method, system and device for sensitivity analysis and reinforcement learning Pending CN112183746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011056171.8A CN112183746A (en) 2020-09-30 2020-09-30 Neural network pruning method, system and device for sensitivity analysis and reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011056171.8A CN112183746A (en) 2020-09-30 2020-09-30 Neural network pruning method, system and device for sensitivity analysis and reinforcement learning

Publications (1)

Publication Number Publication Date
CN112183746A true CN112183746A (en) 2021-01-05

Family

ID=73947039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011056171.8A Pending CN112183746A (en) 2020-09-30 2020-09-30 Neural network pruning method, system and device for sensitivity analysis and reinforcement learning

Country Status (1)

Country Link
CN (1) CN112183746A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011588A (en) * 2021-04-21 2021-06-22 华侨大学 Pruning method, device, equipment and medium for convolutional neural network
CN113128664A (en) * 2021-03-16 2021-07-16 广东电力信息科技有限公司 Neural network compression method, device, electronic equipment and storage medium
CN114936078A (en) * 2022-05-20 2022-08-23 天津大学 Micro-grid group edge scheduling and intelligent body lightweight cutting method
WO2023082278A1 (en) * 2021-11-15 2023-05-19 Intel Corporation Apparatus and method for reinforcement learning based post-training sparsification

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128664A (en) * 2021-03-16 2021-07-16 广东电力信息科技有限公司 Neural network compression method, device, electronic equipment and storage medium
CN113011588A (en) * 2021-04-21 2021-06-22 华侨大学 Pruning method, device, equipment and medium for convolutional neural network
CN113011588B (en) * 2021-04-21 2023-05-30 华侨大学 Pruning method, device, equipment and medium of convolutional neural network
WO2023082278A1 (en) * 2021-11-15 2023-05-19 Intel Corporation Apparatus and method for reinforcement learning based post-training sparsification
CN114936078A (en) * 2022-05-20 2022-08-23 天津大学 Micro-grid group edge scheduling and intelligent body lightweight cutting method

Similar Documents

Publication Publication Date Title
CN112183746A (en) Neural network pruning method, system and device for sensitivity analysis and reinforcement learning
CN108089814B (en) Data storage method and device
US11741339B2 (en) Deep neural network-based method and device for quantifying activation amount
CN110764715B (en) Bandwidth control method, device and storage medium
US10142252B2 (en) Server intelligence for network speed testing control
JP2004126595A5 (en)
US10416907B2 (en) Storage system, storage control apparatus, and method of controlling a storage device
CN110799959A (en) Data compression method, decompression method and related equipment
CN110061930B (en) Method and device for determining data flow limitation and flow limiting values
CN112861996A (en) Deep neural network model compression method and device, electronic equipment and storage medium
CN110851333B (en) Root partition monitoring method and device and monitoring server
CN107783990B (en) Data compression method and terminal
CN112764681B (en) Cache elimination method and device with weight judgment and computer equipment
CN109343792B (en) Storage space configuration method and device, computer equipment and storage medium
CN108831504A (en) Determination method, apparatus, computer equipment and the storage medium of pitch period
CN107305531B (en) Method and device for determining limit value of cache capacity and computing equipment
CN114281808A (en) Traffic big data cleaning method, device, equipment and readable storage medium
CN111598233A (en) Compression method, device and equipment of deep learning model
CN109982110B (en) Method and device for video playing
CN111352825B (en) Data interface testing method and device and server
US20140362895A1 (en) Method, program product, and test device for testing bit error rate of network module
CN109962857B (en) Flow control method, flow control device and computer readable storage medium
CN114816232B (en) Method and device for efficiently accessing geological disaster big data
CN114138179B (en) Method and device for dynamically adjusting write cache space
CN116506116B (en) Encryption control method combining soft and hard

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination