CN109886397A - A kind of neural network structure beta pruning compression optimization method for convolutional layer - Google Patents

A kind of neural network structure beta pruning compression optimization method for convolutional layer Download PDF

Info

Publication number
CN109886397A
CN109886397A CN201910218652.5A CN201910218652A CN109886397A CN 109886397 A CN109886397 A CN 109886397A CN 201910218652 A CN201910218652 A CN 201910218652A CN 109886397 A CN109886397 A CN 109886397A
Authority
CN
China
Prior art keywords
convolutional layer
beta pruning
value
sparse
sparse value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910218652.5A
Other languages
Chinese (zh)
Inventor
梅魁志
张良
张增
薛建儒
鄢健宇
常藩
张向楠
王晓
陶纪安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN201910218652.5A priority Critical patent/CN109886397A/en
Publication of CN109886397A publication Critical patent/CN109886397A/en
Pending legal-status Critical Current

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of neural network structure beta pruning compression optimization methods for convolutional layer, include: the sparse value distribution of (1) each convolutional layer: (1.1) training archetype obtain respectively can beta pruning convolutional layer weighting parameter, and each convolutional layer importance scores are calculated;(1.2) sequence according to importance scores from small to large, and average scale segmentation is carried out referring to maximin, the configuration of sparse value from small to large successively is carried out to each section convolutional layer, by model retraining adjust, obtain all can beta pruning convolutional layer sparse value configure;(2) structuring beta pruning: Convolution Filter is selected according to the sparse value that step (1.2) determine, carries out structuring beta pruning training;Wherein, every layer of convolutional layer only uses a kind of Convolution Filter.Optimization method of the invention can allow deep neural network more easily to run on resource-constrained platform, can save parameter storage space but also acceleration model operation.

Description

A kind of neural network structure beta pruning compression optimization method for convolutional layer
Technical field
The invention belongs to Artificial smart field, deep neural network optimisation technique field and picture recognition technologies Field, in particular to a kind of neural network structure beta pruning compression optimization method for convolutional layer.
Background technique
In artificial intelligence field, deep neural network is directly affected as one of foundation stone, complexity and portability Application of the artificial intelligence in life.The research of acceleration and compression optimization to depth network, may make that artificial intelligence is more square Just life is more easily served in realization.
Currently, there are several types of 1.Low-Rank: low-rank decomposition for the acceleration of common depth network and compression method; 2.Pruning: beta pruning, pruning method are divided into again: structuring beta pruning, core beta pruning, gradient beta pruning, use scope are wider; 3.Quantization: quantization, quantization are divided into again: low bit quantization, overall training accelerate quantization, distributed training gradient amount Change;4.Knowledge Distillation: knowledge distillation;5.Compact Network Design: compact network design, this It is to be optimized from network structure level to model.
Present invention is generally directed to second of compression method beta prunings to be further improved, and also use in prior art The thought of structuring beta pruning, its method is every layer of convolutional layer using multiple Convolution Filters, and these Convolution Filters Type be by training obtain;Not only very long consuming computing resource cycle of training is huge (leads to not smoothness for existing method Use large-scale training dataset), and such structuring beta pruning can not be saved in model forward calculation process it is more Calculating, storage resource.
To sum up, a kind of novel neural network structure beta pruning compression optimization method is needed.
Summary of the invention
The purpose of the present invention is to provide a kind of neural network structure beta pruning compression optimization method for convolutional layer, with Solve above-mentioned one or more technical problems.Optimization method of the invention, can allow deep neural network resource by It is more easily run on the platform of limit, parameter storage space but also acceleration model operation can be saved.
In order to achieve the above objectives, the invention adopts the following technical scheme:
A kind of neural network structure beta pruning compression optimization method for convolutional layer, comprising:
(1) the sparse value distribution of each convolutional layer, comprising:
(1.1) training archetype obtain respectively can beta pruning convolutional layer weighting parameter, and it is important that each convolutional layer is calculated Property score;
(1.2) sequence according to importance scores from small to large, and average scale segmentation is carried out referring to maximin, according to It is secondary that the configuration of sparse value from small to large is carried out to each section convolutional layer, it is adjusted by model retraining, obtaining all can beta pruning volume The sparse value of lamination configures;
(2) structuring beta pruning, comprising:
Convolution Filter is selected according to the sparse value that step (1.2) determine, carries out structuring beta pruning training;
Wherein, every layer of convolutional layer only uses a kind of Convolution Filter.
A further improvement of the present invention is that in step 1, training archetype obtain respectively can beta pruning convolutional layer weight ginseng Number specifically includes: weighting parameter kl,nchw, wherein l is the sequence number, and n, c, h, w are that the 4-D tensor of convolutional layer weighting parameter refers to Number, n are input channel number, and c is output channel number, and h, w are respectively the height and width of convolution kernel;N is input channel sum, and C is defeated Total number of channels out, H, W are respectively the total high and beam overall of convolution kernel, n, c, h, w be positive integer and, n ∈ [1, N], c ∈ [1, C], h ∈[1,H]、w∈[1,W]。
A further improvement of the present invention is that in step 1, the calculation expression of each convolutional layer importance scores are as follows:
In formula, for specified layer l, M is usedlIndicate the convolution nuclear operator of this layer and the average value of value square, n, c, H, w are the 4-D tensor indexes of convolutional layer weighting parameter, and n is input channel number, and c is output channel number, and h, w are respectively convolution kernel Height and width;N is input channel sum, and C is output channel sum, and H, W are respectively the total high and beam overall of convolution kernel, n, c, h, w For positive integer and, n ∈ [1, N], c ∈ [1, C], h ∈ [1, H], w ∈ [1, W].
A further improvement of the present invention is that carrying out what sparse value configured from small to large to each section convolutional layer in step 2 Specific steps include: it is each can beta pruning convolutional layer sparse value configuration include: change its sparse value, to model carry out retraining; If model performance keeps good, then continue to increase its sparse value, if model performance has biggish loss, then takes the last time Sparse value is its final sparse value;The sparse value configuration of convolutional layer is repeated, until completing the sparse value of convolutional layer in the last one section Configuration, obtain entirely can beta pruning convolutional layer structuring beta pruning sparse value initial configuration;
Wherein, the evaluation criterion of the performance of model is the mAP value in accuracy rate or target identification;If keep accuracy rate or Person mAP value, which does not decline, then keeps good for model performance, indicates that model performance has biggish damage if decreaseing beyond preset threshold It loses.
A further improvement of the present invention is that obtain entirely can beta pruning convolutional layer structuring beta pruning sparse value initial configuration Afterwards, the convolutional layer close to importance section both ends is finely adjusted;
Fine tuning includes: that the sparse value in the small one end of numerical value becomes larger, and the sparse value in the big one end of numerical value is become smaller, and follow change Once carry out retraining operation immediately, obtain it is final entirely can beta pruning convolutional layer the configuration of sparse value.
A further improvement of the present invention is that Convolution Filter is cutting as convolution nuclear operator size in step (2) Branch template.
A further improvement of the present invention is that Convolution Filter is described using three parameters, Kp_ in step (2) Stride is the step-length of beta pruning or reservation, and Kp_offset=i is that the Position Number of first value subtracted is i, Kp_keepset =j is that the Position Number of first value retained is j.
Compared with prior art, the invention has the following advantages:
Optimization method of the invention carries out one according to every layer of reasonable distribution of importance score of sparse value of each convolutional layer A kind of structuring beta pruning of the convolution operator rank of Convolution Filter of layer through toning ginseng, retraining, adjusts the training mode of ginseng to obtain Final mask;Under the premise of performance is without being substantially reduced, entire convolutional neural networks may make to obtain reasonable structuring beta pruning Compression optimization can not only substantially reduce parameter storage space, be also equipped with the potentiality of huge operation optimization.In addition, structure After changing beta pruning, as soon as a data flow need to only do the reading data work of time partial rules, the data of reading can be recycled, This will save the storage resource of huge hardware platform, and save a large amount of arithmetic operation, has very big operation and accelerates to dive Power can allow deep neural network more easily to run on resource-constrained platform, can save parameter storage space but also Acceleration model operation.
Further, obtain entirely can after the sparse value initial configuration of beta pruning convolutional layer structuring beta pruning, due to be before by It carries out one or several convolutional layers according to importance sectional to change simultaneously as the operation of identical sparse value, the step will lead to Sparse value close to the convolutional layer at importance section both ends be not it is so accurate, it is subsequent to need to these close to importance section two Some convolutional layers at end carry out the small sparse value in one end of numerical value become larger, the fine tuning that the sparse value in one end that numerical value is big becomes smaller, it then follows change Become primary and carry out retraining operation immediately, the variation before and after comparison model performance, obtain it is final entirely can beta pruning convolutional layer it is sparse Value configures, the mAP value in the judging basis accuracy rate or target identification of model performance.
Further, present invention selection manually adjusts sparse rate and each convolutional layer only uses a kind of convolution filter Device so that do not need to carry out sparse value of the very long training to determine each layer, and only uses a kind of convolution filter due to every layer Device enables model forward calculation process to save a large amount of storage, computing resource.
Detailed description of the invention
Fig. 1 is structuring beta pruning schematic illustration in the optimization method of the embodiment of the present invention;
Fig. 2 is structuring beta pruning principle of operation schematic diagram in the optimization method of the embodiment of the present invention;
Fig. 3 is that sparse value configures schematic illustration in the optimization method of the embodiment of the present invention;
Fig. 4 is the Convolution Filter structural schematic diagram of convolution operator size 3*3 in the optimization method of the embodiment of the present invention.
Specific embodiment
Invention is further described in detail in the following with reference to the drawings and specific embodiments.
Referring to Fig. 1, Fig. 1 show total beta pruning compression optimization schematic illustration.The one of the embodiment of the present invention Kind is directed to the structuring beta pruning compression optimization method of deep neural network convolutional layer, and specific steps include: the sparse value of each convolutional layer Distribution and structuring beta pruning two parts.
(1) the sparse value allocation step of each convolutional layer is as follows: firstly, training archetype obtain respectively can beta pruning convolutional layer ginseng Number data, and calculate single layer importance scores.To every layer of importance score MlSum to obtain M, calculates every layer of importance overall situation and accounts for ThanAccording to from small to large to DlCarry out sequence ranking, according to DlMaximum, minimum value carries out equidistant section Segmentation, specific interval number are needed through observation DlData rule combination experience provides, it then follows total number of segment is no more than the one of layer sum Half, each convolutional layer is distinguished as far as possible.Successively from importance score it is small to big section convolutional layer carry out sparse value from it is small to Configuration;Every once sparse value that changes carries out retraining operation, if model performance keeps good, continues to reduce sparse value, Until model performance has biggish loss, it is end value with the last sparse value of this test.Then, in front on the basis of it is right The convolutional layer in next importance score section repeats above-mentioned work, until the sparse value of convolutional layer for completing the last one section is matched Set work, obtain entirely can beta pruning convolutional layer structuring beta pruning sparse value initial configuration.Finally, a small number of convolutional layers can be modified Sparse value carries out retraining fine tuning, obtain it is final entirely can beta pruning convolutional layer the configuration of sparse value.Wherein, the evaluation of the performance of model Standard is the mAP value in accuracy rate or target identification;It is kept if keeping accuracy rate or mAP value not to decline for model performance Well, indicate that model performance has biggish loss if decreaseing beyond preset threshold.
The neural network archetype that the embodiment of the present invention selects is target detection network YOLOv3, its convolution operator Having a size of 3*3 and 1*1, the convolutional layer having a size of 3*3 is selected to carry out structuring beta pruning.
The data set in 2012 of Pascal VOC is selected, its test set includes 11540=5717+5823 picture, Test set includes 4952 pictures.Then official's configuration file yolov3-voc.cfg is used, training obtains archetype, and surveys Examination obtains the mAP value of model at this time.
The importance score of convolutional layer of the convolution operator having a size of 3*3 is obtained according to archetype parameter:
In formula, for specified layer l, M is usedlIndicate the convolution nuclear operator of this layer and the average value of value square, N, C, H, W are the 4-D tensor indexes of convolutional layer weighting parameter;Wherein, N is input channel number, and C is output channel number, and H, W are respectively to roll up Product core height and width, n, c, h, w be positive integer and, n ∈ [1, N], c ∈ [1, C], h ∈ [1, H], w ∈ [1, W].
Referring to Fig. 3, training archetype obtain respectively can beta pruning convolutional layer supplemental characteristic, and calculate single layer importance point Number.To every layer of importance score MlSum to obtain M, calculates every layer of importance overall situation accountingAccording to from it is small to Greatly to DlCarry out sequence ranking, according to DlMaximum, minimum value carries out equidistant Concourse Division, and specific interval number needs pass through observation DlData rule combination experience provides, it then follows total number of segment is no more than the half of layer sum, distinguishes each convolutional layer as far as possible.Every change Primary sparse value carries out retraining operation, if model performance keeps good, continues to increase sparse value, until model performance There is biggish loss, is end value with the last sparse value of this test;The evaluation criterion of the performance of model be accuracy rate or MAP value in person's target identification;Keep good if keeping accuracy rate or mAP value not to decline for model performance, if decline is super Crossing preset threshold then indicates that model performance has biggish loss.Then, in front on the basis of to next importance score section Convolutional layer repeat above-mentioned work, until completing the sparse value configuration work of convolutional layer in the last one section, obtaining entirely can beta pruning The sparse value initial configuration of convolutional layer structuring beta pruning.Finally, the sparse value progress retraining that can modify a small number of convolutional layers is micro- Adjust, obtain it is final entirely can beta pruning convolutional layer the configuration of sparse value;Due to be before according to importance sectional carry out one or Several convolutional layers change simultaneously as the operation of identical sparse value, and the step will lead to the convolutional layer close to importance section both ends Sparse value be not it is so accurate, it is subsequent need to carry out numerical value close to some convolutional layers at importance section both ends to these it is small The fine tuning that the sparse value in one end that the sparse value in one end becomes larger, numerical value is big becomes smaller, it then follows change primary progress retraining operation immediately, than Compared with the variation before and after model performance, obtain it is final entirely can beta pruning convolutional layer the configuration of sparse value, the judging basis of model performance is quasi- MAP value in true rate or target identification.
(2) steps are as follows for structuring beta pruning: being randomly choosed in the corresponding Convolution Filter of sparse value according to sparse value One kind, but must comply with a convolutional layer and can only select a kind of Convolution Filter, Convolution Filter are exactly and convolution nuclear operator ruler Very little the same beta pruning template.Every layer of convolutional layer is followed using same Convolution Filter, carries out structuring beta pruning training.
Fig. 2 and Fig. 4 are please referred to, according to the sparse value of configuration, from the Convolution Filter of the middle 3*3 selected as shown in Figure 4.It presses Retraining operation is carried out according to structuring beta pruning principle of operation as shown in Figure 2.Convolution Filter is described using three parameters, Kp_stride is the step-length of beta pruning (or reservation), and Kp_offset=i is that the Position Number of first value subtracted is i, Kp_ Keepset=j is that the Position Number of first value retained is j.
Normal operational is the input data (5*5) obtained by upper layer in Fig. 2, according to convolution kernel (3*3), by resetting image block It is obtained input data matrix (9*9) for rectangular array (im2col), then is multiplied to obtain result (9*1) and structure with convolution kernel (9*1) Changing beta pruning is then that input data (5*5) according to Convolution Filter (3*3), is not read the im2col operation of beta pruning partial data, It obtains input data matrix (9*4), then is multiplied to obtain result (9*1) with the convolution kernel (4*1) after Convolution Filter beta pruning Since every layer only selects a kind of Convolution Filter, then the input data on upper layer only needs to do the input number that an im2col is obtained It other convolution kernels after Convolution Filter beta pruning can be used by this layer according to matrix (9*4), without because in one layer There are several different types of Convolution Filters and multiple im2col is done to input data and is operated.Not only reduce operand but also saving A large amount of storage resource.
Present invention selection convolution kernel size 3*3 carries out Convolution Filter introduction in Fig. 4, and " 0 " value is indicated to convolution kernel in figure Corresponding position carries out beta pruning, and " 1 " value indicates to retain the weighting parameter that position is answered in convolution verification.And under each sparse value Convolution Filter shape is obtained by enumeration methodology.To Kp_stride, Kp_offset and Kp_stride, Kp_keepset two Kind combination carries out Kp_stride, Kp_offset, Kp_keepset ∈ [0, ksize2- 1], wherein three Convolution Filter parameters For integer, ksize is convolution kernel side length.Then the asymmetric Convolution Filter of " 1 " and " 0 " value is weeded out.
Although existing unstructured beta pruning compression method can reach very high compression ratio, compressed model is difficult to Operation optimization is carried out, is unfavorable for convolutional neural networks and is realized on resource-constrained hardware platform.For this problem, the present invention Structuring cut operator is carried out to each convolutional layer.Firstly, selecting reading data more according to the size of convolutional layer convolution operator Smooth Convolution Filter, and it is classified according to sparse value;Then to it is each can the relatively entire convolution of beta pruning convolutional layer Neural network importance is assessed, to it is each can beta pruning the suitable sparse value of convolution Layer assignment and suitable convolution filter Device;Using retraining, ginseng, the training mode of retraining are adjusted, under the premise of performance is without being substantially reduced, so that entire convolutional Neural Network obtains reasonable structuring beta pruning compression optimization, and not only substantially reducing parameter storage space, to be also equipped with huge operation excellent The potentiality of change.
To sum up, the neural network structure beta pruning compression optimization method for convolutional layer of the invention, belongs to pruning method In structuring pruning method, the object of beta pruning is convolution kernel operator;By that must be distributed according to its importance every layer of convolutional layer Suitable sparse value is set, then uses same Convolution Filter for each layer, therefore depth network model is not only stored in parameter It spatially greatly reduces, and a kind of this one layer of set-up mode of Convolution Filter, very big operation can be brought to accelerate effect Fruit.By the method for the invention after structuring beta pruning, a data flow need to only do the reading data work an of partial rules, read Data can be recycled, this will will save the storage resource of huge hardware platform, and save a large amount of operation behaviour Make, has very big operation and accelerate potentiality.
The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, although referring to above-described embodiment pair The present invention is described in detail, those of ordinary skill in the art still can to a specific embodiment of the invention into Row modification perhaps equivalent replacement these without departing from any modification of spirit and scope of the invention or equivalent replacement, applying Within pending claims of the invention.

Claims (7)

1. a kind of neural network structure beta pruning compression optimization method for convolutional layer characterized by comprising
(1) the sparse value distribution of each convolutional layer, comprising:
(1.1) training archetype obtain respectively can beta pruning convolutional layer weighting parameter, and each convolutional layer importance point is calculated Number;
(1.2) sequence according to importance scores from small to large, and average scale segmentation is carried out referring to maximin, it is successively right Each section convolutional layer carries out the configuration of sparse value from small to large, adjusts by model retraining, obtaining all can beta pruning convolutional layer Sparse value configuration;
(2) structuring beta pruning, comprising:
Convolution Filter is selected according to the sparse value that step (1.2) determine, carries out structuring beta pruning training;
Wherein, every layer of convolutional layer only uses a kind of Convolution Filter.
2. a kind of neural network structure beta pruning compression optimization method for convolutional layer according to claim 1, special Sign is, in step 1, training archetype obtain respectively can the weighting parameter of beta pruning convolutional layer specifically include:
Weighting parameter kl,nchw, wherein l is the sequence number, and n, c, h, w are the 4-D tensor indexes of convolutional layer weighting parameter, and n is defeated Enter port number, c is output channel number, and h, w are respectively the height and width of convolution kernel;N is input channel sum, and C is that output channel is total Number, H, W are respectively the total high and beam overall of convolution kernel, n, c, h, w be positive integer and, n ∈ [1, N], c ∈ [1, C], h ∈ [1, H], w ∈[1,W]。
3. a kind of neural network structure beta pruning compression optimization method for convolutional layer according to claim 1, special Sign is, in step 1, the calculation expression of each convolutional layer importance scores are as follows:
In formula, for specified layer l, M is usedlIndicate the convolution nuclear operator of this layer and the average value of value square, n, c, h, w are The 4-D tensor index of convolutional layer weighting parameter, n be input channel number, c be output channel number, h, w be respectively convolution kernel height and It is wide;N is input channel sum, and C is output channel sum, and H, W are respectively the total high and beam overall of convolution kernel, and n, c, h, w are positive whole Number and, n ∈ [1, N], c ∈ [1, C], h ∈ [1, H], w ∈ [1, W].
4. a kind of neural network structure beta pruning compression optimization method for convolutional layer according to claim 1, special Sign is, in step 2, carrying out the specific steps that sparse value configures from small to large to each section convolutional layer includes:
Each can beta pruning convolutional layer sparse value configuration include: change its sparse value, to model carry out retraining;If model It is able to maintain well, then continues to increase its sparse value, if model performance has biggish loss, then taking last sparse value is it Final sparse value;
The sparse value configuration of convolutional layer is repeated, until the sparse value configuration of the convolutional layer for completing the last one section, obtaining entirely can beta pruning The sparse value initial configuration of convolutional layer structuring beta pruning;
Wherein, the evaluation criterion of the performance of model is mAP value in accuracy rate or target identification;If keeping accuracy rate or mAP Value, which does not decline, then keeps good for model performance, indicates that model performance has biggish loss if decreaseing beyond preset threshold.
5. a kind of neural network structure beta pruning compression optimization method for convolutional layer according to claim 4, special Sign is, obtaining entirely can be after the sparse value initial configuration of beta pruning convolutional layer structuring beta pruning, to close to importance section both ends Convolutional layer is finely adjusted;
Fine tuning includes: that the sparse value in the small one end of numerical value becomes larger, and the sparse value in the big one end of numerical value is become smaller, and it is primary to follow change Carry out retraining operation immediately, obtain it is final entirely can beta pruning convolutional layer the configuration of sparse value.
6. a kind of neural network structure beta pruning compression optimization method for convolutional layer according to claim 1, special Sign is, in step (2), Convolution Filter is the beta pruning template as convolution nuclear operator size.
7. a kind of neural network structure beta pruning compression optimization method for convolutional layer according to claim 1, special Sign is, in step (2), Convolution Filter is described using three parameters, and Kp_stride is the step-length of beta pruning or reservation, Kp_offset=i is that the Position Number of first value subtracted is i, and Kp_keepset=j is the position of first value retained Number is j.
CN201910218652.5A 2019-03-21 2019-03-21 A kind of neural network structure beta pruning compression optimization method for convolutional layer Pending CN109886397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910218652.5A CN109886397A (en) 2019-03-21 2019-03-21 A kind of neural network structure beta pruning compression optimization method for convolutional layer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910218652.5A CN109886397A (en) 2019-03-21 2019-03-21 A kind of neural network structure beta pruning compression optimization method for convolutional layer

Publications (1)

Publication Number Publication Date
CN109886397A true CN109886397A (en) 2019-06-14

Family

ID=66933609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910218652.5A Pending CN109886397A (en) 2019-03-21 2019-03-21 A kind of neural network structure beta pruning compression optimization method for convolutional layer

Country Status (1)

Country Link
CN (1) CN109886397A (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287857A (en) * 2019-06-20 2019-09-27 厦门美图之家科技有限公司 A kind of training method of characteristic point detection model
CN110378468A (en) * 2019-07-08 2019-10-25 浙江大学 A kind of neural network accelerator quantified based on structuring beta pruning and low bit
CN110598731A (en) * 2019-07-31 2019-12-20 浙江大学 Efficient image classification method based on structured pruning
CN110619391A (en) * 2019-09-19 2019-12-27 华南理工大学 Detection model compression method and device and computer readable storage medium
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110969240A (en) * 2019-11-14 2020-04-07 北京达佳互联信息技术有限公司 Pruning method, device, equipment and medium for deep convolutional neural network
CN111079691A (en) * 2019-12-27 2020-04-28 中国科学院重庆绿色智能技术研究院 Pruning method based on double-flow network
CN111340225A (en) * 2020-02-28 2020-06-26 中云智慧(北京)科技有限公司 Deep convolution neural network model compression and acceleration method
CN111368699A (en) * 2020-02-28 2020-07-03 交叉信息核心技术研究院(西安)有限公司 Convolutional neural network pruning method based on patterns and pattern perception accelerator
CN111461322A (en) * 2020-03-13 2020-07-28 中国科学院计算技术研究所 Deep neural network model compression method
CN112132062A (en) * 2020-09-25 2020-12-25 中南大学 Remote sensing image classification method based on pruning compression neural network
WO2020260991A1 (en) * 2019-06-26 2020-12-30 International Business Machines Corporation Dataset dependent low rank decomposition of neural networks
CN112241509A (en) * 2020-09-29 2021-01-19 上海兆芯集成电路有限公司 Graphics processor and method for accelerating the same
CN112613610A (en) * 2020-12-25 2021-04-06 国网江苏省电力有限公司信息通信分公司 Deep neural network compression method based on joint dynamic pruning
CN112749797A (en) * 2020-07-20 2021-05-04 腾讯科技(深圳)有限公司 Pruning method and device for neural network model
CN112949814A (en) * 2019-11-26 2021-06-11 联合汽车电子有限公司 Compression and acceleration method and device of convolutional neural network and embedded equipment
WO2021129570A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Network pruning optimization method based on network activation and sparsification
CN113128694A (en) * 2019-12-31 2021-07-16 北京超星未来科技有限公司 Method, device and system for data acquisition and data processing in machine learning
CN113392953A (en) * 2020-03-12 2021-09-14 澜起科技股份有限公司 Method and apparatus for pruning convolutional layers in a neural network
CN113762463A (en) * 2021-07-26 2021-12-07 华南师范大学 Model pruning method and system for raspberry pi processor
CN114186633A (en) * 2021-12-10 2022-03-15 北京百度网讯科技有限公司 Distributed training method, device, equipment and storage medium of model
CN110298446B (en) * 2019-06-28 2022-04-05 济南大学 Deep neural network compression and acceleration method and system for embedded system
CN114548884A (en) * 2022-04-27 2022-05-27 中国科学院微电子研究所 Package identification method and system based on pruning lightweight model
CN115935263A (en) * 2023-02-22 2023-04-07 和普威视光电股份有限公司 Yoov 5 pruning-based edge chip detection and classification method and system
CN117114148A (en) * 2023-08-18 2023-11-24 湖南工商大学 Lightweight federal learning training method

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287857A (en) * 2019-06-20 2019-09-27 厦门美图之家科技有限公司 A kind of training method of characteristic point detection model
WO2020260991A1 (en) * 2019-06-26 2020-12-30 International Business Machines Corporation Dataset dependent low rank decomposition of neural networks
CN113826120B (en) * 2019-06-26 2023-02-14 国际商业机器公司 Data set dependent low rank decomposition of neural networks
CN113826120A (en) * 2019-06-26 2021-12-21 国际商业机器公司 Data set dependent low rank decomposition of neural networks
GB2600055A (en) * 2019-06-26 2022-04-20 Ibm Dataset dependent low rank decomposition of neural networks
CN110298446B (en) * 2019-06-28 2022-04-05 济南大学 Deep neural network compression and acceleration method and system for embedded system
CN110378468A (en) * 2019-07-08 2019-10-25 浙江大学 A kind of neural network accelerator quantified based on structuring beta pruning and low bit
WO2021004366A1 (en) * 2019-07-08 2021-01-14 浙江大学 Neural network accelerator based on structured pruning and low-bit quantization, and method
CN110598731A (en) * 2019-07-31 2019-12-20 浙江大学 Efficient image classification method based on structured pruning
CN110598731B (en) * 2019-07-31 2021-08-20 浙江大学 Efficient image classification method based on structured pruning
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110619385B (en) * 2019-08-31 2022-07-29 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110619391B (en) * 2019-09-19 2023-04-18 华南理工大学 Detection model compression method and device and computer readable storage medium
CN110619391A (en) * 2019-09-19 2019-12-27 华南理工大学 Detection model compression method and device and computer readable storage medium
CN110969240A (en) * 2019-11-14 2020-04-07 北京达佳互联信息技术有限公司 Pruning method, device, equipment and medium for deep convolutional neural network
CN110969240B (en) * 2019-11-14 2022-12-09 北京达佳互联信息技术有限公司 Pruning method, device, equipment and medium for deep convolutional neural network
CN112949814B (en) * 2019-11-26 2024-04-26 联合汽车电子有限公司 Compression and acceleration method and device of convolutional neural network and embedded device
CN112949814A (en) * 2019-11-26 2021-06-11 联合汽车电子有限公司 Compression and acceleration method and device of convolutional neural network and embedded equipment
WO2021129570A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Network pruning optimization method based on network activation and sparsification
CN111079691A (en) * 2019-12-27 2020-04-28 中国科学院重庆绿色智能技术研究院 Pruning method based on double-flow network
CN113128694A (en) * 2019-12-31 2021-07-16 北京超星未来科技有限公司 Method, device and system for data acquisition and data processing in machine learning
CN111368699B (en) * 2020-02-28 2023-04-07 交叉信息核心技术研究院(西安)有限公司 Convolutional neural network pruning method based on patterns and pattern perception accelerator
CN111368699A (en) * 2020-02-28 2020-07-03 交叉信息核心技术研究院(西安)有限公司 Convolutional neural network pruning method based on patterns and pattern perception accelerator
CN111340225A (en) * 2020-02-28 2020-06-26 中云智慧(北京)科技有限公司 Deep convolution neural network model compression and acceleration method
CN113392953A (en) * 2020-03-12 2021-09-14 澜起科技股份有限公司 Method and apparatus for pruning convolutional layers in a neural network
CN111461322A (en) * 2020-03-13 2020-07-28 中国科学院计算技术研究所 Deep neural network model compression method
CN111461322B (en) * 2020-03-13 2024-03-08 中国科学院计算技术研究所 Deep neural network model compression method
CN112749797A (en) * 2020-07-20 2021-05-04 腾讯科技(深圳)有限公司 Pruning method and device for neural network model
CN112132062A (en) * 2020-09-25 2020-12-25 中南大学 Remote sensing image classification method based on pruning compression neural network
CN112241509A (en) * 2020-09-29 2021-01-19 上海兆芯集成电路有限公司 Graphics processor and method for accelerating the same
CN112241509B (en) * 2020-09-29 2024-03-12 格兰菲智能科技有限公司 Graphics processor and acceleration method thereof
CN112613610A (en) * 2020-12-25 2021-04-06 国网江苏省电力有限公司信息通信分公司 Deep neural network compression method based on joint dynamic pruning
CN113762463A (en) * 2021-07-26 2021-12-07 华南师范大学 Model pruning method and system for raspberry pi processor
CN114186633A (en) * 2021-12-10 2022-03-15 北京百度网讯科技有限公司 Distributed training method, device, equipment and storage medium of model
CN114548884A (en) * 2022-04-27 2022-05-27 中国科学院微电子研究所 Package identification method and system based on pruning lightweight model
CN114548884B (en) * 2022-04-27 2022-07-12 中国科学院微电子研究所 Package identification method and system based on pruning lightweight model
CN115935263A (en) * 2023-02-22 2023-04-07 和普威视光电股份有限公司 Yoov 5 pruning-based edge chip detection and classification method and system
CN117114148A (en) * 2023-08-18 2023-11-24 湖南工商大学 Lightweight federal learning training method
CN117114148B (en) * 2023-08-18 2024-04-09 湖南工商大学 Lightweight federal learning training method

Similar Documents

Publication Publication Date Title
CN109886397A (en) A kind of neural network structure beta pruning compression optimization method for convolutional layer
Li et al. Group sparsity: The hinge between filter pruning and decomposition for network compression
Lin et al. Hrank: Filter pruning using high-rank feature map
Li et al. OICSR: Out-in-channel sparsity regularization for compact deep neural networks
WO2018227800A1 (en) Neural network training method and device
CN110378468A (en) A kind of neural network accelerator quantified based on structuring beta pruning and low bit
CN108009594B (en) A kind of image-recognizing method based on change grouping convolution
CN106503731A (en) A kind of based on conditional mutual information and the unsupervised feature selection approach of K means
CN109740734B (en) Image classification method of convolutional neural network by optimizing spatial arrangement of neurons
CN109754359A (en) A kind of method and system that the pondization applied to convolutional neural networks is handled
CN111259933B (en) High-dimensional characteristic data classification method and system based on distributed parallel decision tree
CN109472352A (en) A kind of deep neural network model method of cutting out based on characteristic pattern statistical nature
Gao et al. Vacl: Variance-aware cross-layer regularization for pruning deep residual networks
Kiaee et al. Alternating direction method of multipliers for sparse convolutional neural networks
CN112598129A (en) Adjustable hardware-aware pruning and mapping framework based on ReRAM neural network accelerator
JP2022101461A (en) Joint sparse method based on mixed particle size used for neural network
Qi et al. Learning low resource consumption cnn through pruning and quantization
US11875263B2 (en) Method and apparatus for energy-aware deep neural network compression
CN115983366A (en) Model pruning method and system for federal learning
CN114781639A (en) Depth model compression method for multilayer shared codebook vector quantization of edge equipment
CN113610350B (en) Complex working condition fault diagnosis method, equipment, storage medium and device
CN107463528A (en) The gauss hybrid models split-and-merge algorithm examined based on KS
CN113505804A (en) Image identification method and system based on compressed deep neural network
CN107291897A (en) A kind of time series data stream clustering method based on small wave attenuation summary tree
CN111783976A (en) Neural network training process intermediate value storage compression method and device based on window gradient updating

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190614