CN108304928A - Compression method based on the deep neural network for improving cluster - Google Patents

Compression method based on the deep neural network for improving cluster Download PDF

Info

Publication number
CN108304928A
CN108304928A CN201810075486.3A CN201810075486A CN108304928A CN 108304928 A CN108304928 A CN 108304928A CN 201810075486 A CN201810075486 A CN 201810075486A CN 108304928 A CN108304928 A CN 108304928A
Authority
CN
China
Prior art keywords
weights
network
cluster centre
cluster
deep neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810075486.3A
Other languages
Chinese (zh)
Inventor
刘涵
马琰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN201810075486.3A priority Critical patent/CN108304928A/en
Publication of CN108304928A publication Critical patent/CN108304928A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses the compression methods based on the deep neural network for improving cluster;The network after normal training is become by sparse network by Pruning strategy first, realize preliminary compression, then it clusters to obtain the cluster centre of every layer of weight by K Means++, indicate that original weighted value realizes that weights are shared with cluster centre value, the quantization of each layer weight is carried out finally by each strata class, it carries out retraining and updates cluster centre, realize final compression.The present invention is shared by beta pruning, weights and weights quantify three steps, and finally by 30 to 40 times of deep neural network reduced overall, and precision is promoted.Simple and effective based on the compression method for improving cluster, deep neural network is realized under conditions of not losing precision (or even promotion) and is effectively compressed, this makes depth network be deployed in order to possible in mobile terminal.

Description

Compression method based on the deep neural network for improving cluster
Technical field
The present invention relates to machine learning techniques fields, more particularly to the compression side based on the deep neural network for improving cluster Method.
Background technology
In the tasks such as a series of speech recognition and computer vision, deep neural network is all shown obviously Advantage.In addition to powerful computing platform and diversified trained frame, the powerful performance of deep neural network is mainly attributed to It largely can learning parameter.With the increase of network depth, the learning ability of network can also become strong.But this learning ability Enhancing be using the increase of memory and other computing resources as cost, a large amount of weight can consume sizable memory and storage Device bandwidth.The mobile terminals such as present mobile phone, vehicle-mounted are more and more using the demand to deep neural network, with regard to current depth model For size, most models can not be transplanted in mobile phone terminal APP or embedded chip at all.
Deep neural network was typically parametrization, and deep learning model, there are severely redundant, which results in meters The waste calculated and stored.Many methods have been proposed at present to be compressed to deep learning model.Main technology is related to Network beta pruning, quantization and low-rank decomposition and transfer learning etc., and the object compressed is to be directed to depth convolutional neural networks.But It is to be substantially to be compressed mainly for full articulamentum, compression ratio is not high and precision has certain loss, these problems are all It is urgently to be resolved hurrily.
Based on the compression method for the deep neural network for improving cluster, by beta pruning, weights are shared and weights quantify three steps, Deep neural network is realized under conditions of not losing precision (or even promotion) and is effectively compressed.Compression method is simple and effective, this So that depth network is deployed in mobile terminal in order to possible.Therefore real to the compression based on the deep neural network for improving cluster It now studies, the practical application and further theoretical research for deep neural network all have significance.
Invention content
The purpose of the present invention is to provide the compression methods based on the deep neural network for improving cluster, are not losing precision Effective compression is realized to deep neural network under conditions of (or even promotion) so that deployment of the deep neural network in mobile terminal It is possibly realized.
The present invention uses following technical scheme to achieve the above object:
Based on the compression method for the deep neural network for improving cluster, include the following steps:
1), Pruning strategy;
Beta pruning process is broadly divided into three steps, carries out conventional training to network first, and preserve the model after training;Then right The smaller connection of weights carries out beta pruning, and primitive network becomes sparse network, preserves the sparse network model after beta pruning;Finally to dilute Network retraining is dredged to ensure the validity of CNN, final model is preserved after retraining;The process of beta pruning retraining each time All it is an iteration, with the increase of repetitive exercise number, accuracy can gradually increase, and after successive ignition, find best Connection;
After beta pruning is completed, primitive network just becomes sparse network, in conjunction with actual conditions, is to sparse network structure finally The CSC formats of spicy are selected to store;
2) weights, based on K-Means++ algorithms are shared;
K-Means++ algorithms are selected to be clustered, by original n weights W={ w1,w2,......wnIt is divided into k Class C={ c1,c2,...,ck, wherein n > > k, | | wi-wj| | indicate wiAnd wjBetween Euclidean distance, define W about C Cost function it is as follows:
The target of K-Means is exactly to select C to minimize cost function φW(C), K-Means++ and its optimization aim phase Together, it is improved in the selection of initial cluster center, the basic thought of K-Means++ selection initial cluster centers is:Just Mutual distance between the cluster centre of beginning is remote as far as possible;
3), weights quantify;
The quantization of each layer weight is carried out by each strata class, finally carry out retraining update cluster centre, to weights into Row quantization reduces the digit indicated used in weights, and weights quantization realizes further compression to deep neural network;
For each weights, the call number of the cluster centre belonging to it is stored, when being trained to network, when propagated forward It needs to replace each weights with its corresponding cluster centre, when backpropagation calculates the weights gradient in each class, then will Its gradient and anti-pass, for updating cluster centre;
After the shared quantization of weights, all cluster centres are all stored in code book, and weights are not by 32 original floating-points Number indicates, but is indicated by the call number of its corresponding cluster centre, this step allows the data volume of storage to greatly reduce, finally deposits The result of storage is exactly a code book and concordance list, it is assumed that is polymerized to k classes, then needs log2(k) code index is carried out in position, for n The network of a connection, each connection indicate there are k shared weights with b, then compression ratio r can be indicated as follows:
As a further solution of the present invention, in K-Means++ algorithms, the mutual distance between initial cluster centre Remote as far as possible, algorithm steps are as follows:
Step 1:From the W={ w of input1,w2,......wnOne conduct of middle random selection, first cluster centre c1
Step 2:For each w in data set, calculates it and (refer in the cluster selected with nearest cluster centre The heart) distance D (w) and be stored in an array, then these distances are added up to obtain Sum (D (w));
Step 3:Select a new ciAs cluster centre, with probabilitySelect ci=w' ∈ W, i.e. D (w') Larger point is selected larger as the probability of cluster centre;
Step 4:Step 2 and step 3 is repeated until k cluster centre is selected to come;
Step 5:Using this k initial cluster centres come the K-Means algorithms of operation standard.
As a further solution of the present invention, in K-Means++ algorithms, the algorithm realization of step 3 is as follows:First take one The random value Random in Sum (D (w)) can be fallen, then use Random-=D (w), until its value be less than or equal to 0, at this time Point is exactly next cluster centre, value Random=Sum (D (w)) * λ in experiment, λ ∈ (0,1);As algorithm value Random When=Sum (D (w)) * λ, which can be fallen into the larger sections D (w) with larger probability, so corresponding point w3It can be with larger Probability it is selected as new cluster centre;
After being clustered to every layer of weight by K-Means++ algorithms, original weighted value is indicated by cluster centre value Realize that weights are shared, same weights are shared in multiple connections of same layer, and the weights of cross-layer are not shared, and reduce the number of weights in this way Amount realizes the compression of deep neural network again.
Compared with prior art, the present invention has the following advantages:The deep neural network based on improvement cluster of the present invention Compression implementation method, by beta pruning, weights are shared and weights quantify three steps, finally by deep neural network reduced overall 30 To 40 times, and precision is promoted.Simple and effective based on the compression method for improving cluster, deep neural network is not losing essence It realizes and is effectively compressed under conditions of degree (or even promotion), this makes depth network be deployed in order to possible in mobile terminal.
Description of the drawings
Fig. 1 is the present invention based on the deep neural network compression method entire block diagram for improving cluster.
Fig. 2 is the sparse network schematic diagram after the beta pruning of the present invention.
Fig. 3 is the K-Means++ selection initial cluster center schematic diagrames of the present invention.
Fig. 4 is that the weights of the present invention share quantizing process schematic diagram.
Fig. 5 is loss of significance variation diagram after LeNet-300-100 network beta prunings.
Fig. 6 is top-1 Error Graph of the deep neural network under different compression schemes.
Specific implementation mode
The present invention is further elaborated in the following with reference to the drawings and specific embodiments.
The present invention proposes a kind of deep neural network compression method based on improvement cluster, first will by Pruning strategy Network after normal training becomes sparse network, realizes preliminary compression.Then it clusters to obtain every layer of weight by K-Means++ Cluster centre, indicate that original weighted value realizes that weights are shared with cluster centre value.It is carried out finally by each strata class each The quantization of layer weight carries out retraining and updates cluster centre, realizes final compression.Algorithm entirety fundamental block diagram is as shown in Figure 1. The following three stage is entirely divided into based on the deep neural network compression realization process for improving cluster:
1, Pruning strategy;
After traditional convolutional neural networks (CNN) have been trained, model is very huge, and the weight matrix of full articulamentum has several 100000, millions of a parameter values;But absolute value all very littles of many parameters, training or test result effect to CNN are very It is small, therefore we can attempt to remove these small value parameters by Pruning strategy, it can not only reduce the scale of model but also can To reduce calculation amount.Beta pruning process is broadly divided into three steps, in process such as Fig. 1 shown in Step1.
Conventional training is carried out to network first, and preserves the model after training.Then the connection smaller to weights is cut Branch, primitive network become sparse network, preserve the sparse network model after beta pruning.The accurate of network can be influenced after beta pruning connection Property, it is therefore desirable to the validity of CNN is ensured to this process of sparse network retraining, final mould is preserved after retraining Type.The process of beta pruning retraining each time is all an iteration, and with the increase of repetitive exercise number, accuracy can gradually increase Add, after successive ignition, best connection can be found.
After beta pruning is completed, primitive network just becomes sparse network (as shown in Figure 2);We need efficient sparse matrix Storage format.Typical sparse matrix storage format has Coordinate (COO), CompressedSparseRow/Column (CSR/CSC) etc..In conjunction with actual conditions, we are finally that the CSC formats of spicy is selected to store to sparse network structure.
2, the weights based on K-Means++ algorithms are shared;
The present invention obtains the cluster centre of every layer of weight by clustering, and the present invention selects K-Means++ algorithms to carry out Cluster, this is because K-Means methods random selection initial cluster center is likely to result in the result of cluster and the reality of data Differing distribution is very big, unsatisfactory the result that linear initialization cluster centre obtains, and K-Means++ algorithms can be used It solves the problems, such as this, can effectively select initial point.And the proof provided according to document, K-Means++ is in speed Be all an advantage over K-Means's in precision.We are by original n weights W={ w1,w2,......wnIt is divided into k class C= {c1,c2,...,ck, wherein n > > k.||wi-wj| | indicate wiAnd wjBetween Euclidean distance;Define generations of the W about C Valence function is as follows:
The target of K-Means is exactly to select C to minimize cost function φW(C), K-Means++ and its optimization aim phase Together, it but is improved in the selection of initial cluster center.K-Means++ selects the basic thought of initial cluster center It is:Mutual distance between initial cluster centre is remote as far as possible, and algorithm steps are as follows:
Step 1:From the W={ w of input1,w2,......wnOne conduct of middle random selection, first cluster centre c1
Step 2:For each w in data set, calculates it and (refer in the cluster selected with nearest cluster centre The heart) distance D (w) and be stored in an array, then these distances are added up to obtain Sum (D (w));
Step 3:Select a new ciAs cluster centre, with probabilitySelect ci=w' ∈ W (i.e. D (w') Larger point is selected larger as the probability of cluster centre);
Step 4:Step 2 and step 3 is repeated until k cluster centre is selected to come;
Step 5:Using this k initial cluster centres come the K-Means algorithms of operation standard.
In K-Means++ algorithms, the algorithm realization of step 3 is as follows:First take one can fall it is random in Sum (D (w)) Then value Random uses Random-=D (w), until its value is less than or equal to 0, point at this time is exactly next cluster centre.It is real Middle value Random=Sum (D (w)) * λ, λ ∈ (0,1) are tested, are understood for convenience, the present invention is indicated with Fig. 3.
As algorithm value Random=Sum (D (w)) * λ, which can fall into the larger sections D (w) with larger probability It is interior that (value falls into D (w in this figure3) probability it is big), so corresponding point w3It can be selected as new cluster using larger probability Center.
After being clustered to every layer of weight by K-Means++ algorithms, original weighted value is indicated by cluster centre value Realize that weights are shared, same weights (weights of cross-layer are not shared) are shared in multiple connections of same layer, can reduce weights in this way Quantity, again realize deep neural network compression.
3, weights quantify;
The present invention carries out the quantization of each layer weight by each strata class, finally carries out retraining and updates cluster centre.It is right Weights, which carry out quantization, can reduce the digit indicated used in weights, and weights quantization realizes further pressure to deep neural network Contracting.
For each weights, we need to only store the call number of the cluster centre belonging to it.It is preceding when being trained to network To needing to replace each weights with its corresponding cluster centre when propagating, it is terraced that when backpropagation, calculates the weights in each class Degree, then by its gradient and anti-pass, for updating cluster centre, (the identical weights of color indicate poly- to detailed process as shown in Figure 4 For one kind).
After the shared quantization of weights, all cluster centres are all stored in code book.Weights are not by 32 original floating-points Number indicates, but is indicated by the call number of its corresponding cluster centre.This step allows the data volume of storage to greatly reduce, and finally deposits The result of storage is exactly a code book and concordance list.Assuming that being polymerized to k classes, then log is needed2(k) code index is carried out in position.For with n The network of a connection, each connection indicate there are k shared weights with b, then compression ratio r can be indicated as follows:
The present invention under a linux operating system tests deep neural network based on Caffe frames, in CUDA8.0 Concurrent operation is carried out in architecture, used CUBLAS function libraries realize BLAS, for sparse matrix using cuSPARSE into Row sparse calculation.After beta pruning and the shared quantization of weights, by real-time performance under conditions of not losing (or even promotion) precision Considerable compression.Our experiment effects on each network introduced below.
1, LeNet networks;
Present invention LeNet-300-100 networks and LeNet-5 networks are tested on MNIST data sets.It is right first Network after normal training carries out beta pruning, and the trimming rate for determining every layer of depth network is tested by trial and error.After beta pruning is completed, I Retraining is carried out to LeNet-300-100, Fig. 5 indicates the precision and loss variation diagram to network after network retraining 50 times. As can be seen that in the case where not losing precision, beta pruning realizes preferable as a result, and having good Generalization Capability. After this, we carry out, and the weights based on K-Means++ are shared and weights quantizing process, and the weights of network are all quantified as 6- Bit, table 1 give the compression parameters and compression effectiveness (Weights% (P) expression warps of each process of LeNet-300-100 networks The proportion accounted for per layer parameter after beta pruning is crossed, table below is also such).
Table 1
As can be seen that beta pruning process has finally obtained 34 times by 13 times of model compression, with shared quantization in conjunction with network Compression ratio, precision have reached 98.5%.
The present invention is tested to LeNet-5 same methods, and compressed parameter is as shown in table 2,
Table 2
As can be seen that beta pruning process, by 12 times of model compression, final network has obtained 36 times of compression ratio, and precision reaches 99.3%, and realize preferable compression effectiveness.
2, AlexNet networks
The present invention is tested with AlexNet networks on ImageNet ILSVRC-2012 data sets.Same LeNet- 300-100 network experiment processes are the same, and details are not described herein.Network specifically compresses situation and is shown in Table 3;
Table 3
It can be seen that beta pruning process, by 10 times of Web compression, final network has obtained 30 times of compression ratio, Top-1 precision Reach 57.3%, Top-5 precision and reached 80.4%, precision is also promoted while compression network.
3, VGG-16 networks
The present invention is tested with VGG-16 networks on ImageNet ILSVRC-2012 data sets.With same side Method all carries out beta pruning and further compression to convolutional layer and full articulamentum.As shown in table 4;
Table 4
It can be seen that beta pruning process, by 15 times of Web compression, network has been finally reached 40 times of compression ratio, Top-1 precision 68.9%, Top-5 precision is reached and has reached 89.2%, by the effective compression of real-time performance.
4, brief summary;
Design parameter and performance before and after each Web compression is as shown in table 5.
Table 5
As can be seen from Table 5, by the present invention based on the deep neural network compression method for improving cluster, depth nerve 30 to 40 times of network reduced overall, realizes considerable compression.Although needing to be improved in terms of compression ratio, after compression Model it is sufficiently small, this makes deployment of the deep neural network on mobile terminal become possible to.It is pointed out that a side Face, with the method for the present invention by after Web compression, there is no losses for precision, but have a degree of promotion (such as Fig. 6 institutes Show), this has benefited from the improvement of clustering method in the present invention.On the other hand, convolutional layer and full articulamentum are all quantified as by the present invention 6bit differs the redundancy issue brought this avoids code length, eliminates this stage of huffman coding, therefore the present invention Compression method is simpler effectively.
In conclusion the compression method based on the deep neural network for improving cluster of the present invention is simple and effective, solve Compression ratio is not high in conventional compression method, has the problems such as loss of significance, and deep neural network (or even is carried not losing precision Rise) under conditions of realize and be effectively compressed so that deep neural network is deployed to mobile terminal in order to may.
The above is present pre-ferred embodiments, for the ordinary skill in the art, according to the present invention Introduction, in the case where not departing from the principle of the present invention with spirit, changes, modifications, replacement and change that embodiment is carried out Type is still fallen within protection scope of the present invention.

Claims (3)

1. the compression method based on the deep neural network for improving cluster, which is characterized in that include the following steps:
1), Pruning strategy;
Beta pruning process is broadly divided into three steps, carries out conventional training to network first, and preserve the model after training;Then to weights Smaller connection carries out beta pruning, and primitive network becomes sparse network, preserves the sparse network model after beta pruning;Finally to sparse net Network retraining ensures the validity of CNN, after retraining preserves final model;The process of beta pruning retraining each time is all An iteration, with the increase of repetitive exercise number, accuracy can gradually increase, and after successive ignition, find best connection;
After beta pruning is completed, primitive network just becomes sparse network, is finally to select to sparse network structure in conjunction with actual conditions The CSC formats of spicy store;
2) weights, based on K-Means++ algorithms are shared;
K-Means++ algorithms are selected to be clustered, by original n weights W={ w1,w2,......wnIt is divided into k class C= {c1,c2,...,ck, wherein n > > k, | | wi-wj| | indicate wiAnd wjBetween Euclidean distance, define generations of the W about C Valence function is as follows:
The target of K-Means is exactly to select C to minimize cost function φW(C), K-Means++ is identical as its optimization aim, It is improved in the selection of initial cluster center, the basic thought of K-Means++ selection initial cluster centers is:Initial is poly- Mutual distance between class center is remote as far as possible;
3), weights quantify;
The quantization of each layer weight is carried out by each strata class, is finally carried out retraining and is updated cluster centre, to the weights amount of progress Change the digit for reducing and indicating used in weights, weights quantization realizes further compression to deep neural network;
For each weights, the call number of the cluster centre belonging to it is stored, when being trained to network, when propagated forward needs Each weights are replaced with its corresponding cluster centre, when backpropagation calculates the weights gradient in each class, then by its ladder Degree and anti-pass, for updating cluster centre;After the shared quantization of weights, all cluster centres are all stored in code book, and weights are not It is to be indicated by original 32 floating numbers, but indicated by the call number of its corresponding cluster centre, this step allows the number of storage Greatly reduce according to amount, the result finally stored is exactly a code book and concordance list, it is assumed that is polymerized to k classes, then needs log2(k) position is come Code index, for the network connected with n, each connection is indicated with b, there is a shared weights of k, then compression ratio r can be with It indicates as follows:
2. the compression method according to claim 1 based on the deep neural network for improving cluster, which is characterized in that in K- In Means++ algorithms, the mutual distance between initial cluster centre is remote as far as possible, and algorithm steps are as follows:
Step 1:From the W={ w of input1,w2,......wnOne conduct of middle random selection, first cluster centre c1
Step 2:For each w in data set, it and nearest cluster centre (referring to the cluster centre selected) are calculated Distance D (w) is simultaneously stored in an array, then adds up these distances to obtain Sum (D (w));
Step 3:Select a new ciAs cluster centre, with probabilitySelect ci=w' ∈ W, i.e. D (w') are larger Point, be selected larger as the probability of cluster centre;
Step 4:Step 2 and step 3 is repeated until k cluster centre is selected to come;
Step 5:Using this k initial cluster centres come the K-Means algorithms of operation standard.
3. the compression method according to claim 2 based on the deep neural network for improving cluster, which is characterized in that in K- In Means++ algorithms, the algorithm realization of step 3 is as follows:A random value Random that can be fallen in Sum (D (w)) is first taken, so Random-=D (w) is used afterwards, and until its value is less than or equal to 0, point at this time is exactly next cluster centre, value in experiment Random=Sum (D (w)) * λ, λ ∈ (0,1);As algorithm value Random=Sum (D (w)) * λ, which can be with larger general Rate is fallen into the larger sections D (w), so corresponding point w3It can be selected as new cluster centre using larger probability;
After being clustered to every layer of weight by K-Means++ algorithms, indicate that original weighted value is realized by cluster centre value Weights are shared, and same weights are shared in multiple connections of same layer, and the weights of cross-layer are not shared, and reduce the quantity of weights in this way, then The secondary compression for realizing deep neural network.
CN201810075486.3A 2018-01-26 2018-01-26 Compression method based on the deep neural network for improving cluster Pending CN108304928A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810075486.3A CN108304928A (en) 2018-01-26 2018-01-26 Compression method based on the deep neural network for improving cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810075486.3A CN108304928A (en) 2018-01-26 2018-01-26 Compression method based on the deep neural network for improving cluster

Publications (1)

Publication Number Publication Date
CN108304928A true CN108304928A (en) 2018-07-20

Family

ID=62866401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810075486.3A Pending CN108304928A (en) 2018-01-26 2018-01-26 Compression method based on the deep neural network for improving cluster

Country Status (1)

Country Link
CN (1) CN108304928A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063666A (en) * 2018-08-14 2018-12-21 电子科技大学 The lightweight face identification method and system of convolution are separated based on depth
CN109522949A (en) * 2018-11-07 2019-03-26 北京交通大学 Model of Target Recognition method for building up and device
CN109543766A (en) * 2018-11-28 2019-03-29 钟祥博谦信息科技有限公司 Image processing method and electronic equipment, storage medium
CN109635938A (en) * 2018-12-29 2019-04-16 电子科技大学 A kind of autonomous learning impulsive neural networks weight quantization method
CN109858613A (en) * 2019-01-22 2019-06-07 鹏城实验室 A kind of compression method of deep neural network, system and terminal device
CN109978144A (en) * 2019-03-29 2019-07-05 联想(北京)有限公司 A kind of model compression method and system
CN109993304A (en) * 2019-04-02 2019-07-09 北京同方软件有限公司 A kind of detection model compression method based on semantic segmentation
CN110175262A (en) * 2019-05-31 2019-08-27 武汉斗鱼鱼乐网络科技有限公司 Deep learning model compression method, storage medium and system based on cluster
CN110288004A (en) * 2019-05-30 2019-09-27 武汉大学 A kind of diagnosis method for system fault and device excavated based on log semanteme
CN110298446A (en) * 2019-06-28 2019-10-01 济南大学 The deep neural network compression of embedded system and accelerated method and system
CN110909667A (en) * 2019-11-20 2020-03-24 北京化工大学 Lightweight design method for multi-angle SAR target recognition network
CN111263163A (en) * 2020-02-20 2020-06-09 济南浪潮高新科技投资发展有限公司 Method for realizing depth video compression framework based on mobile phone platform
CN109523016B (en) * 2018-11-21 2020-09-01 济南大学 Multi-valued quantization depth neural network compression method and system for embedded system
CN111723912A (en) * 2020-06-18 2020-09-29 南强智视(厦门)科技有限公司 Neural network decoupling method
CN112016672A (en) * 2020-07-16 2020-12-01 珠海欧比特宇航科技股份有限公司 Method and medium for neural network compression based on sensitivity pruning and quantization
CN112132024A (en) * 2020-09-22 2020-12-25 中国农业大学 Underwater target recognition network optimization method and device
CN112749782A (en) * 2019-10-31 2021-05-04 上海商汤智能科技有限公司 Data processing method and related product
WO2021103597A1 (en) * 2019-11-29 2021-06-03 苏州浪潮智能科技有限公司 Method and device for model compression of neural network
CN113114454A (en) * 2021-03-01 2021-07-13 暨南大学 Efficient privacy outsourcing k-means clustering method
CN113673693A (en) * 2020-05-15 2021-11-19 宏碁股份有限公司 Method for deep neural network compression
CN111476366B (en) * 2020-03-16 2024-02-23 清华大学 Model compression method and system for deep neural network

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063666A (en) * 2018-08-14 2018-12-21 电子科技大学 The lightweight face identification method and system of convolution are separated based on depth
CN109522949A (en) * 2018-11-07 2019-03-26 北京交通大学 Model of Target Recognition method for building up and device
CN109522949B (en) * 2018-11-07 2021-01-26 北京交通大学 Target recognition model establishing method and device
CN109523016B (en) * 2018-11-21 2020-09-01 济南大学 Multi-valued quantization depth neural network compression method and system for embedded system
CN109543766A (en) * 2018-11-28 2019-03-29 钟祥博谦信息科技有限公司 Image processing method and electronic equipment, storage medium
CN109635938A (en) * 2018-12-29 2019-04-16 电子科技大学 A kind of autonomous learning impulsive neural networks weight quantization method
CN109635938B (en) * 2018-12-29 2022-05-17 电子科技大学 Weight quantization method for autonomous learning impulse neural network
CN109858613A (en) * 2019-01-22 2019-06-07 鹏城实验室 A kind of compression method of deep neural network, system and terminal device
CN109858613B (en) * 2019-01-22 2021-02-19 鹏城实验室 Compression method and system of deep neural network and terminal equipment
CN109978144B (en) * 2019-03-29 2021-04-13 联想(北京)有限公司 Model compression method and system
CN109978144A (en) * 2019-03-29 2019-07-05 联想(北京)有限公司 A kind of model compression method and system
CN109993304A (en) * 2019-04-02 2019-07-09 北京同方软件有限公司 A kind of detection model compression method based on semantic segmentation
CN110288004A (en) * 2019-05-30 2019-09-27 武汉大学 A kind of diagnosis method for system fault and device excavated based on log semanteme
CN110175262A (en) * 2019-05-31 2019-08-27 武汉斗鱼鱼乐网络科技有限公司 Deep learning model compression method, storage medium and system based on cluster
CN110298446A (en) * 2019-06-28 2019-10-01 济南大学 The deep neural network compression of embedded system and accelerated method and system
CN112749782A (en) * 2019-10-31 2021-05-04 上海商汤智能科技有限公司 Data processing method and related product
CN110909667A (en) * 2019-11-20 2020-03-24 北京化工大学 Lightweight design method for multi-angle SAR target recognition network
US11928599B2 (en) 2019-11-29 2024-03-12 Inspur Suzhou Intelligent Technology Co., Ltd. Method and device for model compression of neural network
WO2021103597A1 (en) * 2019-11-29 2021-06-03 苏州浪潮智能科技有限公司 Method and device for model compression of neural network
CN111263163A (en) * 2020-02-20 2020-06-09 济南浪潮高新科技投资发展有限公司 Method for realizing depth video compression framework based on mobile phone platform
CN111476366B (en) * 2020-03-16 2024-02-23 清华大学 Model compression method and system for deep neural network
CN113673693B (en) * 2020-05-15 2024-03-12 宏碁股份有限公司 Deep neural network compression method
CN113673693A (en) * 2020-05-15 2021-11-19 宏碁股份有限公司 Method for deep neural network compression
CN111723912A (en) * 2020-06-18 2020-09-29 南强智视(厦门)科技有限公司 Neural network decoupling method
CN112016672A (en) * 2020-07-16 2020-12-01 珠海欧比特宇航科技股份有限公司 Method and medium for neural network compression based on sensitivity pruning and quantization
CN112132024B (en) * 2020-09-22 2024-02-27 中国农业大学 Underwater target recognition network optimization method and device
CN112132024A (en) * 2020-09-22 2020-12-25 中国农业大学 Underwater target recognition network optimization method and device
CN113114454B (en) * 2021-03-01 2022-11-29 暨南大学 Efficient privacy outsourcing k-means clustering method
CN113114454A (en) * 2021-03-01 2021-07-13 暨南大学 Efficient privacy outsourcing k-means clustering method

Similar Documents

Publication Publication Date Title
CN108304928A (en) Compression method based on the deep neural network for improving cluster
CN107239825B (en) Deep neural network compression method considering load balance
CN110222821B (en) Weight distribution-based convolutional neural network low bit width quantization method
CN110175628A (en) A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation
CN108229681A (en) A kind of neural network model compression method, system, device and readable storage medium storing program for executing
CN109002889A (en) Adaptive iteration formula convolutional neural networks model compression method
CN109886406A (en) A kind of complex convolution neural network compression method based on depth-compression
CN111105035A (en) Neural network pruning method based on combination of sparse learning and genetic algorithm
CN111814448B (en) Pre-training language model quantization method and device
CN111626404A (en) Deep network model compression training method based on generation of antagonistic neural network
CN108197707A (en) Compression method based on the convolutional neural networks that global error is rebuild
CN112329910A (en) Deep convolutional neural network compression method for structure pruning combined quantization
CN114970853A (en) Cross-range quantization convolutional neural network compression method
CN113837940A (en) Image super-resolution reconstruction method and system based on dense residual error network
CN109523016B (en) Multi-valued quantization depth neural network compression method and system for embedded system
CN108268950A (en) Iterative neural network quantization method and system based on vector quantization
Qi et al. Learning low resource consumption cnn through pruning and quantization
Wang et al. RFPruning: A retraining-free pruning method for accelerating convolutional neural networks
CN110263917A (en) A kind of neural network compression method and device
CN116976428A (en) Model training method, device, equipment and storage medium
CN112115837A (en) Target detection method based on YoloV3 and dual-threshold model compression
CN114781639A (en) Depth model compression method for multilayer shared codebook vector quantization of edge equipment
CN114595802A (en) Data compression-based impulse neural network acceleration method and device
CN110399975A (en) A kind of lithium battery depth diagnostic model compression algorithm towards hardware transplanting
CN114372565A (en) Target detection network compression method for edge device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180720