CN108304928A - Compression method based on the deep neural network for improving cluster - Google Patents
Compression method based on the deep neural network for improving cluster Download PDFInfo
- Publication number
- CN108304928A CN108304928A CN201810075486.3A CN201810075486A CN108304928A CN 108304928 A CN108304928 A CN 108304928A CN 201810075486 A CN201810075486 A CN 201810075486A CN 108304928 A CN108304928 A CN 108304928A
- Authority
- CN
- China
- Prior art keywords
- weights
- network
- cluster centre
- cluster
- deep neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention discloses the compression methods based on the deep neural network for improving cluster;The network after normal training is become by sparse network by Pruning strategy first, realize preliminary compression, then it clusters to obtain the cluster centre of every layer of weight by K Means++, indicate that original weighted value realizes that weights are shared with cluster centre value, the quantization of each layer weight is carried out finally by each strata class, it carries out retraining and updates cluster centre, realize final compression.The present invention is shared by beta pruning, weights and weights quantify three steps, and finally by 30 to 40 times of deep neural network reduced overall, and precision is promoted.Simple and effective based on the compression method for improving cluster, deep neural network is realized under conditions of not losing precision (or even promotion) and is effectively compressed, this makes depth network be deployed in order to possible in mobile terminal.
Description
Technical field
The present invention relates to machine learning techniques fields, more particularly to the compression side based on the deep neural network for improving cluster
Method.
Background technology
In the tasks such as a series of speech recognition and computer vision, deep neural network is all shown obviously
Advantage.In addition to powerful computing platform and diversified trained frame, the powerful performance of deep neural network is mainly attributed to
It largely can learning parameter.With the increase of network depth, the learning ability of network can also become strong.But this learning ability
Enhancing be using the increase of memory and other computing resources as cost, a large amount of weight can consume sizable memory and storage
Device bandwidth.The mobile terminals such as present mobile phone, vehicle-mounted are more and more using the demand to deep neural network, with regard to current depth model
For size, most models can not be transplanted in mobile phone terminal APP or embedded chip at all.
Deep neural network was typically parametrization, and deep learning model, there are severely redundant, which results in meters
The waste calculated and stored.Many methods have been proposed at present to be compressed to deep learning model.Main technology is related to
Network beta pruning, quantization and low-rank decomposition and transfer learning etc., and the object compressed is to be directed to depth convolutional neural networks.But
It is to be substantially to be compressed mainly for full articulamentum, compression ratio is not high and precision has certain loss, these problems are all
It is urgently to be resolved hurrily.
Based on the compression method for the deep neural network for improving cluster, by beta pruning, weights are shared and weights quantify three steps,
Deep neural network is realized under conditions of not losing precision (or even promotion) and is effectively compressed.Compression method is simple and effective, this
So that depth network is deployed in mobile terminal in order to possible.Therefore real to the compression based on the deep neural network for improving cluster
It now studies, the practical application and further theoretical research for deep neural network all have significance.
Invention content
The purpose of the present invention is to provide the compression methods based on the deep neural network for improving cluster, are not losing precision
Effective compression is realized to deep neural network under conditions of (or even promotion) so that deployment of the deep neural network in mobile terminal
It is possibly realized.
The present invention uses following technical scheme to achieve the above object:
Based on the compression method for the deep neural network for improving cluster, include the following steps:
1), Pruning strategy;
Beta pruning process is broadly divided into three steps, carries out conventional training to network first, and preserve the model after training;Then right
The smaller connection of weights carries out beta pruning, and primitive network becomes sparse network, preserves the sparse network model after beta pruning;Finally to dilute
Network retraining is dredged to ensure the validity of CNN, final model is preserved after retraining;The process of beta pruning retraining each time
All it is an iteration, with the increase of repetitive exercise number, accuracy can gradually increase, and after successive ignition, find best
Connection;
After beta pruning is completed, primitive network just becomes sparse network, in conjunction with actual conditions, is to sparse network structure finally
The CSC formats of spicy are selected to store;
2) weights, based on K-Means++ algorithms are shared;
K-Means++ algorithms are selected to be clustered, by original n weights W={ w1,w2,......wnIt is divided into k
Class C={ c1,c2,...,ck, wherein n > > k, | | wi-wj| | indicate wiAnd wjBetween Euclidean distance, define W about C
Cost function it is as follows:
The target of K-Means is exactly to select C to minimize cost function φW(C), K-Means++ and its optimization aim phase
Together, it is improved in the selection of initial cluster center, the basic thought of K-Means++ selection initial cluster centers is:Just
Mutual distance between the cluster centre of beginning is remote as far as possible;
3), weights quantify;
The quantization of each layer weight is carried out by each strata class, finally carry out retraining update cluster centre, to weights into
Row quantization reduces the digit indicated used in weights, and weights quantization realizes further compression to deep neural network;
For each weights, the call number of the cluster centre belonging to it is stored, when being trained to network, when propagated forward
It needs to replace each weights with its corresponding cluster centre, when backpropagation calculates the weights gradient in each class, then will
Its gradient and anti-pass, for updating cluster centre;
After the shared quantization of weights, all cluster centres are all stored in code book, and weights are not by 32 original floating-points
Number indicates, but is indicated by the call number of its corresponding cluster centre, this step allows the data volume of storage to greatly reduce, finally deposits
The result of storage is exactly a code book and concordance list, it is assumed that is polymerized to k classes, then needs log2(k) code index is carried out in position, for n
The network of a connection, each connection indicate there are k shared weights with b, then compression ratio r can be indicated as follows:
As a further solution of the present invention, in K-Means++ algorithms, the mutual distance between initial cluster centre
Remote as far as possible, algorithm steps are as follows:
Step 1:From the W={ w of input1,w2,......wnOne conduct of middle random selection, first cluster centre c1;
Step 2:For each w in data set, calculates it and (refer in the cluster selected with nearest cluster centre
The heart) distance D (w) and be stored in an array, then these distances are added up to obtain Sum (D (w));
Step 3:Select a new ciAs cluster centre, with probabilitySelect ci=w' ∈ W, i.e. D (w')
Larger point is selected larger as the probability of cluster centre;
Step 4:Step 2 and step 3 is repeated until k cluster centre is selected to come;
Step 5:Using this k initial cluster centres come the K-Means algorithms of operation standard.
As a further solution of the present invention, in K-Means++ algorithms, the algorithm realization of step 3 is as follows:First take one
The random value Random in Sum (D (w)) can be fallen, then use Random-=D (w), until its value be less than or equal to 0, at this time
Point is exactly next cluster centre, value Random=Sum (D (w)) * λ in experiment, λ ∈ (0,1);As algorithm value Random
When=Sum (D (w)) * λ, which can be fallen into the larger sections D (w) with larger probability, so corresponding point w3It can be with larger
Probability it is selected as new cluster centre;
After being clustered to every layer of weight by K-Means++ algorithms, original weighted value is indicated by cluster centre value
Realize that weights are shared, same weights are shared in multiple connections of same layer, and the weights of cross-layer are not shared, and reduce the number of weights in this way
Amount realizes the compression of deep neural network again.
Compared with prior art, the present invention has the following advantages:The deep neural network based on improvement cluster of the present invention
Compression implementation method, by beta pruning, weights are shared and weights quantify three steps, finally by deep neural network reduced overall 30
To 40 times, and precision is promoted.Simple and effective based on the compression method for improving cluster, deep neural network is not losing essence
It realizes and is effectively compressed under conditions of degree (or even promotion), this makes depth network be deployed in order to possible in mobile terminal.
Description of the drawings
Fig. 1 is the present invention based on the deep neural network compression method entire block diagram for improving cluster.
Fig. 2 is the sparse network schematic diagram after the beta pruning of the present invention.
Fig. 3 is the K-Means++ selection initial cluster center schematic diagrames of the present invention.
Fig. 4 is that the weights of the present invention share quantizing process schematic diagram.
Fig. 5 is loss of significance variation diagram after LeNet-300-100 network beta prunings.
Fig. 6 is top-1 Error Graph of the deep neural network under different compression schemes.
Specific implementation mode
The present invention is further elaborated in the following with reference to the drawings and specific embodiments.
The present invention proposes a kind of deep neural network compression method based on improvement cluster, first will by Pruning strategy
Network after normal training becomes sparse network, realizes preliminary compression.Then it clusters to obtain every layer of weight by K-Means++
Cluster centre, indicate that original weighted value realizes that weights are shared with cluster centre value.It is carried out finally by each strata class each
The quantization of layer weight carries out retraining and updates cluster centre, realizes final compression.Algorithm entirety fundamental block diagram is as shown in Figure 1.
The following three stage is entirely divided into based on the deep neural network compression realization process for improving cluster:
1, Pruning strategy;
After traditional convolutional neural networks (CNN) have been trained, model is very huge, and the weight matrix of full articulamentum has several
100000, millions of a parameter values;But absolute value all very littles of many parameters, training or test result effect to CNN are very
It is small, therefore we can attempt to remove these small value parameters by Pruning strategy, it can not only reduce the scale of model but also can
To reduce calculation amount.Beta pruning process is broadly divided into three steps, in process such as Fig. 1 shown in Step1.
Conventional training is carried out to network first, and preserves the model after training.Then the connection smaller to weights is cut
Branch, primitive network become sparse network, preserve the sparse network model after beta pruning.The accurate of network can be influenced after beta pruning connection
Property, it is therefore desirable to the validity of CNN is ensured to this process of sparse network retraining, final mould is preserved after retraining
Type.The process of beta pruning retraining each time is all an iteration, and with the increase of repetitive exercise number, accuracy can gradually increase
Add, after successive ignition, best connection can be found.
After beta pruning is completed, primitive network just becomes sparse network (as shown in Figure 2);We need efficient sparse matrix
Storage format.Typical sparse matrix storage format has Coordinate (COO), CompressedSparseRow/Column
(CSR/CSC) etc..In conjunction with actual conditions, we are finally that the CSC formats of spicy is selected to store to sparse network structure.
2, the weights based on K-Means++ algorithms are shared;
The present invention obtains the cluster centre of every layer of weight by clustering, and the present invention selects K-Means++ algorithms to carry out
Cluster, this is because K-Means methods random selection initial cluster center is likely to result in the result of cluster and the reality of data
Differing distribution is very big, unsatisfactory the result that linear initialization cluster centre obtains, and K-Means++ algorithms can be used
It solves the problems, such as this, can effectively select initial point.And the proof provided according to document, K-Means++ is in speed
Be all an advantage over K-Means's in precision.We are by original n weights W={ w1,w2,......wnIt is divided into k class C=
{c1,c2,...,ck, wherein n > > k.||wi-wj| | indicate wiAnd wjBetween Euclidean distance;Define generations of the W about C
Valence function is as follows:
The target of K-Means is exactly to select C to minimize cost function φW(C), K-Means++ and its optimization aim phase
Together, it but is improved in the selection of initial cluster center.K-Means++ selects the basic thought of initial cluster center
It is:Mutual distance between initial cluster centre is remote as far as possible, and algorithm steps are as follows:
Step 1:From the W={ w of input1,w2,......wnOne conduct of middle random selection, first cluster centre c1;
Step 2:For each w in data set, calculates it and (refer in the cluster selected with nearest cluster centre
The heart) distance D (w) and be stored in an array, then these distances are added up to obtain Sum (D (w));
Step 3:Select a new ciAs cluster centre, with probabilitySelect ci=w' ∈ W (i.e. D (w')
Larger point is selected larger as the probability of cluster centre);
Step 4:Step 2 and step 3 is repeated until k cluster centre is selected to come;
Step 5:Using this k initial cluster centres come the K-Means algorithms of operation standard.
In K-Means++ algorithms, the algorithm realization of step 3 is as follows:First take one can fall it is random in Sum (D (w))
Then value Random uses Random-=D (w), until its value is less than or equal to 0, point at this time is exactly next cluster centre.It is real
Middle value Random=Sum (D (w)) * λ, λ ∈ (0,1) are tested, are understood for convenience, the present invention is indicated with Fig. 3.
As algorithm value Random=Sum (D (w)) * λ, which can fall into the larger sections D (w) with larger probability
It is interior that (value falls into D (w in this figure3) probability it is big), so corresponding point w3It can be selected as new cluster using larger probability
Center.
After being clustered to every layer of weight by K-Means++ algorithms, original weighted value is indicated by cluster centre value
Realize that weights are shared, same weights (weights of cross-layer are not shared) are shared in multiple connections of same layer, can reduce weights in this way
Quantity, again realize deep neural network compression.
3, weights quantify;
The present invention carries out the quantization of each layer weight by each strata class, finally carries out retraining and updates cluster centre.It is right
Weights, which carry out quantization, can reduce the digit indicated used in weights, and weights quantization realizes further pressure to deep neural network
Contracting.
For each weights, we need to only store the call number of the cluster centre belonging to it.It is preceding when being trained to network
To needing to replace each weights with its corresponding cluster centre when propagating, it is terraced that when backpropagation, calculates the weights in each class
Degree, then by its gradient and anti-pass, for updating cluster centre, (the identical weights of color indicate poly- to detailed process as shown in Figure 4
For one kind).
After the shared quantization of weights, all cluster centres are all stored in code book.Weights are not by 32 original floating-points
Number indicates, but is indicated by the call number of its corresponding cluster centre.This step allows the data volume of storage to greatly reduce, and finally deposits
The result of storage is exactly a code book and concordance list.Assuming that being polymerized to k classes, then log is needed2(k) code index is carried out in position.For with n
The network of a connection, each connection indicate there are k shared weights with b, then compression ratio r can be indicated as follows:
The present invention under a linux operating system tests deep neural network based on Caffe frames, in CUDA8.0
Concurrent operation is carried out in architecture, used CUBLAS function libraries realize BLAS, for sparse matrix using cuSPARSE into
Row sparse calculation.After beta pruning and the shared quantization of weights, by real-time performance under conditions of not losing (or even promotion) precision
Considerable compression.Our experiment effects on each network introduced below.
1, LeNet networks;
Present invention LeNet-300-100 networks and LeNet-5 networks are tested on MNIST data sets.It is right first
Network after normal training carries out beta pruning, and the trimming rate for determining every layer of depth network is tested by trial and error.After beta pruning is completed, I
Retraining is carried out to LeNet-300-100, Fig. 5 indicates the precision and loss variation diagram to network after network retraining 50 times.
As can be seen that in the case where not losing precision, beta pruning realizes preferable as a result, and having good Generalization Capability.
After this, we carry out, and the weights based on K-Means++ are shared and weights quantizing process, and the weights of network are all quantified as 6-
Bit, table 1 give the compression parameters and compression effectiveness (Weights% (P) expression warps of each process of LeNet-300-100 networks
The proportion accounted for per layer parameter after beta pruning is crossed, table below is also such).
Table 1
As can be seen that beta pruning process has finally obtained 34 times by 13 times of model compression, with shared quantization in conjunction with network
Compression ratio, precision have reached 98.5%.
The present invention is tested to LeNet-5 same methods, and compressed parameter is as shown in table 2,
Table 2
As can be seen that beta pruning process, by 12 times of model compression, final network has obtained 36 times of compression ratio, and precision reaches
99.3%, and realize preferable compression effectiveness.
2, AlexNet networks
The present invention is tested with AlexNet networks on ImageNet ILSVRC-2012 data sets.Same LeNet-
300-100 network experiment processes are the same, and details are not described herein.Network specifically compresses situation and is shown in Table 3;
Table 3
It can be seen that beta pruning process, by 10 times of Web compression, final network has obtained 30 times of compression ratio, Top-1 precision
Reach 57.3%, Top-5 precision and reached 80.4%, precision is also promoted while compression network.
3, VGG-16 networks
The present invention is tested with VGG-16 networks on ImageNet ILSVRC-2012 data sets.With same side
Method all carries out beta pruning and further compression to convolutional layer and full articulamentum.As shown in table 4;
Table 4
It can be seen that beta pruning process, by 15 times of Web compression, network has been finally reached 40 times of compression ratio, Top-1 precision
68.9%, Top-5 precision is reached and has reached 89.2%, by the effective compression of real-time performance.
4, brief summary;
Design parameter and performance before and after each Web compression is as shown in table 5.
Table 5
As can be seen from Table 5, by the present invention based on the deep neural network compression method for improving cluster, depth nerve
30 to 40 times of network reduced overall, realizes considerable compression.Although needing to be improved in terms of compression ratio, after compression
Model it is sufficiently small, this makes deployment of the deep neural network on mobile terminal become possible to.It is pointed out that a side
Face, with the method for the present invention by after Web compression, there is no losses for precision, but have a degree of promotion (such as Fig. 6 institutes
Show), this has benefited from the improvement of clustering method in the present invention.On the other hand, convolutional layer and full articulamentum are all quantified as by the present invention
6bit differs the redundancy issue brought this avoids code length, eliminates this stage of huffman coding, therefore the present invention
Compression method is simpler effectively.
In conclusion the compression method based on the deep neural network for improving cluster of the present invention is simple and effective, solve
Compression ratio is not high in conventional compression method, has the problems such as loss of significance, and deep neural network (or even is carried not losing precision
Rise) under conditions of realize and be effectively compressed so that deep neural network is deployed to mobile terminal in order to may.
The above is present pre-ferred embodiments, for the ordinary skill in the art, according to the present invention
Introduction, in the case where not departing from the principle of the present invention with spirit, changes, modifications, replacement and change that embodiment is carried out
Type is still fallen within protection scope of the present invention.
Claims (3)
1. the compression method based on the deep neural network for improving cluster, which is characterized in that include the following steps:
1), Pruning strategy;
Beta pruning process is broadly divided into three steps, carries out conventional training to network first, and preserve the model after training;Then to weights
Smaller connection carries out beta pruning, and primitive network becomes sparse network, preserves the sparse network model after beta pruning;Finally to sparse net
Network retraining ensures the validity of CNN, after retraining preserves final model;The process of beta pruning retraining each time is all
An iteration, with the increase of repetitive exercise number, accuracy can gradually increase, and after successive ignition, find best connection;
After beta pruning is completed, primitive network just becomes sparse network, is finally to select to sparse network structure in conjunction with actual conditions
The CSC formats of spicy store;
2) weights, based on K-Means++ algorithms are shared;
K-Means++ algorithms are selected to be clustered, by original n weights W={ w1,w2,......wnIt is divided into k class C=
{c1,c2,...,ck, wherein n > > k, | | wi-wj| | indicate wiAnd wjBetween Euclidean distance, define generations of the W about C
Valence function is as follows:
The target of K-Means is exactly to select C to minimize cost function φW(C), K-Means++ is identical as its optimization aim,
It is improved in the selection of initial cluster center, the basic thought of K-Means++ selection initial cluster centers is:Initial is poly-
Mutual distance between class center is remote as far as possible;
3), weights quantify;
The quantization of each layer weight is carried out by each strata class, is finally carried out retraining and is updated cluster centre, to the weights amount of progress
Change the digit for reducing and indicating used in weights, weights quantization realizes further compression to deep neural network;
For each weights, the call number of the cluster centre belonging to it is stored, when being trained to network, when propagated forward needs
Each weights are replaced with its corresponding cluster centre, when backpropagation calculates the weights gradient in each class, then by its ladder
Degree and anti-pass, for updating cluster centre;After the shared quantization of weights, all cluster centres are all stored in code book, and weights are not
It is to be indicated by original 32 floating numbers, but indicated by the call number of its corresponding cluster centre, this step allows the number of storage
Greatly reduce according to amount, the result finally stored is exactly a code book and concordance list, it is assumed that is polymerized to k classes, then needs log2(k) position is come
Code index, for the network connected with n, each connection is indicated with b, there is a shared weights of k, then compression ratio r can be with
It indicates as follows:
2. the compression method according to claim 1 based on the deep neural network for improving cluster, which is characterized in that in K-
In Means++ algorithms, the mutual distance between initial cluster centre is remote as far as possible, and algorithm steps are as follows:
Step 1:From the W={ w of input1,w2,......wnOne conduct of middle random selection, first cluster centre c1;
Step 2:For each w in data set, it and nearest cluster centre (referring to the cluster centre selected) are calculated
Distance D (w) is simultaneously stored in an array, then adds up these distances to obtain Sum (D (w));
Step 3:Select a new ciAs cluster centre, with probabilitySelect ci=w' ∈ W, i.e. D (w') are larger
Point, be selected larger as the probability of cluster centre;
Step 4:Step 2 and step 3 is repeated until k cluster centre is selected to come;
Step 5:Using this k initial cluster centres come the K-Means algorithms of operation standard.
3. the compression method according to claim 2 based on the deep neural network for improving cluster, which is characterized in that in K-
In Means++ algorithms, the algorithm realization of step 3 is as follows:A random value Random that can be fallen in Sum (D (w)) is first taken, so
Random-=D (w) is used afterwards, and until its value is less than or equal to 0, point at this time is exactly next cluster centre, value in experiment
Random=Sum (D (w)) * λ, λ ∈ (0,1);As algorithm value Random=Sum (D (w)) * λ, which can be with larger general
Rate is fallen into the larger sections D (w), so corresponding point w3It can be selected as new cluster centre using larger probability;
After being clustered to every layer of weight by K-Means++ algorithms, indicate that original weighted value is realized by cluster centre value
Weights are shared, and same weights are shared in multiple connections of same layer, and the weights of cross-layer are not shared, and reduce the quantity of weights in this way, then
The secondary compression for realizing deep neural network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810075486.3A CN108304928A (en) | 2018-01-26 | 2018-01-26 | Compression method based on the deep neural network for improving cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810075486.3A CN108304928A (en) | 2018-01-26 | 2018-01-26 | Compression method based on the deep neural network for improving cluster |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108304928A true CN108304928A (en) | 2018-07-20 |
Family
ID=62866401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810075486.3A Pending CN108304928A (en) | 2018-01-26 | 2018-01-26 | Compression method based on the deep neural network for improving cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108304928A (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063666A (en) * | 2018-08-14 | 2018-12-21 | 电子科技大学 | The lightweight face identification method and system of convolution are separated based on depth |
CN109522949A (en) * | 2018-11-07 | 2019-03-26 | 北京交通大学 | Model of Target Recognition method for building up and device |
CN109543766A (en) * | 2018-11-28 | 2019-03-29 | 钟祥博谦信息科技有限公司 | Image processing method and electronic equipment, storage medium |
CN109635938A (en) * | 2018-12-29 | 2019-04-16 | 电子科技大学 | A kind of autonomous learning impulsive neural networks weight quantization method |
CN109858613A (en) * | 2019-01-22 | 2019-06-07 | 鹏城实验室 | A kind of compression method of deep neural network, system and terminal device |
CN109978144A (en) * | 2019-03-29 | 2019-07-05 | 联想(北京)有限公司 | A kind of model compression method and system |
CN109993304A (en) * | 2019-04-02 | 2019-07-09 | 北京同方软件有限公司 | A kind of detection model compression method based on semantic segmentation |
CN110175262A (en) * | 2019-05-31 | 2019-08-27 | 武汉斗鱼鱼乐网络科技有限公司 | Deep learning model compression method, storage medium and system based on cluster |
CN110288004A (en) * | 2019-05-30 | 2019-09-27 | 武汉大学 | A kind of diagnosis method for system fault and device excavated based on log semanteme |
CN110298446A (en) * | 2019-06-28 | 2019-10-01 | 济南大学 | The deep neural network compression of embedded system and accelerated method and system |
CN110909667A (en) * | 2019-11-20 | 2020-03-24 | 北京化工大学 | Lightweight design method for multi-angle SAR target recognition network |
CN111263163A (en) * | 2020-02-20 | 2020-06-09 | 济南浪潮高新科技投资发展有限公司 | Method for realizing depth video compression framework based on mobile phone platform |
CN109523016B (en) * | 2018-11-21 | 2020-09-01 | 济南大学 | Multi-valued quantization depth neural network compression method and system for embedded system |
CN111723912A (en) * | 2020-06-18 | 2020-09-29 | 南强智视(厦门)科技有限公司 | Neural network decoupling method |
CN112016672A (en) * | 2020-07-16 | 2020-12-01 | 珠海欧比特宇航科技股份有限公司 | Method and medium for neural network compression based on sensitivity pruning and quantization |
CN112132024A (en) * | 2020-09-22 | 2020-12-25 | 中国农业大学 | Underwater target recognition network optimization method and device |
CN112749782A (en) * | 2019-10-31 | 2021-05-04 | 上海商汤智能科技有限公司 | Data processing method and related product |
WO2021103597A1 (en) * | 2019-11-29 | 2021-06-03 | 苏州浪潮智能科技有限公司 | Method and device for model compression of neural network |
CN113114454A (en) * | 2021-03-01 | 2021-07-13 | 暨南大学 | Efficient privacy outsourcing k-means clustering method |
CN113673693A (en) * | 2020-05-15 | 2021-11-19 | 宏碁股份有限公司 | Method for deep neural network compression |
CN111476366B (en) * | 2020-03-16 | 2024-02-23 | 清华大学 | Model compression method and system for deep neural network |
-
2018
- 2018-01-26 CN CN201810075486.3A patent/CN108304928A/en active Pending
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063666A (en) * | 2018-08-14 | 2018-12-21 | 电子科技大学 | The lightweight face identification method and system of convolution are separated based on depth |
CN109522949A (en) * | 2018-11-07 | 2019-03-26 | 北京交通大学 | Model of Target Recognition method for building up and device |
CN109522949B (en) * | 2018-11-07 | 2021-01-26 | 北京交通大学 | Target recognition model establishing method and device |
CN109523016B (en) * | 2018-11-21 | 2020-09-01 | 济南大学 | Multi-valued quantization depth neural network compression method and system for embedded system |
CN109543766A (en) * | 2018-11-28 | 2019-03-29 | 钟祥博谦信息科技有限公司 | Image processing method and electronic equipment, storage medium |
CN109635938A (en) * | 2018-12-29 | 2019-04-16 | 电子科技大学 | A kind of autonomous learning impulsive neural networks weight quantization method |
CN109635938B (en) * | 2018-12-29 | 2022-05-17 | 电子科技大学 | Weight quantization method for autonomous learning impulse neural network |
CN109858613A (en) * | 2019-01-22 | 2019-06-07 | 鹏城实验室 | A kind of compression method of deep neural network, system and terminal device |
CN109858613B (en) * | 2019-01-22 | 2021-02-19 | 鹏城实验室 | Compression method and system of deep neural network and terminal equipment |
CN109978144B (en) * | 2019-03-29 | 2021-04-13 | 联想(北京)有限公司 | Model compression method and system |
CN109978144A (en) * | 2019-03-29 | 2019-07-05 | 联想(北京)有限公司 | A kind of model compression method and system |
CN109993304A (en) * | 2019-04-02 | 2019-07-09 | 北京同方软件有限公司 | A kind of detection model compression method based on semantic segmentation |
CN110288004A (en) * | 2019-05-30 | 2019-09-27 | 武汉大学 | A kind of diagnosis method for system fault and device excavated based on log semanteme |
CN110175262A (en) * | 2019-05-31 | 2019-08-27 | 武汉斗鱼鱼乐网络科技有限公司 | Deep learning model compression method, storage medium and system based on cluster |
CN110298446A (en) * | 2019-06-28 | 2019-10-01 | 济南大学 | The deep neural network compression of embedded system and accelerated method and system |
CN112749782A (en) * | 2019-10-31 | 2021-05-04 | 上海商汤智能科技有限公司 | Data processing method and related product |
CN110909667A (en) * | 2019-11-20 | 2020-03-24 | 北京化工大学 | Lightweight design method for multi-angle SAR target recognition network |
US11928599B2 (en) | 2019-11-29 | 2024-03-12 | Inspur Suzhou Intelligent Technology Co., Ltd. | Method and device for model compression of neural network |
WO2021103597A1 (en) * | 2019-11-29 | 2021-06-03 | 苏州浪潮智能科技有限公司 | Method and device for model compression of neural network |
CN111263163A (en) * | 2020-02-20 | 2020-06-09 | 济南浪潮高新科技投资发展有限公司 | Method for realizing depth video compression framework based on mobile phone platform |
CN111476366B (en) * | 2020-03-16 | 2024-02-23 | 清华大学 | Model compression method and system for deep neural network |
CN113673693B (en) * | 2020-05-15 | 2024-03-12 | 宏碁股份有限公司 | Deep neural network compression method |
CN113673693A (en) * | 2020-05-15 | 2021-11-19 | 宏碁股份有限公司 | Method for deep neural network compression |
CN111723912A (en) * | 2020-06-18 | 2020-09-29 | 南强智视(厦门)科技有限公司 | Neural network decoupling method |
CN112016672A (en) * | 2020-07-16 | 2020-12-01 | 珠海欧比特宇航科技股份有限公司 | Method and medium for neural network compression based on sensitivity pruning and quantization |
CN112132024B (en) * | 2020-09-22 | 2024-02-27 | 中国农业大学 | Underwater target recognition network optimization method and device |
CN112132024A (en) * | 2020-09-22 | 2020-12-25 | 中国农业大学 | Underwater target recognition network optimization method and device |
CN113114454B (en) * | 2021-03-01 | 2022-11-29 | 暨南大学 | Efficient privacy outsourcing k-means clustering method |
CN113114454A (en) * | 2021-03-01 | 2021-07-13 | 暨南大学 | Efficient privacy outsourcing k-means clustering method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108304928A (en) | Compression method based on the deep neural network for improving cluster | |
CN107239825B (en) | Deep neural network compression method considering load balance | |
CN110222821B (en) | Weight distribution-based convolutional neural network low bit width quantization method | |
CN110175628A (en) | A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation | |
CN108229681A (en) | A kind of neural network model compression method, system, device and readable storage medium storing program for executing | |
CN109002889A (en) | Adaptive iteration formula convolutional neural networks model compression method | |
CN109886406A (en) | A kind of complex convolution neural network compression method based on depth-compression | |
CN111105035A (en) | Neural network pruning method based on combination of sparse learning and genetic algorithm | |
CN111814448B (en) | Pre-training language model quantization method and device | |
CN111626404A (en) | Deep network model compression training method based on generation of antagonistic neural network | |
CN108197707A (en) | Compression method based on the convolutional neural networks that global error is rebuild | |
CN112329910A (en) | Deep convolutional neural network compression method for structure pruning combined quantization | |
CN114970853A (en) | Cross-range quantization convolutional neural network compression method | |
CN113837940A (en) | Image super-resolution reconstruction method and system based on dense residual error network | |
CN109523016B (en) | Multi-valued quantization depth neural network compression method and system for embedded system | |
CN108268950A (en) | Iterative neural network quantization method and system based on vector quantization | |
Qi et al. | Learning low resource consumption cnn through pruning and quantization | |
Wang et al. | RFPruning: A retraining-free pruning method for accelerating convolutional neural networks | |
CN110263917A (en) | A kind of neural network compression method and device | |
CN116976428A (en) | Model training method, device, equipment and storage medium | |
CN112115837A (en) | Target detection method based on YoloV3 and dual-threshold model compression | |
CN114781639A (en) | Depth model compression method for multilayer shared codebook vector quantization of edge equipment | |
CN114595802A (en) | Data compression-based impulse neural network acceleration method and device | |
CN110399975A (en) | A kind of lithium battery depth diagnostic model compression algorithm towards hardware transplanting | |
CN114372565A (en) | Target detection network compression method for edge device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180720 |