CN108734267A - Compression method and device, terminal, the storage medium of deep neural network model - Google Patents
Compression method and device, terminal, the storage medium of deep neural network model Download PDFInfo
- Publication number
- CN108734267A CN108734267A CN201710266595.9A CN201710266595A CN108734267A CN 108734267 A CN108734267 A CN 108734267A CN 201710266595 A CN201710266595 A CN 201710266595A CN 108734267 A CN108734267 A CN 108734267A
- Authority
- CN
- China
- Prior art keywords
- neural network
- network model
- deep neural
- layer
- weight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
A kind of compression method and device, terminal, storage medium of deep neural network model, the method includes:Obtain the deep neural network model trained;Each layer parameter of overall contribution degree based on parameter in the deep neural network model, the deep neural network model to having trained simplifies, the deep neural network model after being simplified.Above-mentioned scheme can take into account the precision and validity of deep neural network model when being compressed to deep neural network model.
Description
Technical field
The present invention relates to the compression methods and dress of technical field of information processing more particularly to a kind of deep neural network model
It sets, terminal, storage medium.
Background technology
With the rapid development of deep neural network relation technological researching, emerged in large numbers in related field large quantities of neural with depth
The relevant technology of network is such as applied to the convolutional neural networks of visual field and is led applied to speech recognition or natural language processing
The recurrent neural network etc. in domain, these nerual network techniques greatly improve the processing accuracy in corresponding field.
For deep neural network compared with shallow-layer learns, the potentiality to be exploited of deep neural network is huge.Pass through depth nerve net
The characteristic feature of sample can be extracted and be analyzed to the Multilevel method structure of network model, from the superficial to the deep successively convert and calculate sample
Feature and calculation processing result.By carrying out broadening growth processing to deep neural network model, deep neural network can be made
Model obtains more preferably handling result relatively.
However, the parameter of deep neural network model usually million, ten million or more than one hundred million orders of magnitude, therefore to calculating and depositing
Store up the more demanding of equipment.Due to the parameter transmission etc. of deep neural network model when deep neural network model is stored and calculated
Problem limits the application of deep neural network model on the mobile apparatus.
Currently, generally use reduces parameter, interstitial content and the change diversity class method of deep neural network model,
Achieve the purpose that compression depth neural network model.Although certain compression can be carried out to neural network model in this way, so
And the precision and validity of compressed deep neural network model are relatively low.
Invention content
Present invention solves the technical problem that being how to take into account deep neural network when being compressed to deep neural network model
The precision and validity of model.
In order to solve the above technical problems, the embodiment of the present invention provides a kind of compression method of deep neural network model, packet
It includes:Obtain the deep neural network model trained;Overall contribution degree based on parameter in the deep neural network model,
Each layer parameter of deep neural network model to having trained simplifies, the deep neural network model after being simplified.
Optionally, the overall contribution degree based on parameter in the deep neural network model, to the depth trained
Each layer parameter of degree neural network model is simplified, including:According to from rear to preceding sequence to the depth nerve net trained
Each layer of network model is traversed, and the reservation weight of the current layer traversed is obtained;Calculate the right of retention of the current layer of traversal extremely
The full powers of weight connect numerical value;The full powers connect contribution degree phase of the numerical value with the parameter of current layer in the deep neural network
Association;It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and according to comparing
As a result the final reservation weight of current layer is determined;Weight corresponding with the final reservation weight of current layer in next layer is obtained, is made
For next layer of reservation weight, all layers up to the deep neural network model traverse completion.
Optionally, when the deep neural network model is M layers, the reservation weight for calculating the current layer of traversal extremely
Full powers connect numerical value, including:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight,
NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer
The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to
The quantity of reservation weight in the current layer traversed in (M-1) layer.
Optionally, the retention parameter that current layer is determined according to comparison result, including:When the determining full powers being calculated
When connection numerical value simplifies threshold value less than or equal to current layer is corresponding, corresponding reservation weight is deleted;When determination is calculated
Full powers connection numerical value be more than that current layer is corresponding when simplifying threshold value, by corresponding the reservations weight reservation of current layer.
Optionally, the corresponding threshold value of simplifying obtains in the following manner:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model
Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model
Quantity, om,nIndicate that the power of n-th of reservation weight of m layers in deep neural network model connects numerical value, μmIndicate the depth god
The mean value of the full powers connection numerical value of m layers of the reservation weight through network model.
Optionally, the method further includes:Re -training is carried out to the deep neural network model after simplification.
The embodiment of the present invention additionally provides a kind of compression set of deep neural network model, including:Acquiring unit is suitable for
Obtain the deep neural network model trained;Simplified element is suitable for based on parameter in the deep neural network model
Each layer parameter of overall contribution degree, the deep neural network model to having trained simplifies, the depth nerve after being simplified
Network model.
Optionally, the simplified element, be suitable for according to from rear to preceding sequence to the deep neural network model trained
Each layer traversed, obtain the reservation weight of the current layer traversed;Calculate the complete of the reservation weight of the current layer of traversal extremely
Power connection numerical value;The full powers connection numerical value is associated with contribution degree of the parameter of current layer in the deep neural network;
It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and is true according to comparison result
The final reservation weight of settled front layer;Weight corresponding with the final reservation weight of current layer in next layer is obtained, as next
The reservation weight of layer, until all layers of traversal of the deep neural network model are completed.
Optionally, the simplified element is suitable for when the deep neural network model is M layers, using following formula
The full powers for calculating each reservation weight of the current layer of traversal extremely connect numerical value:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight,
NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer
The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to
The quantity of reservation weight in the current layer traversed in (M-1) layer.
Optionally, the simplified element, suitable for being less than or equal to current layer when the determining full powers connection numerical value being calculated
It is corresponding when simplifying threshold value, corresponding reservation weight is deleted;When the determining full powers connection numerical value being calculated is greater than or equal to
Current layer is corresponding when simplifying threshold value, and the corresponding reservation weight of current layer is retained.
Optionally, described device further includes:Threshold computation unit, suitable for the correspondence is calculated using following formula
Simplify threshold value:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model
Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model
Quantity, om,nIndicate that the full powers of n-th of reservation weight of m layers in deep neural network model connect numerical value, μmIndicate the depth
The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
Optionally, described device further includes:Training unit is suitable for carrying out again the deep neural network model after simplification
Training.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored thereon with computer instruction, described
The step of compression method for the deep neural network model that computer instruction executes any of the above-described kind when running.
The embodiment of the present invention additionally provides a kind of terminal, including memory and processor, and energy is stored on the memory
Enough computer instructions run on the processor;The processor executes any of the above-described kind when running the computer instruction
Deep neural network model compression method the step of.
Compared with prior art, the technical solution of the embodiment of the present invention has the advantages that:
Above-mentioned scheme, the overall contribution degree based on parameter in the deep neural network model, to the depth trained
Each layer parameter of degree neural network model is simplified, can make full use of in the deep neural network model each layer parameter it
Between incidence relation deep neural network model is compressed, the precision of compressed deep neural network model can be improved
And validity.
Further, the corresponding full powers connection of each layer parameter of neural network is analyzed layer by layer according to from rear to preceding sequence
(Overall Connection Weight, OCW), is weighed with the contribution degree to parameter, can be based on the multiple ginsengs of current layer
The distribution situation of number contribution degree deletes the lower parameter of contribution degree, and the parameter of deep neural network model is simplified with this.Due to examining
The influence of full powers connection is considered, therefore has not only reflected contribution degree of the parameter to current layer, additionally it is possible to embody defeated to neural network
Go out the overall contribution degree of result, it can the further validity simplified of parameter.
Further, re -training is carried out to the deep neural network model after simplification, can further increased after simplifying
Deep neural network model performance.
Description of the drawings
Fig. 1 is a kind of flow diagram of the compression method of deep neural network model of the embodiment of the present invention;
Fig. 2 is the flow diagram of the compression method of another deep neural network model of the embodiment of the present invention;
Fig. 3 is a kind of example schematic of the compression method of deep neural network model of the embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of the compression set of deep neural network model of the embodiment of the present invention.
Specific implementation mode
As stated in the background art, the compression method of simplifying of current deep neural network model is broadly divided into two classes:Change
The diversity class method of the parameter of deep neural network model density class method and change deep neural network model.
Change deep neural network model density class method, the sparse degree by changing neural network reaches the mesh of compression
's.In some algorithms, usually gives a smaller threshold value and deletes the small size value parameter in deep neural network model,
Subjectivity is larger, needs to carry out excessive parameter adjustment to the neural network of different structure and be likely to obtain preferably to simplify effect
Fruit.Other algorithms screen input node for the contribution relationship between input node and output response, and such algorithm is only
Enough processing are carried out for single hidden layer neural network and not to hiding layer parameter, are not suitable for the deeper depth of network structure
Neural network.
The above method only in single layer or the relationship of subnetwork interlayer is simplified, not to deep neural network into
Consider to row globality, the validity simplified therefore, it is difficult to ensure model.In addition, for larger deep neural network or multiple
The model of deep neural network composition, it is difficult to obtain desired simplify result within the limited time.
To solve the above problems, the technical solution of the embodiment of the present invention is based on parameter in the deep neural network model
Overall contribution degree, each layer parameter of the deep neural network model to having trained simplifies, and can make full use of the depth
Incidence relation in degree neural network model between each layer parameter compresses deep neural network model, can improve compression
The precision of deep neural network model afterwards.
It is understandable to enable above-mentioned purpose, feature and the advantageous effect of the present invention to become apparent, below in conjunction with the accompanying drawings to this
The specific embodiment of invention is described in detail.
Fig. 1 shows the flow chart of the compression method of the deep neural network model in the embodiment of the present invention.Refer to figure
1, the simplification method of the deep neural network model of the present embodiment may include steps of:
Step S101:Obtain the deep neural network model trained.
It in specific implementation, can be using the deep neural network model of trained completion as depth nerve to be simplified
Network model.
Step S102:Overall contribution degree based on parameter in the deep neural network model, to the depth trained
Each layer parameter of neural network model is simplified, the deep neural network model after being simplified.
In specific implementation, when simplifying to deep neural network model, according to parameter in deep neural network mould
Overall contribution degree in type is successively deleted or is retained to the parameter in deep neural network model, and institute can be made full use of
It states the incidence relation in deep neural network model between each layer parameter to compress deep neural network model, therefore can carry
The precision and validity of deep neural network model after high compression.
Above-mentioned scheme, the overall contribution degree based on parameter in the deep neural network model, to the depth trained
Each layer parameter of degree neural network model is simplified, can make full use of in the deep neural network model each layer parameter it
Between incidence relation deep neural network model is compressed, the essence of compressed deep neural network model can be improved
Degree.
In specific implementation, backpropagation mode may be used and obtain each layer parameter layer by layer, and calculate each layer parameter
Power connection numerical value, and using corresponding power connection numerical value is calculated, for contribution of the parameter in deep neural network model
Degree is weighed.
Referring to Fig. 2, the simplification method of the deep neural network model in the embodiment of the present invention in the embodiment of the present invention is suitable for
The parameter of deep neural network model is successively simplified, following step is can specifically include:
Step S201:Obtain the deep neural network model trained and completed.
Step S202:According to all layers of progress from rear to preceding sequence to the deep neural network model trained
Traversal, obtains the reservation weight of the current layer traversed.
In an embodiment of the present invention, in order to improve the efficiency of compression of parameters, to having trained the depth nerve net completed
When all layers of network model are traversed, all layers of deep neural network model are obtained layer by layer using backpropagation mode
Weight.
In specific implementation, due to successively traversing the ginseng of each layer in deep neural network model by the way of backpropagation
Number, the final reservation weight of current layer are:Node corresponding with the determining final reservation weight of latter adjacent layer is calculated with work as
All weights between all nodes of front layer.
It is to be herein pointed out when current layer is last layer of deep neural network model, namely carry out reversed
Can be the reservation weight of current layer by the ownership recast of this layer when the first layer of propagation.
Step S203:The full powers for calculating the reservation weight of the current layer of traversal extremely connect numerical value.
In specific implementation, the full powers connection of weight, the tribute exported for deep neural network as parameter may be used
Offer degree.By taking the full powers of one weight of node connection numerical value as an example, following formula may be used, depth nerve net is calculated
The full powers of each node layer connect numerical value in network:
And:
Wherein, M indicates the number of plies of deep neural network model, OM,nIndicate that the full powers of M layers of n-th of reservation weight connect
Numerical value is connect, n indicates M layers of n-th of reservation weight, NmIndicate the quantity of M layers of reservation weight, oM,nIndicate the 1st to (M-
1) full powers of n-th of reservation weight connect numerical value in the current layer traversed in layer, are traversed in n expressions the 1st to (M-1) layer
Current layer in n-th of reservation weight, the quantity of the reservation weight in the current layer traversed in K the 1st to (M-1) layer.
It is to be herein pointed out formula (1) to (3), by taking a weight of node as an example, the full powers for describing weight connect
The calculating process connect, and in practical applications, the full powers connection numerical value of whole weights of node is required to calculate, is come true
Determine knot removal still to retain.
Step S204:The full powers connection numerical value of the reservation weight of current layer is compared with corresponding precision threshold respectively
Compared with, and determine according to comparison result the final reservation weight of current layer.
In specific implementation, when the full powers of the reservation weight for the current layer being calculated connection numerical value is bigger, show phase
The parameter answered is bigger for the contribution degree of deep neural network model;Conversely, then tribute of the parameter for deep neural network model
Degree of offering is smaller.The smaller parameter of contribution degree is deleted, the output of deep neural network model will be influenced smaller.Therefore, pass through by
Each layer parameter traversed full powers connection numerical value be compared with corresponding precision threshold, delete full powers connect numerical value be less than or
Equal to the parameter of corresponding precision threshold, and retain the parameter that full powers connection numerical value is more than corresponding precision threshold, it can be right
While neural network model simplifies, the precision of the deep neural network model after simplifying is improved.
In specific implementation, the precision threshold of each layer can be the same or different in deep neural network model.In order to
The precision of deep neural network model after simplification can connect numerical value based on the full powers of the parameter of each layer, each layer is calculated
Corresponding precision threshold.In an embodiment of the present invention, the corresponding precision threshold of equivalent layer is calculated using following formula:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of the deep neural network model
Corresponding to simplify threshold value, α indicates preset mass parameter, NmIndicate parameter in m layers of the deep neural network model
Quantity, om,nIndicate that the power of the weight of m n-th of node of layer in deep neural network model connects numerical value, μmIndicate the depth
The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
Step S205:Judge whether all layers in deep neural network model traverse completion;When judging result is to be,
Step S207 can be executed;Conversely, can then execute step S206.
Step S206:Next layer of reservation weight is obtained, and is executed since step S202.
In specific implementation, in the final reservation weight for determining current layer, it may be determined that the final reserve section of current layer
Point and deletion of node, and next layer of reservation weight is then node corresponding with the final reservation weight of current layer and next layer
Weight between all nodes.When obtaining next layer of reservation weight, can be executed since step S202, it is next with determination
The final reservation weight of layer.
Step S207:Re -training is carried out to the deep neural network model after simplification.
In specific implementation, it during carrying out re -training to the deep neural network model after quantization, may be used
The method of equalization is criticized to scale, translate standardization network, to improve the performance for the deep neural network model that training obtains.
Below in conjunction with specific example, the compression method of the deep neural network model in the embodiment of the present invention is carried out
It introduces.
Referring to Fig. 3, by taking deep neural network model is M layers as an example, selection that there are two types of the outputs of M layer depth neural networks,
That is { yk| k=1,2 }, in the full connection iterative calculation for carrying out each layer parameter using backpropagation:
First time iterative calculation is carried out, determines deletion of node lM,1, i.e., M layers of reservation node is lM,2、lM,3And lM,4。
When iterating to calculate for second, when being iterated to calculate because of first time, M layers of reservation node is lM,2、lM,3And lM,4,
Deletion of node is lM,1, therefore deletion of node lM,1Weight between M-1 layers of node will not participate in the fortune that full powers connect numerical value
It calculates, and only calculates M-1 layers of node and M layers of reservation node lM,2、lM,3And lM,4Between weight full powers connect number
Value, it is final to determine deletion of node lM-1,2And lM-1,4, reservation node is lM-1,1And lM-1,3。
When third time iterates to calculate, when being iterated to calculate because of second, M-1 layers of reservation node is lM-1,1And lM-1,3,
Deletion of node is lM-1,2And lM-1,4, therefore deletion of node lM-1,2And lM-1,4Weight between M-2 layers of node will not participate in entirely
The operation of power connection numerical value, and only calculate M-1 layers of reservation node lM-1,1And lM-1,3Power between M-2 layers of node
The full powers of weight connect numerical value, and so on, until M layers all traverse and complete.
The above-mentioned method in the embodiment of the present invention is described in detail, below will be to the above-mentioned corresponding dress of method
It sets and is introduced.
Referring to Fig. 4, a kind of compression set 400 of depth neural network model in the embodiment of the present invention may include obtaining
Unit 401 and simplified element 402, wherein:
Acquiring unit 401 is suitable for obtaining the deep neural network model trained;
Simplified element 402 is suitable for the overall contribution degree based on parameter in the deep neural network model, to having trained
Each layer parameter of deep neural network model simplified, the deep neural network model after being simplified.
In an embodiment of the present invention, the simplified element 402, be suitable for according to from rear to preceding sequence to the depth trained
Each layer of degree neural network model is traversed, and the reservation weight of the current layer traversed is obtained;Calculate the current layer of traversal extremely
Reservation weight full powers connect numerical value;The parameter of the full powers connection numerical value and current layer is in the deep neural network
Contribution degree is associated;It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and
The final reservation weight of current layer is determined according to comparison result;It obtains corresponding with the final reservation weight of current layer in next layer
Weight, as next layer of reservation weight, until all layers of traversal of the deep neural network model are completed.
In an embodiment of the present invention, the simplified element 402 is suitable for when the deep neural network model is M layers,
The full powers that each reservation weight of the current layer of traversal extremely is calculated using following formula connect numerical value:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight,
NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer
The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to
The quantity of reservation weight in the current layer traversed in (M-1) layer.
In an embodiment of the present invention, the simplified element 402 determines that the full powers being calculated connection numerical value is small suitable for working as
In or equal to current layer is corresponding when simplifying threshold value, corresponding reservation weight is deleted;When the determining full powers connection being calculated
When numerical value simplifies threshold value more than or equal to current layer is corresponding, the corresponding reservation weight of current layer is retained.
In an embodiment of the present invention, described device 400 can also include threshold computation unit 403, wherein:
Threshold computation unit 403 described corresponding simplifies threshold value suitable for being calculated using following formula:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model
Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model
Quantity, om,nIndicate that the full powers of n-th of reservation weight of m layers in deep neural network model connect numerical value, μmIndicate the depth
The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
In an embodiment of the present invention, described device 400 can with training unit 404, wherein:
Training unit 404 is suitable for carrying out re -training to the deep neural network model after simplification.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored thereon with computer instruction, described
Computer instruction executes the step of compression method of the deep neural network model in above-described embodiment when running, repeat no more.
The embodiment of the present invention additionally provides a kind of terminal, including memory and processor, and energy is stored on the memory
Enough computer instructions run on the processor;The processor executes above-described embodiment when running the computer instruction
In deep neural network model compression method the step of, repeat no more.
Using the said program in the embodiment of the present invention, the whole tribute based on parameter in the deep neural network model
Each layer parameter of degree of offering, the deep neural network model to having trained simplifies, and can make full use of the depth nerve net
Incidence relation in network model between each layer parameter compresses deep neural network model, can improve compressed depth
The precision and validity of neural network model.
Further, the corresponding full powers connection of each layer parameter of neural network is analyzed layer by layer according to from rear to preceding sequence
(Overall Connection Weight, OCW), is weighed with the contribution degree to parameter, can be based on the multiple ginsengs of current layer
The distribution situation of number contribution degree deletes the lower parameter of contribution degree, and the parameter of deep neural network model is simplified with this.Due to examining
The influence of full powers connection is considered, therefore has not only reflected contribution degree of the parameter to current layer, additionally it is possible to embody defeated to neural network
Go out the overall contribution degree of result, it can the further validity simplified of parameter.
Further, re -training is carried out to the deep neural network model after simplification, can further increased after simplifying
Deep neural network model performance.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can
It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage
Medium may include:ROM, RAM, disk or CD etc..
Although present disclosure is as above, present invention is not limited to this.Any those skilled in the art are not departing from this
It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute
Subject to the range of restriction.
Claims (14)
1. a kind of compression method of deep neural network model, which is characterized in that including:
Obtain the deep neural network model trained;
Overall contribution degree based on parameter in the deep neural network model, to the deep neural network model trained
Each layer parameter is simplified, the deep neural network model after being simplified.
2. the compression method of deep neural network model according to claim 1, which is characterized in that described to be existed based on parameter
Overall contribution degree in the deep neural network model carries out letter to each layer parameter for the deep neural network model trained
Change, including:
It is traversed according to each layer of the deep neural network model from rear to preceding sequence to having trained, obtains working as of traversing
The reservation weight of front layer;
The full powers for calculating the reservation weight of the current layer of traversal extremely connect numerical value;The parameter of full powers the connection numerical value and current layer
Contribution degree in the deep neural network is associated;
It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and tied according to comparing
Fruit determines the final reservation weight of current layer;
Weight corresponding with the final reservation weight of current layer in next layer is obtained, as next layer of reservation weight, until institute
All layers of traversal for stating deep neural network model are completed.
3. the compression method of deep neural network model according to claim 2, which is characterized in that when depth nerve
When network model is M layers, the full powers of the reservation weight for calculating the current layer of traversal extremely connect numerical value, including:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight,
NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer
The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to
The quantity of reservation weight in the current layer traversed in (M-1) layer.
4. the compression method of deep neural network model according to claim 2, which is characterized in that it is described according to compare knot
Fruit determines the retention parameter of current layer, including:
When determining that the full powers being calculated connection numerical value simplifies threshold value less than or equal to current layer is corresponding, by corresponding reservation
Weight is deleted;
When determining that the full powers being calculated connection numerical value simplifies threshold value more than current layer is corresponding, by the corresponding reservation of current layer
Weight retains.
5. the compression method of deep neural network model according to claim 4, which is characterized in that described corresponding to simplify
Threshold value obtains in the following manner:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model
Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model
Quantity, om,nIndicate that the power of n-th of reservation weight of m layers in deep neural network model connects numerical value, μmIndicate the depth god
The mean value of the full powers connection numerical value of m layers of the reservation weight through network model.
6. the compression method of deep neural network model according to claim 1, which is characterized in that further include:To simplification
Deep neural network model afterwards carries out re -training.
7. a kind of compression set of deep neural network model, which is characterized in that including:
Acquiring unit is suitable for obtaining the deep neural network model trained;
Simplified element is suitable for the overall contribution degree based on parameter in the deep neural network model, to the depth trained
Each layer parameter of neural network model is simplified, the deep neural network model after being simplified.
8. the compression set of deep neural network model according to claim 7, which is characterized in that the simplified element,
Suitable for being traversed according to each layer of the deep neural network model from rear to preceding sequence to having trained, working as of traversing is obtained
The reservation weight of front layer;The full powers for calculating the reservation weight of the current layer of traversal extremely connect numerical value;Full powers connection numerical value with
Contribution degree of the parameter of current layer in the deep neural network is associated;The power of the reservation weight of current layer is connected into numerical value
It is corresponding with current layer to simplify threshold value and be compared, and the final reservation weight of current layer is determined according to comparison result;Under acquisition
Weight corresponding with the final reservation weight of current layer in one layer, as next layer of reservation weight, until depth nerve
All layers of traversal of network model are completed.
9. the compression set of deep neural network model according to claim 8, which is characterized in that the simplified element,
Suitable for when the deep neural network model is M layers, each reservation of the current layer of traversal extremely is calculated using following formula
The full powers of weight connect numerical value:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight,
NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer
The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to
The quantity of reservation weight in the current layer traversed in (M-1) layer.
10. the compression set of deep neural network model according to claim 8, which is characterized in that the simplified element,
When full powers connection numerical value suitable for being calculated when determination simplifies threshold value less than or equal to current layer is corresponding, by corresponding reservation
Weight is deleted;It, will be current when determining that the full powers being calculated connection numerical value simplifies threshold value more than or equal to current layer is corresponding
The corresponding reservation weight of layer retains.
11. the compression set of deep neural network model according to claim 10, which is characterized in that further include:Threshold value
Computing unit described corresponding simplifies threshold value suitable for being calculated using following formula:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model
Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model
Quantity, om,nIndicate that the full powers of n-th of reservation weight of m layers in deep neural network model connect numerical value, μmIndicate the depth
The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
12. the compression set of deep neural network model according to claim 7, which is characterized in that further include:Training is single
Member is suitable for carrying out re -training to the deep neural network model after simplification.
13. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the computer instruction
Perform claim requires the step of compression method of 1 to 6 any one of them deep neural network model when operation.
14. a kind of terminal, which is characterized in that including memory and processor, being stored on the memory can be at the place
The computer instruction run on reason device;Perform claim requires any one of 1 to 6 institute when the processor runs the computer instruction
The step of compression method for the deep neural network model stated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710266595.9A CN108734267A (en) | 2017-04-21 | 2017-04-21 | Compression method and device, terminal, the storage medium of deep neural network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710266595.9A CN108734267A (en) | 2017-04-21 | 2017-04-21 | Compression method and device, terminal, the storage medium of deep neural network model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108734267A true CN108734267A (en) | 2018-11-02 |
Family
ID=63933513
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710266595.9A Pending CN108734267A (en) | 2017-04-21 | 2017-04-21 | Compression method and device, terminal, the storage medium of deep neural network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108734267A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112749797A (en) * | 2020-07-20 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Pruning method and device for neural network model |
US11436442B2 (en) * | 2019-11-21 | 2022-09-06 | Samsung Electronics Co., Ltd. | Electronic apparatus and control method thereof |
-
2017
- 2017-04-21 CN CN201710266595.9A patent/CN108734267A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11436442B2 (en) * | 2019-11-21 | 2022-09-06 | Samsung Electronics Co., Ltd. | Electronic apparatus and control method thereof |
CN112749797A (en) * | 2020-07-20 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Pruning method and device for neural network model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200311552A1 (en) | Device and method for compressing machine learning model | |
CN109948029B (en) | Neural network self-adaptive depth Hash image searching method | |
CN108734264A (en) | Deep neural network model compression method and device, storage medium, terminal | |
CN110334580A (en) | The equipment fault classification method of changeable weight combination based on integrated increment | |
CN111178520B (en) | Method and device for constructing neural network | |
CN110188863B (en) | Convolution kernel compression method of convolution neural network suitable for resource-limited equipment | |
CN111079899A (en) | Neural network model compression method, system, device and medium | |
WO2020237904A1 (en) | Neural network compression method based on power exponent quantization | |
CN109740734B (en) | Image classification method of convolutional neural network by optimizing spatial arrangement of neurons | |
CN109067427B (en) | A kind of frequency hop sequences prediction technique based on Optimization-type wavelet neural network | |
CN114154646A (en) | Efficiency optimization method for federal learning in mobile edge network | |
CN107292458A (en) | A kind of Forecasting Methodology and prediction meanss applied to neural network chip | |
CN109388779A (en) | A kind of neural network weight quantization method and neural network weight quantization device | |
CN108734266A (en) | Compression method and device, terminal, the storage medium of deep neural network model | |
CN112153617B (en) | Terminal equipment transmission power control method based on integrated neural network | |
CN114781650B (en) | Data processing method, device, equipment and storage medium | |
CN108734287A (en) | Compression method and device, terminal, the storage medium of deep neural network model | |
CN112436992B (en) | Virtual network mapping method and device based on graph convolution network | |
CN107169566A (en) | Dynamic neural network model training method and device | |
CN109754122A (en) | A kind of Numerical Predicting Method of the BP neural network based on random forest feature extraction | |
CN115392441A (en) | Method, apparatus, device and medium for on-chip adaptation of quantized neural network model | |
CN108734267A (en) | Compression method and device, terminal, the storage medium of deep neural network model | |
CN116957106A (en) | Federal learning model training method based on dynamic attention mechanism | |
CN117574429A (en) | Federal deep learning method for privacy enhancement in edge computing network | |
CN115775026A (en) | Federated learning method based on organization similarity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181102 |