CN108734267A - Compression method and device, terminal, the storage medium of deep neural network model - Google Patents

Compression method and device, terminal, the storage medium of deep neural network model Download PDF

Info

Publication number
CN108734267A
CN108734267A CN201710266595.9A CN201710266595A CN108734267A CN 108734267 A CN108734267 A CN 108734267A CN 201710266595 A CN201710266595 A CN 201710266595A CN 108734267 A CN108734267 A CN 108734267A
Authority
CN
China
Prior art keywords
neural network
network model
deep neural
layer
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710266595.9A
Other languages
Chinese (zh)
Inventor
赵晓辉
林福辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Shanghai Co Ltd
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CN201710266595.9A priority Critical patent/CN108734267A/en
Publication of CN108734267A publication Critical patent/CN108734267A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A kind of compression method and device, terminal, storage medium of deep neural network model, the method includes:Obtain the deep neural network model trained;Each layer parameter of overall contribution degree based on parameter in the deep neural network model, the deep neural network model to having trained simplifies, the deep neural network model after being simplified.Above-mentioned scheme can take into account the precision and validity of deep neural network model when being compressed to deep neural network model.

Description

Compression method and device, terminal, the storage medium of deep neural network model
Technical field
The present invention relates to the compression methods and dress of technical field of information processing more particularly to a kind of deep neural network model It sets, terminal, storage medium.
Background technology
With the rapid development of deep neural network relation technological researching, emerged in large numbers in related field large quantities of neural with depth The relevant technology of network is such as applied to the convolutional neural networks of visual field and is led applied to speech recognition or natural language processing The recurrent neural network etc. in domain, these nerual network techniques greatly improve the processing accuracy in corresponding field.
For deep neural network compared with shallow-layer learns, the potentiality to be exploited of deep neural network is huge.Pass through depth nerve net The characteristic feature of sample can be extracted and be analyzed to the Multilevel method structure of network model, from the superficial to the deep successively convert and calculate sample Feature and calculation processing result.By carrying out broadening growth processing to deep neural network model, deep neural network can be made Model obtains more preferably handling result relatively.
However, the parameter of deep neural network model usually million, ten million or more than one hundred million orders of magnitude, therefore to calculating and depositing Store up the more demanding of equipment.Due to the parameter transmission etc. of deep neural network model when deep neural network model is stored and calculated Problem limits the application of deep neural network model on the mobile apparatus.
Currently, generally use reduces parameter, interstitial content and the change diversity class method of deep neural network model, Achieve the purpose that compression depth neural network model.Although certain compression can be carried out to neural network model in this way, so And the precision and validity of compressed deep neural network model are relatively low.
Invention content
Present invention solves the technical problem that being how to take into account deep neural network when being compressed to deep neural network model The precision and validity of model.
In order to solve the above technical problems, the embodiment of the present invention provides a kind of compression method of deep neural network model, packet It includes:Obtain the deep neural network model trained;Overall contribution degree based on parameter in the deep neural network model, Each layer parameter of deep neural network model to having trained simplifies, the deep neural network model after being simplified.
Optionally, the overall contribution degree based on parameter in the deep neural network model, to the depth trained Each layer parameter of degree neural network model is simplified, including:According to from rear to preceding sequence to the depth nerve net trained Each layer of network model is traversed, and the reservation weight of the current layer traversed is obtained;Calculate the right of retention of the current layer of traversal extremely The full powers of weight connect numerical value;The full powers connect contribution degree phase of the numerical value with the parameter of current layer in the deep neural network Association;It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and according to comparing As a result the final reservation weight of current layer is determined;Weight corresponding with the final reservation weight of current layer in next layer is obtained, is made For next layer of reservation weight, all layers up to the deep neural network model traverse completion.
Optionally, when the deep neural network model is M layers, the reservation weight for calculating the current layer of traversal extremely Full powers connect numerical value, including:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight, NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to The quantity of reservation weight in the current layer traversed in (M-1) layer.
Optionally, the retention parameter that current layer is determined according to comparison result, including:When the determining full powers being calculated When connection numerical value simplifies threshold value less than or equal to current layer is corresponding, corresponding reservation weight is deleted;When determination is calculated Full powers connection numerical value be more than that current layer is corresponding when simplifying threshold value, by corresponding the reservations weight reservation of current layer.
Optionally, the corresponding threshold value of simplifying obtains in the following manner:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model Quantity, om,nIndicate that the power of n-th of reservation weight of m layers in deep neural network model connects numerical value, μmIndicate the depth god The mean value of the full powers connection numerical value of m layers of the reservation weight through network model.
Optionally, the method further includes:Re -training is carried out to the deep neural network model after simplification.
The embodiment of the present invention additionally provides a kind of compression set of deep neural network model, including:Acquiring unit is suitable for Obtain the deep neural network model trained;Simplified element is suitable for based on parameter in the deep neural network model Each layer parameter of overall contribution degree, the deep neural network model to having trained simplifies, the depth nerve after being simplified Network model.
Optionally, the simplified element, be suitable for according to from rear to preceding sequence to the deep neural network model trained Each layer traversed, obtain the reservation weight of the current layer traversed;Calculate the complete of the reservation weight of the current layer of traversal extremely Power connection numerical value;The full powers connection numerical value is associated with contribution degree of the parameter of current layer in the deep neural network; It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and is true according to comparison result The final reservation weight of settled front layer;Weight corresponding with the final reservation weight of current layer in next layer is obtained, as next The reservation weight of layer, until all layers of traversal of the deep neural network model are completed.
Optionally, the simplified element is suitable for when the deep neural network model is M layers, using following formula The full powers for calculating each reservation weight of the current layer of traversal extremely connect numerical value:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight, NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to The quantity of reservation weight in the current layer traversed in (M-1) layer.
Optionally, the simplified element, suitable for being less than or equal to current layer when the determining full powers connection numerical value being calculated It is corresponding when simplifying threshold value, corresponding reservation weight is deleted;When the determining full powers connection numerical value being calculated is greater than or equal to Current layer is corresponding when simplifying threshold value, and the corresponding reservation weight of current layer is retained.
Optionally, described device further includes:Threshold computation unit, suitable for the correspondence is calculated using following formula Simplify threshold value:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model Quantity, om,nIndicate that the full powers of n-th of reservation weight of m layers in deep neural network model connect numerical value, μmIndicate the depth The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
Optionally, described device further includes:Training unit is suitable for carrying out again the deep neural network model after simplification Training.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored thereon with computer instruction, described The step of compression method for the deep neural network model that computer instruction executes any of the above-described kind when running.
The embodiment of the present invention additionally provides a kind of terminal, including memory and processor, and energy is stored on the memory Enough computer instructions run on the processor;The processor executes any of the above-described kind when running the computer instruction Deep neural network model compression method the step of.
Compared with prior art, the technical solution of the embodiment of the present invention has the advantages that:
Above-mentioned scheme, the overall contribution degree based on parameter in the deep neural network model, to the depth trained Each layer parameter of degree neural network model is simplified, can make full use of in the deep neural network model each layer parameter it Between incidence relation deep neural network model is compressed, the precision of compressed deep neural network model can be improved And validity.
Further, the corresponding full powers connection of each layer parameter of neural network is analyzed layer by layer according to from rear to preceding sequence (Overall Connection Weight, OCW), is weighed with the contribution degree to parameter, can be based on the multiple ginsengs of current layer The distribution situation of number contribution degree deletes the lower parameter of contribution degree, and the parameter of deep neural network model is simplified with this.Due to examining The influence of full powers connection is considered, therefore has not only reflected contribution degree of the parameter to current layer, additionally it is possible to embody defeated to neural network Go out the overall contribution degree of result, it can the further validity simplified of parameter.
Further, re -training is carried out to the deep neural network model after simplification, can further increased after simplifying Deep neural network model performance.
Description of the drawings
Fig. 1 is a kind of flow diagram of the compression method of deep neural network model of the embodiment of the present invention;
Fig. 2 is the flow diagram of the compression method of another deep neural network model of the embodiment of the present invention;
Fig. 3 is a kind of example schematic of the compression method of deep neural network model of the embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of the compression set of deep neural network model of the embodiment of the present invention.
Specific implementation mode
As stated in the background art, the compression method of simplifying of current deep neural network model is broadly divided into two classes:Change The diversity class method of the parameter of deep neural network model density class method and change deep neural network model.
Change deep neural network model density class method, the sparse degree by changing neural network reaches the mesh of compression 's.In some algorithms, usually gives a smaller threshold value and deletes the small size value parameter in deep neural network model, Subjectivity is larger, needs to carry out excessive parameter adjustment to the neural network of different structure and be likely to obtain preferably to simplify effect Fruit.Other algorithms screen input node for the contribution relationship between input node and output response, and such algorithm is only Enough processing are carried out for single hidden layer neural network and not to hiding layer parameter, are not suitable for the deeper depth of network structure Neural network.
The above method only in single layer or the relationship of subnetwork interlayer is simplified, not to deep neural network into Consider to row globality, the validity simplified therefore, it is difficult to ensure model.In addition, for larger deep neural network or multiple The model of deep neural network composition, it is difficult to obtain desired simplify result within the limited time.
To solve the above problems, the technical solution of the embodiment of the present invention is based on parameter in the deep neural network model Overall contribution degree, each layer parameter of the deep neural network model to having trained simplifies, and can make full use of the depth Incidence relation in degree neural network model between each layer parameter compresses deep neural network model, can improve compression The precision of deep neural network model afterwards.
It is understandable to enable above-mentioned purpose, feature and the advantageous effect of the present invention to become apparent, below in conjunction with the accompanying drawings to this The specific embodiment of invention is described in detail.
Fig. 1 shows the flow chart of the compression method of the deep neural network model in the embodiment of the present invention.Refer to figure 1, the simplification method of the deep neural network model of the present embodiment may include steps of:
Step S101:Obtain the deep neural network model trained.
It in specific implementation, can be using the deep neural network model of trained completion as depth nerve to be simplified Network model.
Step S102:Overall contribution degree based on parameter in the deep neural network model, to the depth trained Each layer parameter of neural network model is simplified, the deep neural network model after being simplified.
In specific implementation, when simplifying to deep neural network model, according to parameter in deep neural network mould Overall contribution degree in type is successively deleted or is retained to the parameter in deep neural network model, and institute can be made full use of It states the incidence relation in deep neural network model between each layer parameter to compress deep neural network model, therefore can carry The precision and validity of deep neural network model after high compression.
Above-mentioned scheme, the overall contribution degree based on parameter in the deep neural network model, to the depth trained Each layer parameter of degree neural network model is simplified, can make full use of in the deep neural network model each layer parameter it Between incidence relation deep neural network model is compressed, the essence of compressed deep neural network model can be improved Degree.
In specific implementation, backpropagation mode may be used and obtain each layer parameter layer by layer, and calculate each layer parameter Power connection numerical value, and using corresponding power connection numerical value is calculated, for contribution of the parameter in deep neural network model Degree is weighed.
Referring to Fig. 2, the simplification method of the deep neural network model in the embodiment of the present invention in the embodiment of the present invention is suitable for The parameter of deep neural network model is successively simplified, following step is can specifically include:
Step S201:Obtain the deep neural network model trained and completed.
Step S202:According to all layers of progress from rear to preceding sequence to the deep neural network model trained Traversal, obtains the reservation weight of the current layer traversed.
In an embodiment of the present invention, in order to improve the efficiency of compression of parameters, to having trained the depth nerve net completed When all layers of network model are traversed, all layers of deep neural network model are obtained layer by layer using backpropagation mode Weight.
In specific implementation, due to successively traversing the ginseng of each layer in deep neural network model by the way of backpropagation Number, the final reservation weight of current layer are:Node corresponding with the determining final reservation weight of latter adjacent layer is calculated with work as All weights between all nodes of front layer.
It is to be herein pointed out when current layer is last layer of deep neural network model, namely carry out reversed Can be the reservation weight of current layer by the ownership recast of this layer when the first layer of propagation.
Step S203:The full powers for calculating the reservation weight of the current layer of traversal extremely connect numerical value.
In specific implementation, the full powers connection of weight, the tribute exported for deep neural network as parameter may be used Offer degree.By taking the full powers of one weight of node connection numerical value as an example, following formula may be used, depth nerve net is calculated The full powers of each node layer connect numerical value in network:
And:
Wherein, M indicates the number of plies of deep neural network model, OM,nIndicate that the full powers of M layers of n-th of reservation weight connect Numerical value is connect, n indicates M layers of n-th of reservation weight, NmIndicate the quantity of M layers of reservation weight, oM,nIndicate the 1st to (M- 1) full powers of n-th of reservation weight connect numerical value in the current layer traversed in layer, are traversed in n expressions the 1st to (M-1) layer Current layer in n-th of reservation weight, the quantity of the reservation weight in the current layer traversed in K the 1st to (M-1) layer.
It is to be herein pointed out formula (1) to (3), by taking a weight of node as an example, the full powers for describing weight connect The calculating process connect, and in practical applications, the full powers connection numerical value of whole weights of node is required to calculate, is come true Determine knot removal still to retain.
Step S204:The full powers connection numerical value of the reservation weight of current layer is compared with corresponding precision threshold respectively Compared with, and determine according to comparison result the final reservation weight of current layer.
In specific implementation, when the full powers of the reservation weight for the current layer being calculated connection numerical value is bigger, show phase The parameter answered is bigger for the contribution degree of deep neural network model;Conversely, then tribute of the parameter for deep neural network model Degree of offering is smaller.The smaller parameter of contribution degree is deleted, the output of deep neural network model will be influenced smaller.Therefore, pass through by Each layer parameter traversed full powers connection numerical value be compared with corresponding precision threshold, delete full powers connect numerical value be less than or Equal to the parameter of corresponding precision threshold, and retain the parameter that full powers connection numerical value is more than corresponding precision threshold, it can be right While neural network model simplifies, the precision of the deep neural network model after simplifying is improved.
In specific implementation, the precision threshold of each layer can be the same or different in deep neural network model.In order to The precision of deep neural network model after simplification can connect numerical value based on the full powers of the parameter of each layer, each layer is calculated Corresponding precision threshold.In an embodiment of the present invention, the corresponding precision threshold of equivalent layer is calculated using following formula:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of the deep neural network model Corresponding to simplify threshold value, α indicates preset mass parameter, NmIndicate parameter in m layers of the deep neural network model Quantity, om,nIndicate that the power of the weight of m n-th of node of layer in deep neural network model connects numerical value, μmIndicate the depth The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
Step S205:Judge whether all layers in deep neural network model traverse completion;When judging result is to be, Step S207 can be executed;Conversely, can then execute step S206.
Step S206:Next layer of reservation weight is obtained, and is executed since step S202.
In specific implementation, in the final reservation weight for determining current layer, it may be determined that the final reserve section of current layer Point and deletion of node, and next layer of reservation weight is then node corresponding with the final reservation weight of current layer and next layer Weight between all nodes.When obtaining next layer of reservation weight, can be executed since step S202, it is next with determination The final reservation weight of layer.
Step S207:Re -training is carried out to the deep neural network model after simplification.
In specific implementation, it during carrying out re -training to the deep neural network model after quantization, may be used The method of equalization is criticized to scale, translate standardization network, to improve the performance for the deep neural network model that training obtains.
Below in conjunction with specific example, the compression method of the deep neural network model in the embodiment of the present invention is carried out It introduces.
Referring to Fig. 3, by taking deep neural network model is M layers as an example, selection that there are two types of the outputs of M layer depth neural networks, That is { yk| k=1,2 }, in the full connection iterative calculation for carrying out each layer parameter using backpropagation:
First time iterative calculation is carried out, determines deletion of node lM,1, i.e., M layers of reservation node is lM,2、lM,3And lM,4
When iterating to calculate for second, when being iterated to calculate because of first time, M layers of reservation node is lM,2、lM,3And lM,4, Deletion of node is lM,1, therefore deletion of node lM,1Weight between M-1 layers of node will not participate in the fortune that full powers connect numerical value It calculates, and only calculates M-1 layers of node and M layers of reservation node lM,2、lM,3And lM,4Between weight full powers connect number Value, it is final to determine deletion of node lM-1,2And lM-1,4, reservation node is lM-1,1And lM-1,3
When third time iterates to calculate, when being iterated to calculate because of second, M-1 layers of reservation node is lM-1,1And lM-1,3, Deletion of node is lM-1,2And lM-1,4, therefore deletion of node lM-1,2And lM-1,4Weight between M-2 layers of node will not participate in entirely The operation of power connection numerical value, and only calculate M-1 layers of reservation node lM-1,1And lM-1,3Power between M-2 layers of node The full powers of weight connect numerical value, and so on, until M layers all traverse and complete.
The above-mentioned method in the embodiment of the present invention is described in detail, below will be to the above-mentioned corresponding dress of method It sets and is introduced.
Referring to Fig. 4, a kind of compression set 400 of depth neural network model in the embodiment of the present invention may include obtaining Unit 401 and simplified element 402, wherein:
Acquiring unit 401 is suitable for obtaining the deep neural network model trained;
Simplified element 402 is suitable for the overall contribution degree based on parameter in the deep neural network model, to having trained Each layer parameter of deep neural network model simplified, the deep neural network model after being simplified.
In an embodiment of the present invention, the simplified element 402, be suitable for according to from rear to preceding sequence to the depth trained Each layer of degree neural network model is traversed, and the reservation weight of the current layer traversed is obtained;Calculate the current layer of traversal extremely Reservation weight full powers connect numerical value;The parameter of the full powers connection numerical value and current layer is in the deep neural network Contribution degree is associated;It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and The final reservation weight of current layer is determined according to comparison result;It obtains corresponding with the final reservation weight of current layer in next layer Weight, as next layer of reservation weight, until all layers of traversal of the deep neural network model are completed.
In an embodiment of the present invention, the simplified element 402 is suitable for when the deep neural network model is M layers, The full powers that each reservation weight of the current layer of traversal extremely is calculated using following formula connect numerical value:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight, NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to The quantity of reservation weight in the current layer traversed in (M-1) layer.
In an embodiment of the present invention, the simplified element 402 determines that the full powers being calculated connection numerical value is small suitable for working as In or equal to current layer is corresponding when simplifying threshold value, corresponding reservation weight is deleted;When the determining full powers connection being calculated When numerical value simplifies threshold value more than or equal to current layer is corresponding, the corresponding reservation weight of current layer is retained.
In an embodiment of the present invention, described device 400 can also include threshold computation unit 403, wherein:
Threshold computation unit 403 described corresponding simplifies threshold value suitable for being calculated using following formula:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model Quantity, om,nIndicate that the full powers of n-th of reservation weight of m layers in deep neural network model connect numerical value, μmIndicate the depth The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
In an embodiment of the present invention, described device 400 can with training unit 404, wherein:
Training unit 404 is suitable for carrying out re -training to the deep neural network model after simplification.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored thereon with computer instruction, described Computer instruction executes the step of compression method of the deep neural network model in above-described embodiment when running, repeat no more.
The embodiment of the present invention additionally provides a kind of terminal, including memory and processor, and energy is stored on the memory Enough computer instructions run on the processor;The processor executes above-described embodiment when running the computer instruction In deep neural network model compression method the step of, repeat no more.
Using the said program in the embodiment of the present invention, the whole tribute based on parameter in the deep neural network model Each layer parameter of degree of offering, the deep neural network model to having trained simplifies, and can make full use of the depth nerve net Incidence relation in network model between each layer parameter compresses deep neural network model, can improve compressed depth The precision and validity of neural network model.
Further, the corresponding full powers connection of each layer parameter of neural network is analyzed layer by layer according to from rear to preceding sequence (Overall Connection Weight, OCW), is weighed with the contribution degree to parameter, can be based on the multiple ginsengs of current layer The distribution situation of number contribution degree deletes the lower parameter of contribution degree, and the parameter of deep neural network model is simplified with this.Due to examining The influence of full powers connection is considered, therefore has not only reflected contribution degree of the parameter to current layer, additionally it is possible to embody defeated to neural network Go out the overall contribution degree of result, it can the further validity simplified of parameter.
Further, re -training is carried out to the deep neural network model after simplification, can further increased after simplifying Deep neural network model performance.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include:ROM, RAM, disk or CD etc..
Although present disclosure is as above, present invention is not limited to this.Any those skilled in the art are not departing from this It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute Subject to the range of restriction.

Claims (14)

1. a kind of compression method of deep neural network model, which is characterized in that including:
Obtain the deep neural network model trained;
Overall contribution degree based on parameter in the deep neural network model, to the deep neural network model trained Each layer parameter is simplified, the deep neural network model after being simplified.
2. the compression method of deep neural network model according to claim 1, which is characterized in that described to be existed based on parameter Overall contribution degree in the deep neural network model carries out letter to each layer parameter for the deep neural network model trained Change, including:
It is traversed according to each layer of the deep neural network model from rear to preceding sequence to having trained, obtains working as of traversing The reservation weight of front layer;
The full powers for calculating the reservation weight of the current layer of traversal extremely connect numerical value;The parameter of full powers the connection numerical value and current layer Contribution degree in the deep neural network is associated;
It simplifies the power of the reservation weight of current layer connection numerical value is corresponding with current layer threshold value and is compared, and tied according to comparing Fruit determines the final reservation weight of current layer;
Weight corresponding with the final reservation weight of current layer in next layer is obtained, as next layer of reservation weight, until institute All layers of traversal for stating deep neural network model are completed.
3. the compression method of deep neural network model according to claim 2, which is characterized in that when depth nerve When network model is M layers, the full powers of the reservation weight for calculating the current layer of traversal extremely connect numerical value, including:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight, NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to The quantity of reservation weight in the current layer traversed in (M-1) layer.
4. the compression method of deep neural network model according to claim 2, which is characterized in that it is described according to compare knot Fruit determines the retention parameter of current layer, including:
When determining that the full powers being calculated connection numerical value simplifies threshold value less than or equal to current layer is corresponding, by corresponding reservation Weight is deleted;
When determining that the full powers being calculated connection numerical value simplifies threshold value more than current layer is corresponding, by the corresponding reservation of current layer Weight retains.
5. the compression method of deep neural network model according to claim 4, which is characterized in that described corresponding to simplify Threshold value obtains in the following manner:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model Quantity, om,nIndicate that the power of n-th of reservation weight of m layers in deep neural network model connects numerical value, μmIndicate the depth god The mean value of the full powers connection numerical value of m layers of the reservation weight through network model.
6. the compression method of deep neural network model according to claim 1, which is characterized in that further include:To simplification Deep neural network model afterwards carries out re -training.
7. a kind of compression set of deep neural network model, which is characterized in that including:
Acquiring unit is suitable for obtaining the deep neural network model trained;
Simplified element is suitable for the overall contribution degree based on parameter in the deep neural network model, to the depth trained Each layer parameter of neural network model is simplified, the deep neural network model after being simplified.
8. the compression set of deep neural network model according to claim 7, which is characterized in that the simplified element, Suitable for being traversed according to each layer of the deep neural network model from rear to preceding sequence to having trained, working as of traversing is obtained The reservation weight of front layer;The full powers for calculating the reservation weight of the current layer of traversal extremely connect numerical value;Full powers connection numerical value with Contribution degree of the parameter of current layer in the deep neural network is associated;The power of the reservation weight of current layer is connected into numerical value It is corresponding with current layer to simplify threshold value and be compared, and the final reservation weight of current layer is determined according to comparison result;Under acquisition Weight corresponding with the final reservation weight of current layer in one layer, as next layer of reservation weight, until depth nerve All layers of traversal of network model are completed.
9. the compression set of deep neural network model according to claim 8, which is characterized in that the simplified element, Suitable for when the deep neural network model is M layers, each reservation of the current layer of traversal extremely is calculated using following formula The full powers of weight connect numerical value:
Wherein, OM,nIndicate that the full powers connection numerical value of M layers of n-th of reservation weight, n indicate M layers of n-th of reservation weight, NmIndicate the quantity of M layers of reservation weight, oM,nThe 1st is indicated to n-th of right of retention in the current layer traversed in (M-1) layer The full powers of weight connect numerical value, and n indicates n-th of reservation weight in the 1st to the current layer traversed in (M-1) layer, K the 1st to The quantity of reservation weight in the current layer traversed in (M-1) layer.
10. the compression set of deep neural network model according to claim 8, which is characterized in that the simplified element, When full powers connection numerical value suitable for being calculated when determination simplifies threshold value less than or equal to current layer is corresponding, by corresponding reservation Weight is deleted;It, will be current when determining that the full powers being calculated connection numerical value simplifies threshold value more than or equal to current layer is corresponding The corresponding reservation weight of layer retains.
11. the compression set of deep neural network model according to claim 10, which is characterized in that further include:Threshold value Computing unit described corresponding simplifies threshold value suitable for being calculated using following formula:
Wherein, m indicates m layers of the deep neural network model, θmIndicate m layers of correspondence of the deep neural network model Simplify threshold value, α indicates preset mass parameter, NmIt indicates to retain weight in m layers of the deep neural network model Quantity, om,nIndicate that the full powers of n-th of reservation weight of m layers in deep neural network model connect numerical value, μmIndicate the depth The mean value of the full powers connection numerical value of m layers of reservation weight of neural network model.
12. the compression set of deep neural network model according to claim 7, which is characterized in that further include:Training is single Member is suitable for carrying out re -training to the deep neural network model after simplification.
13. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the computer instruction Perform claim requires the step of compression method of 1 to 6 any one of them deep neural network model when operation.
14. a kind of terminal, which is characterized in that including memory and processor, being stored on the memory can be at the place The computer instruction run on reason device;Perform claim requires any one of 1 to 6 institute when the processor runs the computer instruction The step of compression method for the deep neural network model stated.
CN201710266595.9A 2017-04-21 2017-04-21 Compression method and device, terminal, the storage medium of deep neural network model Pending CN108734267A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710266595.9A CN108734267A (en) 2017-04-21 2017-04-21 Compression method and device, terminal, the storage medium of deep neural network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710266595.9A CN108734267A (en) 2017-04-21 2017-04-21 Compression method and device, terminal, the storage medium of deep neural network model

Publications (1)

Publication Number Publication Date
CN108734267A true CN108734267A (en) 2018-11-02

Family

ID=63933513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710266595.9A Pending CN108734267A (en) 2017-04-21 2017-04-21 Compression method and device, terminal, the storage medium of deep neural network model

Country Status (1)

Country Link
CN (1) CN108734267A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112749797A (en) * 2020-07-20 2021-05-04 腾讯科技(深圳)有限公司 Pruning method and device for neural network model
US11436442B2 (en) * 2019-11-21 2022-09-06 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11436442B2 (en) * 2019-11-21 2022-09-06 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof
CN112749797A (en) * 2020-07-20 2021-05-04 腾讯科技(深圳)有限公司 Pruning method and device for neural network model

Similar Documents

Publication Publication Date Title
US20200311552A1 (en) Device and method for compressing machine learning model
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
CN108734264A (en) Deep neural network model compression method and device, storage medium, terminal
CN110334580A (en) The equipment fault classification method of changeable weight combination based on integrated increment
CN111178520B (en) Method and device for constructing neural network
CN110188863B (en) Convolution kernel compression method of convolution neural network suitable for resource-limited equipment
CN111079899A (en) Neural network model compression method, system, device and medium
WO2020237904A1 (en) Neural network compression method based on power exponent quantization
CN109740734B (en) Image classification method of convolutional neural network by optimizing spatial arrangement of neurons
CN109067427B (en) A kind of frequency hop sequences prediction technique based on Optimization-type wavelet neural network
CN114154646A (en) Efficiency optimization method for federal learning in mobile edge network
CN107292458A (en) A kind of Forecasting Methodology and prediction meanss applied to neural network chip
CN109388779A (en) A kind of neural network weight quantization method and neural network weight quantization device
CN108734266A (en) Compression method and device, terminal, the storage medium of deep neural network model
CN112153617B (en) Terminal equipment transmission power control method based on integrated neural network
CN114781650B (en) Data processing method, device, equipment and storage medium
CN108734287A (en) Compression method and device, terminal, the storage medium of deep neural network model
CN112436992B (en) Virtual network mapping method and device based on graph convolution network
CN107169566A (en) Dynamic neural network model training method and device
CN109754122A (en) A kind of Numerical Predicting Method of the BP neural network based on random forest feature extraction
CN115392441A (en) Method, apparatus, device and medium for on-chip adaptation of quantized neural network model
CN108734267A (en) Compression method and device, terminal, the storage medium of deep neural network model
CN116957106A (en) Federal learning model training method based on dynamic attention mechanism
CN117574429A (en) Federal deep learning method for privacy enhancement in edge computing network
CN115775026A (en) Federated learning method based on organization similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181102