CN110490323A - Network model compression method, device, storage medium and computer equipment - Google Patents
Network model compression method, device, storage medium and computer equipment Download PDFInfo
- Publication number
- CN110490323A CN110490323A CN201910770193.1A CN201910770193A CN110490323A CN 110490323 A CN110490323 A CN 110490323A CN 201910770193 A CN201910770193 A CN 201910770193A CN 110490323 A CN110490323 A CN 110490323A
- Authority
- CN
- China
- Prior art keywords
- network model
- network
- replacement
- compressed
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007906 compression Methods 0.000 title claims abstract description 65
- 230000006835 compression Effects 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 59
- 238000003860 storage Methods 0.000 title claims abstract description 13
- 230000000694 effects Effects 0.000 claims abstract description 48
- 239000010410 layer Substances 0.000 claims description 92
- 230000006870 function Effects 0.000 claims description 48
- 238000012549 training Methods 0.000 claims description 37
- 230000015654 memory Effects 0.000 claims description 20
- 238000013528 artificial neural network Methods 0.000 claims description 19
- 238000011156 evaluation Methods 0.000 claims description 17
- 239000002356 single layer Substances 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 13
- 238000012360 testing method Methods 0.000 claims description 13
- 230000007115 recruitment Effects 0.000 claims description 12
- 230000002779 inactivation Effects 0.000 claims description 11
- 230000002441 reversible effect Effects 0.000 claims description 11
- 230000001537 neural effect Effects 0.000 abstract description 14
- 230000002829 reductive effect Effects 0.000 abstract description 11
- 230000008569 process Effects 0.000 description 18
- 238000010586 diagram Methods 0.000 description 8
- 235000013399 edible fruits Nutrition 0.000 description 6
- 241000208340 Araliaceae Species 0.000 description 5
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 5
- 235000003140 Panax quinquefolius Nutrition 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 235000008434 ginseng Nutrition 0.000 description 5
- 238000012804 iterative process Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000000415 inactivating effect Effects 0.000 description 3
- 210000005036 nerve Anatomy 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000003475 lamination Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This application involves a kind of network model compression method, device, storage medium and computer equipments, obtain network model to be compressed;Determine the currently pending structure in network model to be compressed;Destination Network Structure is searched for from search space, and replaces currently pending structure using Destination Network Structure;Modelling effect assessment is carried out to network model after replacement using L1 regularization loss function;When assessment result meets preset requirement, return to the step of determining the currently pending structure in network model to be compressed, until reason structure handles completion everywhere in network model to be compressed, and by network model after finally obtained replacement, as network model after compression.It is searched for by neural framework to be compressed to network model, it can guarantee the accuracy of network model, meanwhile, the parameter amount of network model can be effectively reduced by L1 Regularization, network model complexity is reduced, thus the problem of being effectively improved network architecture redundancy.
Description
Technical field
This application involves field of computer technology, more particularly to a kind of network model compression method, device, storage medium
And computer equipment.
Background technique
With the progress of science and technology, the depth learning technology based on neural network model is rapidly developed, should
Technology is including that multiple application fields such as picture recognition, target detection, semantic segmentation, speech recognition and natural language processing all take
Obtained breakthrough achievement.
However, the ginseng enormous amount of deep neural network model so that model structure redundancy, and then cause internal reference consumption compared with
Greatly.
Summary of the invention
Based on this, it is necessary to be directed to technical problem of the existing technology, providing one kind can effectively improve network model knot
Network model compression method, device, storage medium and the computer equipment of structure redundancy issue.
A kind of network model compression method, comprising:
Obtain network model to be compressed;
Determine the currently pending structure in the network model to be compressed;
Destination Network Structure is searched for from search space, and described currently pending using Destination Network Structure replacement
Structure, network model after being replaced;
Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, obtains assessment knot
Fruit;
When the assessment result meets preset requirement, return currently pending in the determining network model to be compressed
The step of structure, until reason structure handles completion everywhere in the network model to be compressed, and by finally obtained replacement
Network model afterwards, as network model after compression.
A kind of network model compression set, comprising:
Model obtains module, for obtaining network model to be compressed;
Structure determination module, for determining the currently pending structure in the network model to be compressed;
Structure replacement module for searching for Destination Network Structure from search space, and uses the Destination Network Structure
Replace the currently pending structure, network model after being replaced;
Recruitment evaluation module, for carrying out modelling effect to network model after the replacement using L1 regularization loss function
Assessment, obtains assessment result;When the assessment result meets preset requirement, returns and determine in the network model to be compressed
The step of currently pending structure, until reason structure handles completion everywhere in the network model to be compressed, and will be final
Network model after obtained replacement, as network model after compression.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Device performs the steps of when executing the computer program
Obtain network model to be compressed;
Determine the currently pending structure in the network model to be compressed;
Destination Network Structure is searched for from search space, and described currently pending using Destination Network Structure replacement
Structure, network model after being replaced;
Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, obtains assessment knot
Fruit;
When the assessment result meets preset requirement, return currently pending in the determining network model to be compressed
The step of structure, until reason structure handles completion everywhere in the network model to be compressed, and by finally obtained replacement
Network model afterwards, as network model after compression.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
It is performed the steps of when row
Obtain network model to be compressed;
Determine the currently pending structure in the network model to be compressed;
Destination Network Structure is searched for from search space, and described currently pending using Destination Network Structure replacement
Structure, network model after being replaced;
Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, obtains assessment knot
Fruit;
When the assessment result meets preset requirement, return currently pending in the determining network model to be compressed
The step of structure, until reason structure handles completion everywhere in the network model to be compressed, and by finally obtained replacement
Network model afterwards, as network model after compression.
Above-mentioned network model compression method, device, storage medium and computer equipment, obtain network model to be compressed;Really
Currently pending structure in fixed network model to be compressed;Destination Network Structure is searched for from search space, and uses target network
Network structure replaces currently pending structure, network model after being replaced;Using L1 regularization loss function to network after replacement
Model carries out modelling effect assessment, obtains assessment result;When assessment result meets preset requirement, returns and determine network to be compressed
The step of currently pending structure in model, until reason structure handles completions, and general everywhere in network model to be compressed
Network model after finally obtained replacement, as network model after compression.It is searched for by neural framework to be carried out to network model
Compression, it is ensured that the accuracy of network model, meanwhile, the parameter of network model can be effectively reduced by L1 Regularization
Amount reduces network model complexity, thus the problem of being effectively improved network architecture redundancy.
Detailed description of the invention
Fig. 1 is the applied environment figure of network model compression method in one embodiment;
Fig. 2 is the flow diagram of network model compression method in one embodiment;
Fig. 3 is the sequential schematic that currently pending structure is determined in one embodiment;
Fig. 4 is the schematic diagram of network layer and node in one embodiment;
Fig. 5 is the sequential schematic that currently pending structure is determined in another embodiment;
Fig. 6 is the processing flow schematic diagram that Destination Network Structure includes random inactivation unit in one embodiment;
Fig. 7 is to carry out modelling effect assessment to network model after replacement using L1 regularization loss function in one embodiment
Flow diagram;
Fig. 8 is to carry out modelling effect assessment to network model after training by L1 regularization loss function in one embodiment
Flow diagram;
Fig. 9 is the structural block diagram of network model compression set in one embodiment;
Figure 10 is the structural block diagram of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and
It is not used in restriction the application.
Fig. 1 is the applied environment figure of network model compression method in one embodiment.Network model pressure provided by the present application
Contracting method can be applied in application environment as shown in Figure 1.Wherein, terminal 10 and server 20 are communicated by network.This
The network model compression method of application one embodiment may operate on server 20, and terminal 10 sends out network model to be compressed
It send to server 20, server 20 obtains network model to be compressed;Determine the currently pending structure in network model to be compressed;
Destination Network Structure is searched for from search space, and replaces currently pending structure using Destination Network Structure, after obtaining replacement
Network model;Modelling effect assessment is carried out to network model after replacement using L1 regularization loss function, obtains assessment result;When
When assessment result meets preset requirement, the step of determining the currently pending structure in network model to be compressed is returned, until to
Each structure to be processed in compression network model handles completion, and by network model after finally obtained replacement, as compression
Network model afterwards.Wherein, terminal 10 specifically can be terminal console or mobile terminal, such as desktop computer, tablet computer, notes
This computer, smart phone etc..Server 20 can be independent physical server, physical server cluster or Virtual Service
Device.
In another embodiment, the network model compression method of the application may operate in terminal 10, and terminal 10 obtains
Take network model to be compressed;Determine the currently pending structure in network model to be compressed;Target network is searched for from search space
Network structure, and currently pending structure is replaced using Destination Network Structure, network model after being replaced;It is damaged using L1 regularization
It loses function and modelling effect assessment is carried out to network model after replacement, obtain assessment result;When assessment result meets preset requirement,
The step of determining the currently pending structure in network model to be compressed is returned to, until each to be processed in network model to be compressed
Structure handles completion, and by network model after finally obtained replacement, as network model after compression.
As shown in Fig. 2, in one embodiment, providing a kind of network model compression method.The present embodiment is mainly with this
Method is applied to the terminal 10 (or server 20) in above-mentioned Fig. 1 to illustrate.Referring to Fig. 2, the network model compression method
Specifically comprise the following steps:
S110 obtains network model to be compressed.
Wherein, network model to be compressed can be the network model due to the ginseng enormous amount redundancy that leads to that structure is complicated, or
Person is the network model for having compression requirements.In general, the volume and effect of network are proportional, that is, the network of model is joined
Number is more, and model volume is bigger, and the accuracy of model is higher, for example, the relatively good deep learning model of some effects may have
There are millions of even more than one hundred million parameter amounts.However, ginseng enormous amount also results in high calculating cost, to limit net
Network model programmable gate array (Field-Programmable Gate Array, FPGA), reduced instruction set computing at the scene
Machine (Reduced Instruction Set Computer, RISC), microprocessor (Advanced RISC Machine, ARM)
Etc. the application in resource-constrained computing platform.
Based on this, need to compress network model to reduce the complexity of its model structure, specifically, network model
Compression refer to the process of reject network model in redundancy parameters and channels.For example, the component units of network model are network
Layer, network layer, which is based on function difference, can be divided into convolutional layer, active coating, pond layer, mapping layer and batch normalization layer etc., to nerve
The compression of network model can refer to change, replacement or the parameter for deleting network layer, and by taking convolutional layer as an example, model compression can
To refer to that the size to convolution kernel, quantity are compressed, wherein the size of convolution kernel includes: port number, height, width.
S120 determines the currently pending structure in network model to be compressed.
Currently pending structure refers to the network layer for needing to carry out compression processing in network model, is obtaining network to be compressed
After model, currently pending structure can be determined according to preset rules.Wherein, preset rules may include default processing sequence
Rule such as successively determines in network model that it is current for needing the network layer structure for carrying out compression processing according to sequence from back to front
Structure to be processed.
Specifically, it can be and the single layer structure in network model be determined as currently pending structure, for example, the volume of single layer
Lamination, pond layer of single layer etc., due to being handled single layer network structure, to guarantee the accuracy of model compression.Separately
Outside, it is contemplated that the network layer number of network model is larger, be also possible to for the multilayered structure in network model being determined as currently to
Processing structure, the multilayered structure being such as made of multiple convolutional layers, the multilayer knot being made of multiple convolutional layers and multiple pond layers
Structure etc., to guarantee the efficiency of model compression.
S130 searches for Destination Network Structure from search space, and replaces currently pending knot using Destination Network Structure
Structure, network model after being replaced.
In this application, main that (Neural is searched for using neural framework when being compressed to network model
Architecture Search, NAS) architecture design is carried out to network model, by using optimization and layer (the i.e. target simplified
Network structure) come the original network layer of alternative networks model (i.e. currently pending structure), determine currently pending structure and
Search Destination Network Structure specifically can be by the controller in the search of neural framework and execute, and controller is by using search plan
Destination Network Structure is slightly searched for from search space.
Wherein, search strategy defines how controller quickly and accurately finds more appropriate Destination Network Structure, searches
Rope strategy specifically can be random search, Bayes's optimization, transfer learning, intensified learning, evolution algorithm, genetic algorithm, greed
Any one of algorithm and the algorithm based on gradient.Search space refers to that the set of multiple network unit, network unit include
Base neural network unit, such as convolution kernel, activation primitive, pond unit, RNN (Recurrent Neural Network, circulation
Neural network) unit etc..Wherein, convolution kernel (Convolution Kernel) refers to when performing image processing, gives input
Image, each pixel is the weighted average of pixel in a zonule in input picture in the output image, wherein weight by
One function definition, the function are then known as convolution kernel.Activation primitive (Activation Function) refers to be run on neuron
Function, effect be responsible for the input of neuron being mapped to output end.Pond (Pooling) unit is mainly used for parameter drop
Dimension, the quantity of compressed data and parameter reduce over-fitting, while improving the fault-tolerance of model.RNN unit refers to for handling sequence
The neural network unit of column data.In addition, network unit can also include other units, such as random inactivation (Dropout) unit,
Random inactivation is the method optimized to the neural network with depth structure, by by the fractional weight of parameter or output with
Machine zero, the interdependency reduced between node reduce it to realize the regularization (Regularization) of neural network
Structure complexity.
Destination Network Structure refers to the structure being made of the network unit in search space, for example, by multiple convolution kernel groups
At structure, the structure being made of multiple pond units etc..After determining currently pending structure, it can be searched by neural framework
The controller of rope searches for Destination Network Structure according to search strategy from search space, and currently pending structure is replaced with mesh
Mark network structure, network model after being replaced.
S140 carries out modelling effect assessment to network model after replacement using L1 regularization loss function, obtains assessment knot
Fruit.
L1 regularization loss function refers to uses L1 regularization in loss function, wherein loss function (Loss
It Function) is that chance event or its value in relation to stochastic variable are mapped as nonnegative real number to indicate the chance event
" risk " or the function of " loss ", can be used for measuring the modelling effect of model in this application, and loss function can specifically select
Select mean square error, cross entropy, Hinge Loss etc..L1 regularization refers to the weight of parameter in network model to 0 close place
Reason process.In network model, if there are correlations between parameters, the complexity of network model will increase, and
And the interpretability of network model is not improved, it is therefore desirable to parameter selection is carried out, so that network model becomes easy solution
It releases.For example, when needing to analyze the reason of obtaining trigger event A by network model, it is assumed that there are 1000 possible influences
It is larger then to analyze difficulty for the factor.At this point, if by being trained to network model, so that the weight of a part of impact factor
It is 0, then when being analyzed according to remaining impact factor, then can substantially reduces analysis difficulty.Therefore, just by using L1
Then change processing, automatically selecting for network model parameter can be carried out, the weight of network model is concentrated mainly on the ginseng of high different degree
On number, for unessential parameter, weight can tend to 0 quickly, so that network model becomes sparse, realize network model pressure
The purpose of contracting.
Specifically, if L1 regularization loss function is L, the loss function that Regularization is not used is L0, network model
Network layer sum be H, each layer of weight vectors are Wk, k=1,2 ..., H.Assuming that the number of plies to network model and each
The parameter amount of layer all carries out Regularization, then the expression formula of L1 regularization loss function are as follows:
Wherein, λ and μk, k=1,2 ..., H are regularization coefficient, need to be solved in modelling effect assessment.
Currently pending structure is being replaced using Destination Network Structure, after being replaced after network model, neural framework
The controller of search carries out modelling effect assessment, modelling effect assessment to network model after replacement using L1 regularization loss function
Process includes the parameter progress L1 Regularization to network model, thus while obtaining modelling effect assessment result, it can
So that the parameter of network model is reduced, model structure complexity is reduced.
S150 returns to the currently pending knot determined in network model to be compressed when assessment result meets preset requirement
The step of structure, until each structure to be processed in network model to be compressed handles completion, and by net after finally obtained replacement
Network model, as network model after compression.
After obtaining assessment result, the controller of neural framework search judges assessment result, determines the replacement
Whether the assessment result of model meets preset requirement afterwards, if not satisfied, then controller can re-search for new target network knot
Structure, and the process of above-mentioned network architecture replacement, network model recruitment evaluation is repeated, until finding satisfactory target network
Network structure.In addition, the controller of neural framework search carries out Destination Network Structure search, network architecture replacement, network mould
The process of type recruitment evaluation, it is believed that be the process of controller iteration (Controller Epoch), change carrying out controller
During generation, further includes: when controller the number of iterations reaches preset times (such as 2000 times), conformed to if not searching also
The Destination Network Structure asked then can keep the original structure of currently pending structure constant, i.e., not to currently pending structure into
Row structure replacement processing.
It should be noted that the compression process of network model is iterative process in the application, in iterative process
In, single treatment process includes having carried out structure replacement to all currently pending structures in network model, and to correspondence
Replaced network model carried out modelling effect assessment.It is completed when the iterative process of network model meets default iteration
When condition, it can be confirmed that network model compression is completed, presetting iteration and completing condition includes default the number of iterations and default iteration
At least one of duration.
The present embodiment provides a kind of network model compression methods, are searched for by neural framework to press network model
Contracting, it is ensured that the accuracy of network model, meanwhile, the parameter of network model can be effectively reduced by L1 Regularization
Amount reduces network model complexity, thus the problem of being effectively improved network architecture redundancy.
In one embodiment, network model to be compressed is deep neural network (Deep Neural Network, DNN),
Neural network generally includes input layer, hidden layer and output layer, and deep neural network refers to the nerve comprising multiple hidden layers
Network, since implicit layer number is more, the number of parameters of deep neural network is also relatively more, it is thus possible to depth mind
It is compressed through network to reduce its number of parameters, reduces complicated network structure degree.
When compressing to deep neural network, the currently pending structure in network model to be compressed is determined, comprising:
According to the reverse order of output layer, hidden layer, input layer, the single layer structure or multilayered structure in network model to be compressed are determined
For currently pending structure.
Specifically, as shown in figure 3, determining the sequential schematic of currently pending structure, deep neural network for controller
Including input layer A, hidden layer B and output layer C, hidden layer B successively includes B1Layer is to BnLayer, the structure of network model is more complicated, n
Numerical value it is bigger.Arrow direction is to determine the direction of currently pending structure in figure, and processor determines deep neural network first
Output layer C be currently pending structure;It then, can be according to when determining currently pending structure from multiple hidden layer B
From the hidden layer B being connect with output layernTo the hidden layer B being connect with input layer1Sequence determine currently pending structure;Finally,
Determine that the output layer A of deep neural network is currently pending structure.Since there is a certain error for the output of neural network, press
It is handled according to the reverse order of network structure, i.e., carries out Topological expansion from output layer and weight updates, can subtract
Small error improves model accuracy.
Controller can be when determining single layer structure or multilayered structure is currently pending structure according to compression progress
It determines, for example, can choose the number of plies more multilayered structure is currently pending structure in model compression early period, thus plus
Fast compression progress;In model compression mid-term, the number of plies of multilayered structure can be gradually decreased;In the model compression later period, can choose
Single layer structure is preceding structure to be processed, to guarantee compression accuracy.In addition, controller is also possible to the weight according to each layer structure
It determines, for example, biggish for weight network layer, can be and determine that corresponding single layer structure is currently pending structure;It is right
It in the lesser network layer of weight, can be after corresponding network layer is formed multilayered structure, determine that corresponding multilayered structure is to work as
Preceding structure to be processed.In addition, processor can also be according to search strategy specific number of plies when determining currently pending structure.
In one embodiment, neural framework search can search for (Efficient Neural using efficiently neural framework
Architecture Search, ENAS) model, ENAS defines the concept of node (Node), the net in node and network model
The function of network layers is similar, and difference is that network layer depends on mutually, i.e., after the latter network layer is attached to previous network layer
Face, and node can arbitrarily replace preposition input, it is specific as shown in Figure 4.Efficiently the search of nerve framework needs to learn and choose
What is selected is the line relationship between node, and different lines can generate different network structures, select from different network structures
Optimal structure is selected that is, " design " new network model.
With reference to Fig. 4, drawn a straight line the first network model formed by network layer, intersects what line formed with by node
Second network model is not identical, and the two is different calculating figure (Graph), and first network model kind reduced model weight
Internal event (Checkpoint) can not imported into the second network model, but intuitively see that the position of node does not occur
Variation, if the tensor (Tensor) and shape (Shape) that output and input are constant, the weight number of these nodes is the same
, that is to say, that the weight of each network layer in the left side is can to correspond to the node for copying to the right.Based on the above principles,
The shared purpose of weight may be implemented in ENAS, specifically, defines the fixed node of quantity first, then goes to control by one group of parameter
The preposition node of each node connection is made, this group of parameter is the final select ginseng for being used to indicate fixed network structure
Number, this group of parameter can pass through optimization algorithm, such as Bayes's optimization, DQN (Deep Q-Learing, depth Q-
Learning), tuning selects to obtain.
In one embodiment, according to the reverse order of output layer, hidden layer, input layer, network model to be compressed is determined
In single layer structure or multilayered structure be currently pending structure, comprising: according to output layer, hidden layer, input layer it is reverse
Sequentially, based on the assessment result to network model after last replaced replacement, the single layer in network model to be compressed is determined
Structure or multilayered structure are currently pending structure.
Specifically, it as shown in figure 5, controller is when determining currently pending structure, can also be in conjunction with last replacement
The modelling effect assessment result of network model afterwards determines that dotted line frame indicates determining currently pending structure in Fig. 5.Example
Such as, the B of network modeliLayer and Bi-1The structure of layer is closer to, in the B to network modeliAfter layer carries out structure replacement, if control
The modelling effect assessment result of network model is preferable after device judgement replacement, then can continue to Bi-1Layer carries out structure replacement, thus
Further increase modelling effect.When handling new network layer, by combining history replacement record information, can be improved
The replacement efficiency and modelling effect of controller.
In one embodiment, Destination Network Structure includes base neural network unit and random one inactivated in unit
Kind is a variety of.When controller searches for Destination Network Structure in search space, it can be the different types of unit of selection and carry out group
Conjunction obtains Destination Network Structure, for example, the Destination Network Structure being made of pond unit and convolution kernel;It is also possible to select multiple
The unit of same type is combined to obtain Destination Network Structure, for example, the Destination Network Structure being made of multiple convolution kernels;Separately
Outside, the parameter of the unit of same type is also possible to different specifications, for example, by multiple and different sizes, different moving step length
The Destination Network Structure of convolution kernel composition, the target being made of the pond unit of different pond sizes, pond step-length, pond type
Network structure etc..
In one embodiment, when Destination Network Structure includes random inactivation unit, L1 regularization loss function is used
Modelling effect assessment is carried out to network model after replacement, obtains assessment result, comprising: after replacing at random in network model, with
Destination Network Structure connection parameter in one or more be set as failure state after, use L1 regularization loss function pair
Network model carries out modelling effect assessment after replacement, obtains assessment result.
The principle inactivated at random be set failure state for some of which parameter at random in each iterative process so that
In each iteration, partial parameters are in effective status in network model, thus, with the increase of the number of iterations, due to
Different parameters may have the case where failing at any time, and therefore, the sensibility between model parameter can reduce.
As shown in fig. 6, with Bi-1Layer and BiFor layer, inactivating unit at random, which carries out, is included the case where to Destination Network Structure
It illustrates.In network model before replacement, Bi-1Layer includes parameter M1, M2, M3 etc., BiLayer includes parameter N1, N2, N3, N4
Deng in Destination Network Structure replacement B of the use comprising inactivating unit at randomiAfter layer, parameter M1, M2, M3 and random inactivation unit
There are channel connections, at this point, when carrying out modelling effect assessment, the random internal processes for inactivating unit at random by one or
Multiple parameters are set as invalid state, for example, setting invalid state for parameter M2, i.e., inactivation unit only uses parameter M1 at random
And M3.It should be noted that the setting of the invalid parameters state only carries out inside inactivation unit at random, that is to say, that random
Unit is inactivated not to Bi-1The virtual condition of the parameter of layer impacts.
In one embodiment, as shown in fig. 7, carrying out model to network model after replacement using L1 regularization loss function
Recruitment evaluation obtains assessment result, including step S142 to step S146.
S142, obtains sample data, sample data include number of training accordingly and test sample data;
S144 is trained network model after replacement using training sample data, network model after being trained;
S146 carries out model effect to network model after training by L1 regularization loss function according to test sample data
Fruit assessment, obtains assessment result.
Optionally, it is commented as shown in figure 8, carrying out modelling effect to network model after training by L1 regularization loss function
Estimate, obtains assessment result, including step S1462 to step S1464.
S1462 is carried out at L1 regularization by each network layer of the L1 regularization loss function to network model after training
Reason, the quantity for the parameter that weight is 0 in network model after training that treated, greater than being weighed in network model after the training before processing
The quantity for the parameter that weight is 0;
S1464, to treated training after network model carry out modelling effect assessment, obtain assessment result.
Specifically, it can be after obtaining the whole sample data that user uploads, data carried out to whole sample data and are beaten
It dissipates and data cutting processing, obtains number of training accordingly and test sample data.Alternatively, it is also possible to being to directly acquire to have divided
Good number of training is accordingly and test sample data.Then, network model after replacement is instructed using training sample data
Practice, training process is specifically included to parameter, the update of the alternating of parameters weighting etc., until network model meets convergence item after replacement
Part.After training, modelling effect assessment is carried out using test sample data and L1 regularization loss function, to be commented
Estimate as a result, in order to determine it is more better than original structure whether current structure replacement operation plays according to assessment result
Imitate model fruit.
In one embodiment, assessment result includes accuracy evaluation result and complexity evaluations result.
Model accuracy can be used for measuring the performance of its function of model realization, for example, when model is applied with image recognition,
Model accuracy can specifically refer to image recognition rate, and when model is applied to image classification, model accuracy can specifically refer to figure
As the accuracy rate etc. of classification.Input of the test sample data as network model after training can be used in accuracy evaluation result, leads to
The output of network model and the penalty values of expected results after loss function calculating is trained are crossed to obtain.
Model complexity can be divided into time complexity and space complexity, wherein time complexity determines model
Training/predicted time.If time complexity is excessively high, it will lead to model training and prediction take considerable time, it both can not be quick
Verifying idea and improve model, can not also accomplish quickly to predict.Space complexity determines the number of parameters of model.Due to
The limitation of dimension disaster (Curse Of Dimensionality), the parameter of model is more, and data volume needed for training pattern is just
It is bigger, and the data set in actual conditions is usually not too large, this training that will lead to model is easier over-fitting.Time is complicated
Degree specifically can embody to obtain by the operation times of model, and space complexity can specifically be embodied according to the volume of model itself
It obtains.
In one embodiment, currently pending structure is replaced using Destination Network Structure, network model after being replaced,
It include: when there is the history Destination Network Structure consistent with the structure of Destination Network Structure, by history Destination Network Structure
Weight setting be Destination Network Structure initial weight after, replace currently pending structure using Destination Network Structure, obtain
Network model after replacement.
Specifically, in the compression process of network model, based on the principle that parameters weighting is shared, network model is being carried out
When structure is replaced, historical information can be replaced with integrated structure to improve treatment effeciency.For example, being used in i-th treatment process
Destination Network Structure S replaces currently pending structure, and replace after model modelling effect assessment result it is preferable, should
The weight that Destination Network Structure S is determined in secondary treatment process is αS.Then in jth time treatment process, if reusing target network
Network structure S carries out structure replacement, then can be directly by αSInitial weight as Destination Network Structure S in jth time treatment process.
Usually, the initial weight of new network structure often take 0 perhaps random value use 0 or random value as after initial weight,
Need to spend longer time re-starting model training, and the weight α that will directly be determined in i-th treatment process beforeS
As the initial weight in jth time treatment process, it is equivalent on training result in front and is trained, so as to effectively subtract
Few model training time, improve training effectiveness.
In jth time treatment process, after replacing new currently pending structure using Destination Network Structure S, to new
Replaced network model carry out modelling effect assessment when, then can be directly based upon initial weight αS, to net after new replacement
Network model carries out modelling effect assessment.
It should be appreciated that although each step in the flow chart that each embodiment is related to above is according to arrow under reasonable terms
Instruction successively show that but these steps are not that the inevitable sequence according to arrow instruction successively executes.Unless having herein
Explicitly stated, there is no stringent sequences to limit for the execution of these steps, these steps can execute in other order.And
And at least part step in each flow chart may include multiple sub-steps perhaps these sub-steps of multiple stages or rank
Section is not necessarily to execute completion in synchronization, but can execute at different times, these sub-steps or stage
Execution sequence is also not necessarily and successively carries out, but can be with the sub-step or stage of other steps or other steps extremely
Few a part executes in turn or alternately.
In one embodiment, as shown in figure 9, providing a kind of network model compression set, which includes: that model obtains
Module 110, structure determination module 120, structure replacement module 130 and recruitment evaluation module 140.
Model obtains module 110 for obtaining network model to be compressed;
Structure determination module 120 is used to determine the currently pending structure in network model to be compressed;
Structure replacement module 130 is replaced for searching for Destination Network Structure from search space using Destination Network Structure
Currently pending structure is changed, network model after being replaced;
Recruitment evaluation module 140 is used to comment network model progress modelling effect after replacement using L1 regularization loss function
Estimate, obtains assessment result;When assessment result meets preset requirement, return currently pending in determining network model to be compressed
The step of structure, until each structure to be processed in network model to be compressed handles completion, and will be after finally obtained replacement
Network model, as network model after compression.
The application provides a kind of network model compression set, is searched for by neural framework to compress to network model,
It can guarantee the accuracy of network model, meanwhile, the parameter amount of network model can be effectively reduced by L1 Regularization, dropped
Low network model complexity, thus the problem of being effectively improved network architecture redundancy.
In one embodiment, structure determination module 120 is used for: according to output layer, hidden layer, input layer it is reverse suitable
Sequence determines that the single layer structure or multilayered structure in network model to be compressed are currently pending structure.
In one embodiment, structure determination module 120 is used for: according to output layer, hidden layer, input layer it is reverse suitable
Sequence determines the single layer knot in network model to be compressed based on the assessment result to network model after last replaced replacement
Structure or multilayered structure are currently pending structure.
In one embodiment, recruitment evaluation module 140 is used for: when Destination Network Structure includes random inactivation unit,
At random will be after replacement in network model, one or more in the parameter that connect with Destination Network Structure is set as failure state
Afterwards, modelling effect assessment is carried out to network model after replacement using L1 regularization loss function, obtains assessment result.
In one embodiment, recruitment evaluation module 140 is used for: obtaining sample data, sample data includes training sample
Data and test sample data;Network model after replacement is trained using training sample data, network after being trained
Model;According to test sample data, modelling effect assessment is carried out to network model after training by L1 regularization loss function, is obtained
To assessment result.
In one embodiment, structure replacement module 130 is used for: consistent with the structure of Destination Network Structure existing
When history Destination Network Structure, by the weight setting of history Destination Network Structure be Destination Network Structure initial weight after, make
Currently pending structure, network model after being replaced are replaced with Destination Network Structure.
Figure 10 shows the internal structure chart of computer equipment in one embodiment.The computer equipment specifically can be figure
Terminal 10 (or server 20) in 1.As shown in Figure 10, it includes total by system which, which includes the computer equipment,
Processor, memory, network interface, input unit and the display screen of line connection.Wherein, memory includes that non-volatile memories are situated between
Matter and built-in storage.The non-volatile memory medium of the computer equipment is stored with operating system, can also be stored with computer journey
Sequence when the computer program is executed by processor, may make processor to realize network model compression method.In the built-in storage
Computer program can be stored, when which is executed by processor, processor may make to execute network model compression side
Method.The display screen of computer equipment can be liquid crystal display or electric ink display screen, the input unit of computer equipment
It can be the touch layer covered on display screen, be also possible to the key being arranged on computer equipment shell, trace ball or Trackpad,
It can also be external keyboard, Trackpad or mouse etc..
It will be understood by those skilled in the art that structure shown in Figure 10, only part relevant to application scheme
The block diagram of structure, does not constitute the restriction for the computer equipment being applied thereon to application scheme, and specific computer is set
Standby may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, network model compression set provided by the present application can be implemented as a kind of computer program
Form, computer program can be run in computer equipment as shown in Figure 10.Group can be stored in the memory of computer equipment
At each program module of the network model compression set, for example, model shown in Fig. 9 obtains module, structure determination module, knot
Structure replacement module and recruitment evaluation module.The computer program that each program module is constituted executes processor in this specification
Step in the network model compression method of each embodiment of the application of description.
For example, computer equipment shown in Fig. 10 can pass through the model in network model compression set as shown in Figure 9
It obtains module and obtains network model to be compressed.Computer equipment can determine the network model to be compressed by structure determination module
In currently pending structure.Computer equipment can search for Destination Network Structure by structure replacement module from search space,
And the currently pending structure, network model after being replaced are replaced using the Destination Network Structure;Computer equipment can
Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function by recruitment evaluation module, is obtained
Assessment result;When the assessment result meets preset requirement, return current wait locate in the determining network model to be compressed
The step of managing structure until reason structure handles completion everywhere in the network model to be compressed, and is replaced finally obtained
Rear network model is changed, as network model after compression.
In one embodiment, a kind of computer equipment, including memory and processor are provided, memory is stored with meter
Calculation machine program, when computer program is executed by processor, so that the step of processor executes above-mentioned network model compression method.This
The step of locating network model compression method can be the step in the network model compression method of above-mentioned each embodiment.
In one embodiment, a kind of computer readable storage medium is provided, computer program, computer journey are stored with
When sequence is executed by processor, so that the step of processor executes above-mentioned network model compression method.Network model compression side herein
The step of method, can be the step in the network model compression method of above-mentioned each embodiment.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Instruct relevant hardware to complete by computer program, program can be stored in a non-volatile computer storage can be read
In medium, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, provided herein each
To any reference of memory, storage, database or other media used in embodiment, may each comprise it is non-volatile and/
Or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable
ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include random access memory
(RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, such as static state RAM
(SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhanced SDRAM
(ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) directly RAM (RDRAM), straight
Connect memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art
For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application
Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (15)
1. a kind of network model compression method characterized by comprising
Obtain network model to be compressed;
Determine the currently pending structure in the network model to be compressed;
Destination Network Structure is searched for from search space, and replaces the currently pending knot using the Destination Network Structure
Structure, network model after being replaced;
Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, obtains assessment result;
When the assessment result meets preset requirement, the currently pending structure determined in the network model to be compressed is returned
The step of, until each structure to be processed in the network model to be compressed handles completion, and will be after finally obtained replacement
Network model, as network model after compression.
2. the method according to claim 1, wherein the network model to be compressed is deep neural network;Really
Currently pending structure in the fixed network model to be compressed, comprising:
According to the reverse order of output layer, hidden layer, input layer, determine single layer structure in the network model to be compressed or
Multilayered structure is the currently pending structure.
3. according to the method described in claim 2, it is characterized in that, according to output layer, hidden layer, input layer reverse order,
Determine that single layer structure or multilayered structure in the network model to be compressed are the currently pending structure, comprising:
According to the reverse order of output layer, hidden layer, input layer, based on commenting network model after last replaced replacement
Estimate as a result, determining that single layer structure or multilayered structure in the network model to be compressed are the currently pending structure.
4. the method according to claim 1, wherein the Destination Network Structure includes base neural network unit
And one of random inactivation unit or a variety of.
5. according to the method described in claim 4, it is characterized in that, when the Destination Network Structure includes random inactivation unit
When, modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, assessment result is obtained, wraps
It includes:
One or more in the parameter in network model after the replacement, connecting with the Destination Network Structure is set at random
After being set to failure state, modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, is obtained
Assessment result.
6. the method according to claim 1, wherein using L1 regularization loss function to network after the replacement
Model carries out modelling effect assessment, obtains assessment result, comprising:
Obtain sample data, the sample data include number of training accordingly and test sample data;
Network model after the replacement is trained using the training sample data, network model after being trained;
According to the test sample data, modelling effect is carried out to network model after the training by L1 regularization loss function
Assessment, obtains assessment result.
7. method according to claim 1 or 6, which is characterized in that the assessment result include accuracy evaluation result and
Complexity evaluations result.
8. according to the method described in claim 6, it is characterized in that, being damaged according to the test sample data by L1 regularization
It loses function and modelling effect assessment is carried out to network model after the training, obtain assessment result, comprising:
L1 Regularization, processing are carried out by each network layer of the L1 regularization loss function to network model after the training
The quantity for the parameter that weight is 0 in network model after training afterwards is 0 greater than weight in network model after the training before processing
The quantity of parameter;
To it is described treated training after network model carry out modelling effect assessment, obtain assessment result.
9. the method according to claim 1, wherein being replaced using the Destination Network Structure described currently wait locate
Manage structure, network model after being replaced, comprising:
When there is the history Destination Network Structure consistent with the structure of the Destination Network Structure, by the history target network
After the weight setting of network structure is the initial weight of the Destination Network Structure, work as using described in Destination Network Structure replacement
Preceding structure to be processed, network model after being replaced.
10. a kind of network model compression set characterized by comprising
Model obtains module, for obtaining network model to be compressed;
Structure determination module, for determining the currently pending structure in the network model to be compressed;
Structure replacement module is replaced for searching for Destination Network Structure from search space, and using the Destination Network Structure
The currently pending structure, network model after being replaced;
Recruitment evaluation module is commented for carrying out modelling effect to network model after the replacement using L1 regularization loss function
Estimate, obtains assessment result;When the assessment result meets preset requirement, returns and determine working as in the network model to be compressed
The step of preceding structure to be processed, until each structure to be processed in the network model to be compressed handles completion, and will be final
Network model after obtained replacement, as network model after compression.
11. device according to claim 10, which is characterized in that the structure determination module, for realizing the following terms
Any one of:
First item: according to the reverse order of output layer, hidden layer, input layer, the single layer in the network model to be compressed is determined
Structure or multilayered structure are the currently pending structure;
Section 2: according to the reverse order of output layer, hidden layer, input layer, based on to network after last replaced replacement
The assessment result of model determines that the single layer structure or multilayered structure in the network model to be compressed are described currently pending
Structure.
12. device according to claim 10, which is characterized in that the recruitment evaluation module, for realizing the following terms
Any one of:
First item: when the Destination Network Structure includes random inactivation unit, at random by network model after the replacement, with
After one or more in the parameter of Destination Network Structure connection is set as failure state, L1 regularization is used to lose letter
It is several that modelling effect assessment is carried out to network model after the replacement, obtain assessment result;
Section 2: obtaining sample data, the sample data include number of training accordingly and test sample data;Using described
Training sample data are trained network model after the replacement, network model after being trained;According to the test sample
Data carry out modelling effect assessment to network model after the training by L1 regularization loss function, obtain assessment result.
13. device according to claim 10, which is characterized in that the structure replacement module is used for: exist with it is described
When the consistent history Destination Network Structure of the structure of Destination Network Structure, by the weight setting of the history Destination Network Structure
After initial weight for the Destination Network Structure, the currently pending structure is replaced using the Destination Network Structure, is obtained
Network model after to replacement.
14. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the computer program is located
When managing device execution, so that the processor is executed such as the step of any one of claims 1 to 9 the method.
15. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In when the computer program is executed by the processor, so that the processor executes such as any one of claims 1 to 9
The step of the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910770193.1A CN110490323A (en) | 2019-08-20 | 2019-08-20 | Network model compression method, device, storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910770193.1A CN110490323A (en) | 2019-08-20 | 2019-08-20 | Network model compression method, device, storage medium and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110490323A true CN110490323A (en) | 2019-11-22 |
Family
ID=68552342
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910770193.1A Pending CN110490323A (en) | 2019-08-20 | 2019-08-20 | Network model compression method, device, storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490323A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275190A (en) * | 2020-02-25 | 2020-06-12 | 北京百度网讯科技有限公司 | Neural network model compression method and device, image processing method and processor |
CN111488986A (en) * | 2020-04-13 | 2020-08-04 | 商汤集团有限公司 | Model compression method, image processing method and device |
CN111526054A (en) * | 2020-04-21 | 2020-08-11 | 北京百度网讯科技有限公司 | Method and device for acquiring network |
CN111709516A (en) * | 2020-06-09 | 2020-09-25 | 深圳先进技术研究院 | Compression method and compression device of neural network model, storage medium and equipment |
CN111985644A (en) * | 2020-08-28 | 2020-11-24 | 北京市商汤科技开发有限公司 | Neural network generation method and device, electronic device and storage medium |
CN112434725A (en) * | 2020-10-30 | 2021-03-02 | 四川新网银行股份有限公司 | Model compression method deployed to HTML5 |
CN112465115A (en) * | 2020-11-25 | 2021-03-09 | 科大讯飞股份有限公司 | GAN network compression method, device, equipment and storage medium |
CN113658091A (en) * | 2020-05-12 | 2021-11-16 | Tcl科技集团股份有限公司 | Image evaluation method, storage medium and terminal equipment |
CN113657592A (en) * | 2021-07-29 | 2021-11-16 | 中国科学院软件研究所 | Software-defined satellite self-adaptive pruning model compression method |
CN113673694A (en) * | 2021-05-26 | 2021-11-19 | 阿里巴巴新加坡控股有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
US20220138498A1 (en) * | 2020-10-29 | 2022-05-05 | EMC IP Holding Company LLC | Compression switching for federated learning |
CN114692816A (en) * | 2020-12-31 | 2022-07-01 | 华为技术有限公司 | Processing method and equipment of neural network model |
CN115543945A (en) * | 2022-11-29 | 2022-12-30 | 支付宝(杭州)信息技术有限公司 | Model compression method and device, storage medium and electronic equipment |
WO2023071766A1 (en) * | 2021-10-28 | 2023-05-04 | 中兴通讯股份有限公司 | Model compression method, model compression system, server, and storage medium |
CN112465115B (en) * | 2020-11-25 | 2024-05-31 | 科大讯飞股份有限公司 | GAN network compression method, device, equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109284820A (en) * | 2018-10-26 | 2019-01-29 | 北京图森未来科技有限公司 | A kind of search structure method and device of deep neural network |
EP3493120A1 (en) * | 2017-12-01 | 2019-06-05 | Koninklijke Philips N.V. | Training a neural network model |
CN109948783A (en) * | 2019-03-29 | 2019-06-28 | 中国石油大学(华东) | A kind of Topological expansion method based on attention mechanism |
CN110020667A (en) * | 2019-02-21 | 2019-07-16 | 广州视源电子科技股份有限公司 | Searching method, system, storage medium and the equipment of neural network structure |
-
2019
- 2019-08-20 CN CN201910770193.1A patent/CN110490323A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3493120A1 (en) * | 2017-12-01 | 2019-06-05 | Koninklijke Philips N.V. | Training a neural network model |
CN109284820A (en) * | 2018-10-26 | 2019-01-29 | 北京图森未来科技有限公司 | A kind of search structure method and device of deep neural network |
CN110020667A (en) * | 2019-02-21 | 2019-07-16 | 广州视源电子科技股份有限公司 | Searching method, system, storage medium and the equipment of neural network structure |
CN109948783A (en) * | 2019-03-29 | 2019-06-28 | 中国石油大学(华东) | A kind of Topological expansion method based on attention mechanism |
Non-Patent Citations (1)
Title |
---|
BARRET ZOPH: ""Learning Transferable Architectures for Scalable Image Recognition"", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 16 December 2018 (2018-12-16), pages 1 - 14 * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275190A (en) * | 2020-02-25 | 2020-06-12 | 北京百度网讯科技有限公司 | Neural network model compression method and device, image processing method and processor |
CN111275190B (en) * | 2020-02-25 | 2023-10-10 | 北京百度网讯科技有限公司 | Compression method and device of neural network model, image processing method and processor |
CN111488986A (en) * | 2020-04-13 | 2020-08-04 | 商汤集团有限公司 | Model compression method, image processing method and device |
CN111526054B (en) * | 2020-04-21 | 2022-08-26 | 北京百度网讯科技有限公司 | Method and device for acquiring network |
CN111526054A (en) * | 2020-04-21 | 2020-08-11 | 北京百度网讯科技有限公司 | Method and device for acquiring network |
CN113658091A (en) * | 2020-05-12 | 2021-11-16 | Tcl科技集团股份有限公司 | Image evaluation method, storage medium and terminal equipment |
CN111709516A (en) * | 2020-06-09 | 2020-09-25 | 深圳先进技术研究院 | Compression method and compression device of neural network model, storage medium and equipment |
CN111709516B (en) * | 2020-06-09 | 2023-07-28 | 深圳先进技术研究院 | Compression method and compression device, storage medium and equipment of neural network model |
CN111985644A (en) * | 2020-08-28 | 2020-11-24 | 北京市商汤科技开发有限公司 | Neural network generation method and device, electronic device and storage medium |
CN111985644B (en) * | 2020-08-28 | 2024-03-08 | 北京市商汤科技开发有限公司 | Neural network generation method and device, electronic equipment and storage medium |
US11790039B2 (en) * | 2020-10-29 | 2023-10-17 | EMC IP Holding Company LLC | Compression switching for federated learning |
US20220138498A1 (en) * | 2020-10-29 | 2022-05-05 | EMC IP Holding Company LLC | Compression switching for federated learning |
CN112434725A (en) * | 2020-10-30 | 2021-03-02 | 四川新网银行股份有限公司 | Model compression method deployed to HTML5 |
CN112434725B (en) * | 2020-10-30 | 2023-06-09 | 四川新网银行股份有限公司 | Model compression method deployed to HTML5 |
CN112465115A (en) * | 2020-11-25 | 2021-03-09 | 科大讯飞股份有限公司 | GAN network compression method, device, equipment and storage medium |
CN112465115B (en) * | 2020-11-25 | 2024-05-31 | 科大讯飞股份有限公司 | GAN network compression method, device, equipment and storage medium |
CN114692816A (en) * | 2020-12-31 | 2022-07-01 | 华为技术有限公司 | Processing method and equipment of neural network model |
CN114692816B (en) * | 2020-12-31 | 2023-08-25 | 华为技术有限公司 | Processing method and equipment of neural network model |
CN113673694A (en) * | 2021-05-26 | 2021-11-19 | 阿里巴巴新加坡控股有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN113657592A (en) * | 2021-07-29 | 2021-11-16 | 中国科学院软件研究所 | Software-defined satellite self-adaptive pruning model compression method |
CN113657592B (en) * | 2021-07-29 | 2024-03-05 | 中国科学院软件研究所 | Software-defined satellite self-adaptive pruning model compression method |
WO2023071766A1 (en) * | 2021-10-28 | 2023-05-04 | 中兴通讯股份有限公司 | Model compression method, model compression system, server, and storage medium |
CN115543945A (en) * | 2022-11-29 | 2022-12-30 | 支付宝(杭州)信息技术有限公司 | Model compression method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490323A (en) | Network model compression method, device, storage medium and computer equipment | |
US11928574B2 (en) | Neural architecture search with factorized hierarchical search space | |
US10755026B1 (en) | Circuit design including design rule violation correction utilizing patches based on deep reinforcement learning | |
CN111126668B (en) | Spark operation time prediction method and device based on graph convolution network | |
WO2020214428A1 (en) | Using hyperparameter predictors to improve accuracy of automatic machine learning model selection | |
US20210081798A1 (en) | Neural network method and apparatus | |
US20210312261A1 (en) | Neural network search method and related apparatus | |
CN113762486B (en) | Method and device for constructing fault diagnosis model of converter valve and computer equipment | |
CN109313720A (en) | The strength neural network of external memory with sparse access | |
CN110443165A (en) | Neural network quantization method, image-recognizing method, device and computer equipment | |
Ayodeji et al. | Causal augmented ConvNet: A temporal memory dilated convolution model for long-sequence time series prediction | |
CN113239168B (en) | Interpretive method and system based on knowledge graph embedded prediction model | |
CN110990135A (en) | Spark operation time prediction method and device based on deep migration learning | |
Feng et al. | Finite strain FE2 analysis with data-driven homogenization using deep neural networks | |
CN116897356A (en) | Operator scheduling run time comparison method, device and storage medium | |
EP4009239A1 (en) | Method and apparatus with neural architecture search based on hardware performance | |
JP7024881B2 (en) | Pattern recognition device and pattern recognition method | |
US11675951B2 (en) | Methods and systems for congestion prediction in logic synthesis using graph neural networks | |
US20230090720A1 (en) | Optimization for artificial neural network model and neural processing unit | |
US11875263B2 (en) | Method and apparatus for energy-aware deep neural network compression | |
CN110825903A (en) | Visual question-answering method for improving Hash fusion mechanism | |
Joshi et al. | Area efficient VLSI ASIC implementation of multilayer perceptrons | |
US20220318684A1 (en) | Sparse ensembling of unsupervised models | |
CN115204463A (en) | Residual service life uncertainty prediction method based on multi-attention machine mechanism | |
CN114707718A (en) | GAT-LSTM-based information cascade prediction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |