CN110490323A

CN110490323A - Network model compression method, device, storage medium and computer equipment

Info

Publication number: CN110490323A
Application number: CN201910770193.1A
Authority: CN
Inventors: 陈明阳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-08-20
Filing date: 2019-08-20
Publication date: 2019-11-22

Abstract

This application involves a kind of network model compression method, device, storage medium and computer equipments, obtain network model to be compressed；Determine the currently pending structure in network model to be compressed；Destination Network Structure is searched for from search space, and replaces currently pending structure using Destination Network Structure；Modelling effect assessment is carried out to network model after replacement using L1 regularization loss function；When assessment result meets preset requirement, return to the step of determining the currently pending structure in network model to be compressed, until reason structure handles completion everywhere in network model to be compressed, and by network model after finally obtained replacement, as network model after compression.It is searched for by neural framework to be compressed to network model, it can guarantee the accuracy of network model, meanwhile, the parameter amount of network model can be effectively reduced by L1 Regularization, network model complexity is reduced, thus the problem of being effectively improved network architecture redundancy.

Description

Network model compression method, device, storage medium and computer equipment

Technical field

This application involves field of computer technology, more particularly to a kind of network model compression method, device, storage medium And computer equipment.

Background technique

With the progress of science and technology, the depth learning technology based on neural network model is rapidly developed, should Technology is including that multiple application fields such as picture recognition, target detection, semantic segmentation, speech recognition and natural language processing all take Obtained breakthrough achievement.

However, the ginseng enormous amount of deep neural network model so that model structure redundancy, and then cause internal reference consumption compared with Greatly.

Summary of the invention

Based on this, it is necessary to be directed to technical problem of the existing technology, providing one kind can effectively improve network model knot Network model compression method, device, storage medium and the computer equipment of structure redundancy issue.

A kind of network model compression method, comprising:

Obtain network model to be compressed；

Determine the currently pending structure in the network model to be compressed；

Destination Network Structure is searched for from search space, and described currently pending using Destination Network Structure replacement Structure, network model after being replaced；

Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, obtains assessment knot Fruit；

When the assessment result meets preset requirement, return currently pending in the determining network model to be compressed The step of structure, until reason structure handles completion everywhere in the network model to be compressed, and by finally obtained replacement Network model afterwards, as network model after compression.

A kind of network model compression set, comprising:

Model obtains module, for obtaining network model to be compressed；

Structure determination module, for determining the currently pending structure in the network model to be compressed；

Structure replacement module for searching for Destination Network Structure from search space, and uses the Destination Network Structure Replace the currently pending structure, network model after being replaced；

Recruitment evaluation module, for carrying out modelling effect to network model after the replacement using L1 regularization loss function Assessment, obtains assessment result；When the assessment result meets preset requirement, returns and determine in the network model to be compressed The step of currently pending structure, until reason structure handles completion everywhere in the network model to be compressed, and will be final Network model after obtained replacement, as network model after compression.

A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing Device performs the steps of when executing the computer program

Obtain network model to be compressed；

A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor It is performed the steps of when row

Obtain network model to be compressed；

Above-mentioned network model compression method, device, storage medium and computer equipment, obtain network model to be compressed；Really Currently pending structure in fixed network model to be compressed；Destination Network Structure is searched for from search space, and uses target network Network structure replaces currently pending structure, network model after being replaced；Using L1 regularization loss function to network after replacement Model carries out modelling effect assessment, obtains assessment result；When assessment result meets preset requirement, returns and determine network to be compressed The step of currently pending structure in model, until reason structure handles completions, and general everywhere in network model to be compressed Network model after finally obtained replacement, as network model after compression.It is searched for by neural framework to be carried out to network model Compression, it is ensured that the accuracy of network model, meanwhile, the parameter of network model can be effectively reduced by L1 Regularization Amount reduces network model complexity, thus the problem of being effectively improved network architecture redundancy.

Detailed description of the invention

Fig. 1 is the applied environment figure of network model compression method in one embodiment；

Fig. 2 is the flow diagram of network model compression method in one embodiment；

Fig. 3 is the sequential schematic that currently pending structure is determined in one embodiment；

Fig. 4 is the schematic diagram of network layer and node in one embodiment；

Fig. 5 is the sequential schematic that currently pending structure is determined in another embodiment；

Fig. 6 is the processing flow schematic diagram that Destination Network Structure includes random inactivation unit in one embodiment；

Fig. 7 is to carry out modelling effect assessment to network model after replacement using L1 regularization loss function in one embodiment Flow diagram；

Fig. 8 is to carry out modelling effect assessment to network model after training by L1 regularization loss function in one embodiment Flow diagram；

Fig. 9 is the structural block diagram of network model compression set in one embodiment；

Figure 10 is the structural block diagram of computer equipment in one embodiment.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and It is not used in restriction the application.

Fig. 1 is the applied environment figure of network model compression method in one embodiment.Network model pressure provided by the present application Contracting method can be applied in application environment as shown in Figure 1.Wherein, terminal 10 and server 20 are communicated by network.This The network model compression method of application one embodiment may operate on server 20, and terminal 10 sends out network model to be compressed It send to server 20, server 20 obtains network model to be compressed；Determine the currently pending structure in network model to be compressed； Destination Network Structure is searched for from search space, and replaces currently pending structure using Destination Network Structure, after obtaining replacement Network model；Modelling effect assessment is carried out to network model after replacement using L1 regularization loss function, obtains assessment result；When When assessment result meets preset requirement, the step of determining the currently pending structure in network model to be compressed is returned, until to Each structure to be processed in compression network model handles completion, and by network model after finally obtained replacement, as compression Network model afterwards.Wherein, terminal 10 specifically can be terminal console or mobile terminal, such as desktop computer, tablet computer, notes This computer, smart phone etc..Server 20 can be independent physical server, physical server cluster or Virtual Service Device.

In another embodiment, the network model compression method of the application may operate in terminal 10, and terminal 10 obtains Take network model to be compressed；Determine the currently pending structure in network model to be compressed；Target network is searched for from search space Network structure, and currently pending structure is replaced using Destination Network Structure, network model after being replaced；It is damaged using L1 regularization It loses function and modelling effect assessment is carried out to network model after replacement, obtain assessment result；When assessment result meets preset requirement, The step of determining the currently pending structure in network model to be compressed is returned to, until each to be processed in network model to be compressed Structure handles completion, and by network model after finally obtained replacement, as network model after compression.

As shown in Fig. 2, in one embodiment, providing a kind of network model compression method.The present embodiment is mainly with this Method is applied to the terminal 10 (or server 20) in above-mentioned Fig. 1 to illustrate.Referring to Fig. 2, the network model compression method Specifically comprise the following steps:

S110 obtains network model to be compressed.

Wherein, network model to be compressed can be the network model due to the ginseng enormous amount redundancy that leads to that structure is complicated, or Person is the network model for having compression requirements.In general, the volume and effect of network are proportional, that is, the network of model is joined Number is more, and model volume is bigger, and the accuracy of model is higher, for example, the relatively good deep learning model of some effects may have There are millions of even more than one hundred million parameter amounts.However, ginseng enormous amount also results in high calculating cost, to limit net Network model programmable gate array (Field-Programmable Gate Array, FPGA), reduced instruction set computing at the scene Machine (Reduced Instruction Set Computer, RISC), microprocessor (Advanced RISC Machine, ARM) Etc. the application in resource-constrained computing platform.

Based on this, need to compress network model to reduce the complexity of its model structure, specifically, network model Compression refer to the process of reject network model in redundancy parameters and channels.For example, the component units of network model are network Layer, network layer, which is based on function difference, can be divided into convolutional layer, active coating, pond layer, mapping layer and batch normalization layer etc., to nerve The compression of network model can refer to change, replacement or the parameter for deleting network layer, and by taking convolutional layer as an example, model compression can To refer to that the size to convolution kernel, quantity are compressed, wherein the size of convolution kernel includes: port number, height, width.

S120 determines the currently pending structure in network model to be compressed.

Currently pending structure refers to the network layer for needing to carry out compression processing in network model, is obtaining network to be compressed After model, currently pending structure can be determined according to preset rules.Wherein, preset rules may include default processing sequence Rule such as successively determines in network model that it is current for needing the network layer structure for carrying out compression processing according to sequence from back to front Structure to be processed.

Specifically, it can be and the single layer structure in network model be determined as currently pending structure, for example, the volume of single layer Lamination, pond layer of single layer etc., due to being handled single layer network structure, to guarantee the accuracy of model compression.Separately Outside, it is contemplated that the network layer number of network model is larger, be also possible to for the multilayered structure in network model being determined as currently to Processing structure, the multilayered structure being such as made of multiple convolutional layers, the multilayer knot being made of multiple convolutional layers and multiple pond layers Structure etc., to guarantee the efficiency of model compression.

S130 searches for Destination Network Structure from search space, and replaces currently pending knot using Destination Network Structure Structure, network model after being replaced.

In this application, main that (Neural is searched for using neural framework when being compressed to network model Architecture Search, NAS) architecture design is carried out to network model, by using optimization and layer (the i.e. target simplified Network structure) come the original network layer of alternative networks model (i.e. currently pending structure), determine currently pending structure and Search Destination Network Structure specifically can be by the controller in the search of neural framework and execute, and controller is by using search plan Destination Network Structure is slightly searched for from search space.

Wherein, search strategy defines how controller quickly and accurately finds more appropriate Destination Network Structure, searches Rope strategy specifically can be random search, Bayes's optimization, transfer learning, intensified learning, evolution algorithm, genetic algorithm, greed Any one of algorithm and the algorithm based on gradient.Search space refers to that the set of multiple network unit, network unit include Base neural network unit, such as convolution kernel, activation primitive, pond unit, RNN (Recurrent Neural Network, circulation Neural network) unit etc..Wherein, convolution kernel (Convolution Kernel) refers to when performing image processing, gives input Image, each pixel is the weighted average of pixel in a zonule in input picture in the output image, wherein weight by One function definition, the function are then known as convolution kernel.Activation primitive (Activation Function) refers to be run on neuron Function, effect be responsible for the input of neuron being mapped to output end.Pond (Pooling) unit is mainly used for parameter drop Dimension, the quantity of compressed data and parameter reduce over-fitting, while improving the fault-tolerance of model.RNN unit refers to for handling sequence The neural network unit of column data.In addition, network unit can also include other units, such as random inactivation (Dropout) unit, Random inactivation is the method optimized to the neural network with depth structure, by by the fractional weight of parameter or output with Machine zero, the interdependency reduced between node reduce it to realize the regularization (Regularization) of neural network Structure complexity.

Destination Network Structure refers to the structure being made of the network unit in search space, for example, by multiple convolution kernel groups At structure, the structure being made of multiple pond units etc..After determining currently pending structure, it can be searched by neural framework The controller of rope searches for Destination Network Structure according to search strategy from search space, and currently pending structure is replaced with mesh Mark network structure, network model after being replaced.

S140 carries out modelling effect assessment to network model after replacement using L1 regularization loss function, obtains assessment knot Fruit.

L1 regularization loss function refers to uses L1 regularization in loss function, wherein loss function (Loss It Function) is that chance event or its value in relation to stochastic variable are mapped as nonnegative real number to indicate the chance event " risk " or the function of " loss ", can be used for measuring the modelling effect of model in this application, and loss function can specifically select Select mean square error, cross entropy, Hinge Loss etc..L1 regularization refers to the weight of parameter in network model to 0 close place Reason process.In network model, if there are correlations between parameters, the complexity of network model will increase, and And the interpretability of network model is not improved, it is therefore desirable to parameter selection is carried out, so that network model becomes easy solution It releases.For example, when needing to analyze the reason of obtaining trigger event A by network model, it is assumed that there are 1000 possible influences It is larger then to analyze difficulty for the factor.At this point, if by being trained to network model, so that the weight of a part of impact factor It is 0, then when being analyzed according to remaining impact factor, then can substantially reduces analysis difficulty.Therefore, just by using L1 Then change processing, automatically selecting for network model parameter can be carried out, the weight of network model is concentrated mainly on the ginseng of high different degree On number, for unessential parameter, weight can tend to 0 quickly, so that network model becomes sparse, realize network model pressure The purpose of contracting.

Specifically, if L1 regularization loss function is L, the loss function that Regularization is not used is L₀, network model Network layer sum be H, each layer of weight vectors are W_k, k=1,2 ..., H.Assuming that the number of plies to network model and each The parameter amount of layer all carries out Regularization, then the expression formula of L1 regularization loss function are as follows:

Wherein, λ and μ_k, k=1,2 ..., H are regularization coefficient, need to be solved in modelling effect assessment.

Currently pending structure is being replaced using Destination Network Structure, after being replaced after network model, neural framework The controller of search carries out modelling effect assessment, modelling effect assessment to network model after replacement using L1 regularization loss function Process includes the parameter progress L1 Regularization to network model, thus while obtaining modelling effect assessment result, it can So that the parameter of network model is reduced, model structure complexity is reduced.

S150 returns to the currently pending knot determined in network model to be compressed when assessment result meets preset requirement The step of structure, until each structure to be processed in network model to be compressed handles completion, and by net after finally obtained replacement Network model, as network model after compression.

After obtaining assessment result, the controller of neural framework search judges assessment result, determines the replacement Whether the assessment result of model meets preset requirement afterwards, if not satisfied, then controller can re-search for new target network knot Structure, and the process of above-mentioned network architecture replacement, network model recruitment evaluation is repeated, until finding satisfactory target network Network structure.In addition, the controller of neural framework search carries out Destination Network Structure search, network architecture replacement, network mould The process of type recruitment evaluation, it is believed that be the process of controller iteration (Controller Epoch), change carrying out controller During generation, further includes: when controller the number of iterations reaches preset times (such as 2000 times), conformed to if not searching also The Destination Network Structure asked then can keep the original structure of currently pending structure constant, i.e., not to currently pending structure into Row structure replacement processing.

It should be noted that the compression process of network model is iterative process in the application, in iterative process In, single treatment process includes having carried out structure replacement to all currently pending structures in network model, and to correspondence Replaced network model carried out modelling effect assessment.It is completed when the iterative process of network model meets default iteration When condition, it can be confirmed that network model compression is completed, presetting iteration and completing condition includes default the number of iterations and default iteration At least one of duration.

The present embodiment provides a kind of network model compression methods, are searched for by neural framework to press network model Contracting, it is ensured that the accuracy of network model, meanwhile, the parameter of network model can be effectively reduced by L1 Regularization Amount reduces network model complexity, thus the problem of being effectively improved network architecture redundancy.

In one embodiment, network model to be compressed is deep neural network (Deep Neural Network, DNN), Neural network generally includes input layer, hidden layer and output layer, and deep neural network refers to the nerve comprising multiple hidden layers Network, since implicit layer number is more, the number of parameters of deep neural network is also relatively more, it is thus possible to depth mind It is compressed through network to reduce its number of parameters, reduces complicated network structure degree.

When compressing to deep neural network, the currently pending structure in network model to be compressed is determined, comprising: According to the reverse order of output layer, hidden layer, input layer, the single layer structure or multilayered structure in network model to be compressed are determined For currently pending structure.

Specifically, as shown in figure 3, determining the sequential schematic of currently pending structure, deep neural network for controller Including input layer A, hidden layer B and output layer C, hidden layer B successively includes B₁Layer is to B_nLayer, the structure of network model is more complicated, n Numerical value it is bigger.Arrow direction is to determine the direction of currently pending structure in figure, and processor determines deep neural network first Output layer C be currently pending structure；It then, can be according to when determining currently pending structure from multiple hidden layer B From the hidden layer B being connect with output layer_nTo the hidden layer B being connect with input layer₁Sequence determine currently pending structure；Finally, Determine that the output layer A of deep neural network is currently pending structure.Since there is a certain error for the output of neural network, press It is handled according to the reverse order of network structure, i.e., carries out Topological expansion from output layer and weight updates, can subtract Small error improves model accuracy.

Controller can be when determining single layer structure or multilayered structure is currently pending structure according to compression progress It determines, for example, can choose the number of plies more multilayered structure is currently pending structure in model compression early period, thus plus Fast compression progress；In model compression mid-term, the number of plies of multilayered structure can be gradually decreased；In the model compression later period, can choose Single layer structure is preceding structure to be processed, to guarantee compression accuracy.In addition, controller is also possible to the weight according to each layer structure It determines, for example, biggish for weight network layer, can be and determine that corresponding single layer structure is currently pending structure；It is right It in the lesser network layer of weight, can be after corresponding network layer is formed multilayered structure, determine that corresponding multilayered structure is to work as Preceding structure to be processed.In addition, processor can also be according to search strategy specific number of plies when determining currently pending structure.

In one embodiment, neural framework search can search for (Efficient Neural using efficiently neural framework Architecture Search, ENAS) model, ENAS defines the concept of node (Node), the net in node and network model The function of network layers is similar, and difference is that network layer depends on mutually, i.e., after the latter network layer is attached to previous network layer Face, and node can arbitrarily replace preposition input, it is specific as shown in Figure 4.Efficiently the search of nerve framework needs to learn and choose What is selected is the line relationship between node, and different lines can generate different network structures, select from different network structures Optimal structure is selected that is, " design " new network model.

With reference to Fig. 4, drawn a straight line the first network model formed by network layer, intersects what line formed with by node Second network model is not identical, and the two is different calculating figure (Graph), and first network model kind reduced model weight Internal event (Checkpoint) can not imported into the second network model, but intuitively see that the position of node does not occur Variation, if the tensor (Tensor) and shape (Shape) that output and input are constant, the weight number of these nodes is the same , that is to say, that the weight of each network layer in the left side is can to correspond to the node for copying to the right.Based on the above principles, The shared purpose of weight may be implemented in ENAS, specifically, defines the fixed node of quantity first, then goes to control by one group of parameter The preposition node of each node connection is made, this group of parameter is the final select ginseng for being used to indicate fixed network structure Number, this group of parameter can pass through optimization algorithm, such as Bayes's optimization, DQN (Deep Q-Learing, depth Q- Learning), tuning selects to obtain.

In one embodiment, according to the reverse order of output layer, hidden layer, input layer, network model to be compressed is determined In single layer structure or multilayered structure be currently pending structure, comprising: according to output layer, hidden layer, input layer it is reverse Sequentially, based on the assessment result to network model after last replaced replacement, the single layer in network model to be compressed is determined Structure or multilayered structure are currently pending structure.

Specifically, it as shown in figure 5, controller is when determining currently pending structure, can also be in conjunction with last replacement The modelling effect assessment result of network model afterwards determines that dotted line frame indicates determining currently pending structure in Fig. 5.Example Such as, the B of network model_iLayer and B_i-1The structure of layer is closer to, in the B to network model_iAfter layer carries out structure replacement, if control The modelling effect assessment result of network model is preferable after device judgement replacement, then can continue to B_i-1Layer carries out structure replacement, thus Further increase modelling effect.When handling new network layer, by combining history replacement record information, can be improved The replacement efficiency and modelling effect of controller.

In one embodiment, Destination Network Structure includes base neural network unit and random one inactivated in unit Kind is a variety of.When controller searches for Destination Network Structure in search space, it can be the different types of unit of selection and carry out group Conjunction obtains Destination Network Structure, for example, the Destination Network Structure being made of pond unit and convolution kernel；It is also possible to select multiple The unit of same type is combined to obtain Destination Network Structure, for example, the Destination Network Structure being made of multiple convolution kernels；Separately Outside, the parameter of the unit of same type is also possible to different specifications, for example, by multiple and different sizes, different moving step length The Destination Network Structure of convolution kernel composition, the target being made of the pond unit of different pond sizes, pond step-length, pond type Network structure etc..

In one embodiment, when Destination Network Structure includes random inactivation unit, L1 regularization loss function is used Modelling effect assessment is carried out to network model after replacement, obtains assessment result, comprising: after replacing at random in network model, with Destination Network Structure connection parameter in one or more be set as failure state after, use L1 regularization loss function pair Network model carries out modelling effect assessment after replacement, obtains assessment result.

The principle inactivated at random be set failure state for some of which parameter at random in each iterative process so that In each iteration, partial parameters are in effective status in network model, thus, with the increase of the number of iterations, due to Different parameters may have the case where failing at any time, and therefore, the sensibility between model parameter can reduce.

As shown in fig. 6, with B_i-1Layer and B_iFor layer, inactivating unit at random, which carries out, is included the case where to Destination Network Structure It illustrates.In network model before replacement, B_i-1Layer includes parameter M1, M2, M3 etc., B_iLayer includes parameter N1, N2, N3, N4 Deng in Destination Network Structure replacement B of the use comprising inactivating unit at random_iAfter layer, parameter M1, M2, M3 and random inactivation unit There are channel connections, at this point, when carrying out modelling effect assessment, the random internal processes for inactivating unit at random by one or Multiple parameters are set as invalid state, for example, setting invalid state for parameter M2, i.e., inactivation unit only uses parameter M1 at random And M3.It should be noted that the setting of the invalid parameters state only carries out inside inactivation unit at random, that is to say, that random Unit is inactivated not to B_i-1The virtual condition of the parameter of layer impacts.

In one embodiment, as shown in fig. 7, carrying out model to network model after replacement using L1 regularization loss function Recruitment evaluation obtains assessment result, including step S142 to step S146.

S142, obtains sample data, sample data include number of training accordingly and test sample data；

S144 is trained network model after replacement using training sample data, network model after being trained；

S146 carries out model effect to network model after training by L1 regularization loss function according to test sample data Fruit assessment, obtains assessment result.

Optionally, it is commented as shown in figure 8, carrying out modelling effect to network model after training by L1 regularization loss function Estimate, obtains assessment result, including step S1462 to step S1464.

S1462 is carried out at L1 regularization by each network layer of the L1 regularization loss function to network model after training Reason, the quantity for the parameter that weight is 0 in network model after training that treated, greater than being weighed in network model after the training before processing The quantity for the parameter that weight is 0；

S1464, to treated training after network model carry out modelling effect assessment, obtain assessment result.

Specifically, it can be after obtaining the whole sample data that user uploads, data carried out to whole sample data and are beaten It dissipates and data cutting processing, obtains number of training accordingly and test sample data.Alternatively, it is also possible to being to directly acquire to have divided Good number of training is accordingly and test sample data.Then, network model after replacement is instructed using training sample data Practice, training process is specifically included to parameter, the update of the alternating of parameters weighting etc., until network model meets convergence item after replacement Part.After training, modelling effect assessment is carried out using test sample data and L1 regularization loss function, to be commented Estimate as a result, in order to determine it is more better than original structure whether current structure replacement operation plays according to assessment result Imitate model fruit.

In one embodiment, assessment result includes accuracy evaluation result and complexity evaluations result.

Model accuracy can be used for measuring the performance of its function of model realization, for example, when model is applied with image recognition, Model accuracy can specifically refer to image recognition rate, and when model is applied to image classification, model accuracy can specifically refer to figure As the accuracy rate etc. of classification.Input of the test sample data as network model after training can be used in accuracy evaluation result, leads to The output of network model and the penalty values of expected results after loss function calculating is trained are crossed to obtain.

Model complexity can be divided into time complexity and space complexity, wherein time complexity determines model Training/predicted time.If time complexity is excessively high, it will lead to model training and prediction take considerable time, it both can not be quick Verifying idea and improve model, can not also accomplish quickly to predict.Space complexity determines the number of parameters of model.Due to The limitation of dimension disaster (Curse Of Dimensionality), the parameter of model is more, and data volume needed for training pattern is just It is bigger, and the data set in actual conditions is usually not too large, this training that will lead to model is easier over-fitting.Time is complicated Degree specifically can embody to obtain by the operation times of model, and space complexity can specifically be embodied according to the volume of model itself It obtains.

In one embodiment, currently pending structure is replaced using Destination Network Structure, network model after being replaced, It include: when there is the history Destination Network Structure consistent with the structure of Destination Network Structure, by history Destination Network Structure Weight setting be Destination Network Structure initial weight after, replace currently pending structure using Destination Network Structure, obtain Network model after replacement.

Specifically, in the compression process of network model, based on the principle that parameters weighting is shared, network model is being carried out When structure is replaced, historical information can be replaced with integrated structure to improve treatment effeciency.For example, being used in i-th treatment process Destination Network Structure S replaces currently pending structure, and replace after model modelling effect assessment result it is preferable, should The weight that Destination Network Structure S is determined in secondary treatment process is α_S.Then in jth time treatment process, if reusing target network Network structure S carries out structure replacement, then can be directly by α_SInitial weight as Destination Network Structure S in jth time treatment process. Usually, the initial weight of new network structure often take 0 perhaps random value use 0 or random value as after initial weight, Need to spend longer time re-starting model training, and the weight α that will directly be determined in i-th treatment process before_S As the initial weight in jth time treatment process, it is equivalent on training result in front and is trained, so as to effectively subtract Few model training time, improve training effectiveness.

In jth time treatment process, after replacing new currently pending structure using Destination Network Structure S, to new Replaced network model carry out modelling effect assessment when, then can be directly based upon initial weight α_S, to net after new replacement Network model carries out modelling effect assessment.

It should be appreciated that although each step in the flow chart that each embodiment is related to above is according to arrow under reasonable terms Instruction successively show that but these steps are not that the inevitable sequence according to arrow instruction successively executes.Unless having herein Explicitly stated, there is no stringent sequences to limit for the execution of these steps, these steps can execute in other order.And And at least part step in each flow chart may include multiple sub-steps perhaps these sub-steps of multiple stages or rank Section is not necessarily to execute completion in synchronization, but can execute at different times, these sub-steps or stage Execution sequence is also not necessarily and successively carries out, but can be with the sub-step or stage of other steps or other steps extremely Few a part executes in turn or alternately.

In one embodiment, as shown in figure 9, providing a kind of network model compression set, which includes: that model obtains Module 110, structure determination module 120, structure replacement module 130 and recruitment evaluation module 140.

Model obtains module 110 for obtaining network model to be compressed；

Structure determination module 120 is used to determine the currently pending structure in network model to be compressed；

Structure replacement module 130 is replaced for searching for Destination Network Structure from search space using Destination Network Structure Currently pending structure is changed, network model after being replaced；

Recruitment evaluation module 140 is used to comment network model progress modelling effect after replacement using L1 regularization loss function Estimate, obtains assessment result；When assessment result meets preset requirement, return currently pending in determining network model to be compressed The step of structure, until each structure to be processed in network model to be compressed handles completion, and will be after finally obtained replacement Network model, as network model after compression.

The application provides a kind of network model compression set, is searched for by neural framework to compress to network model, It can guarantee the accuracy of network model, meanwhile, the parameter amount of network model can be effectively reduced by L1 Regularization, dropped Low network model complexity, thus the problem of being effectively improved network architecture redundancy.

In one embodiment, structure determination module 120 is used for: according to output layer, hidden layer, input layer it is reverse suitable Sequence determines that the single layer structure or multilayered structure in network model to be compressed are currently pending structure.

In one embodiment, structure determination module 120 is used for: according to output layer, hidden layer, input layer it is reverse suitable Sequence determines the single layer knot in network model to be compressed based on the assessment result to network model after last replaced replacement Structure or multilayered structure are currently pending structure.

In one embodiment, recruitment evaluation module 140 is used for: when Destination Network Structure includes random inactivation unit, At random will be after replacement in network model, one or more in the parameter that connect with Destination Network Structure is set as failure state Afterwards, modelling effect assessment is carried out to network model after replacement using L1 regularization loss function, obtains assessment result.

In one embodiment, recruitment evaluation module 140 is used for: obtaining sample data, sample data includes training sample Data and test sample data；Network model after replacement is trained using training sample data, network after being trained Model；According to test sample data, modelling effect assessment is carried out to network model after training by L1 regularization loss function, is obtained To assessment result.

In one embodiment, structure replacement module 130 is used for: consistent with the structure of Destination Network Structure existing When history Destination Network Structure, by the weight setting of history Destination Network Structure be Destination Network Structure initial weight after, make Currently pending structure, network model after being replaced are replaced with Destination Network Structure.

Figure 10 shows the internal structure chart of computer equipment in one embodiment.The computer equipment specifically can be figure Terminal 10 (or server 20) in 1.As shown in Figure 10, it includes total by system which, which includes the computer equipment, Processor, memory, network interface, input unit and the display screen of line connection.Wherein, memory includes that non-volatile memories are situated between Matter and built-in storage.The non-volatile memory medium of the computer equipment is stored with operating system, can also be stored with computer journey Sequence when the computer program is executed by processor, may make processor to realize network model compression method.In the built-in storage Computer program can be stored, when which is executed by processor, processor may make to execute network model compression side Method.The display screen of computer equipment can be liquid crystal display or electric ink display screen, the input unit of computer equipment It can be the touch layer covered on display screen, be also possible to the key being arranged on computer equipment shell, trace ball or Trackpad, It can also be external keyboard, Trackpad or mouse etc..

It will be understood by those skilled in the art that structure shown in Figure 10, only part relevant to application scheme The block diagram of structure, does not constitute the restriction for the computer equipment being applied thereon to application scheme, and specific computer is set Standby may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.

In one embodiment, network model compression set provided by the present application can be implemented as a kind of computer program Form, computer program can be run in computer equipment as shown in Figure 10.Group can be stored in the memory of computer equipment At each program module of the network model compression set, for example, model shown in Fig. 9 obtains module, structure determination module, knot Structure replacement module and recruitment evaluation module.The computer program that each program module is constituted executes processor in this specification Step in the network model compression method of each embodiment of the application of description.

For example, computer equipment shown in Fig. 10 can pass through the model in network model compression set as shown in Figure 9 It obtains module and obtains network model to be compressed.Computer equipment can determine the network model to be compressed by structure determination module In currently pending structure.Computer equipment can search for Destination Network Structure by structure replacement module from search space, And the currently pending structure, network model after being replaced are replaced using the Destination Network Structure；Computer equipment can Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function by recruitment evaluation module, is obtained Assessment result；When the assessment result meets preset requirement, return current wait locate in the determining network model to be compressed The step of managing structure until reason structure handles completion everywhere in the network model to be compressed, and is replaced finally obtained Rear network model is changed, as network model after compression.

In one embodiment, a kind of computer equipment, including memory and processor are provided, memory is stored with meter Calculation machine program, when computer program is executed by processor, so that the step of processor executes above-mentioned network model compression method.This The step of locating network model compression method can be the step in the network model compression method of above-mentioned each embodiment.

In one embodiment, a kind of computer readable storage medium is provided, computer program, computer journey are stored with When sequence is executed by processor, so that the step of processor executes above-mentioned network model compression method.Network model compression side herein The step of method, can be the step in the network model compression method of above-mentioned each embodiment.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Instruct relevant hardware to complete by computer program, program can be stored in a non-volatile computer storage can be read In medium, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, provided herein each To any reference of memory, storage, database or other media used in embodiment, may each comprise it is non-volatile and/ Or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) directly RAM (RDRAM), straight Connect memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.

The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims

1. a kind of network model compression method characterized by comprising

Obtain network model to be compressed；

Destination Network Structure is searched for from search space, and replaces the currently pending knot using the Destination Network Structure Structure, network model after being replaced；

Modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, obtains assessment result；

When the assessment result meets preset requirement, the currently pending structure determined in the network model to be compressed is returned The step of, until each structure to be processed in the network model to be compressed handles completion, and will be after finally obtained replacement Network model, as network model after compression.

2. the method according to claim 1, wherein the network model to be compressed is deep neural network；Really Currently pending structure in the fixed network model to be compressed, comprising:

According to the reverse order of output layer, hidden layer, input layer, determine single layer structure in the network model to be compressed or Multilayered structure is the currently pending structure.

3. according to the method described in claim 2, it is characterized in that, according to output layer, hidden layer, input layer reverse order, Determine that single layer structure or multilayered structure in the network model to be compressed are the currently pending structure, comprising:

According to the reverse order of output layer, hidden layer, input layer, based on commenting network model after last replaced replacement Estimate as a result, determining that single layer structure or multilayered structure in the network model to be compressed are the currently pending structure.

4. the method according to claim 1, wherein the Destination Network Structure includes base neural network unit And one of random inactivation unit or a variety of.

5. according to the method described in claim 4, it is characterized in that, when the Destination Network Structure includes random inactivation unit When, modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, assessment result is obtained, wraps It includes:

One or more in the parameter in network model after the replacement, connecting with the Destination Network Structure is set at random After being set to failure state, modelling effect assessment is carried out to network model after the replacement using L1 regularization loss function, is obtained Assessment result.

6. the method according to claim 1, wherein using L1 regularization loss function to network after the replacement Model carries out modelling effect assessment, obtains assessment result, comprising:

Obtain sample data, the sample data include number of training accordingly and test sample data；

Network model after the replacement is trained using the training sample data, network model after being trained；

According to the test sample data, modelling effect is carried out to network model after the training by L1 regularization loss function Assessment, obtains assessment result.

7. method according to claim 1 or 6, which is characterized in that the assessment result include accuracy evaluation result and Complexity evaluations result.

8. according to the method described in claim 6, it is characterized in that, being damaged according to the test sample data by L1 regularization It loses function and modelling effect assessment is carried out to network model after the training, obtain assessment result, comprising:

L1 Regularization, processing are carried out by each network layer of the L1 regularization loss function to network model after the training The quantity for the parameter that weight is 0 in network model after training afterwards is 0 greater than weight in network model after the training before processing The quantity of parameter；

To it is described treated training after network model carry out modelling effect assessment, obtain assessment result.

9. the method according to claim 1, wherein being replaced using the Destination Network Structure described currently wait locate Manage structure, network model after being replaced, comprising:

When there is the history Destination Network Structure consistent with the structure of the Destination Network Structure, by the history target network After the weight setting of network structure is the initial weight of the Destination Network Structure, work as using described in Destination Network Structure replacement Preceding structure to be processed, network model after being replaced.

10. a kind of network model compression set characterized by comprising

Model obtains module, for obtaining network model to be compressed；

Structure replacement module is replaced for searching for Destination Network Structure from search space, and using the Destination Network Structure The currently pending structure, network model after being replaced；

Recruitment evaluation module is commented for carrying out modelling effect to network model after the replacement using L1 regularization loss function Estimate, obtains assessment result；When the assessment result meets preset requirement, returns and determine working as in the network model to be compressed The step of preceding structure to be processed, until each structure to be processed in the network model to be compressed handles completion, and will be final Network model after obtained replacement, as network model after compression.

11. device according to claim 10, which is characterized in that the structure determination module, for realizing the following terms Any one of:

First item: according to the reverse order of output layer, hidden layer, input layer, the single layer in the network model to be compressed is determined Structure or multilayered structure are the currently pending structure；

Section 2: according to the reverse order of output layer, hidden layer, input layer, based on to network after last replaced replacement The assessment result of model determines that the single layer structure or multilayered structure in the network model to be compressed are described currently pending Structure.

12. device according to claim 10, which is characterized in that the recruitment evaluation module, for realizing the following terms Any one of:

First item: when the Destination Network Structure includes random inactivation unit, at random by network model after the replacement, with After one or more in the parameter of Destination Network Structure connection is set as failure state, L1 regularization is used to lose letter It is several that modelling effect assessment is carried out to network model after the replacement, obtain assessment result；

Section 2: obtaining sample data, the sample data include number of training accordingly and test sample data；Using described Training sample data are trained network model after the replacement, network model after being trained；According to the test sample Data carry out modelling effect assessment to network model after the training by L1 regularization loss function, obtain assessment result.

13. device according to claim 10, which is characterized in that the structure replacement module is used for: exist with it is described When the consistent history Destination Network Structure of the structure of Destination Network Structure, by the weight setting of the history Destination Network Structure After initial weight for the Destination Network Structure, the currently pending structure is replaced using the Destination Network Structure, is obtained Network model after to replacement.

14. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the computer program is located When managing device execution, so that the processor is executed such as the step of any one of claims 1 to 9 the method.

15. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In when the computer program is executed by the processor, so that the processor executes such as any one of claims 1 to 9 The step of the method.