CN108229646A

CN108229646A - neural network model compression method, device, storage medium and electronic equipment

Info

Publication number: CN108229646A
Application number: CN201710671900.2A
Authority: CN
Inventors: 王飞
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2017-08-08
Filing date: 2017-08-08
Publication date: 2018-06-29

Abstract

The embodiment of the present invention provides a kind of neural network model compression method, device, storage medium and electronic equipment.This method includes：Obtain first nerves network model；It keeps or increases the depth of the first nerves network model and compress an at least network parameter for an at least network layer for the first nerves network model, obtain nervus opticus network model；The nervus opticus network model is trained based on sample data set and according at least to the output of the first nerves network model.So as to train and obtain strong ability in feature extraction, performance and the comparable compressed neural network model of uncompressed neural network model, and the training method has versatility, be adapted to carry out the neural network model of any function.

Description

Neural network model compression method, device, storage medium and electronic equipment

Technical field

The present embodiments relate to artificial intelligence technology more particularly to a kind of neural network model compression method, device, meters Calculation machine storage medium and electronic equipment.

Background technology

In recent years, deep neural network is applied in all multitasks such as computer vision, natural language processing so that performance The raising of making a breakthrough property.In deep neural network model, powerful ability to express is exchanged for a large amount of network weight, so as to Obtain stronger performance.For example, the size of AlexNet models has been more than 200MB, the size of VGG-16 models has been more than 500MB. However, due to its depth and network parameter, the network feedforward time of deep neural network is usually longer, these upper limits to a certain degree Application of the neural network in the limited equipment of computing resource is made.

Invention content

The embodiment of the present invention proposes a kind of neural network model compression scheme.

It is according to embodiments of the present invention in a first aspect, provide a kind of neural network model compression method, including：Obtain first Neural network model；It keeps or increases the depth of the first nerves network model and compress the first nerves network model An at least network parameter for an at least network layer, obtains nervus opticus network model；Based on sample data set and according at least to institute The nervus opticus network model is trained in the output for stating first nerves network model.

Optionally, an at least network parameter includes：Convolution kernel size and/or, feature port number.

Optionally, increase the depth of the first nerves network model, including：Building the nervus opticus network development process In, it is according to the receptive field of a larger convolutional layer of convolution kernel in the first nerves network model, the convolution kernel is larger A convolutional layer using the smaller multiple convolutional layers of convolution kernel come equivalence replacement.

Optionally, the output training described the based on sample data and according at least to the first nerves network model Two neural network models, including：The sample data is inputted into the first nerves network model, obtains the first nerves net The output result of at least one network layer in network model；The sample data is inputted into the nervus opticus network model, is obtained The output result of at least one network layer in the nervus opticus network model；Determine the nervus opticus network model and described First difference of the output result of at least one respective wire network layers of first nerves network model；According to first discrepancy adjustment The network parameter of the nervus opticus network model.

Optionally, the output training described the based on sample data and according at least to the first nerves network model Two neural network models, including：Last network layer of the nervus opticus network model is connect with auxiliary convolutional layer；By institute It states sample data and inputs the first nerves network model, obtain at least one network layer in the first nerves network model Export result；The sample data is inputted into the nervus opticus network model, is obtained in the nervus opticus network model extremely The output result of a few network layer；Determine the nervus opticus network model and the first nerves network model extremely respectively First difference of the output result of few respective wire network layers and the final output result of the first nerves network model and Second difference of the output result of the auxiliary convolutional layer；According to first difference and second difference weighting adjustment The network parameter of nervus opticus network model, and the weight of first difference is more than the weight of second difference.

Optionally, after nervus opticus network model training is completed, the auxiliary convolutional layer is removed.

Optionally, at least one respective wire network layers include：The nervus opticus network model and the first nerves The last one respective network layer of network model.

Optionally, at least one respective wire network layers include：The nervus opticus network model and the first nerves The identical mid-level net network layers of at least one depth of network model.

Optionally, include the identical mid-level net network layers of at least one depth at least one respective wire network layers When, determine that first difference includes：Increase by least one fitting branch in the nervus opticus network, pass through described at least one A fitting branch determines the defeated of the corresponding mid-level net network layers of the nervus opticus network model and the first nerves network model Go out the first difference of result.

Optionally, after nervus opticus network model training is completed, at least one fitting branch described in removal.

Second aspect according to embodiments of the present invention also provides a kind of neural network model compression set, which is characterized in that Including：Acquisition module, for obtaining first nerves network model；Compression module obtains for keeping or increasing the acquisition module The depth of first nerves network model that takes and compress the first nerves network model an at least network layer an at least net Network parameter obtains nervus opticus network model；Training module, for being based on sample data set and according at least to the first nerves The nervus opticus network model is trained in the output of network model.

Optionally, the compression module is used in the nervus opticus network development process is built, according to the first nerves A larger convolutional layer of the convolution kernel is used convolution by the receptive field of a larger convolutional layer of convolution kernel in network model The smaller multiple convolutional layers of core carry out equivalence replacement.

Optionally, the training module includes：First output result acquiring unit, for the sample data to be inputted institute First nerves network model is stated, obtains the output result of at least one network layer in the first nerves network model；Second is defeated Go out result acquiring unit, for the sample data to be inputted the nervus opticus network model, obtain the nervus opticus net The output result of at least one network layer in network model；First difference acquiring unit, for determining the nervus opticus network mould First difference of the output result of at least one respective wire network layers of type and the first nerves network model；First parameter adjustment Unit, for the network ginseng of nervus opticus network model described in the first discrepancy adjustment for being determined according to the difference acquiring unit Number.

Optionally, the training module includes：Auxiliary layer connection unit, for by the nervus opticus network model most Latter network layer is connect with auxiliary convolutional layer；Third exports result acquiring unit, for by sample data input described the One neural network model obtains the output result of at least one network layer in the first nerves network model；4th output knot Fruit acquiring unit for the sample data to be inputted the nervus opticus network model, obtains the nervus opticus network mould The output result of at least one network layer in type；Second difference acquiring unit, for determining the nervus opticus network mould respectively The first difference and described first of the output result of at least one respective wire network layers of type and the first nerves network model Second difference of the final output result of neural network model and the output result of the auxiliary convolutional layer；Second parameter adjustment list Member, for adjusting the network parameter of the nervus opticus network model according to first difference and second difference weighting, And the weight of first difference is more than the weight of second difference.

Optionally, the training module further includes：Auxiliary layer removal unit, for being instructed in the nervus opticus network model Practice after completing, remove the auxiliary convolutional layer.

Optionally, the first difference acquiring unit is divided for increasing by least one fitting in the nervus opticus network Branch determines the phase of the nervus opticus network model and the first nerves network model by least one fitting branch Answer the first difference of the output result of mid-level net network layers.

Optionally, the training module further includes：Branch's removal unit is fitted, in the nervus opticus network model After training is completed, at least one fitting branch described in removal.

The third aspect according to embodiments of the present invention, also provides a kind of electronic equipment, including：Processor, memory, communication Element and communication bus, the processor, the memory and the communication device are completed mutual by the communication bus Communication；For the memory for storing an at least executable instruction, it is following that the executable instruction performs the processor Operation：Obtain first nerves network model；It keeps or increases the depth of the first nerves network model and compress described first An at least network parameter for an at least network layer for neural network model, obtains nervus opticus network model；Based on sample data Collect and train the nervus opticus network model according at least to the output of the first nerves network model.

Optionally, the executable instruction further makes the processor perform following operate：Building second god Through in network development process, according to the receptive field of a larger convolutional layer of convolution kernel in the first nerves network model, will described in A larger convolutional layer of convolution kernel is using the smaller multiple convolutional layers of convolution kernel come equivalence replacement.

Optionally, the executable instruction further makes the processor perform following operate：The sample data is defeated Enter the first nerves network model, obtain the output result of at least one network layer in the first nerves network model；It will The sample data inputs the nervus opticus network model, obtains at least one network layer in the nervus opticus network model Output result；Determine at least one respective wire network layers of the nervus opticus network model and the first nerves network model Output result the first difference；According to the network parameter of nervus opticus network model described in first discrepancy adjustment.

Optionally, the executable instruction further makes the processor perform following operate：By the nervus opticus net Last network layer of network model is connect with auxiliary convolutional layer；The sample data is inputted into the first nerves network model, Obtain the output result of at least one network layer in the first nerves network model；The sample data is inputted described second Neural network model obtains the output result of at least one network layer in the nervus opticus network model；It determines respectively described The first of the output result of at least one respective wire network layers of nervus opticus network model and the first nerves network model is poor The final output result of different and described first nerves network model is second poor with the output result of the auxiliary convolutional layer It is different；The network parameter of the nervus opticus network model, and institute are adjusted according to first difference and second difference weighting The weight for stating the first difference is more than the weight of second difference.

Optionally, the executable instruction further makes the processor perform following operate：In the nervus opticus net After network model training is completed, the auxiliary convolutional layer is removed.

Optionally, the executable instruction further makes the processor perform following operate：In the nervus opticus net Increase by least one fitting branch in network, the nervus opticus network model and described is determined by least one fitting branch First difference of the output result of the corresponding mid-level net network layers of first nerves network model.

Optionally, the executable instruction further makes the processor perform following operate：In the nervus opticus net After network model training is completed, at least one fitting branch described in removal.

Fourth aspect according to embodiments of the present invention also provides a kind of computer readable storage medium, is stored thereon with meter Calculation machine program instruction, wherein, described program instruction realizes any nerve net provided by the embodiments of the present application when being executed by processor The step of network model compression method.

5th aspect according to embodiments of the present invention, also provides a kind of computer program, includes computer program and refer to It enables, described program instruction realizes any neural network model compression method provided by the embodiments of the present application when being executed by processor Step.

The neural network model compression scheme provided according to embodiments of the present invention, passes through the neural network that will be compressed An at least network parameter for an at least network layer for model is compressed, and obtains the neural network model through overcompression.It is compressing In the process, the depth of original neural network model is kept or increased, is compressed so as to reduce as far as possible to the ability to express of network It influences.Hereafter, then based on sample data set and according at least to uncompressed neural network model output training through overcompression Neural network model, it is comparable to obtain ability in feature extraction strong, performance and uncompressed neural network model so as to training Compressed neural network model, and the training method has versatility, is adapted to carry out the neural network mould of any function Type.

Description of the drawings

Fig. 1 is the flow chart for the neural network model compression method for showing according to embodiments of the present invention one；

Fig. 2 is the flow chart for the neural network model compression method for showing according to embodiments of the present invention two；

Fig. 3 is the flow chart for the neural network model compression method for showing according to embodiments of the present invention three；

Fig. 4 shows the logic diagram of according to embodiments of the present invention four neural network model compression set；

Fig. 5 shows the logic diagram of according to embodiments of the present invention five neural network model compression set；

Fig. 6 shows the logic diagram of according to embodiments of the present invention six neural network model compression set；

Fig. 7 is the structure diagram for the electronic equipment for showing according to embodiments of the present invention seven.

Specific embodiment

The exemplary embodiment of the embodiment of the present invention is described in detail below in conjunction with the accompanying drawings.

Embodiment one

Fig. 1 is the flow chart for the neural network model compression method for showing according to embodiments of the present invention one.

With reference to Fig. 1, in step S110, first nerves network model is obtained.

Here first nerves network model can be trained neural network model.That is, according to this hair The neural network model compression method of bright embodiment one is suitable for compressing any general neural network model.

The training of first nerves network model is not defined in embodiments of the present invention, it can be any using traditional net Network training method trains first nerves network model in advance.Can according to function that first nerves network model to be realized, characteristic and Training requirement, using based on the training the in advance such as supervised learning method, unsupervised approaches, intensified learning method or semi-supervised method One neural network model.

For example, can be used based on supervised learning method, training is used for the first nerves network model of classification, wherein, for instruction Practice sample and mark out desired classification value, exercised supervision with the classification value marked out to first nerves network model.Lacking foot Enough prioris, it is difficult in the case that artificial mark classification or the artificial classification mark cost of progress are excessively high, can be used unsupervised Method is trained for example for the first nerves network model of image characteristics extraction.For other applicable machine learning methods, It does not give and describes one by one herein.

In step S120, keep or increase the depth of first nerves network model and compress first nerves network model extremely An at least network parameter for a few network layer, obtains nervus opticus network model.

Network layer mentioned here may include, but be not limited to, in the neural networks such as convolutional layer, pond layer, full articulamentum Layer.

Here, it can be joined by the one or more networks for the one or more network layers for only reducing first nerves network model Number, to obtain the nervus opticus network model through overcompression.Alternatively, can by increasing the depth of first nerves network model, and One or more network parameters of one or more network layers of first nerves network model are reduced, to obtain the through overcompression Two neural network models, from there through the depth for increasing first nerves network model, to reduce the reduction pair due to network parameter The influence of the accuracy of the nervus opticus network model of compression.

As can be seen that in aforementioned any compression processing, the depth of first nerves network model is not reduced.This be because For the number in network parameter is comparable, and network is deeper, and the convolutional layer being superimposed before each network layer is more, god Expressivity through network is stronger, and the receptive field of each layer neuron is bigger, and the corresponding input range of high-rise neuron is bigger, extraction The ability of global characteristics is stronger, therefore its performance is better.Therefore, by do not reduce first nerves network model depth press Contracting first nerves network model can not influence the ability to express of first nerves network model and the ability of global characteristics as far as possible.

In step S130, nervus opticus is trained based on sample data set and according at least to the output of first nerves network model Network model.

Optionally, by the sample data set for the desired value/actual value for containing mark or the sample of labeled data is not contained Data set, to train nervus opticus network model.Wherein, at least using the output of first nerves network model to nervus opticus net The training of network model exercises supervision.

For example, aforementioned sample data set can be inputed to first nerves network model and nervus opticus network model respectively, Carry out forward calculation.Hereafter, according to the output data counting loss value obtained from first nerves network model, further according to the loss Value carries out backwards calculation, so as to which training obtains the nervus opticus network model with estimated performance；It alternatively, can be according to refreshing from first The output data obtained through network model calculates first-loss value, according to the output data that is obtained from nervus opticus network model and Corresponding desired value/actual value that sample data concentration contains calculates the second penalty values, further according to the first-loss value and second Penalty values carry out backwards calculation, so as to which training obtains the nervus opticus network model with estimated performance.

According to embodiments of the present invention one neural network model compression method passes through the neural network mould that will be compressed An at least network parameter for an at least network layer for type is compressed, and obtains the neural network model through overcompression.Compressed Cheng Zhong keeps or increases the depth of original neural network model, so as to reduce shadow of the compression to the ability to express of network as far as possible It rings.Hereafter, then based on sample data set and according at least to the output of uncompressed neural network model train through overcompression Neural network model obtains strong ability in feature extraction, performance and the comparable pressure of uncompressed neural network model so as to training Neural network model after contracting, and the training method has versatility, is adapted to carry out the neural network model of any function.

Embodiment two

Fig. 2 is the flow chart for the neural network model compression method for showing according to embodiments of the present invention two.

With reference to Fig. 2, in step S210, first nerves network model is obtained.The processing of the step is with abovementioned steps S110's Handle similar, it will not be described here.

A kind of optional embodiment according to the present invention, in step S220, according to convolution kernel in first nerves network model The receptive field of a larger convolutional layer, by a larger convolutional layer of convolution kernel using the smaller multiple convolutional layers of convolution kernel come Equivalence replacement obtains nervus opticus network model.

For the first nerves network model for generally including convolutional layer, for the convolutional layer with larger convolution kernel, A convolutional layer of big convolution kernel can be replaced by using multiple convolutional layers of small convolution kernel so that in suitable or identical receptive field Under the premise of the complexity of calculating is reduced by small convolution kernel, can also increase network depth to improve the expression of neural network Ability.

For example, 2 convolutional layers that convolution kernel size is 3*3 can be used to replace convolution kernel size in first nerves network model For 1 convolutional layer of 5*5, the convolution kernel in first nerves network model is replaced using 3 convolutional layers that convolution kernel size is 3*3 Size is 1 convolutional layer of 7*7, etc..So that the receptive field of related convolutional layer is identical before and after replacement or phase as far as possible Together, the complexity of calculating is reduced by small convolution kernel, can also increase network depth to improve the expression energy of neural network Power..

According to an alternate embodiment of the invention, in step S220, compression first nerves network model is at least An at least network parameter for one network layer, obtains nervus opticus network model.

For example, at least a network parameter may include, but it is not limited to：Convolution kernel size and/or, feature port number, etc..

By reducing the size of convolution kernel, i.e., the big convolution kernel of nuclear subsitution is accumulated using rouleau, keep first nerves network model Depth it is constant, can reduce the computation complexity of convolutional layer, improve network feed forward velocity.For example, in first nerves network model Two convolutional layers, use multiple 3 × 3 convolution nuclear subsitution, 5 × 5 convolution kernels or the convolution kernel of bigger so that two convolutional layers Computation complexity and memory space reduce (1-18/25).

Number by the feature channel for reducing convolutional layer can delete the redundancy feature of convolutional layer, reduce calculation scale, and And improve network feed forward velocity.For example, to the convolutional layer to be compressed, it can be by carrying out 1 × 1 convolution by the convolution to the convolutional layer The number of the feature channel of layer is reduced to 256 from 512.

It can be by reducing the size of convolution kernel and reducing one of number two ways of feature channel of convolutional layer come to volume Lamination is compressed, both modes can also be used in combination to compress any convolutional layer of first nerves network model or multiple Convolutional layer.For example, to the convolutional layer to be compressed, it can first pass through and 1 × 1 convolution of convolutional layer progress is led to the feature of convolutional layer The number in road reduces half；Hereafter, the convolutional layer then in this feature channel being reduced uses small convolution nuclear subsitution convolutional layer Big convolution kernel, further to compress the convolutional layer.

Nervus opticus network model is being obtained by aforementioned compression processing, randomly nervus opticus network model can carried out Initialization.

On the other hand, since the number of the network parameter of nervus opticus network model obtained through overcompression is less, study Ability differs larger with original first nerves network model, is directly fitted the output data actual value of first nerves network model It is more difficult.It is trained directly on training dataset to be susceptible to not convergent situation in the case where data volume is larger.

Further, since the predicted value of first nerves network model output is more smooth compared with the actual value of mark, learning difficulty It is relatively low, the net of preferable nervus opticus network model can be obtained by being fitted the output of original first nerves network model The initial value of network parameter.

For this purpose, optionally, the network parameter of nervus opticus network model is initialized as phase in first nerves network model The network parameter answered, so as to the output of initial fitting first nerves network model.

Hereafter, the nervus opticus network model through overcompression is trained by the processing of step S230~S260.

During the neural network model more than the training number of plies, when performing reverse transfer, since network is deeper, it will go out Existing more serious gradient disappears, and causes that the network parameter of the convolutional layer close to network inputs can not be trained, it is difficult to which training is acquired It can preferably neural network model.For this purpose, obtaining the output of the mid-level net network layers of nervus opticus network model, make the second of compression Neural network model is fitted the output feature of the respective layer of first nerves network model in each mid-level net network layers, and deep learning is leaned on The network parameter of the layer (commonly referred to as low network layer) of the input terminal of nearly neural network model.

In step S230, sample data is inputted into first nerves network model, is obtained in first nerves network model at least The output result of one network layer.

Here, forward calculation is carried out to first nerves network model by using sample data, from first nerves network mould At least one of type network layer obtains corresponding output result.Here at least one network layer may include first nerves network The mid-level net network layers and the last one network layer of model.

In step S240, sample data is inputted into nervus opticus network model, is obtained in nervus opticus network model at least The output result of one network layer.

Similarly, forward calculation is carried out to nervus opticus network model by using sample data, from nervus opticus network mould At least one of type network layer obtains corresponding output result.Here at least one network layer may include nervus opticus network The mid-level net network layers and the last one network layer of model.

It may be noted that step S230 and S240 can be performed in any order or is performed in parallel step S230 and S240.

In step S250, at least one respective wire network layers of nervus opticus network model and first nerves network model are determined Output result the first difference.

A kind of optional embodiment according to the present invention, at least one respective wire network layers here include：Nervus opticus net Network model and the last one respective network layer of first nerves network model.

That is, according to the final output of first nerves network model and the final output of nervus opticus network model come Calculate the first difference.

According to an alternate embodiment of the invention, at least one respective wire network layers here include：Nervus opticus The network model mid-level net network layers identical at least one depth of first nerves network model.

That is, according at least one identical with depth in first nerves network model of nervus opticus network model Between the output result of network layer determine the first difference.

For this purpose, specifically, a kind of optional embodiment according to the present invention increases by least one plan in nervus opticus network Branch is closed, the corresponding mid-level net of nervus opticus network model and first nerves network model is determined by least one fitting branch First difference of the output result of network layers.

For example, it is assumed that first nerves network model and nervus opticus network model have 16 layers, it can be respectively in the two god The 6th layer of setting fitting branch 1, and the 12nd layer in the two neural network models sets plan respectively respectively through network model Branch 2 is closed, to carry out output feature fitting in intermediate convolutional layer.Sample data is inputted to first nerves network mould respectively aforementioned After type and nervus opticus network model, can nervus opticus network model and first nerves network mould be determined by these fitting branches First difference of the output result of the corresponding mid-level net network layers of type.

The first difference can be calculated for example, by loss function or apart from function is calculated, so as to assess nervus opticus network model Relative to the accuracy in detection of first nerves network model.

It is understood that above-mentioned feasible program is only one of which realization method, in practical applications, user can be with Realization condition or design parameter are adjusted according to actual demand, the citing of above-mentioned feasible program should not be construed as unique realization side Formula.

In step S260, according to the network parameter of the first discrepancy adjustment nervus opticus network model.

That is, in the step, make the second fitting branch that the intermediate convolutional layer of nervus opticus network model sets The output of first fitting branch of output fitting first nerves network model intermediate convolutional layer setting accordingly, to update compression The network parameter of nervus opticus network model.Specifically, it is multiple training samples and the first difference are refreshing to second for reverse transfer Through network model, the network parameter of nervus opticus network model is updated for example, by gradient descent method, it is first poor to make It is different to converge to the range allowed, reach expected nervus opticus network model so as to which training obtains performance.

Optionally, after the training of nervus opticus network model is completed, these at least one plans for network training are removed Close branch.

According to embodiments of the present invention two neural network model compression method, by the depth for increasing neural network model And/or the network parameter of at least one network layer is reduced, obtain the small-scale neural network model through overcompression.Compressed Cheng Zhong keeps or increases the depth of original neural network model, so as to reduce shadow of the compression to the ability to express of network as far as possible It rings.It is in addition, original in the fitting of the intermediate convolutional layer of original neural network model and the small-scale neural network model being trained to Neural network model output, so as to depth learn close to neural network model input terminal network layer (commonly referred to as Low network layer) network parameter, improve the accuracy of the neural network model through overcompression.

Embodiment three

Fig. 3 is the flow chart for the neural network model compression method for showing according to embodiments of the present invention three.

With reference to Fig. 3, in step S310, first nerves network model is obtained.The processing of the step is with abovementioned steps S110's Handle similar, it will not be described here.

In step S320, keep or increase the depth of first nerves network model and compress first nerves network model extremely An at least network parameter for a few network layer, obtains nervus opticus network model.

The processing of the step is similar with the processing of abovementioned steps S120 or step S220, and it will not be described here.

Since nervus opticus network model is the network that first nerves network model generates after overcompression, network parameter It is reduced, in order to improve the feature representation ability of nervus opticus network model, according to embodiments of the present invention three neural network It is (usual in the output terminal of nervus opticus network model during training nervus opticus network model in model compression method Referred to as network top) auxiliary convolutional layer is added, increase the network depth of nervus opticus network model, for being fitted depth network Output.

Correspondingly, in step S330, last network layer of nervus opticus network model is connect with auxiliary convolutional layer.It is auxiliary The number for helping convolutional layer can be one, two or more.

In step S340, aforementioned sample data is inputted into first nerves network model, is obtained in first nerves network model The output of at least one network layer is as a result, the output result includes the output result of aforementioned mid-level net network layers and first nerves network The final output result of model.

In step S350, aforementioned sample data is inputted into nervus opticus network model, is obtained in nervus opticus network model The output result of at least one network layer.

That is, by the processing of step S350, from least one mid-level net network layers of nervus opticus network model with And auxiliary convolutional layer obtains corresponding output result respectively.

It may be noted that step S340 and S350 can be performed in any order or is performed in parallel step S340 and S350.

In step S360, at least one respective wire of nervus opticus network model and first nerves network model is determined respectively First difference of the output result of network layers and the final output result of first nerves network model and the output of auxiliary convolutional layer As a result the second difference.

The processing that can refer to step S250 determines the first difference, and can be for example, by loss function or apart from calculating function The second difference is calculated, so as to assess detection of the nervus opticus network model in auxiliary convolutional layer relative to first nerves network model Accuracy.

In step S370, the network parameter of nervus opticus network model is adjusted according to the first difference and the weighting of the second difference, And first difference weight be more than the second difference weight.

Specifically, the first difference and the second difference are weighted respectively, obtain comprehensive differences wherein, assigned for the first difference Give the weight more than the second difference.Since the purpose of setting auxiliary convolutional layer is to further improve the spy of nervus opticus network model Ability to express is levied, the training of nervus opticus network model is further optimized, therefore will for the weight that the second difference assigns Less than the weight of the first difference.Hereafter, the comprehensive differences reverse transfer multiple training samples and weighting obtained is to nervus opticus Network model is updated the network parameter of nervus opticus network model for example, by gradient descent method, makes comprehensive differences The range allowed is converged to, reaches expected nervus opticus network model so as to which training obtains performance.

Optionally, after the training of nervus opticus network model is completed, removal is used for the auxiliary convolutional layer of network training.

According to embodiments of the present invention three neural network model compression method on the basis of previous embodiment, is being instructed The output terminal of neural network model experienced, through overcompression adds auxiliary convolutional layer, and passes through the output from auxiliary convolutional layer As a result the output of uncompressed neural network model is fitted, further enhances the feature extraction for the neural network model being trained to Ability, and optimize the performance of the neural network model.Similarly, which has versatility, is adapted to carry out any work( The neural network model with multiple convolutional layers of energy.

Example IV

Fig. 4 shows the logic diagram of according to embodiments of the present invention four neural network model compression set.

With reference to Fig. 4, according to embodiments of the present invention four neural network model compression set includes acquisition module 410, compression Module 420 and training module 430.

Acquisition module 410 is used to obtain first nerves network model.

Compression module 420 is used to keep or increase the depth and pressure of the first nerves network model of the acquisition of acquisition module 410 Contract the first nerves network model an at least network layer an at least network parameter, obtain nervus opticus network model.

Training module 430 is for based on sample data set and according at least to the output training of the first nerves network model The nervus opticus network model.

The neural network model compression set of the present embodiment is used to implement corresponding neural network in preceding method embodiment Model compression method, and the advantageous effect with corresponding embodiment of the method, details are not described herein.

Embodiment five

Fig. 5 shows the logic diagram of according to embodiments of the present invention five neural network model compression set.

According to embodiments of the present invention five, include for at least network parameter that compression module 420 compresses：Convolution kernel is big It is small and/or, feature port number.

Optionally, compression module 420 is used in the nervus opticus network development process is built, according to the first nerves net A larger convolutional layer of the convolution kernel is used convolution kernel by the receptive field of a larger convolutional layer of convolution kernel in network model Smaller multiple convolutional layers carry out equivalence replacement.

Training module 430 includes：

First output result acquiring unit 431, for the sample data to be inputted the first nerves network model, is obtained Obtain the output result of at least one network layer in the first nerves network model；

Second output result acquiring unit 432, for the sample data to be inputted the nervus opticus network model, is obtained Obtain the output result of at least one network layer in the nervus opticus network model；

First difference acquiring unit 433, for determining the nervus opticus network model and the first nerves network mould First difference of the output result of at least one respective wire network layers of type；

First parameter adjustment unit 434, for described in the first discrepancy adjustment for being determined according to the difference acquiring unit The network parameter of two neural network models.

A kind of optional embodiment according to the present invention, aforementioned at least one respective wire network layers include：The nervus opticus Network model and the last one respective network layer of the first nerves network model.

According to an alternate embodiment of the invention, aforementioned at least one respective wire network layers include：Second god Through the network model mid-level net network layers identical at least one depth of the first nerves network model.

Optionally, the first difference acquiring unit 433 is fitted branch for increasing at least one in the nervus opticus network, The corresponding of the nervus opticus network model and the first nerves network model is determined by least one fitting branch First difference of the output result of mid-level net network layers.

Optionally, training module 430 further includes：Branch's removal unit (not shown) is fitted, in the nervus opticus After network model training is completed, at least one fitting branch described in removal.

Embodiment six

Fig. 5 shows the logic diagram of according to embodiments of the present invention six neural network model compression set.

According to embodiments of the present invention six, include for at least network parameter that compression module 420 compresses：Convolution kernel is big It is small and/or, feature port number.

According to embodiments of the present invention six, training module 430 includes：

Auxiliary layer connection unit 435, for by last network layer of the nervus opticus network model with auxiliary convolution Layer connection；

Third output result acquiring unit 436, for the sample data to be inputted the first nerves network model, is obtained Obtain the output result of at least one network layer in the first nerves network model；

4th output result acquiring unit 437, for the sample data to be inputted the nervus opticus network model, is obtained Obtain the output result of at least one network layer in the nervus opticus network model；

Second difference acquiring unit 438, for determining the nervus opticus network model and the first nerves net respectively First difference of the output result of at least one respective wire network layers of network model and the first nerves network model it is final Export the second difference of result and the output result of the auxiliary convolutional layer；

Second parameter adjustment unit 439, for according to first difference and second difference weighting adjustment described the The network parameter of two neural network models, and the weight of first difference is more than the weight of second difference.

Optionally, training module 430 further includes：Auxiliary layer removal unit (not shown), in the nervus opticus net After network model training is completed, the auxiliary convolutional layer is removed.

Embodiment seven

The embodiment of the present invention additionally provides a kind of electronic equipment, such as can be mobile terminal, personal computer (PC), put down Plate computer, server etc..Below with reference to Fig. 7, it illustrates suitable for being used for realizing the terminal device of the embodiment of the present invention or service The structure diagram of the electronic equipment 700 of device.

As shown in fig. 7, electronic equipment 700 includes one or more processors, communication device etc., one or more of places Manage device for example：One or more central processing unit (CPU) 701 and/or one or more image processors (GPU) 713 etc., Processor can according to the executable instruction being stored in read-only memory (ROM) 702 or from storage section 708 be loaded into Machine accesses the executable instruction in memory (RAM) 703 and performs various appropriate actions and processing.Communication device includes communication Component 712 and communication interface 709.Wherein, communication component 712 may include but be not limited to network interface card, and the network interface card may include but unlimited In IB (Infiniband) network interface card, communication interface 1409 includes the communication of the network interface card of LAN card, modem etc. Interface, communication interface 1409 perform communication process via the network of such as internet.

Processor can communicate with read-only memory 702 and/or random access storage device 730 to perform executable instruction, It is connected by bus 704 with communication component 712 and is communicated through communication component 712 with other target devices, thereby completing the present invention The corresponding operation of any one method that embodiment provides, for example, obtaining first nerves network model；It keeps or increases described first The depth of neural network model and compress the first nerves network model an at least network layer an at least network parameter, obtain To nervus opticus network model；Based on sample data set and according at least to described in the output training of the first nerves network model Nervus opticus network model.

In a kind of optional embodiment, an at least network parameter includes：Convolution kernel size and/or, feature is led to Road number.

In a kind of optional embodiment, executable instruction is for so that processor further performs following operate： It builds in the nervus opticus network development process, according to a larger convolutional layer of convolution kernel in the first nerves network model Receptive field, by a larger convolutional layer of the convolution kernel using the smaller multiple convolutional layers of convolution kernel come equivalence replacement.

In a kind of optional embodiment, the executable instruction further makes the processor perform following operate： The sample data is inputted into the first nerves network model, obtains at least one network in the first nerves network model The output result of layer；The sample data is inputted into the nervus opticus network model, obtains the nervus opticus network model In at least one network layer output result；Determine the nervus opticus network model and the first nerves network model extremely First difference of the output result of few respective wire network layers；According to nervus opticus network model described in first discrepancy adjustment Network parameter.

In a kind of optional embodiment, the executable instruction further makes the processor perform following operate： Last network layer of the nervus opticus network model is connect with auxiliary convolutional layer；By sample data input described the One neural network model obtains the output result of at least one network layer in the first nerves network model；By the sample Data input the nervus opticus network model, obtain the output knot of at least one network layer in the nervus opticus network model Fruit；The defeated of at least one respective wire network layers of the nervus opticus network model and the first nerves network model is determined respectively Go out the first difference of result and the final output result of the first nerves network model and the output of the auxiliary convolutional layer As a result the second difference；The net of the nervus opticus network model is adjusted according to first difference and second difference weighting Network parameter, and the weight of first difference is more than the weight of second difference.

In a kind of optional embodiment, the executable instruction further makes the processor perform following operate： After nervus opticus network model training is completed, the auxiliary convolutional layer is removed.

In a kind of optional embodiment, at least one respective wire network layers include：The nervus opticus network mould Type and the last one respective network layer of the first nerves network model.

In another optional embodiment, at least one respective wire network layers include：The nervus opticus network The model mid-level net network layers identical at least one depth of the first nerves network model.

In a kind of optional embodiment, the executable instruction further makes the processor perform following operate： Increase by least one fitting branch in the nervus opticus network, second god is determined by least one fitting branch First difference of the output result of the corresponding mid-level net network layers through network model and the first nerves network model.

In a kind of optional embodiment, the executable instruction further makes the processor perform following operate： After nervus opticus network model training is completed, at least one fitting branch described in removal.

In addition, in RAM 703, it can also be stored with various programs and data needed for device operation.CPU701, ROM702 and RAM703 is connected with each other by bus 704.In the case where there is RAM703, ROM702 is optional module. RAM703 stores executable instruction or executable instruction is written into ROM702 at runtime, and executable instruction makes processor 701 Perform the corresponding operation of above-mentioned communication means.Input/output (I/O) interface 705 is also connected to bus 704.Communication component 712 can With integrally disposed, may be set to be with multiple submodule (such as multiple IB network interface cards), and in bus link.

I/O interfaces 705 are connected to lower component：Importation 706 including keyboard, mouse etc.；It is penetrated including such as cathode The output par, c 707 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.；Storage section 708 including hard disk etc.； And the communication interface 709 of the network interface card including LAN card, modem etc..Driver 710 is also according to needing to connect It is connected to I/O interfaces 705.Detachable media 711, such as disk, CD, magneto-optic disk, semiconductor memory etc. are pacified as needed On driver 710, in order to be mounted into storage section 708 as needed from the computer program read thereon.

It should be noted that framework as shown in Figure 7 is only a kind of optional realization method, it, can during concrete practice The component count amount and type of above-mentioned Fig. 7 are selected, are deleted, increased or replaced according to actual needs；In different function component In setting, can also be used it is separately positioned or integrally disposed and other implementations, such as GPU and CPU separate setting or can be by GPU It is integrated on CPU, the separable setting of communication component 712, can also be integrally disposed on CPU or GPU, etc..These are alternatively Embodiment each falls within protection scope of the present invention.

Particularly, according to embodiments of the present invention, it is soft to may be implemented as computer for the process above with reference to flow chart description Part program.For example, the embodiment of the present invention includes a kind of computer program product, including being tangibly embodied in machine readable media On computer program, computer program included for the program code of the method shown in execution flow chart, and program code can wrap The corresponding instruction of corresponding execution method and step provided in an embodiment of the present invention is included, for example, for obtaining first nerves network model Executable code；For keeping or increasing the depth of the first nerves network model and compress the first nerves network mould An at least network parameter for an at least network layer for type, obtains the executable code of nervus opticus network model；For being based on sample Notebook data collection simultaneously exports the executable of the training nervus opticus network model according at least to the first nerves network model Code.In such embodiments, the computer program can be downloaded and installed from network by communication device and/or It is mounted from detachable media 711.When the computer program is performed by central processing unit (CPU) 701, it is real to perform the present invention Apply the above-mentioned function of being limited in the method for example.

The electronic equipment that the embodiment of the present invention seven provides, by an at least net for the neural network model that will be compressed An at least network parameter for network layers is compressed, and obtains the neural network model through overcompression.In compression process, keep or increase The depth of big original neural network model, so as to reduce influence of the compression to the ability to express of network as far as possible.Hereafter, it then is based on Sample data set simultaneously exports neural network model of the training through overcompression according at least to uncompressed neural network model, from And it can train and obtain strong ability in feature extraction, performance and the comparable compressed neural network mould of uncompressed neural network model Type, and the training method has versatility, is adapted to carry out the neural network model of any function.

Embodiment eight

According to embodiments of the present invention eight provide a kind of computer readable storage medium, are stored thereon with computer program and refer to It enables, wherein, described program instructs the step of realizing aforementioned any neural network model compression method when being executed by processor.

The computer readable storage medium is used to implement in preceding method embodiment corresponding neural network model compression side Method, and the advantageous effect with corresponding embodiment of the method, details are not described herein.

It may be noted that according to the needs of implementation, all parts/step described in this application can be split as more multi-section The part operation of two or more components/steps or components/steps can be also combined into new components/steps by part/step, To realize the purpose of the embodiment of the present invention.

Methods and apparatus of the present invention, equipment may be achieved in many ways.For example, software, hardware, firmware can be passed through Or any combinations of software, hardware, firmware realize the method and apparatus of the embodiment of the present invention, equipment.For the step of method Merely to illustrate, the step of method of the embodiment of the present invention, is not limited to described in detail above suitable for rapid said sequence Sequence, unless specifically stated otherwise.In addition, in some embodiments, the present invention can be also embodied as being recorded in record Jie Program in matter, these programs include being used to implement machine readable instructions according to the method for the embodiment of the present invention.Thus, this hair The recording medium of program of the bright also covering storage for execution according to the method for the present invention.

The description of the embodiment of the present invention in order to example and description for the sake of and provide, and be not exhaustively or will The present invention is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Choosing It is to more preferably illustrate the principle of the present invention and practical application to select and describe embodiment, and makes those of ordinary skill in the art It will be appreciated that the present invention is so as to design the various embodiments with various modifications suitable for special-purpose.

Claims

1. a kind of neural network model compression method, which is characterized in that including：

Obtain first nerves network model；

It keeps or increases the depth of the first nerves network model and compress an at least net for the first nerves network model An at least network parameter for network layers, obtains nervus opticus network model；

The nervus opticus network mould is trained based on sample data set and according at least to the output of the first nerves network model Type.

2. according to the method described in claim 1, it is characterized in that, an at least network parameter includes：Convolution kernel size, And/or feature port number.

3. according to the method described in claim 1, it is characterized in that, increase the depth of the first nerves network model, including：

In the nervus opticus network development process is built, according to the volume that convolution kernel is larger in the first nerves network model The receptive field of lamination replaces a larger convolutional layer of the convolution kernel using the smaller multiple convolutional layers of convolution kernel come equivalent It changes.

4. according to any methods of claim 1-3, which is characterized in that described based on sample data and according at least to described The nervus opticus network model is trained in the output of first nerves network model, including：

The sample data is inputted into the first nerves network model, is obtained at least one in the first nerves network model The output result of network layer；

The sample data is inputted into the nervus opticus network model, is obtained at least one in the nervus opticus network model The output result of network layer；

Determine the output of at least one respective wire network layers of the nervus opticus network model and the first nerves network model As a result the first difference；

According to the network parameter of nervus opticus network model described in first discrepancy adjustment.

5. according to any methods of claim 1-3, which is characterized in that described based on sample data and according at least to described The nervus opticus network model is trained in the output of first nerves network model, including：

Last network layer of the nervus opticus network model is connect with auxiliary convolutional layer；

At least one respective wire network layers of the nervus opticus network model and the first nerves network model are determined respectively It exports the first difference of result and the final output result of the first nerves network model and assists the defeated of convolutional layer with described Go out the second difference of result；

The network parameter of the nervus opticus network model, and institute are adjusted according to first difference and second difference weighting The weight for stating the first difference is more than the weight of second difference.

6. a kind of neural network model compression set, which is characterized in that including：

Acquisition module, for obtaining first nerves network model；

Compression module, for keeping or increasing described in the depth for the first nerves network model that the acquisition module obtains and compression An at least network parameter for an at least network layer for first nerves network model, obtains nervus opticus network model；

Training module, for based on sample data set and according at least to the output training described the of the first nerves network model Two neural network models.

7. device according to claim 6, which is characterized in that an at least network parameter includes：Convolution kernel size, And/or feature port number.

8. device according to claim 6, which is characterized in that the compression module is used to build the nervus opticus net During network, according to the receptive field of a larger convolutional layer of convolution kernel in the first nerves network model, by the convolution A larger convolutional layer of core is using the smaller multiple convolutional layers of convolution kernel come equivalence replacement.

9. a kind of electronic equipment, including：Processor, memory, communication device and communication bus, the processor, the storage Device and the communication device complete mutual communication by the communication bus；

For the memory for storing an at least executable instruction, the executable instruction makes the processor perform following grasp Make：

Obtain first nerves network model；

10. a kind of computer readable storage medium, is stored thereon with computer program instructions, wherein, described program instruction is located Manage the step of any one of the Claims 1 to 5 neural network model compression method is realized when device performs.