CN113537377B

CN113537377B - Network model clipping method and device, electronic equipment and readable storage medium

Info

Publication number: CN113537377B
Application number: CN202110859748.7A
Authority: CN
Inventors: 张顺; 李哲暘; 彭博; 谭文明; 任烨
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2021-07-28
Filing date: 2021-07-28
Publication date: 2024-01-23
Anticipated expiration: 2041-07-28
Also published as: CN113537377A

Abstract

The application provides a network model clipping method, a device, an electronic device and a readable storage medium, wherein the method comprises the following steps: performing sparse constraint training on the original network model; according to the training results of carrying out sparse constraint training of different rounds of numbers on the original network model in the same training process, counting N1 cutting parameter-optimization target quantity curves until the counted cutting parameter-optimization target quantity curves meet a preset stopping rule; determining a first target cutting parameter according to the N1 cutting parameter-optimization target quantity curves and a preset cutting optimization target quantity; determining the cutting proportion of each layer according to the first target cutting parameter; and performing variable weight sparse constraint training on the original network model according to the clipping proportion of each layer to obtain a sparse model corresponding to the original network model, and clipping the sparse model according to the clipping proportion of each layer. The method can realize automatic clipping of the network model based on variable weight sparse constraint.

Description

Network model clipping method and device, electronic equipment and readable storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method and apparatus for clipping a network model, an electronic device, and a readable storage medium.

Background

With the rapid development of artificial intelligence technology, applications for performing intelligent tasks, such as intelligent detection (e.g., vehicle, license plate detection, etc.), according to a network model are becoming more popular. Since the computing resources of the terminal device are usually limited, if the structure of the network model is too complex, the demand for the computing resources will be too large, so that the speed of the terminal device executing the intelligent task according to the network model will be slower, and the real-time performance will be worse. Therefore, in order to improve the real-time performance of the terminal device for executing the intelligent task, the network model applied on the terminal device can be reasonably cut, and the network model cutting based on the sparse constraint is a more common network model cutting mode.

The sparse constraint is a typical method for training the parameters of the reconstruction model, and the norms of some groups are reduced by dividing each layer of network parameters into a plurality of groups and adding the sparse constraint into an objective function, so that the purpose of sparsity is achieved.

At present, the network model is cut based on sparse constraint mainly comprises two schemes:

1. Equal weight sparse constraint scheme: the same strength of sparse constraint is used for all connections of the network model. A disadvantage of this approach is that each layer of the network cannot be thinned precisely to a specified sparsity.

2. Weight-variable sparse constraint scheme: different strength sparsity constraints are used for different connections of the network model, thereby accurately sparsing each layer of the network to a specified sparsity.

However, practice discovers that the traditional weight-variable sparse constraint scheme needs to manually set the cutting proportion of each layer of the network model, automatic cutting cannot be achieved, and therefore the real-time performance of the terminal device for executing intelligent tasks according to the network model is poor.

Disclosure of Invention

In view of this, the present application provides a method, an apparatus, an electronic device, and a readable storage medium for clipping a network model, so as to solve the problem that the traditional weight-variable sparse constraint scheme cannot achieve automatic clipping of the network model, and further results in poor real-time performance of executing an intelligent task by a terminal device.

Specifically, the application is realized by the following technical scheme:

according to a first aspect of an embodiment of the present application, there is provided a network model clipping method, including:

performing sparse constraint training on the original network model;

According to the training results of carrying out sparse constraint training of different rounds of numbers on the original network model in the same training process, counting N1 cutting parameter-optimization target quantity curves until the counted cutting parameter-optimization target quantity curves meet a preset stopping rule; the method comprises the steps that based on a cutting parameter-optimization target quantity curve obtained through statistics of any training result, the method is used for representing an optimization target quantity of the training result after cutting according to different cutting parameters, the cutting parameter is positively correlated with the cutting proportion of each layer of the training result, N1 is more than or equal to 2, the preset stopping rule comprises that the change amplitude of the target cutting parameter along with the increase of the training wheel number does not exceed a preset value range, and the target cutting parameter is a cutting parameter enabling the calculated quantity of a network model after cutting to be equal to the preset calculated quantity after cutting;

determining a first target cutting parameter according to the N1 cutting parameter-optimization target quantity curves and a preset cutting optimization target quantity;

determining the cutting proportion of each layer according to the first target cutting parameter;

and performing variable weight sparse constraint training on the original network model according to the clipping proportion of each layer to obtain a sparse model corresponding to the original network model, and clipping the sparse model according to the clipping proportion of each layer.

According to a second aspect of embodiments of the present application, there is provided a network model clipping device, including:

the pre-training unit is used for carrying out sparse constraint training on the original network model;

the statistical unit is used for counting N1 cutting parameter-optimization target quantity curves according to training results of sparse constraint training of different numbers of rounds on the original network model in the same training process until the cutting parameter-optimization target quantity curves obtained through statistics meet a preset stopping rule; the method comprises the steps that based on a cutting parameter-optimization target quantity curve obtained through statistics of any training result, the method is used for representing an optimization target quantity of the training result after cutting according to different cutting parameters, the cutting parameter is positively correlated with the cutting proportion of each layer of the training result, N1 is more than or equal to 2, the preset stopping rule comprises that the change amplitude of the target cutting parameter along with the increase of the training wheel number does not exceed a preset value range, and the target cutting parameter is a cutting parameter enabling the calculated quantity of a network model after cutting to be equal to the preset calculated quantity after cutting;

the determining unit is used for determining a first target cutting parameter according to the N1 cutting parameter-optimization target quantity curves and a preset cutting optimization target quantity;

The determining unit is used for determining the cutting proportion of each layer according to the first target cutting parameter;

the processing unit is further used for performing variable weight sparse constraint training on the original network model according to the clipping proportion of each layer to obtain a sparse model corresponding to the original network model, and clipping the sparse model according to the clipping proportion of each layer.

According to a third aspect of embodiments of the present application, there is provided an electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor for executing the machine-executable instructions to implement the above-described network model clipping method.

According to a fourth aspect of embodiments of the present application, there is provided a machine-readable storage medium having stored therein machine-executable instructions which, when executed by a processor, implement the above-described network model clipping method.

The technical scheme that this application provided can bring following beneficial effect at least:

according to the method, a plurality of cutting parameters-optimizing target quantity curves are counted according to the training results of sparse constraint training of different numbers of rounds of original network models in the same training process, according to the plurality of cutting parameters-optimizing target quantity curves obtained through statistics and preset cutting calculation quantity, a first target cutting parameter is determined, the cutting proportion of each layer is determined according to the first target cutting parameter, automatic determination of the cutting proportion of each layer of the network model is achieved, furthermore, variable weight sparse constraint training can be conducted on the original network models according to the cutting proportion of each layer, sparse models corresponding to the original network models are obtained, the sparse models are cut according to the cutting proportion of each layer, automatic cutting of the network models based on variable weight sparse constraint is achieved, further, the demand of the cut network models on calculation resources can be reduced, and the instantaneity of intelligent task execution according to the cut network models is improved.

Drawings

FIG. 1 is a flow diagram of a network model clipping method according to an exemplary embodiment of the present application;

FIG. 2 is a flow diagram of one implementation of a network model clipping scheme shown in an exemplary embodiment of the present application;

FIG. 3A is a schematic diagram of a graph corresponding to L2 norm after ordering of channels according to an exemplary embodiment of the present application;

FIG. 3B is a schematic diagram of R-Gcaps curves corresponding to training results of training different numbers of rounds of the original network model using equal weight sparse constraints, showing one different curve representation in an exemplary embodiment of the present application;

fig. 4 is a schematic structural diagram of a network model clipping device according to an exemplary embodiment of the present application;

fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

In order to enable those skilled in the art to better understand the technical solutions provided in the embodiments of the present application, the following is a brief description of some technical terms related to the embodiments of the present application and application scenarios of the embodiments of the present application.

1. Technical terminology

1. The Neural Network (NN) can also be called artificial Neural Network (Artificial Neural Network, ANN) which is an algorithm mathematical model for simulating the behavior characteristics of the animal Neural Network and carrying out distributed parallel information processing. The network relies on the complexity of the system and achieves the purpose of processing information by adjusting the relationship of the interconnection among a large number of nodes.

Neural networks are made up of a large number of nodes (or "neurons") and interconnections between them. Each node represents a specific output function, called an excitation function, an activation function. The links between each two nodes represent a weight, called weight, for the signal passing through the connection. The output of the network is different according to the connection mode of the network, the weight value and the excitation function.

2. A deep neural network (Deep Neural Network, abbreviated as DNN) is a neural network with at least one hidden layer that is capable of providing modeling for complex nonlinear systems.

3. Convolutional neural network (Convolutional Neural Networks, abbreviated as CNN) is a type of feedforward neural network (Feedforward Neural Networks) that includes convolutional calculation and has a Deep structure, and is one of representative algorithms of Deep Learning. The convolutional neural network imitates the biological visual perception (visual perception) mechanism to construct, can carry on the supervised study and unsupervised study, its intra-implicit convolution kernel parameter shares and sparsity of the interlaminar connection make the convolutional neural network can learn the latticed characteristic with less computational load, for example pixel and audio, have stable effect and have no extra characteristic engineering (feature engineering) requirement to the data.

2. Application scenario

The technical scheme provided by the embodiment of the application can be applied to the fields requiring the neural network, such as target detection, target segmentation, target identification and the like. In practical application, according to the limitation of the computing resources and storage resources of the specific application scene and the application equipment (such as a camera), the network model deployed in the application equipment can be cut by using the technical scheme provided by the embodiment of the application equipment, so that the requirement of the network model on the resources is reduced, and the real-time performance of the application equipment for performing intelligent tasks according to the cut network model is improved.

The following is an example.

Application scene 1, object detection and segmentation

Taking an automatic driving scene as an example, in the automatic driving scene, targets such as pedestrians, vehicles and the like on a street can be detected and segmented, and a safe driving decision is very important for the vehicles.

Under the application scene, the technical scheme provided by the embodiment of the application can be utilized, and the target detection and segmentation model structure matched with the computing resource of the vehicle-mounted equipment can be obtained by performing network model clipping based on the computing resource of the vehicle-mounted equipment. Thus, the object detection and segmentation model with the model structure is operated on the vehicle, so that the object in the image acquired by the vehicle-mounted image acquisition equipment can be accurately detected, positioned and segmented, the real-time performance of object detection and segmentation is improved, and the safety of automatic driving is further improved.

Application scenario 2, object recognition

Taking a license plate recognition scene of an intelligent parking lot as an example, in the license plate recognition scene of the intelligent parking lot, the license plates of vehicles entering and exiting the parking lot can be recognized, and vehicle traffic control, namely passing or rejecting traffic, is realized.

In the application scenario, the technical scheme provided by the embodiment of the application can be utilized, and the target identification model structure matched with the computing resource of the barrier control equipment is obtained by performing network model clipping based on the computing resource of the barrier control equipment. Therefore, the target recognition model with the model structure is operated on the barrier gate control equipment, vehicles of the vehicles in the images acquired by the barrier gate image acquisition equipment can be accurately recognized, the real-time performance of vehicle recognition is improved, and the efficiency of intelligent parking lot traffic control is further improved.

In order to make the above objects, features and advantages of the embodiments of the present application more comprehensible, the following describes the technical solutions of the embodiments of the present application in detail with reference to the accompanying drawings.

Referring to fig. 1, a flow chart of a network model clipping method provided in an embodiment of the present application, as shown in fig. 1, the network model clipping method may include the following steps:

and step S100, performing sparse constraint training on the original network model.

And S110, counting N1 cutting parameter-optimization target quantity curves according to training results of sparse constraint training of different rounds of the original network model in the same training process until the cutting parameter-optimization target quantity curves obtained through statistics meet a preset stopping rule.

The training result is used for representing the optimized target quantity after being cut according to different cutting parameters.

For example, taking a training result of performing M rounds of sparse constraint training on an original network model in the same training process as an example, an optimization target amount after cutting according to different cutting parameters can be determined according to the training result (the network model after performing M rounds of sparse constraint training), and a cutting parameter-optimization target amount curve corresponding to the training result can be determined.

By way of example, the clipping parameters may be shared parameters for the layers of the network model, i.e. the clipping parameters are the same for the layers of the network model, the clipping parameters being associated with the number of channels that the layers are clipped. For any layer of the network model, the clipping parameters may characterize the proportion of the sum of L2 norm of the channels clipped by that layer to the sum of L2 norm of all channels of the layer.

For example, since the sparse speeds of different layers of the network model are typically different, the number of channels that need to be clipped for each layer may be different for the same clipping parameters, i.e., the clipping ratio may be different for each layer.

For any training result, the clipping parameters are positively correlated with the clipping proportion of each layer of the training result, namely, the larger the clipping parameters are, the larger the clipping proportion of each layer is, namely, the more channels each layer needs to clip; the smaller the clipping parameters, the smaller the clipping proportion of each layer, i.e. the fewer the number of channels each layer needs to clip.

It should be noted that, for any layer of the network model, when the layer is cut in a channel, the channels can be ordered according to the order from small to large of L2 norm of each channel of the layer, then the channels with equal quantity, which are ordered in front, are cut according to the number of channels to be cut, and the specific implementation thereof can be described below, which is not repeated herein in the embodiment of the present application.

Illustratively, L2 norm may be used to measure the importance of a channel, where L2 norm of a channel is the sum of squares of all elements of the channel.

In the embodiment of the application, in order to realize automatic determination of the clipping proportion of each layer, sparse constraint training can be performed on an original network model, and a plurality of clipping parameter-optimization target amount curves are respectively counted according to training results of the sparse constraint training of different rounds of the original network model until the clipping parameter-optimization target amount curves obtained through counting meet a preset stopping rule (the number of clipping parameter-optimization target amount obtained through counting is N1 when the preset stopping rule is supposed to be met), and N1 is more than or equal to 2.

For example, the corresponding clipping parameter-optimization target amount curves may be counted according to training results of performing equal weight sparse constraint training (i.e., taking coefficient constraint training in step S100 as equal weight coefficient constraint training as an example) of the Mi round of the original network, i=1, 2, …; the larger i is, the larger Mi is.

For example, a specific implementation flow of performing equal weight sparse constraint training on the network model may refer to a related description in a conventional equal weight sparse constraint scheme, which is not described herein in detail in the embodiment of the present application.

The preset stopping rule includes that the change amplitude of the target cutting parameter along with the increase of the training wheel number does not exceed the preset value range, namely when the change of the target cutting parameter along with the increase of the training wheel number tends to be stable, the cutting parameter-optimizing target quantity curve can be determined to meet the preset stopping rule.

The target clipping parameters are clipping parameters that make the calculated amount of the clipped network model equal to the preset clipping calculated amount.

For example, the optimization objective amounts described above may include, but are not limited to, a calculated amount, a parameter amount, and/or a time consumption.

For ease of understanding and description, the following description will be given by taking the target optimization amount as an example of the calculated amount, and the same applies when the target optimization amount is a parameter amount or time consuming.

Step S120, determining a first target clipping parameter according to the N1 clipping parameter-optimization target quantity curves and the preset post-clipping optimization target quantity.

Step S130, determining the clipping proportion of each layer according to the first target clipping parameters.

In this embodiment of the present application, when the network model needs to be cut, the expected calculated amount of the cut network model (which may be referred to as the target calculated amount after cutting or the preset calculated amount after cutting) may be set, and further, the calculated amount of the network model may reach the preset calculated amount after cutting by cutting the network model.

For example, when N1 clipping parameter-calculated amount curves are obtained by statistics in the manner described in steps S100 to S110, clipping parameters (referred to herein as first target clipping parameters) for making the calculated amount of the clipped network model be the calculated amount after clipping may be determined according to the N1 clipping parameter-calculated amount curves and the calculated amount after clipping, and clipping proportions of each layer may be determined according to the first target clipping parameters, that is, the calculated amount of the clipped network model is the calculated amount after clipping according to the clipping proportions of each layer.

And step 140, performing variable weight sparse constraint training on the original network model according to the clipping proportion of each layer to obtain a sparse model corresponding to the original network model, and clipping the sparse model according to the clipping proportion of each layer.

In this embodiment, when the clipping ratio of each layer is determined according to the manner described in steps S100 to S130, variable weight sparse constraint training may be performed on the original network model according to the clipping ratio of each layer, so as to obtain a sparse model corresponding to the original network model, that is, a network model with a partial channel parameter of 0 or close to 0.

For the obtained sparse model, the channels to be cut of each layer in the sparse model can be determined according to the cutting proportion of each layer determined in the mode, the sparse model is cut, the network model after cutting is obtained, the demand of the network model for computing resources is reduced, and the instantaneity of executing intelligent tasks according to the network model after cutting is improved.

It should be noted that, after the sparse model is cut in the above manner, before the cut model is applied, parameter tuning may also be performed on the cut network model, and the embodiment of the present application is not limited.

It can be seen that, in the method flow shown in fig. 1, by counting a plurality of clipping parameters-optimizing target amount curves according to training results of performing different rounds of sparse constraint training on an original network model in the same training process, determining a first target clipping parameter according to the counted plurality of clipping parameters-optimizing target amount curves and a preset clipping calculation amount, determining clipping proportion of each layer according to the first target clipping parameter, automatic determination of clipping proportion of each layer of the network model is achieved, further, variable weight sparse constraint training can be performed on the original network model according to clipping proportion of each layer, so as to obtain a sparse model corresponding to the original network model, clipping is performed on the sparse model according to clipping proportion of each layer, automatic clipping of the network model based on variable weight sparse constraint is achieved, further, requirements of the clipped network model on calculation resources can be reduced, and real-time performance of executing intelligent tasks according to the clipped network model is improved.

In some embodiments, in step S110, statistics of N1 clipping parameter-optimization target amount curves according to training results of performing sparse constraint training on the original network model with different numbers of rounds in the same training process may include:

in the same training process, the parameter-optimization target quantity curve is subjected to statistics and clipping according to the training result of the sparse constraint training of the current round number of the original network model every other preset round number.

For example, in order to obtain the variation trend of the clipping parameter-calculated amount curve more reasonably, the clipping parameter-calculated amount curve of the trained network model can be counted by fixing the number of training rounds at intervals.

For example, a training round number (may be called a preset round number, such as 5 kilorounds, 2 kilorounds, etc.) may be preset, and the clipping parameter-calculated curve is counted according to the training result of performing the sparse constraint training of the current round number on the original network model every other preset round number.

For example, assuming that the preset number of rounds is 2 ten thousand rounds, the parameter-calculated curve can be cut according to the training result of performing 2 ten thousand rounds of sparse constraint training on the original network model (i.e. the trained network model), the training result of performing 4 ten thousand rounds on the original network model, and … statistics.

In some embodiments, in step S110, the statistical clipping parameter-optimization target amount curve satisfies the preset stopping rule, which may include:

the difference value of the adjacent second target cutting parameters meets a preset stopping rule; and for any one of the obtained cutting parameter-optimizing target quantity curves, the second target cutting parameter corresponding to the cutting parameter-calculating quantity curve is a cutting parameter determined according to the preset cutting optimizing target quantity and the cutting parameter-optimizing target quantity curve.

By way of example, considering that when the number of training rounds of performing equal weight sparse constraint on an original network model reaches a certain number of training rounds in the same training process, the change of the sparseness of the trained network model gradually decreases, that is, the sparseness of the trained network model gradually tends to be stable, in this case, based on the clipping parameters determined by the trained network model, the clipping part can be ensured to be a part with lower importance as far as possible when the network model is clipped, the clipped network model structure is optimized, and the workload of sparse constraint training can be reduced as far as possible.

Considering that when the sparseness of the network model tends to be stable, in order to make the calculated amount of the network model after clipping approach the clipping parameters of the preset post-clipping calculated amount, whether the clipping parameters-calculated amount curve satisfies the preset stopping rule may be determined according to clipping parameters-calculated amount curves corresponding to different training results (training results of performing different rounds of sparse constraint training on the original network model), clipping parameters (which may be referred to as second target clipping parameters) in the case of making the calculated amount of the network model after clipping approach the preset post-clipping calculated amount.

Here, (R0, T) is a point on the clipping parameter-calculated amount curve 1, assuming that the clipping parameter in the case where the calculated amount of the clipping parameter-calculated amount curve 1 is made to reach the preset post-clipping calculated amount (assumed to be T) is R0.

For example, whether the counted cutting parameter-calculated amount curve meets the preset stopping rule may be determined according to whether the difference value of the adjacent second target cutting parameters meets the preset stopping rule, and when the difference value of the adjacent second target cutting parameters meets the preset stopping rule, it is determined that the counted cutting parameter-calculated amount curve meets the preset stopping rule.

It should be noted that, as the number of training rounds increases, the sparseness of the trained network model gradually increases, and accordingly, the smaller the clipping parameter (i.e., the second target clipping parameter) that causes the clipping calculation amount of the network model to reach the preset clipping calculation amount, the second target clipping parameter corresponding to the clipping parameter-calculation amount curve obtained based on the training result statistics of the number of training rounds gradually decreases as the number of training rounds increases.

For example, assuming that the second target clipping parameter corresponding to the clipping parameter-calculated amount curve i obtained by the ith statistics is the second target clipping parameter i, the second target clipping parameter i and the second target clipping parameter (i+1) are adjacent second target clipping parameters.

In one example, the difference between the adjacent second target cropping parameters satisfies a preset stopping rule, which may include:

and among the N2 continuous second target clipping parameters, the difference value of the adjacent second target clipping parameters is smaller than a preset threshold value.

Illustratively, 2.ltoreq.N2.ltoreq.N1.

For example, considering that when the sparseness of the network model tends to be stable, the change of the second target clipping parameter corresponding to the clipping parameter-calculated amount curve is also small, it may be determined whether the clipping parameter-calculated amount curve satisfies the preset stopping rule according to the difference value of the adjacent second target clipping parameters.

For example, when the difference value of the adjacent second target cropping parameters is smaller than the preset threshold value among the N2 consecutive second target cropping parameters, it may be determined that the difference value of the adjacent second target cropping parameters satisfies the preset stop rule.

As an example, the determining the first target clipping parameter according to the N1 clipping parameter-optimization target amount curves and the preset post-clipping optimization target amount may include:

and determining the first target clipping parameters according to the continuous N2 second target clipping parameters.

For example, when the adjacent second target cropping parameter of the continuous N2 second target cropping parameters is smaller than the preset threshold, it may be determined that the preset stopping rule is satisfied, and the first target cropping parameter is determined according to the continuous N2 second target cropping parameters.

For example, any one of the N2 consecutive second target clipping parameters may be determined as the first target clipping parameter.

For example, the first second target cropping parameter, or the last second target cropping parameter, of the consecutive nth 2 nd second target cropping parameters may be determined as the first target cropping parameter.

In some embodiments, in step S130, determining the clipping ratio of each layer according to the first target clipping parameter may include:

for a layer i with the channel number of A in the original network model, determining the clipping proportion r of the layer i according to the following strategy:

r＝a/A

wherein R is a first target clipping parameter, y=f (x) is a curve corresponding to each channel L2 norm of layer i after sorting from small to large, and for any channel, L2 norm of the channel is the sum of squares of all elements of the channel.

For example, the clipping parameters may be set to a ratio of the sum of L2 norm of the clipped channels in each layer to the sum of L2 norm of each layer.

For the layer i, when the first target clipping parameter is determined, the number of channels a to be clipped by the layer i can be determined according to the first target clipping parameter, and then the clipping proportion r of the layer i is determined.

In order to enable those skilled in the art to better understand the technical solutions provided by the embodiments of the present application, the technical solutions provided by the embodiments of the present application are described below with reference to specific examples.

Considering that the traditional variable weight sparse constraint scheme needs to manually set the cutting proportion of each layer, automatic cutting cannot be achieved, the embodiment of the application provides a network model cutting scheme combining the equal weight sparse constraint scheme and the variable weight sparse constraint scheme, and automatic cutting of a network model based on the variable weight sparse constraint scheme is achieved.

The clipping proportion of each layer can be determined through an equal weight sparse constraint scheme, and clipping of the network model is realized through a variable weight sparse constraint scheme according to the determined clipping proportion.

Referring to fig. 2, in this embodiment, a network model clipping scheme implementation flow may be as shown in fig. 2, which may include a structure selection portion and a constraint clipping portion.

The structure selection part can be realized by a structure selection module based on an equal weight sparse constraint scheme, and the clipping proportion of each layer of the network model is determined to obtain the structure of the clipped network model.

The constraint clipping part can be realized by a constraint clipping module based on a variable weight sparse constraint scheme, and network model clipping is performed according to the determined clipping proportion, so that network model parameter reconstruction is realized.

The implementation flow of the structure selection part is exemplified as follows:

1.1, performing sparse constraint training on an original model;

1.2, counting R-Gflow curves (namely the cutting parameter-calculated quantity curves) according to a trained network model (namely a training result) at regular intervals of epochs (namely the preset number of rounds);

and 1.3, stopping training when the R-Gflow curve meets a preset stopping rule, and obtaining the cutting proportion of each layer, namely determining the target sparsity (which can be recorded as S) of each layer, and obtaining a cutting proportion list (the cutting proportion of each layer is recorded in the list).

For example, the structure of the model after clipping can be determined according to the calculated amount (Gflow) of the network model under different clipping parameters (R).

Illustratively, the importance of a channel may be measured in terms of the sum of squares (i.e., L2 norm) of all elements of each channel.

For any layer, it is assumed that the layer has a channels, and L2norm for the a channels can form a vector of length a. The curve corresponding to L2norm (ordered in order of L2norm from small to large) after each channel ordering is noted as y=f (x), and the schematic diagram can be shown in fig. 3A.

As shown in fig. 3A, the ordinate indicates L2norm of each channel after sorting, and the abscissa indicates the channel number (which may also be referred to as index) of each channel after sorting in order of how small L2norm is.

Assuming that a layer needs to cut off a channels, the parameters are cut off

Illustratively, all layers of the network model share the clipping parameters, but the clipping ratio a/a is different for each layer.

For example, the original model may be trained with equal weight sparsity constraints. With the progress of equal weight sparse constraint training, the change of the corresponding relation between the calculated amount of the model after training (namely Gflow) and the clipping parameter R can be focused.

For training results of different training rounds, corresponding R-gfps curves can be counted, and the schematic diagram can be seen in fig. 3B.

As shown in fig. 3B, different curves represent R-gfps curves corresponding to training results obtained by training the original network model with different numbers of rounds using equal weight sparse constraint (2W, 4W, 6W, 8W, and 10W rounds in order in fig. 3B, i.e., the preset number of rounds is 2W rounds).

Assuming that the preset calculation amount after cutting is T, based on the curve shown in fig. 3, when the R-gfps curve is gradually reduced with the increase of the training wheel number under the condition that the calculation amount after cutting is T (i.e., the second target cutting parameter), the speed of the reduction gradually decreases, i.e., gradually becomes stable.

For example, the clipping parameters corresponding to the R-gfps curve M (which may be denoted as R) may be used as the final clipping parameters (i.e., the first target clipping parameters) according to R, And the channel number A of each layer is used for determining the channel number a of each layer to be cut and determining the cutting proportion of each layer.

For example, when the clipping ratio of each layer is determined, the clipping of the network model may be performed according to the flow of the constraint clipping portion shown in fig. 2, that is, the variable weight sparse constraint training, the clipping of the network model, and the tuning of the network model after clipping are performed on the original network model according to the clipping ratio of each layer.

Therefore, according to the network model clipping scheme provided by the embodiment of the application, a neural network searching method is not needed, the model structure after clipping can be determined by using an equal-weight sparse constraint scheme, and when an initial model is not sparse, a structure selection module automatically selects a proper model structure after clipping according to the sparse speed of each layer. And in the constraint cutting module, the structure output by the structure selection module is used for precisely thinning, only the part needing to be cut is cut, and the part needing not to be cut is protected.

The methods provided herein are described above. The apparatus provided in this application is described below:

referring to fig. 4, a schematic structural diagram of a network model clipping device provided in an embodiment of the present application, as shown in fig. 4, the network model clipping device may include:

A pre-training unit 410, configured to perform sparse constraint training on the original network model;

the statistics unit 420 is configured to count N1 clipping parameter-optimization target amount curves according to training results of performing sparse constraint training on the original network model with different numbers of rounds in the same training process, until the clipping parameter-optimization target amount curves obtained by statistics meet a preset stopping rule; the method comprises the steps that based on a cutting parameter-optimization target quantity curve obtained through statistics of any training result, the method is used for representing an optimization target quantity of the training result after cutting according to different cutting parameters, the cutting parameter is positively correlated with the cutting proportion of each layer of the training result, N1 is more than or equal to 2, the preset stopping rule comprises that the change amplitude of the target cutting parameter along with the increase of the training wheel number does not exceed a preset value range, and the target cutting parameter is a cutting parameter enabling the calculated quantity of a network model after cutting to be equal to the preset calculated quantity after cutting;

a determining unit 430, configured to determine a first target clipping parameter according to the N1 clipping parameter-optimization target amount curves and a preset post-clipping optimization target amount;

the determining unit 430 is configured to determine a clipping ratio of each layer according to the first target clipping parameter;

The processing unit 440 is further configured to perform variable weight sparse constraint training on the original network model according to the clipping ratio of each layer, obtain a sparse model corresponding to the original network model, and clip the sparse model according to the clipping ratio of each layer.

In some embodiments, the statistics unit 420 counts N1 clipping parameter-optimization objective quantity curves according to training results of performing sparse constraint training on the original network model in different rounds in the same training process, including:

in the same training process, the parameter-optimized target quantity curve is subjected to statistics and clipping according to the training result of sparse constraint training of the current round number of the original network model every other preset round number.

In some embodiments, the statistically derived clipping parameter-optimization objective curve satisfies a preset stopping rule, including:

the difference value of the adjacent second target cutting parameters meets a preset stopping rule; and for any one of the obtained cutting parameter-optimizing target quantity curves, the second target cutting parameter corresponding to the cutting parameter-optimizing target quantity curve is a cutting parameter determined according to the preset cutting optimizing target quantity and the cutting parameter-optimizing target quantity curve.

In some embodiments, the difference between adjacent second target clipping parameters satisfies a preset stopping rule, including:

and among the continuous N2 second target cutting parameters, the difference value of the adjacent second target cutting parameters is smaller than a preset threshold value, and N2 is more than or equal to 2 and less than or equal to N1.

In some embodiments, the determining unit 430 determines the first target clipping parameter according to the N1 clipping parameter-optimization target amount curves and the preset post-clipping optimization target amount, including:

determining the first target cutting parameters according to the continuous N2 second target cutting parameters;

in some embodiments, the determining unit 430 determines a clipping ratio of each layer according to the first target clipping parameter, including:

for the layer i with the channel number of A in the original network model, determining the clipping proportion r of the layer i according to the following strategy:

r＝a/A

wherein R is the first target clipping parameter, y=f (x) is a curve corresponding to each channel L2 norm of layer i after sorting from small to large, and for any channel, L2 norm of the channel is the sum of squares of all elements of the channel.

Fig. 5 is a schematic hardware structure of an electronic device according to an embodiment of the present application. The electronic device may include a processor 501, a memory 502 storing machine-executable instructions. The processor 501 and the memory 502 may communicate via a system bus 503. Also, the processor 501 may perform the network model clipping method described above by reading and executing machine-executable instructions in the memory 502 corresponding to the network model clipping control logic.

The memory 502 referred to herein may be any electronic, magnetic, optical, or other physical storage device that may contain or store information, such as executable instructions, data, or the like. For example, a machine-readable storage medium may be: RAM (Radom Access Memory, random access memory), volatile memory, non-volatile memory, flash memory, a storage drive (e.g., hard drive), a solid state drive, any type of storage disk (e.g., optical disk, dvd, etc.), or a similar storage medium, or a combination thereof.

In some embodiments, a machine-readable storage medium, such as memory 502 in fig. 5, is also provided, having stored therein machine-executable instructions that when executed by a processor implement the network model clipping method described above. For example, the machine-readable storage medium may be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing description of the preferred embodiments of the present invention is not intended to limit the invention to the precise form disclosed, and any modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present invention are intended to be included within the scope of the present invention.

Claims

1. The method is used for adjusting a network model of an intelligent task of a vehicle, which is originally used by a terminal device, so that computing resources of the terminal device adapt to the adjusted network model, and the adjusted network model is used by the terminal device for accelerating the real-time performance of the intelligent task of the vehicle when the intelligent task of the vehicle is executed, and the intelligent task of the vehicle at least comprises: detecting a vehicle and detecting a license plate; the method is applied to the terminal equipment and comprises the following steps:

performing sparse constraint training on the original network model; according to training results of sparse constraint training of different rounds of the original network model in the same training process, N1 cutting parameter-optimization target quantity curves are counted until the counted cutting parameter-optimization target quantity curves meet a preset stopping rule; the method comprises the steps of calculating a cutting parameter-optimization target quantity curve based on any training result, wherein the cutting parameter-optimization target quantity curve is used for representing an optimization target quantity of the training result after cutting according to different cutting parameters, and the optimization target quantity comprises the following components: the calculation amount, the parameter amount and/or the time consumption of the terminal equipment; the cutting parameters are positively correlated with the cutting proportion of each layer of the training result, N1 is more than or equal to 2, the preset stopping rule comprises that the change amplitude of the target cutting parameters along with the increase of the training wheel number does not exceed a preset value range, and the target cutting parameters are cutting parameters which enable the calculated amount of the network model after cutting to be equal to the preset calculated amount after cutting;

according to the clipping proportion of each layer, carrying out variable weight sparse constraint training on the original network model to obtain a sparse model corresponding to the original network model, clipping the sparse model according to the clipping proportion of each layer, wherein the requirement of the clipped sparse model on computational resources when being used for the intelligent task of a vehicle is smaller than that of the original network model;

and executing the intelligent task of the vehicle by using the cut sparse model.

2. The method according to claim 1, wherein the counting N1 clipping parameter-optimization target amount curves according to training results of performing different rounds of sparse constraint training on the original network model in the same training process includes:

3. The method of claim 1, wherein the statistically derived clipping parameter-optimization objective curve satisfies a preset stopping rule, comprising:

4. A method according to claim 3, wherein the difference between the adjacent second target clipping parameters satisfies a preset stopping rule, comprising:

5. The method of claim 4, wherein determining the first target clipping parameters based on the N1 clipping parameter-optimization target amount curves and a preset post-clipping optimization target amount comprises:

6. The method of claim 1, wherein determining the cropping proportions of the layers according to the first target cropping parameters comprises:

r＝a/A

wherein R is the first target clipping parameter, y=f (x) is a curve corresponding to each channel L2 norm of layer i after sorting from small to large, for any channel, L2 norm of the channel is the sum of squares of all elements of the channel, and a is the number of channels to be clipped by layer i.

7. The utility model provides a network model tailors device, its characterized in that, the device is used for adjusting the network model that terminal equipment originally was used for carrying out the intelligent task of vehicle to make terminal equipment's computational resource adaptation network model after adjusting, make the real-time that accelerates the intelligent task of vehicle when the network model after adjusting is used for carrying out the intelligent task of vehicle by the terminal equipment, the intelligent task of vehicle includes at least: detecting a vehicle and detecting a license plate; the apparatus is applied to a terminal device, and the apparatus includes:

the processing unit is further used for carrying out variable weight sparse constraint training on the original network model according to the clipping proportion of each layer to obtain a sparse model corresponding to the original network model, clipping the sparse model according to the clipping proportion of each layer, and the requirement of the clipped sparse model on computational resources when being used for the intelligent task of the vehicle is smaller than that of the original network model; the pruned sparse model is used to perform the intelligent task of the vehicle.

8. The apparatus of claim 7, wherein the statistics unit counts N1 clipping parameter-optimization target amount curves according to training results of sparse constraint training for different numbers of rounds of the original network model in the same training process, including:

in the same training process, counting a cutting parameter-optimizing target quantity curve according to a training result of carrying out sparse constraint training on the current number of rounds of the original network model every other preset number of rounds;

And/or the number of the groups of groups,

the cutting parameter-optimizing target quantity curve obtained through statistics meets a preset stopping rule, and the method comprises the following steps:

the difference value of the adjacent second target cutting parameters meets a preset stopping rule; for any cutting parameter-optimizing target quantity curve obtained through statistics, a second target cutting parameter corresponding to the cutting parameter-optimizing target quantity curve is a cutting parameter determined according to the preset cutting optimizing target quantity and the cutting parameter-optimizing target quantity curve;

wherein the difference value of the adjacent second target cutting parameters meets a preset stopping rule, comprising:

among the continuous N2 second target cutting parameters, the difference value between the adjacent second target cutting parameters is smaller than a preset threshold value, and N2 is more than or equal to 2 and less than or equal to N1;

the determining unit determines a first target clipping parameter according to the N1 clipping parameter-optimization target amount curves and a preset post-clipping optimization target amount, including:

and/or, the determining unit determines the clipping proportion of each layer according to the first target clipping parameter, including:

r＝a/A

9. An electronic device comprising a processor and a memory, the memory storing machine executable instructions executable by the processor for executing the machine executable instructions to implement the method of any of claims 1-6.

10. A machine-readable storage medium having stored thereon machine-executable instructions which, when executed by a processor, implement the method of any of claims 1-6.