CN111325309A

CN111325309A - Model architecture adjusting method and device

Info

Publication number: CN111325309A
Application number: CN201811524843.6A
Authority: CN
Inventors: 张鹏国; 孔露森
Original assignee: Zhejiang Uniview Technologies Co Ltd
Current assignee: Zhejiang Uniview Technologies Co Ltd
Priority date: 2018-12-13
Filing date: 2018-12-13
Publication date: 2020-06-23

Abstract

The application provides a model architecture adjusting method and device. The method comprises the following steps: aiming at each convolution layer of the CNN model to be adjusted, obtaining an influence degree sequencing result among convolution cores in the convolution layer based on a sequencing acquisition strategy corresponding to the convolution layer; sequentially shielding the convolution kernel with the lowest influence degree in the influence degree sequencing result according to a preset shielding rate interval, and testing the target calculation accuracy of the convolution layer on the verification set by using the current residual unshielded convolution kernel after shielding the convolution kernel each time until the target reduction amplitude of the target calculation accuracy relative to the initial calculation accuracy exceeds a preset reduction amplitude threshold; and selecting a shielding rate when the target descending amplitude does not exceed a preset descending amplitude threshold value as the cutting rate of the convolution layer, and cutting off the convolution kernel with the lowest influence degree in the corresponding influence degree sequencing result according to the cutting rate, thereby reducing the optimization model architecture on the basis of not influencing the calculation precision, and greatly reducing the convolution calculation amount and the storage occupied resources.

Description

Model architecture adjusting method and device

Technical Field

The present invention relates to the technical field of Convolutional Neural Network (CNN) model adjustment, and in particular, to a method and an apparatus for adjusting a model architecture.

Background

With the continuous development of scientific technology, the application of the CNN model in different industry fields is more extensive, the model architecture of the CNN model gradually develops towards a deeper direction, the number of related parameters and the convolution calculation amount are greatly improved, and therefore more calculation resources need to be consumed to meet the calculation requirement of the CNN model. For the CNN model, most of the total computation amount of the convolutional layer is occupied by the computation amount corresponding to the convolutional layer, and the total computation amount of the convolutional neural network model can be reduced by reducing the number of the convolutional parameters participating in the computation of the convolutional layer in the overall computation process.

Currently, the mainstream in the industry generally adopts a method of zeroing unimportant convolution parameters in convolution parameters included in each convolution kernel of a convolution layer, so that a CNN model is calculated by using the non-zeroed convolution parameters in the convolution parameters included in the convolution kernel, thereby reducing the total calculation amount of the CNN model. However, this method cannot significantly reduce the computation amount of the CNN model and also cannot reduce the storage resources occupied by the CNN model on the basis of ensuring the computation accuracy of the CNN model.

Disclosure of Invention

In order to overcome the above disadvantages in the prior art, an object of the present application is to provide a model architecture adjusting method and apparatus, where the model architecture adjusting method can significantly reduce the convolution computation amount and storage occupation resources of a CNN model by cutting the model architecture of the CNN model on the basis of ensuring the computation accuracy of the CNN model.

As for a method, an embodiment of the present application provides a model architecture adjustment method, where the method is applied to a convolutional neural network CNN model debugging device, where the debugging device prestores a ranking obtaining policy, where the ranking obtaining policy is used to calculate a ranking result of influence degrees between convolutional kernels in a same convolutional layer included in a CNN model, and the method includes:

aiming at each convolution layer of the CNN model to be adjusted, obtaining an influence degree sequencing result among convolution cores in the convolution layer based on the sequencing acquisition strategy corresponding to the convolution layer;

sequentially shielding at least one convolution kernel with the lowest influence degree in the influence degree sequencing result according to a preset shielding rate interval, and testing the target calculation accuracy of the convolution layer on a verification set by using the current residual unshielded convolution kernels after shielding the convolution kernels each time until the target calculation accuracy exceeds a preset descending threshold relative to the target descending amplitude of the initial calculation accuracy, wherein the initial calculation accuracy is the calculation accuracy of the convolution layer on the verification set when the convolution kernel is not shielded;

and selecting one shielding rate from all shielding rates when the target descending amplitude does not exceed a preset descending amplitude threshold value as the cutting rate of the convolution layer, and cutting off at least one convolution kernel with the lowest influence degree in the influence degree sequencing result of the convolution layer matched with the sequencing acquisition strategy corresponding to the cutting rate according to the cutting rate.

As for a device, an embodiment of the present application provides a model architecture adjusting device, where the device is applied to a convolutional neural network CNN model debugging device, where the debugging device prestores a ranking obtaining policy, where the ranking obtaining policy is used to calculate and obtain a ranking result of degrees of influence between convolutional kernels in the same convolutional layer included in a CNN model, and the device includes:

the influence degree ordering module is used for obtaining an influence degree ordering result between each convolution core in the convolution layer based on the ordering acquisition strategy corresponding to the convolution layer aiming at each convolution layer of the CNN model to be adjusted;

the shielding test module is used for sequentially shielding at least one convolution kernel with the lowest influence degree in the influence degree sequencing result according to a preset shielding rate interval, and testing the target calculation accuracy of the convolution layer on a verification set by using the current residual unshielded convolution kernels after the convolution kernel is shielded each time until the target reduction amplitude of the target calculation accuracy relative to the initial calculation accuracy exceeds a preset reduction amplitude threshold, wherein the initial calculation accuracy is the calculation accuracy of the verification set when the convolution layer does not shield the convolution kernel;

and the convolution kernel cutting module is used for selecting one shielding rate from all shielding rates when the target descending amplitude does not exceed a preset descending amplitude threshold value as the cutting rate of the convolution layer, and cutting off at least one convolution kernel with the lowest influence degree in the influence degree sequencing result matched with the sequencing acquisition strategy corresponding to the cutting rate of the convolution layer according to the cutting rate.

Compared with the prior art, the model architecture adjusting method and device provided by the embodiment of the application have the following beneficial effects: the model architecture adjusting method can obviously reduce the convolution calculation amount and the storage occupation resource of the CNN model by cutting the model architecture of the CNN model on the basis of ensuring the calculation accuracy of the CNN model. Firstly, aiming at each convolution layer of the CNN to be adjusted, the method obtains the influence degree sequencing result among the convolution cores in the convolution layer based on the sequencing acquisition strategy corresponding to the convolution layer; then, the method sequentially shields at least one convolution kernel with the lowest influence degree in the influence degree sequencing result according to a preset shielding rate interval, and tests the target calculation accuracy of the convolution layer on the verification set by the current remaining unmasked convolution kernels after shielding the convolution kernel each time until the target reduction amplitude of the target calculation accuracy relative to the initial calculation accuracy exceeds a preset reduction amplitude threshold value, so as to obtain the shielding rates of different values corresponding to the sequencing acquisition strategy when the target reduction amplitude does not exceed the preset reduction amplitude threshold value; and finally, selecting a shielding rate from all shielding rates when the target descending amplitude does not exceed a preset descending amplitude threshold value as the cutting rate of the convolution layer, cutting off at least one convolution kernel with the lowest influence degree in the influence degree sequencing result matched with the sequencing acquisition strategy corresponding to the cutting rate of the convolution layer according to the cutting rate, and cutting off the convolution kernel with the smaller influence degree in the convolution layer on the basis of not influencing the calculation precision of each convolution layer, so that the model architecture of the CNN model is reduced and optimized, and the convolution calculation amount and the storage occupied resources of the CNN model are obviously reduced. The number of the convolution kernels of each mask is equal to the product of the preset mask rate interval and the total number of the convolution kernels of the convolution layer, the initial calculation precision is the calculation precision on the verification set when the convolution layer does not mask the convolution kernels, and the number of the clipped convolution kernels is equal to the product of the clipping rate and the total number of the convolution kernels of the convolution layer.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments are briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope of the claims of the present application, and it is obvious for those skilled in the art that other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a schematic block diagram of a CNN model debugging device provided in an embodiment of the present application.

Fig. 2 is a flowchart illustrating a method for adjusting a model architecture according to an embodiment of the present disclosure.

Fig. 3 is a schematic view of a first flowchart included in step S210 shown in fig. 2.

Fig. 4 is a schematic diagram of a second flowchart included in step S210 shown in fig. 2.

Fig. 5 is a third flowchart included in step S210 shown in fig. 2.

Fig. 6 is a block diagram illustrating a model architecture adjustment apparatus according to an embodiment of the present disclosure.

Icon: 10-CNN model debugging equipment; 11-a memory; 12-a processor; 13-a communication unit; 100-model architecture adjustment means; 110-an influence ranking module; 120-a shield test module; 130-convolution kernel clipping module.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In the description of the present application, it is noted that the terms "first", "second", "third", and the like are used merely for distinguishing between descriptions and are not intended to indicate or imply relative importance.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

Fig. 1 is a schematic block diagram of a CNN model debugging device 10 according to an embodiment of the present application. In this embodiment of the present application, the CNN model debugging device 10 is configured to perform a calculation accuracy test on a CNN model, and cut out a convolution kernel with a smaller influence degree in each convolution layer of the CNN model on the basis of not influencing the calculation accuracy of the CNN model, so as to reduce and optimize a model architecture of the CNN model, and significantly reduce convolution computation and storage occupation resources of the CNN model when participating in convolution computation. The CNN model debugging device 10 includes a model architecture adjusting apparatus 100, a memory 11, a processor 12, and a communication unit 13. The memory 11, the processor 12 and the communication unit 13 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.

In this embodiment, the memory 11 is a non-volatile memory, and the memory 11 stores a sorting obtaining policy in advance, where the sorting obtaining policy is used to calculate and obtain a sorting result of influence degrees between convolution kernels in the same convolution layer included in the CNN model, and the sorting result of influence degrees may represent a ranking condition of influence strengths of the convolution kernels participating in the sorting process on the convolution calculation process. The influence degree sorting result may be a sorting result corresponding to sorting the influence intensities in an ascending manner or a sorting result corresponding to sorting the influence intensities in a descending manner, and the specific sorting manner may be configured differently according to requirements. For example, in the CNN model, a part of convolutional layers are sorted by influence strength in an ascending manner, and another part of convolutional layers are sorted by influence strength in a descending manner; sorting all the convolution layers in the CNN model by adopting an ascending mode to influence the strength; and (4) sorting the influence intensities of all the convolution layers in the CNN model in a descending manner. In an implementation manner of this embodiment, all convolutional layers in the CNN model are sorted in ascending order according to their impact strength.

In this embodiment, the CNN model includes a plurality of convolution layers, each convolution layer has a plurality of convolution kernels, and each convolution kernel includes a plurality of convolution parameters (e.g., the number of convolution channels of the convolution kernel, the width of the convolution kernel, and the height of the convolution kernel). The number of the corresponding matching sequence acquisition strategies of each convolutional layer at the CNN model debugging device 10 is at least one, and the sequence acquisition strategy corresponding to each convolutional layer may be the same or different from the sequence acquisition strategies corresponding to other convolutional layers.

For example, the CNN model includes convolutional layers 1, convolutional layers 2, convolutional layers 3, and convolutional layers 4, the order acquisition policy corresponding to convolutional layer 1 is policy a for performing influence ordering based on the variance of each convolutional parameter in the convolutional core, the order acquisition policy corresponding to convolutional layer 2 is policy B for performing influence ordering based on the L1 norm of each parameter in the convolutional core, the order acquisition policy corresponding to convolutional layer 3 is policy C for performing influence ordering based on the L2 norm of each parameter in the convolutional core, and policy a for performing influence ordering based on the variance of each parameter in the convolutional core, and the order acquisition policy corresponding to convolutional layer 4 is policy a for performing influence ordering based on the variance of each parameter in the convolutional core. It is to be understood that the policy a, the policy B, and the policy C are only three embodiments of the order obtaining policy provided by the present application, the order obtaining policy provided by the present application is not limited to the three embodiments, and the order obtaining policy may further include more embodiments, for example, a policy D for performing influence degree ordering based on standard deviations of parameters in a convolution kernel, a policy E for performing influence degree ordering based on information entropy of parameters in the convolution kernel, and the like.

In this embodiment, the memory 11 may also be used for storing a program, and the processor 12 may execute the program accordingly after receiving the execution instruction.

In this embodiment, the processor 12 may be an integrated circuit chip having signal processing capabilities. The Processor 12 may be a general-purpose Processor including a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Network Processor (NP), and the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that implements or executes the methods, steps and logic blocks disclosed in the embodiments of the present application.

In this embodiment, the communication unit 13 is configured to establish a communication connection between the CNN model commissioning device 10 and another electronic device through a network, and send and receive data through the network. For example, the CNN model debugging device 10 sends the CNN model with the adjusted model architecture to other electronic devices through the communication unit 13.

In the present embodiment, the model architecture adjusting apparatus 100 includes at least one software functional module that can be stored in the memory 11 in the form of software or firmware or solidified in the operating system of the CNN model debugging device 10. The processor 12 may be used for executing executable modules stored in the memory 11, such as software functional modules and computer programs included in the model architecture adjusting device 100. The CNN model debugging device 10 cuts out the convolution kernel with a smaller influence in each convolution layer through the model architecture adjusting device 100 on the basis of not influencing the calculation precision of each convolution layer in the CNN model to be adjusted, so as to reduce and optimize the model architecture of the CNN model to be adjusted, thereby significantly reducing the convolution calculation amount and the storage occupied resources of the CNN model to be adjusted when participating in convolution calculation.

It is understood that the block diagram shown in fig. 1 is only a schematic diagram of one structure of the CNN model commissioning apparatus 10, and the CNN model commissioning apparatus 10 may further include more or fewer components than those shown in fig. 1, or have a different configuration than that shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

Fig. 2 is a schematic flow chart of a model architecture adjustment method according to an embodiment of the present application. In the embodiment of the present application, the model architecture adjusting method is applied to the CNN model debugging device 10, and the specific flow and steps of the model architecture adjusting method shown in fig. 2 are described in detail below.

Step S210, aiming at each convolution layer of the CNN model to be adjusted, obtaining the influence degree sequencing result between each convolution core in the convolution layer based on the sequencing acquisition strategy corresponding to the convolution layer.

In this embodiment, before performing convolution kernel clipping on each convolution layer of the CNN model to be adjusted, the CNN model debugging device 10 needs to search a ranking acquisition policy matched with the convolution layer to be processed in the memory 11, and acquire an influence ranking result between each convolution kernel in the convolution layer to be processed, which is correspondingly matched, based on the searched ranking acquisition policy. The influence degree sorting result may be sorted in an ascending manner according to the influence degree, or sorted in a descending manner according to the influence degree. In an implementation manner of this embodiment, the influence degree ranking results corresponding to all the ranking obtaining policies pre-stored by the CNN model debugging device 10 are ranked in an ascending order according to the influence degree.

Optionally, please refer to fig. 3, which is a first flowchart included in step S210 shown in fig. 2. In a first implementation manner of the embodiment of the present application, if the number of the pre-stored ranking acquisition strategies corresponding to a certain convolutional layer of the CNN model to be adjusted is one, and the ranking acquisition strategies perform influence ranking based on variances of convolution parameters in convolutional kernels, the step of obtaining an influence ranking result between convolutional kernels in the convolutional layer based on the ranking acquisition strategy corresponding to the convolutional layer in step S210 includes sub-step S211 and sub-step S212.

In the substep S211, for each convolution kernel in the convolution layer, the variance between the plurality of convolution parameters is calculated according to the plurality of convolution parameters included in the convolution kernel.

In this embodiment, when the ordering obtaining policy corresponding to a certain convolutional layer performs influence ordering based on the variance of each convolution parameter in a convolution kernel, the CNN model debugging device 10 obtains the variance corresponding to each convolution kernel in the convolutional layer by performing variance operation between each convolution parameter included in the same convolution kernel under the convolutional layer.

And a substep S212, sorting all convolution kernels in the convolution layer according to the variance corresponding to each convolution kernel in the convolution layer to obtain the influence degree sorting result.

In this embodiment, the CNN model debugging device 10 sorts the variances of all the convolution kernels in the convolutional layer according to the variance corresponding to each convolution kernel in the convolutional layer, and uses the sorted result as a result of sorting the influence of all the convolution kernels in the convolutional layer in the convolution calculation process.

Optionally, please refer to fig. 4, which is a second flowchart included in step S210 shown in fig. 2. In a second implementation manner of the embodiment of the present application, if the number of the pre-stored ranking obtaining strategies corresponding to a certain convolutional layer of the CNN model to be adjusted is one, and the ranking obtaining strategies perform influence ranking based on the L1 norm of each convolutional parameter in the convolutional kernel, the step of obtaining an influence ranking result between convolutional kernels in the convolutional layer based on the ranking obtaining strategies corresponding to the convolutional layer in step S210 includes sub-step S213 and sub-step S214.

In the substep S213, for each convolution kernel in the convolution layer, the sum of the absolute values of the convolution parameters is calculated according to the convolution parameters included in the convolution kernel, so as to obtain the L1 norm corresponding to the convolution kernel.

In this embodiment, when the order obtaining policy corresponding to a certain convolutional layer performs influence degree ordering based on the L1 norm of each convolution parameter in the convolutional core, the CNN model debugging device 10 obtains the L1 norm corresponding to each convolution core in the convolutional layer by performing L1 norm operation (i.e., calculating the sum of absolute values of a plurality of parameters) between each convolution parameter included in the same convolutional core under the convolutional layer.

And a substep S214, sorting all convolution kernels in the convolution layer according to the L1 norm corresponding to each convolution kernel in the convolution layer, so as to obtain the influence degree sorting result.

In this embodiment, the CNN model debugging device 10 ranks the L1 norms of all the convolution kernels in the convolutional layer according to the L1 norms corresponding to the convolution kernels in the convolutional layer, and uses the ranked result as a ranking result of the influence of all the convolution kernels in the convolutional layer in the convolution calculation process.

Optionally, please refer to fig. 5, which is a third flowchart included in step S210 shown in fig. 2. In a third implementation manner of the embodiment of the present application, if the number of the pre-stored ranking acquisition strategies corresponding to a certain convolutional layer of the CNN model to be adjusted is one, and the ranking acquisition strategies perform influence ranking based on the L2 norm of each convolution parameter in a convolutional kernel, the step of obtaining an influence ranking result between convolutional kernels in the convolutional layer based on the ranking acquisition strategies corresponding to the convolutional layer in step S210 includes substep S215 and substep S216.

In the substep S215, for each convolution kernel in the convolution layer, the square sum of the convolution parameters is calculated according to the convolution parameters included in the convolution kernel, and the square root of the square sum is calculated, so as to obtain the L2 norm corresponding to the convolution kernel.

In this embodiment, when the ordering obtaining policy corresponding to a certain convolutional layer performs influence ordering based on the L2 norm of each convolution parameter in a convolution kernel, the CNN model debugging device 10 obtains the L2 norm corresponding to each convolution kernel in the convolutional layer by performing L2 norm operation (i.e., calculating the square root of the square sum of multiple parameters) between each convolution parameter included in the same convolution kernel under the convolutional layer.

And a substep S216, sorting all convolution kernels in the convolution layer according to the L2 norm corresponding to each convolution kernel in the convolution layer to obtain the influence degree sorting result.

In this embodiment, the CNN model debugging device 10 ranks the L2 norms of all the convolution kernels in the convolutional layer according to the L2 norms corresponding to the convolution kernels in the convolutional layer, and uses the ranked result as a ranking result of the influence of all the convolution kernels in the convolutional layer in the convolution calculation process.

Optionally, in a fourth implementation manner of the embodiment of the present application, if the number of the pre-stored ranking acquisition strategies corresponding to a certain convolutional layer of the CNN model to be adjusted is multiple, the step of obtaining a ranking result of the influence between convolutional cores in the convolutional layer based on the ranking acquisition strategy corresponding to the convolutional layer in step S210 includes:

and sequencing all the convolution kernels in the convolution layer based on each sequencing acquisition strategy corresponding to the convolution layer to obtain the sequencing result of the influence degree of each sequencing acquisition strategy between each corresponding convolution kernel at the convolution layer.

For example, when the pre-stored order obtaining strategies corresponding to a certain convolution layer include a strategy a for conducting influence order based on the variance of each parameter in the convolution kernel, a strategy B for conducting influence order based on the L1 norm of each parameter in the convolution kernel, a strategy C for conducting influence order based on the L2 norm of each parameter in the convolution kernel, and a strategy D for conducting influence order based on the standard deviation of each parameter in the convolution kernel, the CNN model debugging device 10 performs influence degree sorting results on each convolution kernel in the convolution layer according to a policy a, a policy B, a policy C, and a policy D, respectively, to obtain an influence degree sorting result corresponding to the policy a, an influence degree sorting result corresponding to the policy B, an influence degree sorting result corresponding to the policy C, and an influence degree sorting result corresponding to the policy D.

In this embodiment of the application, the three sub-steps included in step S210 shown in fig. 3, fig. 4, and fig. 5 are only application examples of three different ordered acquisition policies provided in this application, and the ordered acquisition policy pre-stored in the CNN model debugging device 10 is not limited to the three ordered acquisition policies, and may be more possible ordered acquisition policies.

Referring to fig. 2 again, in step S220, at least one convolution kernel with the lowest influence degree in the influence degree sorting result is sequentially shielded according to a preset shielding rate interval, and the target calculation accuracy of the convolution layer on the verification set by using the currently remaining unmasked convolution kernels is tested after the convolution kernels are shielded each time until the target reduction of the target calculation accuracy relative to the initial calculation accuracy exceeds a preset reduction threshold.

In this embodiment, the preset shielding rate interval is used to represent a difference between shielding rates corresponding to two adjacent shielding rates used by the CNN model debugging device 10 when shielding a convolution kernel in a certain convolution layer, where the preset shielding rate intervals used by the two adjacent shielding rates may be the same or different. For example, the masking rate corresponding to the CNN model debug apparatus 10 when a certain convolutional layer is not subjected to convolutional kernel masking is 0%, if the preset masking rate interval corresponding to the first masking is 5%, the masking rate used when the CNN model debug apparatus 10 first sequentially masks at least one convolutional kernel having the lowest influence in the influence degree ranking result of a certain convolutional layer is 5%, and if the preset masking rate interval corresponding to the second masking is 1%, the CNN model debug apparatus 10 sequentially masks at least one remaining unmasked convolutional kernel having the lowest influence in the influence degree ranking result according to the preset masking rate interval having a value of 1%, where the masking rate corresponding to the second masking is 6%. If the preset shielding rate interval corresponding to the third shielding is 2%, the CNN model debugging device 10 sequentially shields at least one remaining unmasked convolution kernel with the lowest influence in the influence degree ordering result according to the preset shielding rate interval with a value of 2%, where the shielding rate corresponding to the second shielding is 8%. Wherein the number of convolution kernels per mask is equal to the product of the preset mask rate interval and the total number of convolution kernels of the convolution layer.

In this embodiment, after sequentially shielding at least one current remaining unshielded convolution kernel with the lowest influence of a certain convolution layer in a matching influence ranking result each time according to a preset shielding rate interval, the CNN model debugging device 10 shields a convolution channel corresponding to the shielded convolution kernel in the CNN model to be adjusted, and performs convolution calculation on the verification set by using the convolution layer based on the current remaining unshielded convolution kernel, so as to test and obtain the target calculation accuracy of the convolution layer on the verification set by using the current remaining unshielded convolution kernel. When the CNN model debugging device 10 obtains the current target calculation accuracy of the convolutional layer through testing after shielding the convolutional kernel each time, the target calculation accuracy is compared with the initial calculation accuracy, and when the decrease of the target calculation accuracy relative to the initial calculation accuracy exceeds the preset decrease threshold corresponding to the convolutional layer, the step S220 is stopped.

The verification set is used for testing the calculation accuracy of the convolution layers before and after the shielding convolution kernel; the target calculation precision is used for representing the difference degree between data obtained by performing convolution calculation on the convolutional layer by using the current residual unmasked convolution kernel for the verification set and data obtained by performing convolution calculation on the convolutional layer by using the state when the convolutional layer does not mask the convolution kernel for the verification set; the initial calculation accuracy is the calculation accuracy of the convolution layer performing convolution operation on the verification set in a state where the convolution kernel is not masked. In the present embodiment, the initial calculation accuracy of the convolutional layer, which will be gradually less than the initial calculation accuracy after each masking of the convolutional kernel, is generally set to 1. For example, if the target calculation accuracy of a convolution layer after first masking the convolution kernel is 0.96, the target calculation accuracy at the first masking is decreased by 0.04 with respect to the target of the initial calculation accuracy; if the target calculation accuracy of a convolution layer after the second masking of the convolution kernel is 0.91, the target calculation accuracy at the second masking is decreased by 0.09 from the target of the initial calculation accuracy. The preset descending range threshold may be 0.1, 0.05, or 0.0002, where the preset descending range threshold corresponding to each convolution layer may be the same as or different from the preset descending range thresholds corresponding to other convolution layers, and the specific value of the preset descending range threshold corresponding to each convolution layer may be configured differently according to the requirement.

Step S230, selecting one masking rate from all masking rates when the target descending amplitude does not exceed a preset descending amplitude threshold value as the cutting rate of the convolution layer, and cutting off at least one convolution kernel with the lowest influence degree in the influence degree sequencing result matched with the sequencing acquisition strategy corresponding to the cutting rate according to the cutting rate.

In this embodiment, if the number of the ordering acquisition strategies corresponding to a certain convolution layer is one, after obtaining, based on the ordering acquisition strategy, each shielding rate value of which the target falling amplitude corresponding to the convolution layer does not exceed the preset falling amplitude threshold, the CNN model debugging device 10 selects, according to a requirement, one shielding rate with a suitable value from all shielding rate values of which the target falling amplitude corresponding to the ordering acquisition strategy does not exceed the preset falling amplitude threshold as the clipping rate of the convolution layer. For example, the CNN model debugging device 10 may randomly select one shielding rate as the clipping rate of the convolutional layer from all shielding rate values corresponding to the ordering acquisition policy, where the target descending amplitude does not exceed a preset descending amplitude threshold, and may also select the shielding rate with the largest value as the clipping rate of the convolutional layer. In an implementation manner of this embodiment, the CNN model debugging device 10 selects a masking rate with a maximum value as a clipping rate of the convolutional layer from all masking rate values corresponding to the ordering acquisition policy, where the target descending amplitude does not exceed a preset descending amplitude threshold, so as to clip a convolution kernel with a smaller influence in the convolutional layer as much as possible on the basis of not affecting the calculation accuracy of the convolutional layer through the clipping rate, thereby further reducing and optimizing the model architecture of the CNN model, and significantly reducing the convolution calculation amount and storage occupation resources of the CNN model.

In this embodiment, if the number of the sequence acquisition policies corresponding to a certain convolution layer is multiple, the CNN model debugging device 10 selects a shielding rate with a suitable value from all shielding rate values whose target falling amplitudes do not exceed the preset falling amplitude threshold value as the clipping rate of the convolution layer according to the requirement after obtaining each shielding rate value whose target falling amplitude does not exceed the preset falling amplitude threshold value corresponding to the convolution layer at each sequence acquisition policy based on each pre-stored sequence acquisition policy. For example, when the pre-stored sequence acquiring policy corresponding to a certain convolutional layer includes a policy a for performing influence sequence based on a variance of each parameter in a convolutional kernel, a policy B for performing influence sequence based on an L1 norm of each parameter in the convolutional kernel, and a policy C for performing influence sequence based on an L2 norm of each parameter in the convolutional kernel, and the CNN model debugging device 10 obtains all shielding rate values that a target falling amplitude corresponding to each of the policies a, B, and C does not exceed a preset falling amplitude threshold value through a test mode, the CNN model debugging device 10 may randomly select a shielding rate as a clipping rate of the convolutional layer from all shielding rate values that the target falling amplitude corresponding to the policy a does not exceed the preset falling amplitude threshold value; or randomly selecting a shielding rate from all shielding rate values of which the target descending amplitude corresponding to the strategy C does not exceed a preset descending amplitude threshold value as the cutting rate of the convolutional layer; a shielding rate of which the target descending amplitude of a certain strategy does not exceed a preset descending amplitude threshold value can be arbitrarily selected from the strategy A, the strategy B and the strategy C to serve as the cutting rate of the convolutional layer; and selecting the shielding rate with the maximum value as the cutting rate of the convolutional layer from all shielding rate values of which the target descending amplitude corresponding to the strategy A, the strategy B and the strategy C does not exceed a preset descending amplitude threshold value.

In an implementation manner of this embodiment, the CNN model debugging device 10 selects, from all the mask rate values whose target decrease amplitude does not exceed the preset decrease amplitude threshold corresponding to each ordering acquisition policy corresponding to the mask layer, the mask rate with the largest value as the clipping rate of the convolutional layer, and on the basis of not affecting the calculation accuracy of the convolutional layer, clips the convolution kernel with the smaller influence in the convolutional layer as much as possible according to the clipping rate, so as to further reduce and optimize the model architecture of the CNN model, and significantly reduce the convolution calculation amount and storage occupied resources of the CNN model.

At this time, the step of selecting one shielding rate as the clipping rate of the convolutional layer from all shielding rates when the target descending amplitude does not exceed the preset descending amplitude threshold value includes:

screening out the maximum shielding rate corresponding to each sequencing acquisition strategy when the target descending amplitude does not exceed a preset descending amplitude threshold from all shielding rates corresponding to each sequencing acquisition strategy of the convolutional layer when the target descending amplitude does not exceed the preset descending amplitude threshold;

and selecting the maximum shielding rate with the maximum value as the cutting rate of the convolutional layer from the maximum shielding rates corresponding to each sequencing acquisition strategy of the convolutional layer when the target descending amplitude does not exceed a preset descending amplitude threshold value.

For example, when the pre-stored sequence acquisition policy corresponding to a certain convolutional layer includes a policy a for performing influence sequence based on a variance of each parameter in a convolutional kernel, a policy B for performing influence sequence based on an L1 norm of each parameter in the convolutional kernel, and a policy C for performing influence sequence based on an L2 norm of each parameter in the convolutional kernel, and the CNN model debugging device 10 obtains all shielding rate values whose target falling amplitudes respectively corresponding to the policies a, B, and C do not exceed a preset falling amplitude threshold value through a test mode, the maximum shielding rate with the largest value is selected as the clipping rate of the convolutional layer from the maximum shielding rates respectively corresponding to the policies a, B, and C.

In this embodiment, after the CNN model debugging device 10 obtains the cutting rate of the convolutional layer, at least one convolution kernel with the lowest influence degree in the ordering result of the influence degree of the convolutional layer, which is matched with the ordering acquisition policy corresponding to the cutting rate, is cut according to the cutting rate, and a convolution channel with the cut convolution kernel is cut at the same time, so that on the basis of not affecting the calculation accuracy of each convolutional layer in the CNN model to be adjusted, the convolution kernel with the smaller influence degree in each convolutional layer is cut, thereby reducing and optimizing the model architecture of the CNN model to be adjusted, and significantly reducing the convolution calculation amount and storage occupied resources of the CNN model to be adjusted when participating in convolution calculation. Wherein the number of clipped convolution kernels corresponding to the convolutional layer is equal to the product of the clipping rate and the total number of convolution kernels of the convolutional layer.

Fig. 6 is a block diagram of a model architecture adjustment apparatus 100 according to an embodiment of the present disclosure. In the embodiment of the present application, the model architecture adjusting device 100 includes an influence degree ordering module 110, a mask testing module 120, and a convolution kernel clipping module 130.

The influence ranking module 110 is configured to, for each convolutional layer of the CNN model to be adjusted, obtain an influence ranking result between convolutional cores in the convolutional layer based on a ranking acquisition policy corresponding to the convolutional layer.

Optionally, if the number of the pre-stored ranking obtaining strategies corresponding to a certain convolution layer of the CNN model to be adjusted is one, and the ranking obtaining strategies perform influence ranking based on the variance of each convolution parameter in the convolution kernel, the influence ranking module 110 may perform sub-step S211 and sub-step S212 in fig. 3.

Optionally, if the number of the pre-stored ranking obtaining strategies corresponding to a certain convolution layer of the CNN model to be adjusted is one, and the ranking obtaining strategies perform influence ranking based on the L1 norm of each convolution parameter in the convolution kernel, the influence ranking module 110 may perform sub-step S213 and sub-step S214 in fig. 4.

Optionally, if the number of the pre-stored ranking obtaining strategies corresponding to a certain convolution layer of the CNN model to be adjusted is one, and the ranking obtaining strategies perform influence ranking based on the L2 norm of each convolution parameter in the convolution kernel, the influence ranking module 110 may perform sub-step S215 and sub-step S216 in fig. 5.

Optionally, if the number of the pre-stored ranking acquisition strategies corresponding to a certain convolutional layer of the CNN model to be adjusted is multiple, the influence ranking module 110 is specifically configured to rank all convolutional cores in the convolutional layer based on each ranking acquisition strategy corresponding to the convolutional layer, so as to obtain an influence ranking result of each ranking acquisition strategy between each convolutional core corresponding to the convolutional layer.

The shielding test module 120 is configured to sequentially shield at least one convolution kernel with the lowest influence degree in the influence degree ranking result according to a preset shielding rate interval, and test the target calculation accuracy of the convolution layer on the verification set by using the currently remaining unmasked convolution kernels after each shielding of the convolution kernel until the target reduction of the target calculation accuracy relative to the initial calculation accuracy exceeds a preset reduction threshold.

In this embodiment, the number of convolution kernels masked each time is equal to the product of the preset masking rate interval and the total number of convolution kernels of the convolution layer, and the initial calculation precision is the calculation precision on the verification set when the convolution layer has not masked the convolution kernels. The shielding test module 120 may perform step S220 shown in fig. 2, and the specific implementation process may refer to the above detailed description of step S220.

The convolution kernel clipping module 130 is configured to select one masking rate from all masking rates when the target falling amplitude does not exceed a preset falling amplitude threshold value, as the clipping rate of the convolution layer, and clip at least one convolution kernel with the lowest influence degree in the influence degree ranking result of the convolution layer matched with the ranking acquisition policy corresponding to the clipping rate according to the clipping rate.

In this embodiment, if the number of the ordering acquisition strategies corresponding to a certain convolution layer is one, the convolution kernel clipping module 130 selects a shielding rate with an appropriate value as the clipping rate of the convolution layer from all shielding rate values, of which the target falling amplitude corresponding to the ordering acquisition strategy does not exceed the preset falling amplitude threshold value, after obtaining each shielding rate value, of which the target falling amplitude corresponding to the convolution layer does not exceed the preset falling amplitude threshold value, based on the ordering acquisition strategy. In an implementation manner of this embodiment, the convolution kernel clipping module 130 selects a shielding rate with a maximum value from all shielding rate values of which the target falling amplitude corresponding to the sorting obtaining policy does not exceed a preset falling amplitude threshold value as the clipping rate of the convolutional layer.

In this embodiment, if the number of the ordering acquisition strategies corresponding to a certain convolution layer is multiple, the convolution kernel clipping module 130 selects a masking rate with an appropriate value from all the masking rates when the target descending amplitude does not exceed the preset descending amplitude threshold value as the clipping rate of the convolution layer after obtaining, based on each pre-stored ordering acquisition strategy, each masking rate value of the convolution layer where the target descending amplitude corresponding to each ordering acquisition strategy does not exceed the preset descending amplitude threshold value. In an implementation manner of this embodiment, the convolution kernel clipping module 130 selects, from maximum masking rate values of the convolution layer where a target descending amplitude corresponding to each sorting obtaining policy does not exceed a preset descending amplitude threshold, the maximum masking rate with the largest value as the clipping rate of the convolution layer.

In this embodiment, after the convolution kernel clipping module 130 obtains the clipping rate of the convolution layer, at least one convolution kernel with the lowest influence degree in the ordering result of the influence degree of the convolution layer matched with the ordering acquisition policy corresponding to the clipping rate is clipped according to the clipping rate, and a convolution channel with the clipped convolution kernel is clipped at the same time, so that on the basis of not affecting the calculation precision of each convolution layer in the CNN model to be adjusted, the convolution kernel with the smaller influence degree in each convolution layer is clipped, thereby reducing and optimizing the model architecture of the CNN model to be adjusted, and significantly reducing the convolution calculation amount and storage occupation resources of the CNN model to be adjusted when participating in convolution calculation. Wherein the number of clipped convolution kernels corresponding to the convolutional layer is equal to the product of the clipping rate and the total number of convolution kernels of the convolutional layer.

In summary, in the model architecture adjustment method and apparatus provided in the embodiments of the present application, the model architecture adjustment method can significantly reduce the convolution calculation amount and the storage occupation resource of the CNN model by cutting the model architecture of the CNN model on the basis of ensuring the calculation accuracy of the CNN model. Firstly, aiming at each convolution layer of the CNN to be adjusted, the method obtains the influence degree sequencing result among the convolution cores in the convolution layer based on the sequencing acquisition strategy corresponding to the convolution layer; then, the method sequentially shields at least one convolution kernel with the lowest influence degree in the influence degree sequencing result according to a preset shielding rate interval, and tests the target calculation accuracy of the convolution layer on the verification set by the current remaining unmasked convolution kernels after shielding the convolution kernel each time until the target reduction amplitude of the target calculation accuracy relative to the initial calculation accuracy exceeds a preset reduction amplitude threshold value, so as to obtain the shielding rates of different values corresponding to the sequencing acquisition strategy when the target reduction amplitude does not exceed the preset reduction amplitude threshold value; and finally, selecting a shielding rate from all shielding rates when the target descending amplitude does not exceed a preset descending amplitude threshold value as the cutting rate of the convolution layer, cutting off at least one convolution kernel with the lowest influence degree in the influence degree sequencing result matched with the sequencing acquisition strategy corresponding to the cutting rate of the convolution layer according to the cutting rate, and cutting off the convolution kernel with the smaller influence degree in the convolution layer on the basis of not influencing the calculation precision of each convolution layer, so that the model architecture of the CNN model is reduced and optimized, and the convolution calculation amount and the storage occupied resources of the CNN model are obviously reduced. The number of the convolution kernels of each mask is equal to the product of the preset mask rate interval and the total number of the convolution kernels of the convolution layer, the initial calculation precision is the calculation precision on the verification set when the convolution layer does not mask the convolution kernels, and the number of the clipped convolution kernels is equal to the product of the clipping rate and the total number of the convolution kernels of the convolution layer.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A model architecture adjusting method is applied to Convolutional Neural Network (CNN) model debugging equipment, wherein sequencing acquisition strategies are prestored in the debugging equipment, and are used for calculating and acquiring influence degree sequencing results among convolutional kernels in the same convolutional layer, which are included in the CNN model, and the method comprises the following steps:

2. The method of claim 1, wherein if the number of the ordering acquisition strategies corresponding to the convolutional layer is one, and the ordering acquisition strategies perform influence ordering based on variances of convolution parameters in convolution kernels, the step of obtaining an influence ordering result between convolution kernels in the convolutional layer based on the ordering acquisition strategies corresponding to the convolutional layer comprises:

calculating the variance among a plurality of convolution parameters according to the convolution parameters included in the convolution kernel aiming at each convolution kernel in the convolution layer;

and sequencing all the convolution kernels in the convolutional layer according to the variance corresponding to each convolution kernel in the convolutional layer to obtain the influence degree sequencing result.

3. The method of claim 1, wherein if the number of the ordering obtaining strategies corresponding to the convolutional layer is one, and the ordering obtaining strategies perform influence ordering based on the L1 norm of each convolution parameter in the convolutional kernel, the step of obtaining an influence ordering result between convolutional kernels in the convolutional layer based on the ordering obtaining strategies corresponding to the convolutional layer comprises:

calculating the sum of absolute values of a plurality of convolution parameters according to the plurality of convolution parameters included in the convolution kernel aiming at each convolution kernel in the convolution layer to obtain an L1 norm corresponding to the convolution kernel;

and sequencing all the convolution kernels in the convolution layer according to the L1 norm corresponding to each convolution kernel in the convolution layer to obtain the influence degree sequencing result.

4. The method of claim 1, wherein if the number of the ordering obtaining strategies corresponding to the convolutional layer is one, and the ordering obtaining strategies perform visual ordering based on the L2 norm of each convolution parameter in the convolutional kernel, the step of obtaining the ordering result of the influence between the convolutional kernels in the convolutional layer based on the ordering obtaining strategies corresponding to the convolutional layer comprises:

for each convolution kernel in the convolution layer, calculating the square sum of the convolution parameters according to the convolution parameters included in the convolution kernel, and calculating the square root of the square sum to obtain an L2 norm corresponding to the convolution kernel;

and sequencing all the convolution kernels in the convolution layer according to the L2 norm corresponding to each convolution kernel in the convolution layer to obtain the influence degree sequencing result.

5. The method of claim 1, wherein if the number of the ordering acquisition strategies corresponding to the convolutional layer is plural, the step of obtaining an ordering result of the influence between the convolutional cores in the convolutional layer based on the ordering acquisition strategies corresponding to the convolutional layer comprises:

and sequencing all the convolution kernels in the convolution layer based on each sequencing acquisition strategy corresponding to the convolution layer to obtain the sequencing result of the influence degree of each sequencing acquisition strategy between each convolution kernel corresponding to the convolution layer.

6. A model architecture adjusting device is applied to Convolutional Neural Network (CNN) model debugging equipment, wherein the debugging equipment prestores a sequencing acquisition strategy, and the sequencing acquisition strategy is used for calculating and acquiring an influence degree sequencing result between convolutional kernels in the same convolutional layer, which are included in the CNN model, and the device comprises:

7. The apparatus of claim 6, wherein if the number of the ordering acquisition strategies corresponding to the convolutional layer is one, and the ordering acquisition strategies perform the influence ordering based on the variance of each convolution parameter in the convolution kernel, the influence ordering module is specifically configured to:

8. The apparatus of claim 6, wherein if the number of the ordering acquisition strategies corresponding to the convolutional layer is one, and the ordering acquisition strategies perform the influence ordering based on the L1 norm of each convolution parameter in the convolution kernel, the influence ordering module is specifically configured to:

9. The apparatus of claim 6, wherein if the number of the ordering acquisition strategies corresponding to the convolutional layer is one, and the ordering acquisition strategies are image-ordered based on the L2 norm of the convolution parameters in the convolution kernel, the influence ordering module is specifically configured to:

10. The apparatus of claim 6, wherein if there are multiple ordering acquisition strategies corresponding to the convolutional layer, the influence ordering module is specifically configured to: