CN114943337A

CN114943337A - Model pruning method and device and computer equipment

Info

Publication number: CN114943337A
Application number: CN202210679726.7A
Authority: CN
Inventors: 周宏扬
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2022-06-15
Filing date: 2022-06-15
Publication date: 2022-08-26

Abstract

According to the model pruning method, the model pruning device and the computer equipment provided by the embodiment of the invention, aiming at every two target channels in the first model, a plurality of single-dimensional similarities between the two channels are obtained through calculation according to a plurality of similarity calculation modes, the unified similarity between the two target channels is calculated based on each single-dimensional similarity between the two channels, and then the target channels in the first model are pruned through the channels based on the unified similarity. By the method of fusing multiple similarity calculation modes, channel pruning can be performed by combining the characteristics of all the similarity calculation modes together, and channel pruning is performed based on unified similarity, so that the similarity among all the channels is fused with the characteristics of all the algorithm application scenes, the model pruning is suitable for multiple application scenes, and the generalization of the model pruning is enhanced.

Description

Model pruning method and device and computer equipment

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a model pruning method and device and computer equipment.

Background

Under a plurality of scenes and modes of an artificial intelligence algorithm, aiming at algorithm directions such as face recognition, vehicle recognition, character recognition and the like and a mode of landing an embedded terminal application, model pruning needs to be carried out on an existing large model so as to reduce model parameters under the condition of not obviously reducing algorithm precision and improve algorithm operation efficiency of terminal equipment.

The existing model channel pruning technology can use a more appropriate pruning mode when the model channel is pruned under a certain specific algorithm scene, and the pruning effect is better. However, the pruning method has an unsatisfactory pruning effect in other algorithm scenes, cannot achieve a good model pruning effect in each algorithm scene, and is poor in generalization.

Therefore, how to improve the generalization of model pruning becomes an urgent technical problem to be solved.

Disclosure of Invention

The embodiment of the invention aims to provide a model pruning method, a model pruning device and computer equipment, so as to improve the generalization of model pruning. The specific technical scheme is as follows:

according to a first aspect of the embodiments of the present invention, there is provided a model pruning method applied to target identification, the method including:

aiming at every two target channels in a first model, respectively calculating according to multiple similarity calculation modes to obtain multiple single-dimensional similarities between the two channels, wherein the target channels are used for extracting data features of data input into the first model;

for every two target channels in the first model, calculating unified similarity between the two target channels based on each single-dimensional similarity between the two target channels, wherein the unified similarity is positively correlated with each single-dimensional similarity between the two channels;

and according to the unified similarity among the target channels, channel pruning the target channels in the first model to obtain a second model, wherein the unified similarity between any two channel-pruned target channels is greater than a preset similarity threshold.

As an embodiment, the plurality of similarity calculation methods includes at least two of the following three similarity calculation methods:

the method comprises the following steps of calculating the similarity according to the Euclidean distance, calculating the similarity according to the cosine distance, and calculating the similarity according to the Mahalanobis distance.

As an embodiment, said calculating, for each two target channel vectors in the first model, a unified similarity of the two target channels based on each of the single-dimensional similarities between the two target channels includes:

aiming at every two target channels in the first model, carrying out standard normalization on each single-dimensional similarity between the two channel vectors to obtain a plurality of normalized similarities between the two target channels;

for each two target channels in the first model, calculating a unified similarity between the two channels based on each normalized similarity between the two target channels, wherein the unified similarity is positively correlated with each normalized similarity between the two channels.

As an embodiment, the method further comprises:

judging whether the current channel number in the first model is lower than a preset channel number threshold value or not;

and if not, taking the second model as a new first model, returning to execute the step of calculating and obtaining a plurality of single-dimensional similarities between two channels according to a plurality of similarity calculation methods aiming at every two target channels in the first model.

As an embodiment, the method further comprises:

judging whether the model training times of the first model are larger than the preset model training times or not;

if not, training the first model to obtain a third model; and taking the third model as a new first model, and executing the step of calculating and obtaining a plurality of single-dimensional similarities between two target channels in the first model according to a plurality of similarity calculation modes aiming at each two target channels in the first model.

According to a second aspect of the embodiments of the present invention, there is provided a model pruning apparatus, wherein the apparatus is applied to target recognition, and the apparatus includes:

the first calculation unit is used for calculating a plurality of single-dimensional similarities between two target channels in a first model according to a plurality of similarity calculation methods, wherein the target channels are used for extracting data features of data input into the first model;

a second calculating unit, configured to calculate, for each two target channels in the first model, a unified similarity between the two target channels based on each of the single-dimensional similarities between the two target channels, where the unified similarity is positively correlated with each of the single-dimensional similarities between the two target channels;

and the channel pruning unit is used for channel pruning the target channels in the first model according to the unified similarity between the target channels to obtain a second model, wherein the unified similarity between any two target channels pruned by the channel is greater than a preset similarity threshold.

As an implementation manner, the second calculating unit is specifically configured to, for each two target channels in the first model, perform standard normalization on each single-dimensional similarity between the two channel vectors to obtain multiple normalized similarities between the two target channels;

As an implementation manner, the apparatus further includes a first determining unit, configured to determine whether a current channel number in the first model is lower than a preset channel number threshold;

As an embodiment, the apparatus further includes a second determining unit, configured to determine whether the number of times of model training of the first model is greater than a preset number of times of model training;

According to a third aspect of the embodiments of the present application, there is provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

a processor configured to implement the method steps of any one of the first aspect when executing a program stored in the memory.

According to a fourth aspect of embodiments herein, there is provided a model pruning device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: the method steps according to any of the above first aspects are carried out.

According to a fifth aspect of embodiments herein, there is provided a computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when executed by a processor, the computer program implements the method steps of any one of the first aspect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by referring to these drawings.

Fig. 1 is a schematic flow chart of a model pruning method provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a process for calculating a unified similarity between two target channels according to an embodiment of the present disclosure;

fig. 3 is a schematic flowchart of channel number detection of a model according to an embodiment of the present disclosure;

fig. 4 is a schematic flowchart of detecting the number of times of model training of a model according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of another model pruning method provided in the embodiments of the present application;

fig. 6 is a schematic structural diagram of a model pruning device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived from the embodiments given herein by one of ordinary skill in the art, are within the scope of the invention.

For more clearly explaining the device code determining method provided in the embodiment of the present application, a possible application scenario of the device code determining method provided in the present application will be exemplarily described below, it should be understood that the following example is only one possible application scenario of the device code determining method provided in the embodiment of the present application, and in other possible embodiments, the device code determining method provided in the embodiment of the present application may also be applied to other possible application scenarios, and the following example is not limited thereto.

The implementation of the artificial intelligence algorithm depends on an algorithm model (hereinafter referred to as a model for short), and is limited by the performance of the terminal equipment for operating the algorithm model in numerous scenes, such as scenes of face recognition, vehicle recognition, character recognition and the like. Therefore, model pruning needs to be performed on the model to reduce model parameters without significantly reducing the algorithm precision, and the algorithm operating efficiency of the terminal device is improved.

In the related art, model pruning is often performed on the model according to a specific application scene, so that the algorithm precision of the model obtained through the model pruning in other application scenes is low. For example, a convolutional neural network model is assumed to exist, and model pruning is performed on the convolutional neural network model according to an application scene of face recognition, so that the new convolutional neural network model obtained through model pruning has lower algorithm precision in the application scene of face recognition, but has lower algorithm precision in application scenes such as vehicle recognition and character recognition. Therefore, the model pruning method in the related art has poor applicability.

Based on this, the present application provides a model pruning method, as shown in fig. 1, which is applied to target recognition, where an application scenario corresponding to the target recognition may be any one of the above application scenarios of vehicle recognition, character recognition, and face recognition, and the method includes:

s101, aiming at every two target channels in the first model, respectively calculating according to multiple similarity calculation modes to obtain multiple single-dimensional similarities between the two channels, wherein the target channels are used for extracting data features of data input into the first model.

S102, aiming at every two target channels in the first model, calculating unified similarity between the two target channels based on each single-dimensional similarity between the two target channels, wherein the unified similarity is positively correlated with each single-dimensional similarity between the two channels.

S103, according to the unified similarity between the target channels, the channels prune the target channels in the first model to obtain a second model, wherein the unified similarity between any two channel-pruned target channels is larger than a preset similarity threshold.

In the embodiment of the application, for every two target channels in the first model, a plurality of single-dimensional similarities between the two channels are obtained through calculation according to a plurality of similarity calculation methods, the unified similarity between the two target channels is calculated based on each single-dimensional similarity between the two channels, and then the target channels in the first model are pruned through the channels based on the unified similarity. By the method of fusing multiple similarity calculation modes, channel pruning can be performed by combining the characteristics of all the similarity calculation modes together, and channel pruning is performed based on unified similarity, so that the similarity among all the channels is fused with the characteristics of all the algorithm application scenes, the model pruning is suitable for multiple application scenes, and the generalization of the model pruning is enhanced.

The foregoing steps S101-S103 will be described in detail below:

in S101, in the first model, each convolutional layer has a plurality of channels, and the target channel in the present application may be all channels in a specific convolutional layer or may be a part of channels in a specific convolutional layer. For example, there are 3 convolutional layers in the first model: the convolutional layers 1, 2, and 3, if model pruning needs to be performed on the convolutional layer 2 of the first model, model pruning may be performed based on all channels in the convolutional layer 2, or model pruning may be performed on some channels in the convolutional layer 2. The first model in the embodiment of the present application may be an original model without a pruning process, or may be a model after pruning, and the present application is not limited at all.

And aiming at the target channels needing pruning in the first model, calculating the single-dimensional similarity of every two target channels. In the embodiment of the present application, a calculation manner of calculating the single-dimensional similarity of each two target channels is not limited. The single-dimensional similarity between every two target channels is obtained by the channel vector of the target channels. The similarity in the present application refers to the degree of similarity, and the greater the similarity, the greater the degree of similarity.

Taking the aforementioned exemplary first model as an example, pruning may be performed on convolutional layer 2 of the first model to obtain channel vector 1, channel vector 2, channel vector 3, and channel vector 4 of channel 1, channel 2, channel 3, and channel 4. Taking the calculation of the similarity of the channels 1 and 2 as an example, the similarity between the channel vector 1 and the channel vector 2 is calculated according to three similarity calculation methods, the similarity d1 is calculated according to the first similarity calculation method, the similarity d2 is calculated according to the second similarity calculation method, and the similarity d3 is calculated according to the third similarity calculation method.

In S102, a plurality of single-dimensional similarities exist between every two target channels, and in S101, there are several calculation methods for calculating the single-dimensional similarity between the two target channels, and then there are several single-dimensional similarities between every two target channels. For example, if there are three similarity calculation methods to calculate the similarity between the channel 1 and the channel 2 in the convolution layer 2 of the first model, the single-dimensional similarity d1 calculated by the first similarity calculation method can be obtained; the single-dimensional similarity d2 calculated by the second similarity calculation method; the third similarity calculation method calculates the single-dimensional similarity d 3. Then, based on the single-dimensional similarity D1, D2 and D3, the unified similarity D between the channel 1 and the channel 2 is calculated _total1 . The calculation method of the unified similarity of the other channels is the same as the unified similarity of the channel 1 and the channel 2, and other details are not described in this embodiment of the present application. In the embodiment of the application, after the uniform similarity between every two target channels of the model is obtained, the obtained uniform similarity and the similarity threshold value can be arranged according to the sequence from large to small, so that channel pruning can be directly performed according to the arrangement sequence in the follow-up process, and the model pruning efficiency can be improved.

In S103, a similarity threshold is preset, and when the unified similarity between two channels in the first model is greater than the preset similarity threshold, channel pruning is performed on the first model, and the first model has a smaller influence on the performance of the first model after channel pruning, and the similarity threshold is set by a technician according to experience and is not limited in the present application.

Still taking the aforementioned exemplary application scenario as an example, it is determined whether the similarity between all the channels in the convolution layer 2 in the first model is lower than the similarity threshold, and if not, the channels with the similarity higher than the similarity threshold are subjected to channel pruning. For example, the uniform similarity between channel 1 and channel 2 is D _total1 (ii) a The unified similarity between the channel 1 and the channel 3 is D _total2 (ii) a The unified similarity between the channel 1 and the channel 4 is D _total3 (ii) a The unified similarity between the channel 2 and the channel 3 is D _total4 (ii) a The unified similarity between the channels 2 and 4 is D _total5 (ii) a The unified similarity between the channel 3 and the channel 4 is D _total6 And the similarity threshold is alpha. And performing channel pruning on the channels with the unified similarity larger than the similarity threshold. And sorting the unified similarity and the similarity threshold value obtained by calculation according to the sequence from big to small: d _total1 ＞D _total4 ＞D _total2 ＞α＞D _total3 ＞D _total5 ＞D _total6 . According to the sequencing result, the uniform similarity D between the channels 1 and 2 can be known _total1 (ii) a Unified similarity D between channels 1, 3 _total2 (ii) a Unified similarity D between channels 2, 3 _total4 All are greater than the similarity threshold, and in a feasible manner, the channels 1 and 2 corresponding to the uniform similarity with the largest uniform similarity can be combined according to the sorting result. In a possible manner, the unified similarity degree between the related channels, such as the channels 1 and 2, with the unified similarity degree greater than the similarity degree threshold value can also be used _total1 (ii) a Unified similarity D between channel 2 and channel 3 _total4 (ii) a Unified similarity D between channel 1 and channel 3 _total2 (ii) a Channels 1, 2 are related, channels 2, 3 are related, and channels 1, 3 are related, it is possible to merge channels 1, 2, 3 together. In another possible way, the similarity can be unifiedAnd optionally pruning two target channels which are larger than the similarity threshold, such as merging the channels 2 and 3. Specifically, which channels are combined may be determined according to the current application scenario.

In one possible implementation, the plurality of similarity calculation methods includes: and calculating the similarity according to the Euclidean distance, the cosine distance and the Mahalanobis distance. For example, the single-dimensional similarity between the channels may be calculated in a manner of calculating the similarity according to the euclidean distance and in a manner of calculating the similarity according to the cosine distance, may be calculated in a manner of calculating the similarity according to the cosine distance and in a manner of calculating the similarity according to the mahalanobis distance, and may be calculated in a manner of calculating the similarity according to the euclidean distance, in a manner of calculating the similarity according to the cosine distance, and in a manner of calculating the similarity according to the mahalanobis distance. The number of similarity calculation methods for calculating the single-dimensional similarity between channels is not limited in the present application.

Specifically, in the embodiment of the present application, the similarity may be calculated according to the following calculation formula:

wherein d (x, y) is the similarity between two channel vectors, n is the dimension of the target channel in the first model, x _i ，y _i Respectively, the elements in the ith-dimension channel vector.

The similarity calculation method based on the cosine distance can be calculated according to the following calculation formula:

wherein cos (theta) is the similarity between two channel vectors, n is the dimension of the target channel in the first model, and x _i ，y _i Are elements in the ith channel vector respectively。

The similarity calculation method according to the mahalanobis distance can be calculated according to the following calculation formula:

wherein d (x, y) is the similarity between two channel vectors, n is the dimension of the target channel in the first model, x _i ，y _i Respectively, the elements in the ith channel vector, S is the covariance matrix of the target channel vector, and T is the transposition operation.

In the embodiment of the present application, other manners capable of calculating the similarity may also be used to calculate the single-dimensional similarity between two channels in the embodiment of the present application, and the embodiment of the present application is not limited in this embodiment.

Still taking the calculation of the single-dimensional similarity of the channels 1 and 2 in the convolutional layer 2 of the aforementioned exemplary first model as an example, the single-dimensional similarity between the channel vector 1 and the channel vector 2 is calculated according to three calculation methods, the single-dimensional similarity d1 is calculated according to the euclidean distance, the single-dimensional similarity d2 is calculated according to the cosine distance, and the single-dimensional similarity d3 is calculated according to the mahalanobis distance.

In the embodiment of the application, for every two target channels in the first model, a plurality of single-dimensional similarities between the two channels are obtained through calculation according to a plurality of similarity calculation methods, a unified similarity between the two target channels is calculated based on each single-dimensional similarity between the two channels, and then the target channels in the first model are pruned by the channels based on the unified similarity. In the embodiment of the application, the similarity between every two target channels in the model is calculated by using multiple similarity calculation modes, and channel pruning can be performed jointly by combining the characteristics of each similarity calculation mode, so that the model pruning is suitable for multiple application scenes, and the generalization of the model pruning is enhanced.

In a possible implementation manner, after the single-dimensional similarity between every two channels is calculated according to different similarity calculation manners, in order to improve the accuracy of obtaining the uniform similarity, the single-dimensional similarity between every two target channels may be normalized by a standard. Based on each single-dimensional similarity between two target channels, calculating the unified similarity of the two target channels, as shown in fig. 2:

s201, aiming at every two target channels in the first model, carrying out standard normalization on each single-dimensional similarity between the two channel vectors to obtain a plurality of second similarities between the two target channels.

In the embodiment of the application, the single-dimensional similarity between every two channel vectors is calculated by a plurality of similarity calculation methods. For example, taking the application scenario of the foregoing embodiment as an example, for the convolutional layer 2 of the first model, the single-dimensional similarities between the channels 1, 2, 3, and 4 are calculated according to three similarity calculation methods, and the single-dimensional similarity d1 between the channels 1 and 2 is obtained according to the euclidean distance calculation _(1,2) D1, between channels 1, 3 _(1,3) Single-dimensional similarity d1 between channels 1, 4 _(1,4) 3, channels 2, 3, respectively, and a single-dimensional similarity d1 between the channels _(2,3) Single-dimensional similarity d1 between channels 2, 4 _(2,4) 3, 4, respectively, and a single-dimensional similarity d1 between the channels 3, 4 _(3,4) (ii) a Calculating and obtaining the single-dimensional similarity d2 between the channels 1 and 2 according to the cosine distance _(1,2) D2, between channels 1, 3 _(1,3) Single-dimensional similarity d2 between channels 1, 4 _(1,4) 3, channels 2, 3, respectively, and a single-dimensional similarity d2 between the channels _(2,3) Single-dimensional similarity d2 between channels 2, 4 _(2,4) 3, 4, respectively, and a single-dimensional similarity d2 between the channels 3, 4 _(3,4) (ii) a The single-dimensional similarity d3 between the channels 1 and 2 is calculated according to the Mahalanobis distance _(1,2) D3, between channels 1, 3 _(1,3) Single-dimensional similarity d3 between channels 1, 4 _(1,4) 3, channels 2, 3, respectively, and a single-dimensional similarity d3 between the channels _(2,3) Single-dimensional similarity d3 between channels 2, 4 _(2,4) 3, 4, respectively, and a single-dimensional similarity d3 between the channels 3, 4 _(3,4) 。

When performing standard normalization, the following formula can be followed:

standard normalizing meterCalculating a formula:

wherein x is _i Based on one of the single-dimensional similarities between two target channels calculated in the same calculation method, for example, the single-dimensional similarity d1 between the channels 1 and 2 obtained according to the Euclidean distance extreme in the convolutional layer 2 of the first model _(1,2) ；

In order to calculate the average value of the single-dimensional similarity between all the two target channels in the convolutional layer based on the same calculation mode, σ (x) is the standard deviation of the single-dimensional similarity between all the target channels based on the same calculation mode.

Aiming at the convolution layer 2 of the first model, the similarity standard obtained by three calculation modes is normalized:

and (3) carrying out standard normalization according to the similarity obtained by the Euclidean distance calculation:

wherein

For d1 calculated according to Euclidean distance _(1,2) ，d1 _(1,3) ，d1 _(1,4) ，d1 _(2,3) ，d1 _(2,4) ，d1 _(3,4) The standard deviation of (a). d1 _(1,3) 、d1 _(1,4) 、d1 _(2,3) 、d1 _(2,4) 、d1 _(3,4) Normalized calculation of d1 _(1,2) The calculation methods are similar, and are not described in detail in this application.

Similarly, the one-dimensional similarity d2 is calculated according to the cosine distance _(1,2) 、d2 _(1,3) 、d2 _(1,4) 、d2 _(2,3) 、d2 _(2,4) 、d2 _(3,4) (ii) a Calculating according to the Mahalanobis distance to obtain the single-dimensional similarity d3 _(1,2) 、d3 _(1,3) 、d3 _(1,4) 、d3 _(2,3) 、d3 _(2,4) 、d3 _(3,4) Normalized sum of normalized calculation d1 _(1,2) The calculation methods are similar, and are not described in detail in this application.

For the channel 1 and the channel 2, the single-dimensional similarity is calculated according to three calculation modes and then is normalized to obtain D1 _(1,2) 、D2 _(1,2) 、D3 _(1,2) D1 _(1,2) 、D2 _(1,2) 、D3 _(1,2) As the second similarity obtained.

S202, aiming at every two target channels in the first model, calculating unified similarity between the two channels based on every second similarity between the two target channels, wherein the unified similarity is positively correlated with every second similarity between the two channels.

In the embodiment of the application, the normalized similarity obtained by calculation in various ways can be weighted to obtain the unified similarity. The weighted value of the normalized similarity obtained in each calculation mode may be set by a technician according to an empirical value, or may be calculated by the technician according to a weighted value calculation formula, and in the embodiment of the present application, no limitation is made.

According to D _total ＝w ₁ *D ₁ +w ₂ *D ₂ +w ₃ *D ₃ + … … wherein D ₁ 、D ₂ 、D ₃ … … is a second similarity obtained by standard normalization of single-dimensional similarities obtained by various calculation methods, w ₁ 、w ₂ 、w ₃ The weight value of each second similarity. Taking channel 1 and channel 2 in convolutional layer 2 of model a in the above example as examples, D1 after standard normalization _(1,2) 、D2 _(1,2) 、D3 _(1,2) And calculating to obtain the uniform similarity D between the channel 1 and the channel 2 _total1 ＝w ₁ ×D1 _(1,2) +w ₂ ×D2 _(1,2) +w ₃ ×D3 _(1,2) 。

In the embodiment of the application, the similarity between every two channels is calculated by various similarity calculation methods, the unified similarity is calculated after the calculated similarity is subjected to standard normalization processing, and finally, channel pruning can be performed based on the unified similarity. The channel pruning can be performed by combining the characteristics of all similarity calculation modes together, and the channel pruning is performed based on unified similarity, so that the similarity among all channels is fused with the characteristics of all algorithm application scenes, the model pruning is suitable for multiple application scenes, and the generalization of the model pruning is enhanced. In a possible embodiment, the number of channels of the model after channel pruning may also be detected, so as to implement pruning of the model channels to the minimum as possible. As shown in fig. 3, after the above S103, the method further includes:

s301, judging whether the current channel number in the first model is lower than a preset channel number threshold value.

In this step, the current channel number of the first model is the latest channel number of the first model after the channel pruning step, and the preset channel number threshold is preset by a technician according to the characteristics of the model and experience. The first model may be a model after channel pruning or a model without channel pruning. For example, when the calculated uniform similarity is greater than the threshold, channel pruning is performed on the first model, and when the calculated uniform similarity is less than the threshold, channel pruning is not performed on the first model.

And S302, if not, taking the second model as a new first model, and returning to execute S101.

In this step, for two target channels, when it is determined that the unified similarity is higher than the preset similarity threshold, the two target channels are merged, that is, channel pruning is performed on the first model to obtain a second model. And taking the second model as a new first model, judging whether the current channel number of the first model is lower than a preset channel number threshold, if not, indicating that the model can also carry out channel pruning, calculating the unified similarity based on the new first model, and carrying out channel pruning on the new first model until the channel number is lower than the preset channel threshold.

Still taking convolutional layer 2 of the first model as an example, convolutional layer 2 has 4 channels in total: a channel 1,The channel 2, the channel 3 and the channel 4 are calculated to obtain the unified similarity D of the channel 1 and the channel 2 _total1 If the similarity is higher than the similarity threshold, merging the channels 1 and 2, and the number of channels in the convolutional layer 2 of the first model is 3, which are: new channel 1, new channel 2, new channel 3. And (3) assuming that the preset channel number threshold is 2, judging that the current channel number of the convolution layer 2 of the model A is 3 > 2, taking the first model subjected to channel pruning as a new first model, re-acquiring data in a new channel 1, a new channel 2 and a new channel 3 in the convolution layer 2, and re-calculating the uniform similarity channel pruning channels until the current channel number of the convolution layer 2 of the new first model is less than 2.

In the embodiment of the application, whether the current model is subjected to channel pruning to the minimum channel number is judged by setting the preset channel number threshold, so that the purpose of performing channel pruning to the model to the minimum channel number as much as possible is achieved.

In a possible implementation manner, the training times of the model after channel pruning can be detected, so that the training times of the model can be limited, and the channel pruning of the model can be guaranteed on the premise of influencing the performance of the model as far as possible. As shown in fig. 4, after S103, the method further includes:

s401, judging whether the model training times of the first model are larger than the preset model training times.

In this step, except that a plurality of single-dimensional similarities between two channels are obtained by calculating according to a plurality of similarity calculation methods for each two target channels in the first model for the first time, the model needs to be retrained to obtain a new first model before the step of calculating the plurality of single-dimensional similarities between the two channels according to the plurality of similarity calculation methods for each two target channels in the first model for the new first model is executed again. In the embodiment of the application, the training times of the model can be recorded, and 1 is added to the previous training times every time the model is trained.

Therefore, before judging whether the model training times of the first model are larger than the preset model training times, the training times of the current first model are obtained. The number of times of training of the first model may be recorded by a counting unit in the first model.

S402, if the number of times of training of the preset model is less than that of the first model, training the first model to obtain a third model; with the third model as the new first model, the process returns to S101.

In this step, the model is continuously trained during channel pruning of the first model, so as to improve the generalization of the model. However, since the performance and accuracy of the model may be affected during the channel pruning of the first model, in order to ensure the performance and accuracy of the model, a threshold value of the training times of the first model needs to be set. The threshold value of the training times of the model is manually set by a technician according to experience, and is not limited in any way in the embodiment of the application.

Still taking the convolutional layer 2 of the first model as an example, before the single-dimensional similarity of the convolutional layer 2 of the first model is calculated for the first time, the number of times of training of the first model is 0, the number of times of training of the first model after the first model channel pruning is 1, assuming that the number of times of training of the model is 1, 1 is 1, the first model is retrained again to obtain a new first model, at this time, the number of times of training of the first model is 2, 2 > 1, the first model will not be retrained again, because the number of times of training of the first model has reached the maximum value at this time, if training is continued, the performance and accuracy of the first model will be affected.

In the embodiment of the application, whether the training times of the current model reach the maximum training times is judged by setting the threshold value of the training times of the model, so that the model training is realized on the premise of not influencing the performance and the precision of the first model.

In a possible implementation manner, a method for performing model pruning in combination with a preset similarity threshold, a preset channel number threshold, and a preset model training time threshold may be adopted, as shown in fig. 5, where the method specifically includes steps S501-S508, where steps S501-S504 are the same as steps S101-S103, and steps S505-S508 are the same as steps S301-S302 and steps S401-S402, and are not described herein again. The method comprises the following steps:

s501, aiming at every two target channels in the first model, calculating according to multiple similarity calculation modes to obtain multiple single-dimensional similarities between the two channels, wherein the target channels are used for extracting data features of data input into the first model.

S502, aiming at every two target channels in the first model, calculating unified similarity between the two target channels based on each single-dimensional similarity between the two target channels, wherein the unified similarity is positively correlated with each single-dimensional similarity between the two channels.

S503, judging whether the unified similarity between every two target channels is lower than a similarity threshold, if so, executing S507, and if not, executing S504.

S504, channel pruning is conducted on the target channels with the uniform similarity larger than the similarity threshold value in the first model, and a second model is obtained.

And S505, judging whether the current channel number in the second model is lower than a preset channel number threshold value.

And S506, if not, executing S507.

And S507, judging whether the model training frequency is greater than a model training threshold value.

S508, if the model is not larger than the preset value, the first model is trained to obtain a third model; with the third model as the new first model, the process returns to S501.

In the embodiment of the application, for every two target channels in the first model, a plurality of single-dimensional similarities between the two channels are obtained through calculation according to a plurality of similarity calculation methods, the unified similarity between the two target channels is calculated based on each single-dimensional similarity between the two channels, and then the target channels in the first model are pruned through the channels based on the unified similarity. The channel pruning can be performed by combining the characteristics of each similarity calculation mode, and the channel pruning is performed based on the unified similarity, so that the similarity among the channels is fused with the characteristics of each algorithm application scene, the model pruning is suitable for multiple application scenes, and the generalization of the model pruning is enhanced. . By setting three thresholds: the similarity threshold, the channel number threshold and the training frequency threshold can be used for pruning the model by the channel pruning model as far as possible on the premise of ensuring the model effect.

Corresponding to the foregoing model pruning method, the present application also provides a model pruning device, as shown in fig. 6, including:

a first calculating unit 601, configured to calculate, according to multiple similarity calculation methods, multiple single-dimensional similarities between two target channels in a first model, where the target channels are used to extract data features of data input to the first model;

a second calculating unit 602, configured to calculate, for each two target channels in the first model, a unified similarity between the two target channels based on each of the single-dimensional similarities between the two target channels, where the unified similarity is positively correlated with each of the single-dimensional similarities between the two target channels;

a channel pruning unit 603, configured to perform channel pruning on the target channels in the first model according to the uniform similarity between the target channels to obtain a second model, where the uniform similarity between any two target channels pruned by the channel is greater than a preset similarity threshold.

As an embodiment, the second calculating unit 602 is specifically configured to perform standard normalization on each single-dimensional similarity between two channel vectors in the first model for every two target channels in the first model, so as to obtain multiple normalized similarities between the two target channels;

As an implementation manner, the apparatus further includes a first determining unit 604, configured to determine whether a current channel number in the first model is lower than a preset channel number threshold;

As an embodiment, the apparatus further includes a second determining unit 605, configured to determine whether the number of times of model training of the first model is greater than a preset number of times of model training;

An embodiment of the present invention further provides an electronic device, as shown in fig. 7, including a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702, and the memory 703 complete mutual communication through the communication bus 704,

a memory 703 for storing a computer program;

the processor 701 is configured to implement the following steps when executing the program stored in the memory 703:

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

In a further embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the model pruning methods described above.

In a further embodiment provided by the present invention, there is also provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform any of the model pruning methods of the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the embodiments of the apparatus, the electronic device, the computer-readable storage medium, and the computer program product, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A model pruning method, applied to target recognition, comprising:

for every two target channels in the first model, calculating a unified similarity between the two target channels based on each single-dimensional similarity between the two target channels, wherein the unified similarity is positively correlated with each single-dimensional similarity between the two channels;

2. The method according to claim 1, wherein the plurality of similarity calculation methods includes at least two of the following three similarity calculation methods:

3. The method of claim 1, wherein the calculating a unified similarity of two target channels based on each of the single-dimensional similarities between the two target channels for each two target channel vectors in the first model comprises:

4. The method of claim 1, further comprising:

and if not, taking the second model as a new first model, returning to execute the step of calculating a plurality of single-dimensional similarities between two channels according to a plurality of similarity calculation methods aiming at each two target channels in the first model.

5. The method of claim 1, further comprising:

6. Model pruning device, characterized in that the device is applied to object recognition, the device comprises:

a second calculating unit, configured to calculate, for each two target channels in the first model, a unified similarity between the two target channels based on each single-dimensional similarity between the two target channels, where the unified similarity is positively correlated with each single-dimensional similarity between the two channels;

7. The apparatus of claim 6, wherein the plurality of similarity calculation methods comprises at least two of the following three similarity calculation methods:

8. The apparatus according to claim 6, wherein the second computing unit is specifically configured to perform standard normalization on each one-dimensional similarity between two channel vectors for each two target channels in the first model, so as to obtain multiple normalized similarities between the two target channels;

9. The apparatus according to claim 6, further comprising a first determining unit configured to determine whether a current channel number in the first model is lower than a preset channel number threshold;

10. The apparatus according to claim 6, further comprising a second determining unit configured to determine whether the number of times of model training of the first model is greater than a preset number of times of model training;

11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any one of claims 1 to 5 when executing a program stored in the memory.

12. A model pruning device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 5.

13. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-5.