CN113673694A

CN113673694A - Data processing method and device, electronic equipment and computer readable storage medium

Info

Publication number: CN113673694A
Application number: CN202110581028.9A
Authority: CN
Inventors: 涂小兵; 鲁路; 张伟丰
Original assignee: Alibaba Singapore Holdings Pte Ltd
Current assignee: Alibaba Innovation Co
Priority date: 2021-05-26
Filing date: 2021-05-26
Publication date: 2021-11-19

Abstract

The application discloses a data processing method and device, electronic equipment and a computer readable storage medium. The method comprises the following steps: acquiring an initial value of a compression parameter and a model precision value corresponding to the initial value, wherein the model precision value is the calculation precision of a target model compressed by the initial value of the compression parameter; comparing the model precision value with an original precision value, and adjusting the numerical value of the compression parameter according to the comparison result, wherein the original precision value is the calculation precision of the target model before compression; and outputting the adjusted numerical value of the compression parameter. According to the embodiment of the application, the numerical value of the compression parameter can be adjusted by judging the change of the calculation precision of the target model before and after being compressed, so that the compression parameter suitable for the target model is obtained, and a better compression effect is achieved.

Description

Data processing method and device, electronic equipment and computer readable storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a data processing method and apparatus, an electronic device, and a computer-readable storage medium.

Background

With the development of artificial intelligence technology, artificial intelligence technology has been employed in an increasing number of fields to perform computing tasks including computer vision, speech recognition, natural language processing, and the like. In particular, the deep neural network can learn by utilizing long-term memory, so that the self-learning capability and the calculation effect of the artificial intelligence technology are greatly improved. Therefore, the deep neural network is more and more widely applied with the development of the artificial intelligence technology. Due to the characteristic of long-term memory, the deep neural network needs huge calculation overhead and memory storage space in use, but under the condition that the application of artificial intelligence technology is more and more extensive, more and more application scenes are computing environments with limited resources. Therefore, in such a computing environment, the artificial intelligence technology using the deep neural network has limited and affected computing efficiency on one hand, and on the other hand, the huge demand of the deep neural network for computing resources even prevents the application thereof in various scenes deep in people's life.

In the prior art, it has been proposed to compress such networks using an appropriate model compression approach for the cases where more or less parameter redundancy exists for deep neural networks, thereby obtaining networks that are relatively lightweight and at the same time maintain a certain degree of accuracy to adapt to application scenarios with limited computational resources. In the related art, for example, ADMM (Alternating Direction Method of Multipliers) has been used to compress the neural network parameters. However, for the compression schemes used in the prior art, such as ADMM, it is usually necessary to manually specify a specific compression ratio, and for the neural network, the sparse sensitivity of each layer is different, so that it is difficult to achieve a better compression effect by using a fixed compression ratio in the prior art.

Therefore, a technical solution capable of quickly determining the compression ratio parameter to improve the compression effect is needed.

Disclosure of Invention

Embodiments of the present application provide a data processing method and apparatus, an electronic device, and a computer-readable storage medium, so as to solve a defect that a better compression effect is difficult to achieve by using a fixed compression rate in the prior art.

In order to achieve the above object, an embodiment of the present application provides a data processing method, including:

acquiring an initial value of a compression parameter and a model precision value corresponding to the initial value, wherein the model precision value is the calculation precision of a target model compressed by the initial value of the compression parameter;

comparing the model precision value with an original precision value, and adjusting the numerical value of the compression parameter according to the comparison result, wherein the original precision value is the calculation precision of the target model before compression;

and outputting the adjusted numerical value of the compression parameter.

Further, according to an embodiment of the present application, in the data processing method of the present application, adjusting the value of the compression parameter may include: the value of the compression parameter is reduced to one third.

Further, according to an embodiment of the present application, in the data processing method of the present application, the increase and decrease in the value of the compression parameter may be asymmetric.

Further, according to an embodiment of the present application, in the data processing method of the present application, adjusting the value of the compression parameter according to the comparison result may include: the value of the compression parameter is doubled when the model accuracy value is greater than or equal to the original accuracy value, and is reduced by one-third when the model accuracy value is less than the original accuracy value.

Further, according to the embodiment of the present application, in the data processing method of the present application, the target model is compressed by using an ADMM scheme, and the compression process may further include: when the current network layer of the target model is compressed, the remaining parameters of the associated layer of the current network layer are used as the new constraint of the ADMM, where the remaining parameters of the associated layer may refer to the number of the parameters remaining after the associated layer is compressed, and the associated layer of the current network layer may be a previous layer and/or a subsequent layer of the current layer.

Therefore, in the embodiment of the present application, by additionally introducing the compression residual parameters of the preceding and following layers of the current network layer as an additional compression weight reference when the ADMM scheme performs the compression processing, the correlation between the parameters of the neural network model layer based on the current compression parameters and the parameters of the preceding and following layers is comprehensively considered in the compression processing, and the parameters of the target model left in the compression are determined according to the correlation, so that the compression effect can be further improved.

Further, according to the embodiment of the present application, in the data processing method of the present application, an interactive interface may be further provided, through which an initial value of the compression parameter may be displayed to a user as an initial compression suggestion, and a feedback input of the user for the displayed initial value may be received through the interface, where the feedback input may be a compression parameter directly specified by the user, or may be an adjustment instruction of the initial value of the compression parameter by the user, for example, an increase or decrease of the initial value.

Further, according to the embodiment of the present application, in the data processing method of the present application, the interactive interface may further provide basic information of the target model to be processed for the user, and may also receive confirmation or adjustment opinions of the user on the basic information through the interactive interface, and further, the data processing method of the present application may further include: the initial values of the compression parameters are adjusted according to the received user input of the basic information of the target model.

An embodiment of the present application further provides a data processing apparatus, including:

the acquisition module is used for compressing an initial value of a parameter and a model precision value corresponding to the initial value, wherein the model precision value is the calculation precision of a target model compressed by adopting the initial value of the compression parameter;

the parameter adjusting module is used for comparing the model precision value with an original precision value and adjusting the numerical value of the compression parameter according to the comparison result, wherein the original precision value is the calculation precision of the target model before compression;

and the output module is used for outputting the adjusted numerical value of the compression parameter.

An embodiment of the present application further provides an electronic device, including:

a memory for storing a program;

and the processor is used for operating the program stored in the memory, and the program executes the data processing method provided by the embodiment of the application when running.

The embodiment of the present application also provides a computer readable storage medium, on which a computer program executable by a processor is stored, wherein the program, when executed by the processor, implements the data processing method provided by the embodiment of the present application.

According to the data processing method and device, the electronic device and the computer readable storage medium provided by the embodiment of the application, the numerical value of the compression parameter is adjusted by judging the change of the calculation precision of the target model before and after the target model is compressed, so that the compression parameter suitable for the target model is obtained, and the compression parameter suitable for the model can be obtained.

For example, in the field of machine learning calculation, the neural network model used generally has a plurality of parameters, and the more accurate or more complex model generally has more parameters, so in the prior art, to obtain a result with higher accuracy, the neural network model with higher accuracy or more complex model is often used, thereby resulting in a sharp increase in the number of parameters used in calculation, which not only consumes huge calculation resources, but also has a slower calculation speed. In practice, however, the more complex neural network models use a greater number of parameters, but these parameters are not all equally important, and even many of the parameters are redundant parameters. In other words, some of the large number of parameters used in the more complex neural network models have a substantially small, if not even no, effect on the accuracy of the calculations performed by the network model. Therefore, in the prior art, redundant parameters in the model are usually removed by setting appropriate compression parameters for different models, so that the effect of slimming the calculation model is achieved by reducing the number of parameters. However, the most important premise of such compression processing is that the computational accuracy of the compressed model cannot be affected. In other words, it is necessary to reduce the parameters used by the compressed model as much as possible while ensuring that the availability of the compressed model is unchanged, so as to achieve the purpose of improving the computational efficiency and reducing the computational resource requirement. However, in such a case, how to determine suitable parameters to realize the model slimming without reducing the calculation accuracy is a major problem in the art. Therefore, with the above-described data processing method and apparatus, electronic device, and computer-readable medium provided by the embodiments of the present application, it is possible to adjust the compression parameter according to whether the calculation accuracy of the compressed target model is reduced by first performing compression on the target model with a predetermined or specified initial value of the compression parameter, and comparing the calculation accuracy of the compressed model with the original calculation accuracy to adjust the compression parameter according to the result of the comparison. In particular, in the embodiment of the present application, since the best possible compression parameter that does not affect the calculation accuracy of the target model is found by repeating iterative heuristics, the best possible compression effect can be obtained compared to the prior art in which only a fixed compression parameter is used. In addition, since the initial value of the compression parameter of the present application can also be determined according to the historical compression parameters of the same type or similar models, or can also be determined through interaction with a user, the number of iterative heuristics can be further reduced, and the best possible compression parameter can be obtained more quickly.

The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a schematic diagram of a data processing scheme provided by an embodiment of the present application;

FIG. 2 is a flow chart of one embodiment of a data processing method provided herein;

FIG. 3 is a flow chart of another embodiment of a data processing method provided herein;

FIG. 4 is a schematic structural diagram of an embodiment of a data processing apparatus provided in the present application;

fig. 5 is a schematic structural diagram of an embodiment of an electronic device provided in the present application.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Example one

The scheme provided by the embodiment of the application can be applied to any equipment or system with algorithm operation, such as a distributed server and the like. Fig. 1 is a schematic diagram of a schematic architecture of a data processing scheme provided in an embodiment of the present application, and the architecture shown in fig. 1 is only one example of an architecture to which the technical scheme of the present application can be applied.

With the development of artificial intelligence technology, artificial intelligence technology is beginning to be used in more and more fields to solve various problems in various fields. In particular, the deep neural network can learn by utilizing long-term memory, so that the self-learning capability and the calculation effect of the artificial intelligence technology are greatly improved. However, the deep neural network requires huge computation overhead and memory storage space in use due to its long-term memory property, and under the condition that the application of artificial intelligence technology is more and more extensive, more and more application scenarios are computing environments with limited resources. Therefore, in such a computing environment, the artificial intelligence technology using the deep neural network has limited and affected computing efficiency on one hand, and on the other hand, the huge demand of the deep neural network for computing resources even prevents the application thereof in various scenes deep in people's life.

In particular, a neural network model generally has a plurality of parameters, and a model with higher precision or more complex model generally has more parameters, so in order to obtain a result with higher precision, a neural network model with higher precision or more complex model is often used in the prior art, but such a neural network model not only needs a large amount of computing resources but also has slower computing speed due to the large number of parameters. In practice, however, there are a number of parameters of such a network model which are less important and even redundant in the first to speak. In other words, some of the large number of parameters used in the network model have a small influence on the accuracy of the calculations performed by the network model, and therefore, in the prior art, it has been proposed to use an appropriate compression method to compress such a network model by reducing the parameters used in the network model, which have a small influence on the accuracy, for the case of more or less parameter redundancy existing in deep neural networks, so as to obtain a network with fewer parameters and at the same time maintain a certain degree of accuracy, so as to adapt to application scenarios with limited computational resources or requiring higher computational efficiency.

In the prior art, for example, compression schemes such as ADMM have been employed to compress parameters of neural network models. Particularly, ADMM is an improved form proposed to integrate the resolvability of the dual elevation method with the excellent convergence property of the multiplier method. The goal is to decompose the original and augmented functions to facilitate parallel optimization under more general assumptions. That is, ADMM is alternately optimized by introducing new variables and then cross-reversing. Specifically, in the ADMM, the variables that are disassembled from the beginning are respectively regarded as different variable sums, and the constraint conditions are also processed in the same way, so that the advantage that the variables do not need to be fused together later is realized, the resolvability of the previous optimization process is ensured, and the sequential iteration is obtained. However, even such an optimized compression scheme usually needs to manually specify a fixed compression rate, that is, for a certain network neural model, the fixed compression rate is used to reduce the number of parameters meeting the compression rate, but for the neural network, the sparse sensitivity of each layer is different, so that in the prior art, using the fixed compression rate to compress the neural network model often results in that all or most of the parameters having small influence on the precision are not reduced as much as possible, and thus it is difficult to achieve a better compression effect.

Fig. 1 is a schematic diagram of a data processing scheme provided in an embodiment of the present application. As shown in fig. 1, in the processing scheme according to the embodiment of the present application, when a parameter compression process is performed on a neural network model by a scheme such as an ADMM, a compression parameter that has a large influence on the compression effect, for example, an RHO parameter in the ADMM compression scheme, is selected for adjustment. In other words, the value of a parameter such as RHO will typically have a large impact on the accuracy of the calculations performed by the neural network model using the RHO parameter.

Therefore, based on this principle, the RHO parameters can be adjusted by using the calculation accuracy of the neural network model as a criterion in the embodiment of the present application. For example, as shown in fig. 1, a calculation may be performed on an object model to be subjected to a compression process in the ADMM scheme first to obtain an original calculation accuracy of the object model. For example, in the embodiment of the present application, the calculation performed by the neural network model to obtain the original precision may be to perform a preset standard calculation process, or may be to perform an actual calculation task.

Furthermore, after the original calculation accuracy is obtained, an initial value may be given to a compression weight parameter such as the RHO parameter in the ADMM scheme. In the embodiment of the present application, the initial value may be obtained empirically, or may be a statistical value obtained by compressing each of a plurality of similar models, and further, a compression process may be performed on the target model based on the initial value, that is, the target neural network model is compressed using an ADMM compression scheme having a compression weight parameter of the initial value, so as to obtain a compressed neural network model, and the above calculation is repeatedly performed on the compressed neural network model again, so as to obtain the post-compression calculation accuracy.

That is, after obtaining the original calculation accuracy of the target model to perform the calculation task, when the compression parameter is given an initial value, the ADMM may perform the compression processing on the target model using the compression weight parameter having the initial value, and perform the preset calculation processing task again or perform the actual calculation processing task on the compressed neural network model obtained after the compression processing to obtain the post-compression calculation accuracy corresponding to the value of the weight parameter of the compression scheme on which the compression processing is performed on the model. So that the compression effect of the compression processing using the compression weight coefficient of the initial value can be judged based on the post-compression calculation accuracy.

For example, the post-compression calculation accuracy may be compared with the original accuracy value obtained as described above when the target model has not been compressed, and the value of the compression parameter may be adjusted based on the comparison result of the two. For example, after the compression process is performed by using the compression weight parameter with the preset initial value, the compressed target model repeatedly performs the above calculation to obtain a compressed calculation accuracy which is not reduced compared with the original calculation accuracy, and the comparison result not only indicates that the compression process performed by using the initial value is successful to some extent without deteriorating the calculation accuracy performed by the target model, i.e. that the compressed target model is usable because the number of parameters of the compressed target model is reduced compared with the original model and the same calculation task as that performed to obtain the original calculation accuracy is repeatedly performed to obtain the same or similar calculation accuracy, but also indicates that there may be a further compression space by using the initial value to perform the compression, this is because the calculation accuracy of the target model after the compression is performed is not degraded, or the influence of the parameters removed during the compression process on the target model performing the target calculation task is very small. It is then easily conceivable that there are still very likely to be removable parameters in the thus compressed object model.

For this reason, as shown in fig. 1, it is possible to further return the flow to the ADMM and adjust the value of the compression weight parameter RHO, for example, from the initial value to twice the initial value, and perform the compression processing again on the original target model with the modified value, and the compressed target model obtained after the compression processing repeats the above calculation again to obtain the compressed calculation accuracy and the original calculation accuracy obtained by the original target model performing the above repetition processing is compared. That is, it is determined whether or not the calculation accuracy of the target model is not deteriorated in the case where the compression weight parameter RHO is increased to further increase the compression rate. Therefore, when the calculation accuracy of the compressed target model obtained after performing compression on the target model with the parameter value twice as large as the initial value is not deteriorated, the above-described steps of adjusting the parameter value may be continuously repeated, for example, doubling the current value, i.e., twice as large as the initial value, again, i.e., modifying it to four times as large as the initial value, and performing compression on the original target model again.

Further, if there is a decrease in the post-compression calculation accuracy obtained by repeatedly performing the above-described calculation on the compressed target model, as compared with the original calculation accuracy, after the compression processing is performed using the compression weight parameter having the above-described preset initial value, the comparison result may indicate that the compression processing performed using the initial value has not succeeded, i.e., that the accuracy of performing the calculation on the target model has a degrading influence. In particular, since the number of parameters of the compressed target model is reduced compared with the original model, but the same or similar calculation precision cannot be obtained by repeatedly executing the same calculation task as that executed to obtain the original calculation precision, for example, the calculation precision of the compressed target model obtained after the compression processing is executed is reduced, which indicates that the compressed target model cannot be used, and on the other hand, the comparison result may indicate that no further compression space exists already when the compression processing is executed using the initial value, and even that the compression processing executed last time is compressed. This is because the calculation accuracy of the target model after the compression is performed is deteriorated, or there is a parameter having a large influence on the target calculation task performed by the target model among the parameters removed in the compression process. It is easily conceivable that there is a high probability that parameters that should not be fetched in the current compression process are present in the thus compressed object model. That is, in the current compression processing, the compression weight parameter setting of, for example, RHO is not appropriate.

Therefore, if the compression process performed by comparing the parameter values of the currently-RHO has deteriorated the calculation accuracy of the target model, it is possible to reduce the parameter value of the currently-RHO, for example, to half the value of the currently-RHO parameter, and to use the reduced RHO parameter as the appropriate compression weight parameter. Of course, after the current RHO parameter value is reduced, the above-described compression process may be repeated again, and the above-described repeated calculation may be performed again on the compressed target model to compare the original calculation accuracy and the compressed calculation accuracy, in such a manner as to perform iteration.

Of course, in the embodiment of the present application, the method of increasing the parameter value of the compression weight parameter without deterioration of the calculation accuracy is not limited to doubling, and may be adjusted by three times or other multiples of the initial value, and the method of decreasing the parameter value of the compression weight parameter with deterioration of the calculation accuracy is not limited to halving, but may be adjusted by, for example, halving. Further, the increase and decrease may not be corresponding, i.e., the compression weight value may be adjusted in such a way that it is doubled when the weight parameter is increased and decreased by one third when the weight parameter is decreased. Further, in the case where the ways of increasing and decreasing the compression weight value do not correspond to each other, for example, the way of doubling when increasing and the way of decreasing by one third when decreasing as described above, the compression process may be performed again with the parameter value decreased after decreasing, and the above-described repetitive calculation may be performed again on the compressed target model and the obtained compressed calculation precision value may be compared with the original calculation precision value, and if it is confirmed that the calculation precision is not deteriorated, the compression weight value after decreasing may be output as a better compression weight value.

In addition, in the embodiment of the present application, when the compression processing is performed, the compression residual parameters of the preceding and following layers of the current network layer may be additionally introduced as an additional compression weight reference, so that the correlation between the parameters of the neural network model layer based on the current compression parameters and the parameters of the preceding and following layers may be comprehensively considered, and thus the probability that the parameters having strong correlation with the parameters remaining after the compression of the preceding and following layers are left during the compression is also high, so that the compression effect can be further improved.

Therefore, the data processing scheme provided by the embodiment of the application can adjust the numerical value of the compression parameter by judging the change of the calculation accuracy of the target model before and after being compressed, and quickly determine the numerical value of the compression parameter suitable for the target model in an iterative manner, so that a better compression effect is achieved on the premise of ensuring that the calculation accuracy of the target model is not lost.

The above embodiments are illustrations of technical principles and exemplary application frameworks of the embodiments of the present application, and specific technical solutions of the embodiments of the present application are further described in detail below through a plurality of embodiments.

Example two

Fig. 2 is a flowchart of an embodiment of a data processing method provided in the present application, and an execution subject of the method may be various terminal or server devices with data processing capability, or may be a device or chip integrated on these devices. As shown in fig. 2, the data processing method includes the steps of:

s201, obtaining an initial value of a compression parameter and a model precision value corresponding to the initial value.

In this embodiment, the model precision value may be a calculation precision of the target model compressed by using the initial value of the compression parameter. Before the target model is subjected to parameter compression processing by a compression method such as ADMM, the calculation accuracy, i.e. the original accuracy value in the embodiment of the present application, is calculated for the uncompressed target model. Then, for the compression parameters in the compression algorithm, the target model is compressed once according to the manually set or default initial values, and the calculation accuracy of the target model at this time, that is, the model accuracy value corresponding to the initial values of the compression parameters in the embodiment of the present application, is calculated.

For example, as shown in fig. 1, a calculation may be performed on an object model to be subjected to a compression process in, for example, an ADMM compression scheme, to obtain an original calculation accuracy of the object model. For example, in the embodiment of the present application, the calculation performed by the neural network model to obtain the original precision may be to perform a preset standard calculation process, or may be to perform an actual calculation task. After the original calculation accuracy is obtained, an initial value may be given to a compression parameter, such as the RHO parameter in the ADMM scheme, which has a large influence on the compression effect in step S201. In the embodiment of the present application, the initial value may be obtained empirically or may be a statistical value obtained by compressing each of a plurality of similar models, and further, a compression process may be performed on the target model based on the initial value, that is, the target neural network model is compressed using an ADMM compression scheme having the compression parameters of the initial value described above, so as to obtain a compressed neural network model, and the same calculation task as that performed to obtain the original calculation accuracy is repeatedly performed on the compressed neural network model again, so as to obtain a model accuracy value.

S202, comparing the model precision value with the original precision value, and adjusting the value of the compression parameter according to the comparison result.

In the embodiment of the present application, the original precision value is the calculation precision of the target model before compression. When the model precision value after compression processing is compared with the original precision value before compression, if the model precision value is greater than or equal to the original precision value, namely the calculation precision of the target model is not lost, the numerical value of the compression parameter can be increased; if the model precision value is smaller than the original precision value, namely the target model generates precision loss, the numerical value of the compression parameter can be reduced. Therefore, the compression parameters are adjusted by judging the change of the calculation precision of the target model before and after being compressed, the numerical value of the compression parameters suitable for the target model can be finally obtained, and the target model is compressed according to the numerical value, so that a better compression effect can be achieved on the premise of ensuring the calculation precision of the target model.

That is, after the original calculation accuracy of the target model performing the calculation task is obtained, after the initial value is given to the compression parameter in step S201, the target model may be subjected to the compression processing using the compression parameter having the initial value in step S202, and the same calculation processing task as that performed to obtain the original calculation accuracy described above may be performed again for the compressed neural network model obtained after the compression processing to obtain the post-compression calculation accuracy corresponding to the value of the weight parameter of the compression scheme on which the compression processing is performed for the model. So that the compression effect of the compression processing using the compression weight coefficient of the initial value can be judged based on the post-compression calculation accuracy.

For example, the value of the compression parameter is adjusted in step S202 by comparing the calculation accuracy obtained after the compression with the original calculation accuracy when the target model obtained as described above has not been compressed yet, and based on the result of the comparison of the two. For example, after the compression process is performed by using the compression parameters with the preset initial values, the calculation accuracy of the model obtained by repeatedly performing the above calculation on the compressed target model is not reduced compared with the original calculation accuracy, and the comparison result not only indicates that the compression process performed by using the initial values is successful to some extent without deteriorating the calculation accuracy of the target model, i.e., that the compressed target model is usable because the number of parameters of the compressed target model is reduced compared with the original model and the same or similar calculation accuracy can be obtained by repeatedly performing the same calculation task as that performed to obtain the original calculation accuracy, but also indicates that there may be further compression space by using the initial values to perform compression, this is because the calculation accuracy of the target model after the compression is performed is not degraded, or the influence of the parameters removed during the compression process on the target model performing the target calculation task is very small. It is then easily conceivable that there are still very likely to be removable parameters in the thus compressed object model. Therefore, by determining the compression parameter from the change in calculation accuracy before and after compression, it is possible to obtain a compression parameter value that can achieve a better compression effect.

S203, outputs the adjusted value of the compression parameter.

Therefore, after the adjusted compression parameter value is obtained in step S202, the compression parameter value may be output in step S203, for example, for direct use in subsequent other model compression processes or as a setting reference of the initial value of the compression parameter.

Therefore, the data processing scheme provided by the embodiment of the application can adjust the value of the compression parameter by judging the change of the calculation precision of the target model before and after being compressed, so that the compression parameter suitable for the target model is obtained, and a better compression effect can be achieved by compressing the target model by adopting the compression mode of the value.

EXAMPLE III

Fig. 3 is a flowchart of another embodiment of the data processing method provided in the present application, and an execution subject of the method may be various terminal or server devices with algorithm data processing capability, or may be a device or chip integrated on these devices. As shown in fig. 3, based on the embodiment shown in fig. 2, the data processing method may include the following steps:

s301, obtaining an initial value of the compression parameter and a model precision value corresponding to the initial value.

For example, as shown in fig. 1, a calculation may be performed on an object model to be subjected to a compression process in, for example, an ADMM compression scheme, to obtain an original calculation accuracy of the object model. For example, in the embodiment of the present application, the calculation performed by the neural network model to obtain the original precision may be to perform a preset standard calculation process, or may be to perform an actual calculation task. After the original calculation accuracy is obtained, an initial value may be given to a compression parameter, such as the RHO parameter in the ADMM scheme, which has a large influence on the compression effect in step S301. In the embodiment of the present application, the initial value may be obtained empirically or may be a statistical value obtained by compressing each of a plurality of similar models, and further, a compression process may be performed on the target model based on the initial value, that is, the target neural network model is compressed using an ADMM compression scheme having the compression parameters of the initial value described above, so as to obtain a compressed neural network model, and the same calculation task as that performed to obtain the original calculation accuracy is repeatedly performed on the compressed neural network model again, so as to obtain a model accuracy value.

In this embodiment of the present application, in the process of adjusting the value of the compression parameter according to the magnitudes of the model precision value and the original precision value, the following iterative manner may be adopted:

s302, comparing the model precision value with the original precision value.

That is, after the original calculation accuracy of the target model performing the calculation task is obtained, after the initial value is given to the compression parameter in step S301, the target model may be subjected to the compression processing using the compression parameter having the initial value in step S302, and the same calculation processing task as that performed to obtain the original calculation accuracy described above may be performed again for the compressed neural network model obtained after the compression processing to obtain the post-compression calculation accuracy corresponding to the value of the weight parameter of the compression scheme on which the compression processing is performed for the model. So that the compression effect of the compression processing using the compression weight coefficient of the initial value can be judged based on the post-compression calculation accuracy.

For example, the value of the compression parameter is adjusted in step S302 by comparing the calculation accuracy obtained after the compression with the original calculation accuracy when the target model obtained as described above has not been compressed yet, and based on the result of the comparison of the two. For example, after the compression process is performed by using the compression parameters with the preset initial values, the calculation accuracy of the model obtained by repeatedly performing the above calculation on the compressed target model is not reduced compared with the original calculation accuracy, and the comparison result not only indicates that the compression process performed by using the initial values is successful to some extent without deteriorating the calculation accuracy of the target model, i.e., that the compressed target model is usable because the number of parameters of the compressed target model is reduced compared with the original model and the same or similar calculation accuracy can be obtained by repeatedly performing the same calculation task as that performed to obtain the original calculation accuracy, but also indicates that there may be further compression space by using the initial values to perform compression, this is because the calculation accuracy of the target model after the compression is performed is not degraded, or the influence of the parameters removed during the compression process on the target model performing the target calculation task is very small. It is then easily conceivable that there are still very likely to be removable parameters in the thus compressed object model. Therefore, by determining the compression parameter from the change in calculation accuracy before and after compression, it is possible to obtain a compression parameter value that can achieve a better compression effect.

And S303, when the model precision value is greater than or equal to the original precision value, increasing the numerical value of the compression parameter, compressing the target model by adopting the current value of the compression parameter, and acquiring the calculation precision of the compressed target model to update the model precision value.

S304, when the model precision value is smaller than the original precision value, reducing the numerical value of the compression parameter, and ending the iteration operation.

In the embodiment of the present application, when it is determined in step S303 that the model precision value is greater than or equal to the original precision value, i.e. the calculation precision of the target model is not lost, the value of the compression parameter may be increased, for example, the compression parameter may be preferably increased by one time. Then, the current value of the compression parameter, i.e. the increased value, is used to perform compression processing on the target model again, and obtain the calculation accuracy of the target model at this time, and then the model accuracy value is updated with the calculation accuracy, and the parameter value is returned to perform the step S302 to perform further adjustment operation. In the above iteration process, when it is detected that the current model precision value is smaller than the original precision value, that is, the target model generates precision loss, it is indicated that the last adjusted value is the target value, so that an operation of reducing the value of the compression parameter may be performed, for example, the value of the compression parameter may be reduced by one half.

For example, as shown in fig. 1, when it is judged in step S303 that the calculation accuracy value at which the compressed target model performs the same calculation process as that performed to obtain the original calculation accuracy is greater than or equal to the original calculation accuracy value, the value of the compression weight parameter RHO may be adjusted in step S303, for example, modified from the initial value to twice the initial value, and the compression process may be performed on the original target model again with the modified value, and the compressed target model obtained after the compression process may be repeatedly performed again with the above-described calculation to obtain the compressed calculation accuracy and the original calculation accuracy obtained by the original target model performing the above-described repetition process may be compared. That is, it is determined whether or not the calculation accuracy of the target model is not deteriorated in the case where the compression parameter RHO is increased to further increase the compression rate. Therefore, when the calculation accuracy of the compressed target model obtained after performing compression on the target model with the parameter value twice as large as the initial value is not deteriorated, the step of adjusting the parameter value in step S303 may be continuously repeated, for example, doubling the current value, i.e., twice as large as the initial value, again, i.e., modifying it to four times as large as the initial value, and performing compression on the original target model again.

When it is determined in step S303 that the compression weight parameter of the preset initial value has performed the compression process, and the compressed target model repeatedly performs the calculation to obtain a reduced calculation accuracy after compression compared with the original calculation accuracy, the comparison result may indicate that the compression process performed by using the initial value has not been successful, that is, the accuracy of the calculation performed on the target model has a degrading effect. In particular, since the number of parameters of the compressed target model is reduced compared with the original model, but the same or similar calculation precision cannot be obtained by repeatedly executing the same calculation task as that executed to obtain the original calculation precision, for example, the calculation precision of the compressed target model obtained after the compression processing is executed is reduced, which indicates that the compressed target model cannot be used, and on the other hand, the comparison result may indicate that no further compression space exists already when the compression processing is executed using the initial value, and even that the compression processing executed last time is compressed. This is because the calculation accuracy of the target model after the compression is performed is deteriorated, or there is a parameter having a large influence on the target calculation task performed by the target model among the parameters removed in the compression process. It is easily conceivable that there is a high probability that parameters that should not be fetched in the current compression process are present in the thus compressed object model. That is, in the current compression processing, the compression weight parameter setting of, for example, RHO is not appropriate.

Therefore, if the compression process performed by comparing the parameter values determined for the current RHO in step S303 has degraded the calculation accuracy of the target model, the parameter value of the current RHO may be reduced, for example, to half the value of the current RHO parameter, and the iteration may be terminated with the reduced RHO parameter as the appropriate compression parameter. Of course, after the current RHO parameter value is reduced, the above-described compression process may be repeated again, and the above-described repeated calculation may be performed again on the compressed target model to compare the original calculation accuracy and the compressed calculation accuracy, in such a manner as to perform iteration.

Specifically, in this embodiment of the present application, when the number of parameters used by the target model is reduced in an ADMM manner to compress the parameters, and when a current network layer of the target model is compressed, the remaining parameters of an association layer of the current network layer may be used as a new constraint of the ADMM, where the remaining parameters of the association layer in this embodiment of the present application refer to the number of parameters remaining after the association layer is compressed, and the association layer of the current network layer may be a previous layer and/or a subsequent layer of the current layer.

In other words, in the embodiment of the present application, the compression residual parameters of the preceding and following layers of the current network layer may be additionally introduced as an additional compression weight reference when performing the compression processing, so that the correlation between the parameters of the neural network model layer based on the current compression parameters and the parameters of the preceding and following layers may be comprehensively considered, and thus the probability that the parameters having strong correlation with the parameters remaining after the compression of the preceding and following layers are left during the compression is also high, so that the compression effect can be further improved.

S305, the adjusted value of the compression parameter is output.

Example four

Fig. 4 is a schematic structural diagram of an embodiment of a data processing apparatus provided in the present application, which can be used to execute the method steps shown in fig. 2 and fig. 3. As shown in fig. 4, the data processing apparatus may include: an acquisition module 41, a parameter adjustment module 42 and an output module 43.

The obtaining module 41 is configured to compress an initial value of a parameter and a model precision value corresponding to the initial value, where the model precision value is a calculation precision of a target model compressed by using the initial value of the compression parameter. The parameter adjusting module 42 is configured to compare the model precision value with an original precision value, and adjust the value of the compression parameter according to the comparison result, where the original precision value is the calculation precision of the target model before compression; the output module 43 is used for outputting the adjusted values of the compression parameters.

In the embodiment of the present application, before the target model is subjected to parameter compression processing by a compression method such as ADMM, the calculation accuracy, that is, the original accuracy value in the embodiment of the present application, of the uncompressed target model is first calculated. Then, for the compression parameters in the compression algorithm, the target model is compressed once according to the manually set or default initial values, and the calculation accuracy of the target model at this time, that is, the model accuracy value corresponding to the initial values of the compression parameters in the embodiment of the present application, is calculated. The obtaining module 41 obtains the initial value and the model precision value corresponding to the initial value.

For example, a calculation may be performed on an object model to be compressed in, for example, an ADMM compression scheme, to obtain an original calculation accuracy of the object model. For example, in the embodiment of the present application, the calculation performed by the neural network model to obtain the original precision may be to perform a preset standard calculation process, or may be to perform an actual calculation task. After the original calculation accuracy is obtained, the initial value of the compression parameter, such as the RHO parameter in the ADMM scheme, which has a large influence on the compression effect, may be acquired by the acquisition module 41. In the embodiment of the present application, the initial value may be obtained empirically or may be a statistical value obtained by compressing each of a plurality of similar models, and further, a compression process may be performed on the target model based on the initial value, that is, the target neural network model is compressed using an ADMM compression scheme having the compression parameters of the initial value described above, so as to obtain a compressed neural network model, and the same calculation task as that performed to obtain the original calculation accuracy is repeatedly performed on the compressed neural network model again, so as to obtain a model accuracy value.

Then, the parameter adjusting module 42 adjusts the parameters, specifically, when the parameter adjusting module 42 compares the compressed model precision value with the original precision value before compression, if the model precision value is greater than or equal to the original precision value, that is, the calculation precision of the target model is not lost, the value of the compression parameter can be increased; if the model precision value is smaller than the original precision value, namely the target model generates precision loss, the numerical value of the compression parameter can be reduced. Therefore, the compression parameters are adjusted by judging the change of the calculation precision of the target model before and after being compressed, the numerical value of the compression parameters suitable for the target model can be finally obtained, and the target model is compressed according to the numerical value, so that a better compression effect can be achieved on the premise of ensuring the calculation precision of the target model.

In the embodiment of the present application, the original precision value is the calculation precision of the target model before compression. When the parameter adjustment module 42 compares the compressed model precision value with the original precision value before compression, if the model precision value is greater than or equal to the original precision value, that is, the calculation precision of the target model is not lost, the value of the compression parameter can be increased; if the model precision value is smaller than the original precision value, namely the target model generates precision loss, the numerical value of the compression parameter can be reduced. Therefore, the compression parameters are adjusted by judging the change of the calculation precision of the target model before and after being compressed, the numerical value of the compression parameters suitable for the target model can be finally obtained, and the target model is compressed according to the numerical value, so that a better compression effect can be achieved on the premise of ensuring the calculation precision of the target model.

That is, after obtaining the original calculation accuracy of the target model to perform the calculation task, after the obtaining module 41 obtains the initial value of the compression parameter, the parameter adjusting module 42 may perform the compression processing on the target model using the compression parameter having the initial value, and perform the above-described calculation processing task identical to the calculation task performed to obtain the original calculation accuracy again on the compressed neural network model obtained after the compression processing to obtain the compressed calculation accuracy corresponding to the value of the weight parameter of the compression scheme on which the compression processing is performed on the model. So that the compression effect of the compression processing using the compression weight coefficient of the initial value can be judged based on the post-compression calculation accuracy.

For example, the parameter adjustment module 42 adjusts the value of the compression parameter based on the comparison result between the calculation accuracy obtained after the compression and the original calculation accuracy of the target model obtained as described above when the target model has not been compressed. For example, after the compression process is performed by using the compression parameters with the preset initial values, the calculation accuracy of the model obtained by repeatedly performing the above calculation on the compressed target model is not reduced compared with the original calculation accuracy, and the comparison result not only indicates that the compression process performed by using the initial values is successful to some extent without deteriorating the calculation accuracy of the target model, i.e., that the compressed target model is usable because the number of parameters of the compressed target model is reduced compared with the original model and the same or similar calculation accuracy can be obtained by repeatedly performing the same calculation task as that performed to obtain the original calculation accuracy, but also indicates that there may be further compression space by using the initial values to perform compression, this is because the calculation accuracy of the target model after the compression is performed is not degraded, or the influence of the parameters removed during the compression process on the target model performing the target calculation task is very small. It is then easily conceivable that there are still very likely to be removable parameters in the thus compressed object model. Therefore, by determining the compression parameter from the change in calculation accuracy before and after compression, it is possible to obtain a compression parameter value that can achieve a better compression effect. The output module 43 then outputs the value of the compression parameter adjusted by the parameter adjustment module 42.

Further, in the embodiment of the present application, the parameter adjusting module 42 may perform the following iterative operations when performing parameter adjustment:

the model accuracy value is compared to the original accuracy value.

And when the model precision value is greater than or equal to the original precision value, increasing the numerical value of the compression parameter, compressing the target model by adopting the current value of the compression parameter, and acquiring the calculation precision of the compressed target model to update the model precision value.

And when the model precision value is smaller than the original precision value, reducing the numerical value of the compression parameter, and ending the iterative operation.

In the embodiment of the present application, when the parameter adjusting module 42 determines that the model precision value is greater than or equal to the original precision value, i.e. the calculation precision of the target model is not lost, the value of the compression parameter may be increased, for example, the compression parameter may be preferably increased by one time. Then, the current value of the compression parameter, i.e. the increased value, is used to compress the target model again, and the calculation accuracy of the target model at this time is obtained, and then the model accuracy value is updated according to the calculation accuracy, and the operation of comparing the accuracy values is continuously executed, so as to further adjust the parameter value. In the iterative process, when the parameter adjustment module 42 detects that the current model precision value is smaller than the original precision value, that is, the target model generates precision loss, it indicates that the last adjusted value is the target value, so that an operation of reducing the value of the compression parameter may be performed, for example, the value of the compression parameter may be reduced by one half, then the iterative operation is ended, and the output module 43 outputs the value of the compression parameter adjusted by the parameter adjustment module 42.

In addition, in this embodiment of the present application, when the number of parameters used by the target model is reduced in an ADMM manner to perform compression processing on the parameters, and when the current network layer of the target model is compressed, the remaining parameters of the associated layer of the current network layer may be used as a new constraint of the ADMM, where the remaining parameters of the associated layer in this embodiment of the present application refer to the number of parameters remaining after the associated layer is compressed, and the associated layer of the current network layer may be a previous layer and/or a subsequent layer of the current layer.

EXAMPLE five

The internal functions and structure of the data processing apparatus, which can be implemented as an electronic device, are described above. Fig. 5 is a schematic structural diagram of an embodiment of an electronic device provided in the present application. As shown in fig. 5, the electronic device includes a memory 51 and a processor 52.

The memory 51 stores programs. In addition to the above-described programs, the memory 51 may also be configured to store other various data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, contact data, phonebook data, messages, pictures, videos, and so forth.

The memory 51 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The processor 51 is not limited to a Central Processing Unit (CPU), but may be a processing chip such as a Graphic Processing Unit (GPU), a Field Programmable Gate Array (FPGA), an embedded neural Network Processor (NPU), or an Artificial Intelligence (AI) chip. And a processor 52, coupled to the memory 51, for executing the program stored in the memory 51, wherein the program executes the data processing method of the second to third embodiments.

Further, as shown in fig. 5, the electronic device may further include: communication components 53, power components 54, audio components 55, display 56, and other components. Only some of the components are schematically shown in fig. 5, and it is not meant that the electronic device comprises only the components shown in fig. 5.

The communication component 53 is configured to facilitate wired or wireless communication between the electronic device and other devices. The electronic device may access a wireless network based on a communication standard, such as WiFi, 3G, 4G, or 5G, or a combination thereof. In an exemplary embodiment, the communication component 53 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 53 further comprises a Near Field Communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

A power supply component 54 provides power to the various components of the electronic device. The power components 54 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for an electronic device.

The audio component 55 is configured to output and/or input audio signals. For example, the audio component 55 includes a Microphone (MIC) configured to receive external audio signals when the electronic device is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 51 or transmitted via the communication component 53. In some embodiments, audio assembly 55 also includes a speaker for outputting audio signals.

The display 56 includes a screen, which may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of data processing, comprising:

and outputting the adjusted numerical value of the compression parameter.

2. The data processing method according to claim 1, wherein said comparing the magnitude of the model precision value with the original precision value and adjusting the value of the compression parameter according to the comparison result comprises performing the following iterative operations in a loop:

comparing the magnitude of the model precision value to the original precision value;

when the model precision value is larger than or equal to the original precision value, increasing the numerical value of the compression parameter, compressing the target model by adopting the current value of the compression parameter, and acquiring the calculation precision of the compressed target model to update the model precision value;

3. The data processing method of claim 2, wherein the increasing the value of the compression parameter comprises:

multiplying the value of the compression parameter; or

The reducing the value of the compression parameter comprises:

and carrying out multiplication processing on the numerical value of the compression parameter.

4. The data processing method according to claim 2, wherein the compressing the target model with the current value of the compression parameter includes:

and reducing the number of parameters used by the target model by adopting an alternating direction multiplier method according to the current values of the compression parameters.

5. The data processing method according to claim 4, wherein, when compressing a current network layer of the target model, a remaining parameter number of an associated layer of the current network layer is used as a new constraint of the alternative direction multiplier method, wherein the remaining parameter number of the associated layer is a number of parameters remaining after the associated layer is compressed.

6. The data processing method of claim 1, wherein the obtaining initial values of compression parameters comprises:

obtaining the type of the target model;

acquiring historical data of compression parameters corresponding to the types;

and determining the initial value of the compression parameter according to the historical data.

7. The data processing method according to claim 6, wherein after said determining an initial value of the compression parameter from the history data, the data processing method further comprises:

receiving feedback input of a user for an initial value determined according to the historical data;

determining an initial value of the compression parameter based on the feedback input and an initial value determined based on the historical data.

8. A data processing apparatus comprising:

9. An electronic device, comprising:

a memory for storing a program;

a processor for executing the program stored in the memory, the program when executed performing the data processing method of any one of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program executable by a processor is stored, wherein the program, when executed by the processor, implements a data processing method as claimed in any one of claims 1 to 7.