CN112418393A - Model cutting method and device - Google Patents

Model cutting method and device Download PDF

Info

Publication number
CN112418393A
CN112418393A CN202011147814.XA CN202011147814A CN112418393A CN 112418393 A CN112418393 A CN 112418393A CN 202011147814 A CN202011147814 A CN 202011147814A CN 112418393 A CN112418393 A CN 112418393A
Authority
CN
China
Prior art keywords
cutting
model
parameter
clipping
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011147814.XA
Other languages
Chinese (zh)
Inventor
李远辉
王奇刚
舒红乔
邓建林
杨安荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN202011147814.XA priority Critical patent/CN112418393A/en
Publication of CN112418393A publication Critical patent/CN112418393A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a model cutting method and a device, wherein the method comprises the following steps: obtaining a calculation power weight and a parameter weight corresponding to a model to be cut, wherein the model to be cut comprises at least two cutting layers; the calculation force weight is used for representing the ratio of the model calculation force to the model clipping contribution degree, and the parameter quantity weight is used for representing the ratio of the model parameter quantity to the model clipping contribution degree; analyzing each cutting layer according to the calculated force weight and the parameter weight to determine a cutting degree parameter corresponding to the cutting layer; cutting the model to be cut based on the cutting degree parameter to obtain a cutting model; by applying the method, the cutting model with relatively better acceleration ratio and compression ratio can be obtained.

Description

Model cutting method and device
Technical Field
The invention relates to the technical field of model processing, in particular to a model clipping method and device.
Background
Complex models require high amounts of memory space and computational resource consumption, and the models need to be compressed to be suitable for each hardware platform. When the model is compressed, corresponding compression operation is directly performed on each layer of the whole model, feedback needs to be performed repeatedly according to the actual deployment effect of the compressed model, the compression operation is performed again according to feedback indexes, and the deployment effect of the model is tested, so that the whole compression process is time-consuming and the cutting model with relatively good acceleration ratio and compression ratio is difficult to obtain.
Disclosure of Invention
The embodiment of the invention provides a model cutting method and model cutting equipment, and a cutting model with relatively good acceleration ratio and compression ratio can be obtained by applying the method.
One aspect of the embodiments of the present invention provides a model clipping method, where the method includes: obtaining a calculation power weight and a parameter weight corresponding to a model to be cut, wherein the model to be cut comprises at least two cutting layers; the calculation force weight is used for representing the ratio of the model calculation force to the model clipping contribution degree, and the parameter quantity weight is used for representing the ratio of the model parameter quantity to the model clipping contribution degree; analyzing each cutting layer according to the calculated force weight and the parameter weight to determine a cutting degree parameter corresponding to the cutting layer; and cutting the model to be cut based on the cutting degree parameter to obtain a cutting model.
In an embodiment, analyzing each clipping layer according to the calculated power weight and the parameter weight to determine a clipping degree parameter corresponding to the clipping layer includes: determining the parameter quantity and the calculation force corresponding to the model to be cut; carrying out normalization processing on the parameter quantity and the calculation force to obtain a parameter specific gravity and a calculation specific gravity; weighting the parameter proportion through a calculation force weight value, and weighting the calculation proportion through the parameter weight value to obtain a weighted calculation force and a weighted parameter quantity; and integrating the weighted calculation force and the weighted parameters to obtain a clipping degree parameter.
In an implementation manner, the clipping processing on the model to be clipped based on the clipping degree parameter to obtain a clipping model includes: sequencing the cutting degree parameters to obtain first sequencing information; screening a cutting layer meeting a cutting range according to the first sequencing information; and cutting the cutting layer meeting the cutting range to obtain a cutting model.
In one embodiment, the cropping process is layer cropping; correspondingly, the cutting layer meeting the cutting range is cut to obtain a cutting model, and the cutting model comprises: performing importance evaluation on the cutting layer meeting the cutting range to obtain first evaluation information; according to the first evaluation information, importance ranking is carried out on the cutting layers meeting the cutting range, and second ranking information is obtained; and cutting the cutting layer meeting the cutting range according to the second sequencing information to obtain a cutting model.
In one embodiment, the clipping process is parameter clipping; correspondingly, the cutting layer meeting the cutting range is cut to obtain a cutting model, and the cutting model comprises: obtaining a parameter to be cut of the cutting layer meeting the cutting range, and performing importance evaluation on the parameter to be cut to obtain second evaluation information; according to the second evaluation information, importance ranking is carried out on the parameters to be cut, and third ranking information is obtained; and cutting the parameters to be cut according to the third sequencing information to obtain a cutting model.
Another aspect of an embodiment of the present invention provides a model clipping device, where the device includes: the acquisition module is used for acquiring a calculation power weight and a parameter weight corresponding to a model to be cut, wherein the model to be cut comprises at least two cutting layers; the calculation force weight is used for representing the ratio of the model calculation force to the model clipping contribution degree, and the parameter quantity weight is used for representing the ratio of the model parameter quantity to the model clipping contribution degree; the analysis module is used for analyzing each layer of cutting layer according to the calculated force weight and the parameter weight so as to determine a cutting degree parameter corresponding to the cutting layer; and the cutting module is used for cutting the model to be cut based on the cutting degree parameter to obtain a cutting model.
In one embodiment, the analysis module includes: the determining submodule is used for determining the parameter quantity and the computing power corresponding to the model to be cut; the normalization submodule is used for carrying out normalization processing on the parameter quantity and the calculation force to obtain a parameter specific gravity and a calculation specific gravity; and the weighting submodule is used for weighting the parameter proportion by the calculation force weight value and weighting the calculated proportion by the parameter weight value to obtain the cutting degree parameter.
In an embodiment, the cropping module includes: the sequencing submodule is used for sequencing the cutting degree parameters to obtain first sequencing information; the screening submodule is used for screening the cutting layers meeting the cutting range according to the first sequencing information; and the cutting submodule is used for cutting the cutting layer meeting the cutting range to obtain a cutting model.
In one embodiment, the cropping process is layer cropping; correspondingly, the cropping submodule includes: the evaluation unit is used for evaluating the importance of the cutting layer meeting the cutting range to obtain first evaluation information; the ordering unit is used for ordering the importance of the cutting layers meeting the cutting range according to the first evaluation information to obtain second ordering information; and the cutting unit is used for cutting the cutting layer meeting the cutting range according to the second sequencing information to obtain a cutting model.
In one embodiment, the clipping process is parameter clipping; correspondingly, the cropping submodule includes: the obtaining unit is used for obtaining the parameter to be cut of the cutting layer meeting the cutting range, and performing importance evaluation on the parameter to be cut to obtain second evaluation information; the sorting unit is further configured to perform importance sorting on the parameters to be cut according to the second evaluation information to obtain third sorting information; and the cutting unit is used for cutting the parameters to be cut according to the third sequencing information to obtain a cutting model.
According to the model cutting method and the model cutting equipment, the calculation force and the parameter quantity of each cutting layer of the model are comprehensively considered through the combined calculation force weight value and the parameter quantity weight value, the calculation force and the parameter quantity of each cutting layer are analyzed according to the calculation force weight value and the parameter quantity weight value, so that the cutting degree parameter is determined, the model can be cut according to the cutting degree parameter, a better cutting effect is achieved, and the cutting model with relatively better acceleration ratio and compression ratio is obtained.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
FIG. 1 is a schematic diagram of an implementation flow of a model clipping method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an implementation flow of determining a clipping degree parameter by a model clipping method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a flow chart of a model clipping method according to an embodiment of the present invention;
FIG. 4 is a schematic view of a scene flow of a model clipping method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an implementation module of a model clipping device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic diagram of an implementation flow of a model clipping method according to an embodiment of the present invention.
Referring to fig. 1, in one aspect, an embodiment of the present invention provides a model clipping method, where the method includes: operation 101, obtaining a calculation power weight and a parameter weight corresponding to a model to be cut, wherein the model to be cut comprises at least two cutting layers; the calculation force weight is used for representing the ratio of the model calculation force to the model cutting contribution, and the parameter weight is used for representing the ratio of the model parameter to the model cutting contribution; operation 102, analyzing each cutting layer according to the calculated force weight and the parameter weight to determine a cutting degree parameter corresponding to the cutting layer; and in operation 103, performing clipping processing on the model to be clipped based on the clipping degree parameter to obtain a clipping model.
According to the model clipping method provided by the embodiment of the invention, the calculated force and the parameter quantity of each clipping layer are analyzed according to the calculated force weight and the parameter quantity weight so as to determine the clipping degree parameters of the calculated force and the parameter quantity of each clipping layer of the comprehensive model through the calculated force weight and the parameter quantity weight, so that the model can be clipped according to the clipping degree parameters to achieve a better clipping effect and obtain a clipping model with relatively better acceleration ratio and compression ratio.
In the method operation 101, the device applying the method may obtain the calculation weight and the parameter weight by collecting input information, where the input information comes from a user, and the method does not limit the obtaining mode of the input information, such as voice input, text input, or key input. The calculation force weight is used for representing the ratio of the model calculation force to the model clipping contribution degree, and the parameter quantity weight is used for representing the ratio of the model parameter quantity to the model clipping contribution degree. That is, when a user wishes to improve the influence of the computing power on model clipping during model clipping, the computing power weight may be set to be greater than the parameter weight, so that the device has a greater influence on clipping of the model clipping layer by an analysis result corresponding to the model computing power during the process of performing clipping layer analysis; when a user wants to improve the influence of the parameter quantity on model clipping, the parameter quantity weight value can be set to be larger than the calculation force weight value, so that the clipping influence of an analysis result corresponding to the parameter quantity on a model clipping layer is larger in the process of carrying out clipping layer analysis. It should be understood that, the calculated weight value and the parameter weight value in the method may be set to any value, and the adjustment of the importance of the weight value is obtained by adjusting the ratio between the calculated weight value and the parameter weight value according to the user requirement. For example, the calculated force weight and the parameter weight may be 0.1: 0.9, may be 1: 9, and can also be 8: 0.1, etc., which are not described in detail below.
In an implementation scenario, the device acquires a calculation power weight value of 0.4 and a parameter weight value of 0.6 by collecting user input data, in the process of analyzing a clipping layer, an analysis result based on a parameter needs to be considered in combination with the parameter weight value of 0.6, and an analysis result based on a calculation power needs to be considered in combination with the calculation power weight value of 0.4, so that the influence of the analysis result based on the parameter on the clipping degree parameter is greater than that of the analysis result based on the calculation power, the clipping degree parameter is determined according to the analysis result of the parameter in combination with the weight value and the calculation power analysis result in combination with the weight value, and subsequent clipping processing is performed according to the clipping degree parameter. By the operation, a user can selectively determine the calculation force weight and the parameter quantity weight according to the requirements on the model, such as model precision, model running speed, model compression quantity and other factors, so that the obtained cutting model is more matched with the requirements of the user. The method is suitable for various models, and only needs to meet the requirement that the model to be cut comprises at least two cutting layers, wherein the cutting layers refer to layers which can be subjected to parameter and calculation force cutting, such as a convolution layer. It is understood that, in some cases, there are layers in the model to be clipped that include layers that the user does not wish to clip, i.e., non-clipping layers, such as output layers or convolutional layers for which parameter quantities are not clipping. In this case, the non-tailorable layer may be predetermined so that the non-tailorable layer does not participate in subsequent tailorable analysis. It is understood that the non-clipping layer may be determined by the user when the calculation power weight and the parameter weight are input, or may be preset in advance by the device, for example, the input layer and the output layer are preset as the non-clipping layer.
In operation 102, it can be appreciated that during the model design process, the calculated force values of the convolutional layers are mainly large and the parameter quantities are small; the parameter quantity of the convolution layer is large, and the calculation force is small; the calculated force value of the convolution layer is larger, and the parameter quantity is larger; the parameter quantity of the convolution layer is small, and the calculation force is small, and it can be understood that the larger the integrated value of the calculation force value and the parameter quantity is, the larger the running speed of the model and the occupied space of the model are. Based on the method, the calculated force and the parameter quantity of each cutting layer are analyzed, the calculated force of each cutting layer is weighted through the calculated force weight, the parameter quantity of each cutting layer is weighted through the parameter quantity weight, and the cutting degree parameter corresponding to the cutting layer can be determined through integrating the weighted calculated force and the weighted parameter quantity. The cutting degree parameter can be used for comprehensively evaluating the overall effect of the cutting layer on the model cutting, and it can be understood that the larger the cutting degree parameter is, the more obvious the overall effect of the cutting layer corresponding to the cutting layer on the model cutting is.
Based on this, in operation 103, the clipping layer is selectively clipped according to the clipping degree parameter of each layer, and a clipping model having relatively excellent acceleration ratio and compression ratio can be obtained. The cutting processing can be directly cutting each layer according to the size of the cutting degree parameter, or cutting each layer according to the proportion of the cutting degree parameter in the total cutting degree, or cutting the parameter quantity of the cutting layer according to the proportion of the cutting degree parameter in each layer, or cutting the parameter quantity of the cutting layer corresponding to the cutting degree parameter according to the set proportion under the condition that the cutting degree parameter meets the set cutting threshold value. The method can also adopt other cutting modes to cut the cutting layer according to the cutting degree parameter.
To facilitate understanding of the above embodiments, a specific implementation scenario is provided below for description. In the scene, the model clipping method is applied to the model clipping device and clips the model to be clipped. After acquiring a calculation force weight and a parameter weight input by a user, acquiring a model to be cut, analyzing the calculation force and the parameter of each layer of the model to be cut, determining a corresponding calculation force numerical value and a parameter numerical value, weighting the calculation force numerical value through the calculation force weight, weighting the parameter numerical value through the parameter numerical value, integrating the weighted calculation force numerical value and the weighted parameter numerical value to determine a cutting degree coefficient of each layer of cutting layer, and cutting each layer of cutting layer according to the numerical value of the cutting degree coefficient, thereby acquiring a cutting model with relatively better acceleration ratio and compression ratio compared with the model to be cut.
Fig. 2 is a schematic diagram of an implementation flow of determining a clipping degree parameter by a model clipping method according to an embodiment of the present invention.
Referring to fig. 2, in the embodiment of the present invention, in operation 102, analyzing each clipping layer according to the calculated force weight and the parameter weight to determine a clipping degree parameter corresponding to the clipping layer includes: operation 1021, determining a parameter amount and a calculation force corresponding to the model to be cut; operation 1022, perform normalization processing on the parameter quantity and the calculation force to obtain a parameter specific gravity and a calculation specific gravity; operation 1023, weighting the parameter proportion by the calculation force weight value, and weighting the calculation proportion by the parameter weight value to obtain the weighted calculation force and the weighted parameter quantity; and operation 1024, integrating the weighted calculation force and the weighted parameters to obtain a clipping degree parameter.
The cutting layer analysis of the method firstly determines the parameters and the computing power of the model to be cut, wherein the computing power (FLOPs) means the number of floating point operations per second, and is understood as the computing speed. The parameter number (Params) is used to refer to the parameters of a convolutional layer, including the product of the parameters of the convolutional kernels in the convolutional layer and the number of convolutional kernels. The calculation force affects the length of the network execution time, and the parameter affects the amount of the video memory occupied, so that the unit of the parameter value obtained after analyzing each layer of the cutting layer is different from the unit of the calculation force value, for example, the unit of the parameter value can be a byte (KB), and the unit of the calculation force is a time unit or a speed unit.
Based on this, the method performs normalization processing on the parameter quantity and the calculation force in operation 1022 to make the evaluation units of the parameter quantity and the calculation force consistent, for example, the parameter proportion and the calculation proportion are obtained by evaluating the proportion of the parameter quantity and the calculation force, the total parameter quantity is obtained by adding the parameter quantity of each layer, and then the parameter proportion of each layer is obtained by comparing the parameter quantity of each layer with the total parameter quantity; similarly, the total calculated force can be obtained by adding the calculated forces of each layer, and then the calculated specific gravity of each layer can be obtained by comparing the calculated force of each layer with the total calculated force. It can be understood that the method only needs to perform normalization processing on the parameter and the calculation force, the method does not limit the normalization processing mode, and other normalization modes can be adopted to obtain the parameter specific gravity and the calculation specific gravity of each layer.
And then, weighting the parameter proportion by the calculation force weight value, weighting the calculation proportion by the parameter weight value, wherein the weighted calculation force and the weighted parameter number can be matched with the cutting requirement of a user, so that the cutting of the model has more pertinence. And finally, integrating the weighted calculation force and the weighted parameters to obtain the clipping degree parameter. Wherein the integration of the weighted calculation force and the weighted parameter may be integrated by multiplying, adding or setting other coefficients. The method utilizes the comprehensive consideration of the calculated force and the parameter quantity of each cutting layer, is favorable for combining the maximum advantages of the calculated force and the parameter quantity of each cutting layer, fully utilizes the analysis of the contribution importance of the calculated force of each layer of the model to the speed of the model and the analysis of the contribution importance of the parameter quantity of each layer to the compression of each layer of the model, can achieve very large acceleration ratio and compression ratio, reduces the time required in the later model compression process, and improves the quality of the cutting model.
Fig. 3 is a schematic diagram of an implementation flow of a clipping process of the model clipping method according to the embodiment of the present invention.
Referring to fig. 3, in the embodiment of the present invention, in operation 103, performing clipping processing on the model to be clipped based on the clipping degree parameter, to obtain a clipping model, including: operation 1031, sorting the cropping degree parameters to obtain first sorting information; operation 1032, screening a clipping layer meeting the clipping range according to the first ordering information; and operation 1033, performing clipping processing on the clipping layer meeting the clipping range to obtain a clipping model.
It can be understood that the larger the clipping degree parameter is, the larger the influence of the clipping operation performed on the clipping layer on the finally obtained clipping model is, based on this, the clipping degree parameter of each layer is firstly sorted to obtain first sorting information, and the first sorting information is used for representing the influence of the clipping processing performed on each layer of the clipping layer on the clipping model. The method screens the cutting layer according to the first sequencing information to obtain the cutting layer meeting the cutting range. The screening condition for screening by the method can be that a set number of cutting layers in the first sequencing information are screened, and if the first sequencing information is obtained, the first three cutting layers in the first sequencing information are selected for cutting. The screening condition for screening by the method can also be that whether the cutting layer meets the cutting threshold value is judged, and the cutting layer meeting the cutting threshold value is determined as the cutting layer meeting the cutting range. The screening condition for screening by the method can also be that the cutting threshold value and the set number in the first sequence information are simultaneously met, and the cutting layer meeting the condition is determined as the cutting layer meeting the cutting range. And then, cutting the cutting layer meeting the cutting range to obtain a cutting model. Specifically, the clipping process referred to in the method is to clip the parameter quantity. Based on the method, the speed of model deployment and the model compression degree can be improved, a layer with large influence factors on the cutting model can be found out, the layer is cut, and the number of later actual deployment attempts is saved.
In the embodiment of the invention, the cutting treatment is layer cutting; correspondingly, in operation 1031, a clipping layer satisfying the clipping range is clipped to obtain a clipping model, including: firstly, evaluating the importance of a cutting layer meeting a cutting range to obtain first evaluation information; then, sorting importance of the cutting layers meeting the cutting range according to the first evaluation information to obtain second sorting information; and then, cutting the cutting layer meeting the cutting range according to the second sequencing information to obtain a cutting model.
The cutting processing of the cutting layer by the method can be cutting the parameter quantity in the cutting layer, and the cutting model can also be obtained by cutting the cutting layer of which the whole cutting layer meets the cutting range. Specifically, when the cutting processing is layer cutting, after the first sequencing information is obtained, the cutting layer is cut in a whole layer according to the sequence of the first sequencing information, and the cutting is known to meet the user requirement. It should be added that the method can also set the clipping amount, for example, the memory occupation of the clipping model is set as the set proportion of the model to be clipped, such as 40%, 50%, etc.; or, the calculation speed of the model to be cut is set to be the set proportion of the model to be cut, such as 2 times, 3 times and the like. The setting can be preset in the cutting device, and can also be determined by the cutting device collecting the cutting proportion input by the user.
Furthermore, the method can also evaluate the importance of each cutting layer, and then determine whether the cutting layer meeting the cutting range needs to be cut according to the result of the importance evaluation. That is, the clipping layer is clipped in the clipping layer satisfying the clipping range and ranked later in the importance evaluation. It should be added that the first evaluation information is an evaluation result of overall evaluation of importance of each layer of the cutting layer, and the evaluation result may be integrated according to the parameter amount of each layer of the cutting layer to determine the first evaluation information. For example, in the first ranking information, the first layer 0.3, the second layer 0.26, the third layer 0.24, and the fourth layer 0.2 are determined according to the first ranking information and the corresponding clipping degree parameters, and in the importance ranking, the first layer 0.2, the second layer 0.24, the third layer 0.26, and the fourth layer 0.3 are determined according to the second ranking information and the importance evaluation, wherein the higher the importance evaluation is, the larger the parameters are. After the first ordering information and the second ordering information are multiplied and integrated, the first layer is 0.06, the second layer is 0.0624, the third layer is 0.0624, and the fourth layer is 0.06, according to the result, the second layer and the first layer can be cut first, and further, the first ordering information and the second ordering information can be weighted as required according to the running precision, the running speed and the model size of a user on a cutting model, so that the cutting layer needing to be cut is obtained.
In the embodiment of the invention, the cutting processing is parameter cutting; correspondingly, in operation 1031, a clipping layer satisfying the clipping range is clipped to obtain a clipping model, including: firstly, obtaining a parameter to be cut of a cutting layer meeting a cutting range, and performing importance evaluation on the parameter to be cut to obtain second evaluation information; then, sorting the importance of the parameters to be cut according to the second evaluation information to obtain third sorting information; and then, cutting the parameters to be cut according to the third sequencing information to obtain a cutting model.
In another clipping method, the parameters of each clipping layer are clipped. In the cutting method, importance evaluation needs to be performed on each parameter to be cut of the cutting layer which meets the cutting range to obtain second evaluation information corresponding to each parameter to be cut, then importance ranking is performed on the parameters to be cut according to the second evaluation information, it needs to be supplemented that the higher the importance is, the larger the numerical value corresponding to the second evaluation information is, the third ranking information is obtained, and each parameter to be cut is sequentially cut from the beginning of the smaller importance based on the sequence of the third ranking information. For example, in one case, the clipping layers that satisfy the clipping range are the first layer, the second layer, and the third layer in sequence, and after the importance of the parameters to be clipped is sorted, the importance of the third parameter in the second layer is 0.001, the importance of the fourth parameter in the third layer is 0.002, and the importance of the fifth parameter in the first layer is 0.003, the third parameter in the second layer, the fourth parameter in the third layer, and the fifth parameter in the first layer may be clipped first. Further, the method can also set the maximum ratio of each layer to be cut, for example, when the parameter to be cut of one of the cutting layers exceeds a certain set ratio of the total parameter to be cut of the layer, for example, 80%, the cutting of the layer is stopped.
Fig. 4 is a schematic view of a scene flow of a model clipping method according to an embodiment of the present invention.
Referring to table 1 and fig. 4, to facilitate understanding of the above embodiments, a specific implementation scenario is provided below for explanation. In the scene, the method is applied to model clipping equipment and is used for clipping the Yolov3-tiny detection model.
Firstly, 40% of the calculated force weight (m) and 60% of the parameter weight (n) set by the user are obtained. Then, the device calculates the calculated force and the parameters according to the input size of the test picture, for example, when the size of the test picture is 416 × 3, the original calculated force F and the original parameters P of each convolution layer are obtained. Then, normalizing the original calculated force F and the original calculated quantity P, and weighting the normalized calculated force and the normalized parameter quantity according to the calculated force weight value of 40% and the parameter quantity weight value of 60% to obtain a weighted calculated force F 'and a weighted parameter quantity P'; then, adding the weighted calculation force F 'and the weighted parameter P' to obtain a cutting degree parameter R corresponding to each layer; and then, sequencing each layer according to the size sequence to obtain the sequencing of each layer corresponding to the cutting degree parameter. And extracting a certain number of cutting layers in the front sequence according to a set proportion, for example, extracting the convolution layers in the front 50% of the sequence to perform layer cutting or parameter cutting, and cutting 50% of parameter quantity of the convolution layers in the front 50% of the sequence to obtain a cutting model.
Table 1 is as follows:
Figure BDA0002740230810000121
fig. 5 is a schematic diagram of an implementation module of a model clipping device according to an embodiment of the present invention.
Referring to fig. 5, another aspect of the embodiment of the present invention provides a model clipping device, including: an obtaining module 501, configured to obtain a calculation weight and a parameter weight corresponding to a to-be-cut model, where the to-be-cut model includes at least two cutting layers; the calculation force weight is used for representing the ratio of the model calculation force to the model cutting contribution, and the parameter weight is used for representing the ratio of the model parameter to the model cutting contribution; an analysis module 502, configured to analyze each clipping layer according to the calculated power weight and the parameter weight to determine a clipping degree parameter corresponding to the clipping layer; and the cutting module 503 is configured to perform cutting processing on the model to be cut based on the cutting degree parameter, so as to obtain a cutting model.
In an embodiment of the present invention, the analyzing module 502 includes: the determining submodule 5021 is used for determining the parameter quantity and the computing power corresponding to the model to be cut; the normalization submodule 5022 is used for carrying out normalization processing on the parameter quantity and the calculation force to obtain the parameter proportion and the calculation proportion; the weighting submodule 5023 is used for weighting the parameter proportion through the calculation force weight value and weighting the calculation proportion through the parameter weight value to obtain the cutting degree parameter.
In this embodiment of the present invention, the clipping module 503 includes: a sorting submodule 5031, configured to sort the clipping degree parameter to obtain first sorting information; a screening submodule 5032 configured to screen a clipping layer that meets the clipping range according to the first ordering information; a clipping submodule 5033, configured to perform clipping processing on the clipping layer that meets the clipping range, to obtain a clipping model.
In the embodiment of the invention, the cutting treatment is layer cutting; accordingly, cropping sub-module 5033 comprises: an evaluation unit 50331 configured to perform importance evaluation on the clipping layer that satisfies the clipping range to obtain first evaluation information; a sorting unit 50332, configured to perform importance sorting on the clipping layers meeting the clipping range according to the first evaluation information to obtain second sorting information; a clipping unit 50333, configured to clip the clipping layer meeting the clipping range according to the second sorting information, so as to obtain a clipping model.
In the embodiment of the invention, the cutting processing is parameter cutting; accordingly, cropping sub-module 5033 comprises: an obtaining unit 50334, configured to obtain a parameter to be clipped of the clipping layer that meets the clipping range, perform importance evaluation on the parameter to be clipped, and obtain second evaluation information; the sorting unit 50332 is further configured to perform importance sorting on the parameters to be cut according to the second evaluation information to obtain third sorting information; a clipping unit 5033, configured to clip the parameter to be clipped according to the third sorting information, so as to obtain a clipping model.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method of model clipping, the method comprising:
obtaining a calculation power weight and a parameter weight corresponding to a model to be cut, wherein the model to be cut comprises at least two cutting layers; the calculation force weight is used for representing the ratio of the model calculation force to the model clipping contribution degree, and the parameter quantity weight is used for representing the ratio of the model parameter quantity to the model clipping contribution degree;
analyzing each cutting layer according to the calculated force weight and the parameter weight to determine a cutting degree parameter corresponding to the cutting layer;
and cutting the model to be cut based on the cutting degree parameter to obtain a cutting model.
2. The method of claim 1, wherein analyzing each clipping layer according to the calculated force weight and the parameter weight to determine a clipping degree parameter corresponding to the clipping layer comprises:
determining the parameter quantity and the calculation force corresponding to the model to be cut;
carrying out normalization processing on the parameter quantity and the calculation force to obtain a parameter specific gravity and a calculation specific gravity;
weighting the parameter proportion through a calculation force weight value, and weighting the calculation proportion through the parameter weight value to obtain a weighted calculation force and a weighted parameter quantity;
and integrating the weighted calculation force and the weighted parameters to obtain a clipping degree parameter.
3. The method according to claim 1, wherein the clipping process is performed on the model to be clipped based on the clipping degree parameter to obtain a clipping model, and the clipping process comprises:
sequencing the cutting degree parameters to obtain first sequencing information;
screening a cutting layer meeting a cutting range according to the first sequencing information;
and cutting the cutting layer meeting the cutting range to obtain a cutting model.
4. The method of claim 3, wherein the cropping process is layer cropping;
correspondingly, the cutting layer meeting the cutting range is cut to obtain a cutting model, and the cutting model comprises:
performing importance evaluation on the cutting layer meeting the cutting range to obtain first evaluation information;
according to the first evaluation information, importance ranking is carried out on the cutting layers meeting the cutting range, and second ranking information is obtained;
and cutting the cutting layer meeting the cutting range according to the second sequencing information to obtain a cutting model.
5. The method according to claim 3, wherein the clipping process is parametric clipping;
correspondingly, the cutting layer meeting the cutting range is cut to obtain a cutting model, and the cutting model comprises:
obtaining a parameter to be cut of the cutting layer meeting the cutting range, and performing importance evaluation on the parameter to be cut to obtain second evaluation information;
according to the second evaluation information, importance ranking is carried out on the parameters to be cut, and third ranking information is obtained;
and cutting the parameters to be cut according to the third sequencing information to obtain a cutting model.
6. A model clipping device characterized in that it comprises:
the acquisition module is used for acquiring a calculation power weight and a parameter weight corresponding to a model to be cut, wherein the model to be cut comprises at least two cutting layers; the calculation force weight is used for representing the ratio of the model calculation force to the model clipping contribution degree, and the parameter quantity weight is used for representing the ratio of the model parameter quantity to the model clipping contribution degree;
the analysis module is used for analyzing each layer of cutting layer according to the calculated force weight and the parameter weight so as to determine a cutting degree parameter corresponding to the cutting layer;
and the cutting module is used for cutting the model to be cut based on the cutting degree parameter to obtain a cutting model.
7. The apparatus of claim 6, wherein the analysis module comprises:
the determining submodule is used for determining the parameter quantity and the computing power corresponding to the model to be cut;
the normalization submodule is used for carrying out normalization processing on the parameter quantity and the calculation force to obtain a parameter specific gravity and a calculation specific gravity;
and the weighting submodule is used for weighting the parameter proportion by the calculation force weight value and weighting the calculated proportion by the parameter weight value to obtain the cutting degree parameter.
8. The apparatus of claim 6, wherein the cropping module comprises:
the sequencing submodule is used for sequencing the cutting degree parameters to obtain first sequencing information;
the screening submodule is used for screening the cutting layers meeting the cutting range according to the first sequencing information;
and the cutting submodule is used for cutting the cutting layer meeting the cutting range to obtain a cutting model.
9. The apparatus of claim 8, wherein the cropping process is layer cropping;
correspondingly, the cropping submodule includes:
the evaluation unit is used for evaluating the importance of the cutting layer meeting the cutting range to obtain first evaluation information;
the ordering unit is used for ordering the importance of the cutting layers meeting the cutting range according to the first evaluation information to obtain second ordering information;
and the cutting unit is used for cutting the cutting layer meeting the cutting range according to the second sequencing information to obtain a cutting model.
10. The apparatus of claim 8, wherein the clipping process is parametric clipping;
correspondingly, the cropping submodule includes:
the obtaining unit is used for obtaining the parameter to be cut of the cutting layer meeting the cutting range, and performing importance evaluation on the parameter to be cut to obtain second evaluation information;
the sorting unit is further configured to perform importance sorting on the parameters to be cut according to the second evaluation information to obtain third sorting information;
and the cutting unit is used for cutting the parameters to be cut according to the third sequencing information to obtain a cutting model.
CN202011147814.XA 2020-10-23 2020-10-23 Model cutting method and device Pending CN112418393A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011147814.XA CN112418393A (en) 2020-10-23 2020-10-23 Model cutting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011147814.XA CN112418393A (en) 2020-10-23 2020-10-23 Model cutting method and device

Publications (1)

Publication Number Publication Date
CN112418393A true CN112418393A (en) 2021-02-26

Family

ID=74841076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011147814.XA Pending CN112418393A (en) 2020-10-23 2020-10-23 Model cutting method and device

Country Status (1)

Country Link
CN (1) CN112418393A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921287A (en) * 2018-07-12 2018-11-30 开放智能机器(上海)有限公司 A kind of optimization method and system of neural network model
CN110472736A (en) * 2019-08-26 2019-11-19 联想(北京)有限公司 A kind of method and electronic equipment cutting neural network model
CN110598731A (en) * 2019-07-31 2019-12-20 浙江大学 Efficient image classification method based on structured pruning
CN110766131A (en) * 2019-05-14 2020-02-07 北京嘀嘀无限科技发展有限公司 Data processing device and method and electronic equipment
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
CN111488990A (en) * 2020-04-17 2020-08-04 苏州浪潮智能科技有限公司 Model clipping method, device, equipment and medium based on performance perception
CN111598250A (en) * 2019-02-20 2020-08-28 北京奇虎科技有限公司 Model evaluation method, model evaluation device, computer equipment and storage medium
US20200311552A1 (en) * 2019-03-25 2020-10-01 Samsung Electronics Co., Ltd. Device and method for compressing machine learning model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921287A (en) * 2018-07-12 2018-11-30 开放智能机器(上海)有限公司 A kind of optimization method and system of neural network model
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
CN111598250A (en) * 2019-02-20 2020-08-28 北京奇虎科技有限公司 Model evaluation method, model evaluation device, computer equipment and storage medium
US20200311552A1 (en) * 2019-03-25 2020-10-01 Samsung Electronics Co., Ltd. Device and method for compressing machine learning model
CN110766131A (en) * 2019-05-14 2020-02-07 北京嘀嘀无限科技发展有限公司 Data processing device and method and electronic equipment
CN110598731A (en) * 2019-07-31 2019-12-20 浙江大学 Efficient image classification method based on structured pruning
CN110472736A (en) * 2019-08-26 2019-11-19 联想(北京)有限公司 A kind of method and electronic equipment cutting neural network model
CN111488990A (en) * 2020-04-17 2020-08-04 苏州浪潮智能科技有限公司 Model clipping method, device, equipment and medium based on performance perception

Similar Documents

Publication Publication Date Title
CN111031346B (en) Method and device for enhancing video image quality
CN101695141B (en) Method and device for evaluating video quality
CN110610723B (en) Method, device, equipment and storage medium for evaluating sound quality in vehicle
EP1342380A2 (en) System and method for providing a scalable dynamic objective metric for automatic video quality evaluation
CN111751119B (en) Automobile acceleration sound quality evaluation method based on sound order frequency characteristics
US20180321194A1 (en) Method for estimating a variation in preload applied to linear guideway
CN110173857A (en) Control method, air conditioner and the computer readable storage medium of air conditioner
DE102014118075A1 (en) Audio and video synchronizing perception model
CN111277274A (en) Data compression method, device, equipment and storage medium
CN114355094B (en) Product reliability weak link comprehensive evaluation method and device based on multi-source information
KR102583854B1 (en) Process monitoring apparatus for judging the defectiveness of cables produced through harness cable production process, and the operating method thereof
CN114972232A (en) No-reference image quality evaluation method based on incremental meta-learning
CN112418393A (en) Model cutting method and device
CN110377821A (en) Generate method, apparatus, computer equipment and the storage medium of interest tags
CN112000803B (en) Text classification method and device, electronic equipment and computer readable storage medium
CN113158022A (en) Service recommendation method, device, server and storage medium
Abouelaziz et al. A blind mesh visual quality assessment method based on convolutional neural network
CN111428125B (en) Ordering method, ordering device, electronic equipment and readable storage medium
CN112665810A (en) Method and system for determining chip vibration falling, storage medium and electronic equipment
CN111783843A (en) Feature selection method and device and computer system
CN115307721A (en) Method, device and equipment for evaluating quality of automobile acceleration sound and storage medium
CN115936099A (en) Weight compression and integration standard pruning method for neural network
Torija et al. Subjective dominance as a basis for selecting frequency weightings
CN106384598A (en) Noise quality determination method and device
CN111160530A (en) Compression processing method and device of model and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination