CN116468100A - Residual pruning method, residual pruning device, electronic equipment and readable storage medium - Google Patents

Residual pruning method, residual pruning device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN116468100A
CN116468100A CN202310213879.7A CN202310213879A CN116468100A CN 116468100 A CN116468100 A CN 116468100A CN 202310213879 A CN202310213879 A CN 202310213879A CN 116468100 A CN116468100 A CN 116468100A
Authority
CN
China
Prior art keywords
target
residual
calculation
pruning
operator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310213879.7A
Other languages
Chinese (zh)
Other versions
CN116468100B (en
Inventor
蒯文啸
张法朝
唐剑
奉飞飞
刘宁
童虎庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Midea Group Co Ltd
Midea Group Shanghai Co Ltd
Original Assignee
Midea Group Co Ltd
Midea Group Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Midea Group Co Ltd, Midea Group Shanghai Co Ltd filed Critical Midea Group Co Ltd
Priority to CN202310213879.7A priority Critical patent/CN116468100B/en
Publication of CN116468100A publication Critical patent/CN116468100A/en
Application granted granted Critical
Publication of CN116468100B publication Critical patent/CN116468100B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a residual pruning method, a device, electronic equipment and a readable storage medium, wherein the residual pruning method comprises the following steps: searching a target node in a calculation graph corresponding to the first model, wherein the target node corresponds to a residual error connection structure in the first model; traversing the computational graph based on the target node, and determining a target convolution layer in the computational graph; pruning is carried out on the target convolution layer.

Description

Residual pruning method, residual pruning device, electronic equipment and readable storage medium
Technical Field
The application belongs to the technical field of model processing, and particularly relates to a residual pruning method, a residual pruning device, electronic equipment and a readable storage medium.
Background
The channel pruning technology aims at compressing redundancy weights in the model, and optimizing the reasoning speed of the model in operation while reducing the volume of the model.
In the related art, all automatic pruning schemes need to manually input the corresponding names of the convolution layers of the required pruning processing in the model, which results in low efficiency of the automatic pruning processing of the model.
Disclosure of Invention
The present application aims to solve one of the technical problems existing in the prior art or related technologies.
To this end, a first aspect of the present application proposes a residual pruning method.
A second aspect of the present application proposes a residual pruning device.
A third aspect of the present application proposes a residual pruning device.
A fourth aspect of the present application proposes a computer program product.
A fifth aspect of the present application proposes a readable storage medium.
A sixth aspect of the present application proposes an electronic device.
In view of this, according to a first aspect of the present application, a residual pruning method is proposed, comprising: searching a target node in a calculation graph corresponding to the first model, wherein the target node corresponds to a residual error connection structure in the first model; traversing the computational graph based on the target node, and determining a target convolution layer in the computational graph; pruning is carried out on the target convolution layer.
According to a second aspect of the present application, a residual pruning device is provided, comprising: the searching module is used for searching a target node in the calculation graph corresponding to the first model, and the target node corresponds to the residual error connection structure in the first model; the determining module is used for traversing the calculation graph based on the target node and determining a target convolution layer in the calculation graph; and the processing module is used for pruning the target convolution layer.
According to a third aspect of the present application, a residual pruning device is provided, comprising: a memory in which a program or instructions are stored; the processor executes the program or the instructions stored in the memory to implement the steps of the residual pruning method as in any one of the first aspects, so that the method has all the beneficial technical effects of the residual pruning method in any one of the first aspects, and will not be described in detail herein.
According to a fourth aspect of the present application, a computer program product is provided, which when executed by a processor, implements the steps of the residual pruning method as set forth in any one of the first aspects, so that the method has all the beneficial technical effects of the residual pruning method in any one of the first aspects, which are not described in detail herein.
According to a fifth aspect of the present application, there is provided a readable storage medium having stored thereon a program or instructions which, when executed by a processor, implement the steps of the residual pruning method as set forth in any one of the above-mentioned first aspects. Therefore, the method has all the beneficial technical effects of the residual pruning method in any one of the above first aspects, and will not be described in detail herein.
According to a sixth aspect of the present application, there is provided an electronic device comprising: the residual pruning device as defined in the second or third aspect and/or the readable storage medium as defined in the fourth aspect thus has all the advantageous technical effects of the residual pruning device as defined in the second or third aspect and/or the readable storage medium as defined in the fourth aspect, and will not be described in any more detail herein.
According to the technical scheme, the first model is converted into the calculation graph structure, the target node is found based on the calculation graph structure, the target convolution layer required to be subjected to pruning processing is found through the target node, the target convolution layer is automatically positioned, automatic pruning processing is performed on the positioned target convolution layer, the convolution layer connection relation of the residual error connection structure in the automatic acquisition model is achieved, the user does not need to manually input the names corresponding to the convolution layers in the model, accuracy of the model pruning processing is guaranteed, and meanwhile, efficiency of the model pruning processing is improved.
Additional aspects and advantages of the present application will become apparent in the following description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, wherein:
FIG. 1 illustrates one of the schematic flow diagrams of the residual pruning method provided in some embodiments of the present application;
FIG. 2 illustrates one of the schematic diagrams of residual connection structures provided in some embodiments of the present application;
FIG. 3 illustrates a second schematic flow diagram of a residual pruning method provided in some embodiments of the present application;
FIG. 4 illustrates a third schematic flow chart of a residual pruning method provided in some embodiments of the present application;
FIG. 5 illustrates a schematic flow diagram of a method of finding target volume integration provided in some embodiments of the present application;
FIG. 6 illustrates a fourth schematic flow diagram of a residual pruning method provided in some embodiments of the present application;
FIG. 7 illustrates a fifth schematic flow diagram of a residual pruning method provided in some embodiments of the present application;
FIG. 8 illustrates a sixth schematic flow diagram of a residual pruning method provided in some embodiments of the present application;
FIG. 9 illustrates a second schematic diagram of a residual connection provided in some embodiments of the present application;
FIG. 10 illustrates a third schematic diagram of a residual connection provided in some embodiments of the present application;
fig. 11 shows a fourth schematic diagram of a residual connection provided in some embodiments of the present application:
FIG. 12 illustrates a block diagram of a residual pruning device provided in some embodiments of the present application;
FIG. 13 illustrates a block diagram of a residual pruning device provided in some embodiments of the present application;
fig. 14 illustrates a block diagram of an electronic device provided in some embodiments of the present application.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will be more clearly understood, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the present embodiment and the features in the embodiment may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced otherwise than as described herein, and thus the scope of the present application is not limited by the specific embodiments disclosed below.
Residual pruning methods, apparatuses, readable storage media and electronic devices according to some embodiments of the present application are described below with reference to fig. 1 to 14.
According to one embodiment of the present application, as shown in fig. 1, a residual pruning method is provided, including:
step 102, searching a target node in a calculation graph corresponding to the first model, wherein the target node corresponds to a residual error connection structure in the first model;
step 104, traversing the calculation graph based on the target node, and determining a target convolution layer in the calculation graph;
and 106, pruning the target convolution layer.
The residual pruning method is used for pruning operators in the residual connection structure in the first model.
In this embodiment, the first model is a data processing model, and illustratively, the first model may be an image classification model, an object detection model, or the like, specifically for example: resNet model in image classification (an image classification model), YOLOv7-tiny model in object detection (an object detection model). The computational graph is constructed based on the first model, operator nodes in the computational graph correspond to operators in the first model one by one, and the corresponding target nodes are conveniently searched by expanding the first model into the computational graph. The target node is an operator node in a residual error connection structure in the first model, namely, when the target node is found, the target convolution layer can be found through the target node, the target convolution layer is a residual error network structure needing pruning treatment, and through carrying out residual error pruning treatment on the target convolution layer, the redundant weight in the first model can be compressed, the volume of the first model is reduced, and meanwhile, the running speed of the model is improved. The residual connection structure refers to that the input of a certain convolution layer is from the superposition of the outputs of two convolution layers at the upper layer. It should be noted that, in order to alleviate the influence of gradient disappearance when setting up the network, a residual structure is introduced, and the residual structure is that the information extracted by the features needs to be enriched by using the outputs of different convolution layers, namely, an addition sign, and an add operator is located at the operator layer. The add operator functions to sum the two or more convolved outputs as the input to the next convolution, which is in fact the addition operation at the time of model forward reasoning. If no special processing is carried out on add operation, the number of output channels of the input convolution layers is inconsistent after pruning, so that summation fails. The model size is reduced by eliminating redundant parameters with low influence on precision.
As shown in fig. 9, illustratively, the add operator represents a residual connection, so the add operator is taken as a target node, and after the add operator is found, a corresponding convolution layer is found based on the add operator. The input of ConvC is the sum of the outputs of ConvA and ConvB, so the convolution layer formed by ConvA, convB and ConvC is the target convolution layer ConvD that needs pruning. Wherein ConvA, convB, convC, convD are all convolution layers. It can be seen that the two inputs of the add operator are ConvA and ConvB, respectively, and the input of ConvC is the output of the add operator, so the add operator, convA, convB, convC can be determined as a residual connection structure.
As shown in fig. 10, the input of the add operator is layer1.0.conv2 and conv1, the output of the add operator is layer1.1.conv1, it can be seen that the input and the output of the add operator are both convolution operators, and the structure formed by the input and the output of the add operator can be determined to be a residual connection structure.
As shown in fig. 11, the two inputs of the add_1 operator are layer1_0_relu and layer1_1_bn2, and it can be seen that the two inputs of the add_1 operator are BN operator (normalization operator) and relu operator (activation operator), respectively, where the residual connection structure is a convolution layer containing input and output directions of the add operator, and if no convolution layer is found by searching along a certain direction, the process proceeds. Since the convolutional layer in any one model is not reached, it is determined that the complete residual link structure has not been searched, and therefore, it is determined that the structure composed of the input and the output of the add operator is not a residual link structure. Specifically, the first model is expanded into a computational graph to determine the inference structure of the first model. By traversing each operator node in the computational graph, a target node in the computational graph can be located. After finding the target node, traversing the calculation graph along the input and output directions by taking the target node as a traversing starting point so as to find a target convolution layer which can be used as a pruning object, and carrying out pruning treatment on the target convolution layer.
Illustratively, the target node may be an add operator node, such as add, add_1, and the like. Taking an operator with a node name of add as an example, an "opcode" of the add operator is "call_function", and two inputs received by the add operator are (layer1_0_bn2, relu), but neither the BN operator (normalization operator) nor the relu operator (activation operator) is a convolution layer, and only the convolution layer belongs to a pruning object. The search continues along the data stream until the corresponding convolutional layer is found. For example: the "a" is a module object to be determined by the Pytorch code "if isolation (a, torch. Nn. Conv2 d)", that is, whether the "a" is the target convolution layer.
In the embodiment of the application, the first model is converted into the calculation graph structure, the target node is found based on the calculation graph structure, the target convolution layer required to be subjected to pruning is found through the target node, the target convolution layer is automatically positioned, automatic pruning is performed on the positioned target convolution layer, the convolution layer connection relation of the residual error connection structure in the automatic acquisition model is realized, the user does not need to manually input the corresponding names of the convolution layers in the model, and the pruning processing accuracy of the model is ensured while the pruning processing efficiency of the model is improved.
As shown in fig. 3, in the above embodiment, determining the target convolutional layer in the computation graph based on traversing the computation graph by the target node includes:
step 302, traversing the computational graph along the input direction and the output direction of the target node until a first operator structure corresponding to the target node is determined;
step 304, determining the first operator structure as a target convolution layer under the condition that the first operator structure is matched with a preset operator structure.
In this embodiment, a specific process of traversing the computational graph by the target node to find the target convolutional layer structure is given.
In this embodiment, after the target node in the computation graph is found, because the target node is an operator node in the residual connection structure, the target node is used as a starting point of traversing the computation graph, the computation graph is traversed along the output direction and the input direction of the operator node, a first operator structure corresponding to the target node is found, and when the comparison result is a match, the first operator structure is determined to be the target convolution layer, so that the automatic finding of the target convolution layer to be pruned is realized.
Illustratively, the target node is an add operator node, and after the add operator node is located, traversing the input direction and the output direction of the add operator node is started. And carrying names of the input nodes and the output nodes corresponding to the add operator nodes into a network to obtain specific corresponding structures in the input node and the output node models, wherein the structures are first operator structures. If the first operator structure is a convolution layer structure, determining the first operator structure as a target convolution layer, and if the first operator structure is not the convolution layer structure, continuing to search from the current input node and the current output node as starting points until the first operator structure matched with the preset operator structure is searched.
In the process of traversing the calculation map, the calculation map may be traversed from the input direction, and then traversed along the output direction after the input direction is traversed, or the calculation map may be traversed from the output direction, and then traversed along the input direction after the output direction is traversed.
Illustratively, where the target node includes a plurality of output paths, the plurality of output paths are ordered and traversed sequentially.
In the embodiment of the application, since the target node is an operator node in the target convolution layer, the calculation graph needs to be traversed according to the input direction and the output direction, a first operator structure comprising the target node can be searched through traversing the calculation graph, and when the searched first operator structure is matched with a preset operator structure, the first operator structure is judged to be the target convolution layer, so that the corresponding target convolution layer is automatically searched based on the target node.
As shown in fig. 4, in any of the above embodiments, the number of target nodes is at least two; traversing the computational graph along the input direction and the output direction of the target node, and determining a first operator structure corresponding to the target node, wherein the first operator structure comprises:
step 402, obtaining the arrangement sequence of at least two target nodes;
step 404, traversing the computation graph by taking at least two target nodes as starting points in sequence along the arrangement order.
In this embodiment, when a plurality of target nodes are found to be included in the computation graph, a corresponding first operator structure is sequentially found for each target node according to the arrangement order, and whether each target node corresponds to a target convolution layer is sequentially found.
It should be noted that the operator structure corresponding to the target convolution layer must include target nodes, but each target node does not uniformly correspond to the operator structure of the target convolution layer.
Specifically, the computational graph comprises a plurality of target nodes, the arrangement sequence of the plurality of target nodes is obtained, and each target node is sequentially used as a starting point to traverse the computational graph so as to find a corresponding target convolution layer.
The calculation graph includes two target nodes, namely node 1 and node 2, and the arrangement sequence of the two target nodes is determined to be node 1 and node 2. First, the calculation map is traversed in the output direction and the input direction with the node 1 as the starting point, and the target convolution layer1 is found after the node 1 is traversed. Then, the calculation map is traversed in the output direction and the input direction with the node 2 as a starting point, and the target convolution layer 2 is found after the node 2 is traversed.
In the embodiment of the application, when the calculation graph comprises a plurality of target nodes, the target nodes are traversed in sequence, the target convolution layer corresponding to each target node is found, and all target convolution layers needing pruning in the calculation graph can be found.
As shown in fig. 5, in some possible embodiments, the method of finding a target volume integration includes:
step 502, positioning add operators;
in this embodiment, the add operator is the target node in the computational graph.
Step 504, traversing operator input/output;
in this embodiment, the add operator node is used as a starting point to traverse the computation graph along the output direction/input direction, and the first operator structure is searched;
step 506, obtaining a corresponding structure in the model;
in this embodiment, the corresponding structure in the model is a first operator structure.
Step 508, judging whether the layer is a convolution layer, if yes, executing step 510, and if not, executing step 512;
in this embodiment, in case the first operator structure matches a preset operator structure, the first operator structure is the target convolutional layer.
Step 510, if the current operator is traversed, returning to the execution step 502 if the judgment result is yes, and returning to the execution step 504 if the judgment result is not yes;
in this embodiment, in the case where a plurality of target nodes are included in the computation graph, after the current operator (current target node) completes the traversal, the next target node is traversed, and in the case where the traversal is not completed, the traversal in the input direction or the output direction is continued.
Step 512, relocates the input/output node, and returns to step 506.
In this embodiment, if the first operator structure is not matched with the preset operator structure, the corresponding input node or output node is taken as a starting point again, and the calculation graph is continuously traversed until the next first operator structure is found.
As shown in fig. 6, in any of the above embodiments, pruning is performed on the target convolutional layer, including:
step 602, determining a target calculation channel in a target convolution layer;
step 604, clipping the target computing channel.
In this embodiment, after determining the target convolutional layer, a target computing channel in the target convolutional layer needs to be searched, where the target computing channel is a computing channel capable of performing clipping processing, that is, after the target convolutional layer is subjected to pruning processing, the target computing channel is clipped and deleted, so that the volume of a first model where the target convolutional layer is located is compressed, and the reasoning efficiency of the first model is also improved.
Specifically, in order to ensure the reasoning effect of the first model after pruning, a calculation channel to be called in each convolution layer in the residual connection structure needs to be reserved, the calculation channel which is not required to be called in each convolution layer is determined to be a target channel, and the target channel is cut.
As shown in fig. 2, the residual connection structure includes three target convolution layers, respectively ConvA, convB and ConvC. Wherein, convA has 4 calculation channels, convB has 4 calculation channels, and when ConvA and ConvB are added, the channel numbers of ConvA and ConvB must be equal. In the calculation process, convA only reserves the first two calculation channels, convB only reserves the last calculation channel, and because the input of the ConvC calculation channel needs to be consistent with the output of the ConvA and ConvB calculation channels, the third calculation channel in the four calculation channels is used as a target calculation channel, and the third calculation channel is cut so as to complete pruning processing of the residual connection structure.
In the embodiment of the application, the target calculation channel in the target convolution layer in the residual error connection structure is searched, and the target calculation channel is cut, so that the volume of the first model is compressed and the reasoning efficiency of the first model is improved on the premise of ensuring the reasoning effect of the first model after pruning.
As shown in fig. 7, in any of the foregoing embodiments, the determining a target computation channel in a target convolution layer corresponding to a plurality of target convolution layers by using the same residual connection structure includes:
step 702, determining a first computing channel in a target convolution layer, wherein the first computing channel is a computing channel required to be called by the target convolution layer;
step 704, determining the rest calculation channels except the first calculation channel in the target convolution layer as target calculation channels.
In this embodiment, all the remaining computation channels except the computation channel to be invoked in the target convolution layer are taken as target computation channels.
Specifically, when a residual connection structure includes a plurality of target convolution layers, calculation channels required to be called by the target convolution layers are required to be counted, calculation channels which are not required to be called by each target convolution layer are taken as target calculation channels, and the target calculation channels are cut.
Illustratively, the residual connection structure includes three target convolution layers, wherein two target convolution layers are upper convolution layers and the other target convolution layer is a lower convolution layer. The number of the calculation channels in the three target convolution layers is the same, first calculation channels which are required to be called by the three target convolution layers are found, and all the calculation channels except the first calculation channels in the plurality of target convolution layers are used as target calculation channels.
In the embodiment of the application, the residual error connection structure comprises a plurality of target convolution layers, the first calculation channels required to be called in the plurality of target convolution layers are counted, all calculation channels except the first calculation channels in the plurality of target convolution layers are all target calculation channels to be cut, the accuracy of pruning processing of the residual error connection structure is further ensured, and the accuracy of first model reasoning after pruning processing is ensured.
As shown in fig. 8, in any of the above embodiments, determining a first computation channel in a plurality of target convolutional layers includes:
step 802, performing mask processing on a second calculation channel required to be called in each target convolution layer to obtain a target channel mask;
in step 804, a first computing channel is determined based on the target channel mask.
In the embodiment of the present application, the first computation channel is a computation channel required to be invoked by a plurality of target convolution layers in the same residual connection structure. The second computation channel is the computation channel that each target convolutional layer needs to call.
In this embodiment, the target channel mask is obtained by performing mask processing on the second calculation channel to be called in each target convolution layer, and based on the target channel mask, the first calculation channel matching the target channel mask can be determined.
Specifically, the first channel masks are obtained by performing the masking process on the second computation channels in the plurality of target convolutional layers, respectively, the target channel masks can be obtained by performing the computation process on the plurality of first channel masks, and the first computation channels are obtained based on the target channel masks.
As shown in fig. 2, the residual connection structure includes three target convolution layers, respectively ConvA, convB and ConvC. Of these, convA, convB, convC all had 4 computation channels. In the calculation process, convA only reserves the first two calculation channels, the first channel mask corresponding to ConvA is [1, 0], convB only reserves the last calculation channel, the first channel mask corresponding to ConvB is [0, 1], and because the input of the calculation channels corresponding to ConvC needs to be consistent with the output of the calculation channels of ConvA and ConvB, the first channel mask corresponding to ConvC is the target channel mask, calculation is performed based on the two first channel masks, the target channel mask is [1, 0,1], and the channel with the mask of 0 in the four channels is the target calculation channel, and the channel with the mask of 1 is the first calculation channel.
In the embodiment of the application, the corresponding first channel mask is obtained by carrying out mask processing on the second calculation channels in the plurality of target convolution layers, and then the target channel mask corresponding to the plurality of convolution layers can be obtained based on the first channel mask, so that the first calculation channels required to be called by the plurality of convolution layers are obtained, and the step of searching the first calculation channels required to be called by the plurality of convolution layers is simplified.
In any of the foregoing embodiments, after clipping the target computing channel, the method further includes: and establishing a mapping relation between the target computing channel and the target convolution layer.
In this embodiment, after the target convolution layer is found and the clipping is completed on the target calculation channel in the target convolution layer, a mapping relationship is established between the target convolution layer and the target calculation channel, so that the target calculation channel is automatically clipped when the same target convolution layer is found in the next traversal.
Specifically, the step of establishing the mapping relation between the target convolution layer and the target calculation channel includes: and obtaining an operator identifier of the target convolution layer and channel information of the target calculation channel, and establishing a mapping relation among the operator identifier, the channel information and the target node.
Illustratively, the operator identification may target name information of the convolutional layer, etc., and the channel information may calculate a channel identification of the channel for the target.
Illustratively, a mapping relationship between a target convolution layer participating in the residual connection structure and a target computation channel is established, and is stored in a dictionary. When the model is pruned next time, all the convolution layers in the model are traversed, when the traversed target convolution layers are already located in the dictionary, the traversed target convolution layers are in a residual connection structure and serve as input of an add operator, another target convolution layer participating in the input of the add operator and a corresponding target calculation channel are searched, and therefore the cutting mode of the target convolution layers is determined.
In the embodiment of the application, after the target convolution layer is cut, a mapping relation is established between the cut target calculation channel and the target convolution layer, so that when the other models are cut later, the corresponding target calculation channel can be directly called, and the subsequent step of cutting the other models is simplified.
In any of the foregoing embodiments, searching for a target node in a computation graph corresponding to the first model includes: constructing a calculation graph corresponding to the first model; and searching residual nodes in the calculation graph based on the target operator information.
In this embodiment, before searching for the target node, the first model is expanded to obtain a computation graph matched with the first model, and the target node in the computation graph is searched according to the target operator information corresponding to the target operator node.
The target operator information is operator information corresponding to the target node.
Illustratively, the target operator information may be an operator node name of the target node, such as: add, i.e., an operator node named "add" is taken as the target node.
In the embodiment of the application, the first model is unfolded to form the computational graph, the inference structure of the first model is obtained, and corresponding target nodes are searched based on target operator information, so that the position of the target convolution layer corresponding to the residual connection structure in the computational graph is preliminarily determined.
In any of the foregoing embodiments, constructing a computation graph corresponding to the first model includes: operator information of at least two operators in the first model and sequence information between the at least two operators are obtained; and constructing a calculation graph based on the operator information and the sequence information.
In this embodiment, a specific procedure of expanding the first model into the computation graph is given, and the computation can accurately reflect the inference structure of the first model by taking the operators in the first model as nodes in the computation graph and the relationships between the operators in the first model as edges in the computation graph.
Specifically, operator information for each operator in the first model is extracted, as well as order information between each operator in the first model. And determining input/output relations among the plurality of operator nodes in the computational graph based on the sequence information among the plurality of operator information, thereby constructing the computational graph matched with the first model.
In the embodiment of the application, the computing graph matched with the first model can be constructed by extracting the operator information of the operators in the first model and the sequence information among the operators, so that the matching property of the computing graph and the first model is ensured.
In any of the above embodiments, the operator information includes any of the following: operator node name, operator calculation parameters, operator calculation results.
In this embodiment, the operator information includes operator node names, which are name information corresponding to each operator in the first model. The operator information includes operator calculation parameters, which are parameters capable of reflecting the calculation process of the operator in the calculation process. The operator information comprises an operator calculation result which is a result obtained after the operator performs calculation processing on the data.
In the embodiment of the application, any one of the operator node name, the operator calculation result and the operator calculation parameter is used as operator information, so that a corresponding operator node can be constructed for the calculation graph, and the matching property of the constructed calculation graph and the first model is further improved.
In any of the above embodiments, the first model comprises any of: an image classification model and a target recognition model.
In one embodiment according to the present application, as shown in fig. 12, a residual pruning device 1200 is provided, including:
the searching module 1202 is configured to search a target node in the computation graph corresponding to the first model, where the target node corresponds to a residual connection structure in the first model;
a determining module 1204, configured to determine a target convolution layer in the computation graph based on traversing the computation graph by the target node;
a processing module 1206, configured to perform pruning processing on the target convolutional layer.
In the embodiment of the application, the first model is converted into the calculation graph structure, the target node is found based on the calculation graph structure, the target convolution layer required to be subjected to pruning is found through the target node, the target convolution layer is automatically positioned, automatic pruning is performed on the positioned target convolution layer, the convolution layer connection relation of the residual error connection structure in the automatic acquisition model is realized, the user does not need to manually input the corresponding names of the convolution layers in the model, and the pruning processing accuracy of the model is ensured while the pruning processing efficiency of the model is improved.
In the above embodiment, the determining module 1204 is configured to traverse the computational graph along the input direction and the output direction of the target node, and determine a first operator structure corresponding to the target node;
the determining module 1204 is configured to determine, when the first operator structure matches with a preset operator structure, that the first operator structure is a target convolutional layer.
In the embodiment of the application, since the target node is an operator node in the target convolution layer, the calculation graph needs to be traversed according to the input direction and the output direction, a first operator structure comprising the target node can be searched through traversing the calculation graph, and when the searched first operator structure is matched with a preset operator structure, the first operator structure is judged to be the target convolution layer, so that the corresponding target convolution layer is automatically searched based on the target node.
In any of the above embodiments, the number of target nodes is at least two;
residual pruning apparatus 1200, further comprising:
the acquisition module is used for acquiring the arrangement sequence of at least two target nodes;
the processing module 1206 is configured to traverse the computation graph sequentially with at least two target nodes as starting points along the arrangement order.
In the embodiment of the application, when the calculation graph comprises a plurality of target nodes, the target nodes are traversed in sequence, the target convolution layer corresponding to each target node is found, and all target convolution layers needing pruning in the calculation graph can be found.
In any of the above embodiments, the determining module 1204 is configured to determine a target computation channel in the target convolutional layer;
a processing module 1206, configured to perform clipping processing on the target computing channel.
In the embodiment of the application, the target calculation channel in the target convolution layer in the residual error connection structure is searched, and the target calculation channel is cut, so that the volume of the first model is compressed and the reasoning efficiency of the first model is improved on the premise of ensuring the reasoning effect of the first model after pruning.
In any of the foregoing embodiments, the determining module 1204 is configured to determine a first computation channel in the target convolution layer, where the first computation channel is a computation channel that needs to be invoked by the target volume layer;
the determining module 1204 is configured to determine remaining computing channels except the first computing channel in the target convolutional layer as target computing channels.
In the embodiment of the application, the residual error connection structure comprises a plurality of target convolution layers, the first calculation channels required to be called in the plurality of target convolution layers are counted, all calculation channels except the first calculation channels in the plurality of target convolution layers are all target calculation channels to be cut, the accuracy of pruning processing of the residual error connection structure is further ensured, and the accuracy of first model reasoning after pruning processing is ensured.
In any of the foregoing embodiments, the processing module 1206 is configured to perform mask processing on the second computation channel to be invoked in each target convolutional layer, to obtain a target channel mask;
a determining module 1204 is configured to determine a first computing channel based on the target channel mask.
In the embodiment of the application, the corresponding first channel mask is obtained by carrying out mask processing on the second calculation channels in the plurality of target convolution layers, and then the target channel mask corresponding to the plurality of convolution layers can be obtained based on the first channel mask, so that the first calculation channels required to be called by the plurality of convolution layers are obtained, and the step of searching the first calculation channels required to be called by the plurality of convolution layers is simplified.
In any of the foregoing embodiments, the processing module 1206 is configured to establish a mapping relationship between the target computation channel and the target convolutional layer.
In the embodiment of the application, after the target convolution layer is cut, a mapping relation is established between the cut target calculation channel and the target convolution layer, so that when the other models are cut later, the corresponding target calculation channel can be directly called, and the subsequent step of cutting the other models is simplified.
In any of the foregoing embodiments, the processing module 1206 is configured to construct a computation graph corresponding to the first model;
a processing module 1206 is configured to find a residual node in the computation graph based on the target operator information.
In the embodiment of the application, the first model is unfolded to form the computational graph, the inference structure of the first model is obtained, and corresponding target nodes are searched based on target operator information, so that the position of the target convolution layer corresponding to the residual connection structure in the computational graph is preliminarily determined.
In any of the foregoing embodiments, the obtaining module is configured to obtain operator information of at least two operators in the first model, and order information between the at least two operators;
a processing module 1206 is configured to construct a computational graph based on the operator information and the order information.
In the embodiment of the application, the computing graph matched with the first model can be constructed by extracting the operator information of the operators in the first model and the sequence information among the operators, so that the matching property of the computing graph and the first model is ensured.
In any of the above embodiments, the operator information includes any of the following: operator node name, operator calculation parameters, operator calculation results.
In the embodiment of the application, any one of the operator node name, the operator calculation result and the operator calculation parameter is used as operator information, so that a corresponding operator node can be constructed for the calculation graph, and the matching property of the constructed calculation graph and the first model is further improved.
In one embodiment according to the present application, as shown in fig. 13, a residual pruning device 1300 is provided, including: a processor 1302 and a memory 1304, the memory 1304 having stored therein programs or instructions; the processor 1302 executes programs or instructions stored in the memory 1304 to implement the steps of the residual pruning method in any of the embodiments described above, so that the method has all the advantages of the residual pruning method in any of the embodiments described above, and will not be described in detail herein.
In an embodiment according to the present application, a computer program product is provided, which when executed by a processor, implements the steps of the residual pruning method in any of the above embodiments, so that all the beneficial technical effects of the residual pruning method in any of the above embodiments are provided, and will not be described in detail herein.
In an embodiment according to the present application, a readable storage medium is proposed, on which a program or instructions is stored which, when executed by a processor, implement the steps of the residual pruning method as in any of the embodiments described above. Therefore, the method has all the beneficial technical effects of the residual pruning method in any of the above embodiments, and will not be described in detail herein.
In one embodiment according to the present application, as shown in fig. 14, an electronic device 1400 is presented, comprising: the residual pruning device 1200 in any of the embodiments described above, and/or the computer program product 1402 in any of the embodiments described above, and/or the readable storage medium 1404 as defined in the fourth aspect described above, thus has all the advantageous technical effects of the residual pruning device 1200 in any of the embodiments described above, and/or the readable storage medium 1404 in any of the embodiments described above, and will not be repeated here.
It should be understood that, in the claims, the description, and the drawings of the present application, the term "plurality" refers to two or more, and unless otherwise explicitly defined, the orientation or positional relationship indicated by the terms "upper", "lower", etc. are based on the orientation or positional relationship shown in the drawings, merely for the convenience of describing the present application and making the description process easier, and not for the purpose of indicating or implying that the apparatus or element in question must have the particular orientation, configuration and operation described, of a particular orientation, and therefore such description should not be construed as limiting the present application; the terms "connected," "mounted," "secured," and the like are to be construed broadly, and may be, for example, a fixed connection between a plurality of objects, a removable connection between a plurality of objects, or an integral connection; the objects may be directly connected to each other or indirectly connected to each other through an intermediate medium. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art based on the above data.
The description of the terms "one embodiment," "some embodiments," "particular embodiments," and the like in the claims, specification, and drawings of this application mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in the embodiment or example of the application. In the claims, specification and drawings of this application, the schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing is merely a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and variations may be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (16)

1. A residual pruning method, comprising:
searching a target node in a calculation graph corresponding to a first model, wherein the target node corresponds to a residual error connection structure in the first model;
traversing the computational graph based on the target node, and determining a target convolution layer in the computational graph;
and pruning the target convolution layer.
2. The residual pruning method of claim 1, wherein said determining a target convolutional layer in the computational graph based on the target node traversing the computational graph comprises:
traversing the computational graph along the input direction and the output direction of the target node until a first operator structure corresponding to the target node is determined;
and under the condition that the first operator structure is matched with a preset operator structure, determining the first operator structure as the target convolution layer.
3. The residual pruning method according to claim 2, wherein the number of target nodes is at least two;
traversing the computational graph along the input direction and the output direction of the target node, and determining a first operator structure corresponding to the target node, including:
acquiring the arrangement sequence of at least two target nodes;
and traversing the calculation graph by taking at least two target nodes as starting points in sequence along the arrangement sequence.
4. A residual pruning method according to any one of claims 1 to 3, wherein said pruning said target convolutional layer comprises:
determining a target computing channel in the target convolution layer;
and cutting the target computing channel.
5. The residual pruning method according to claim 4, wherein the same residual connection corresponds to the number of the target convolutional layers;
the determining a target computation channel in the target convolution layer comprises:
determining a first calculation channel in a plurality of target convolution layers, wherein the first calculation channel is a calculation channel required to be called by the target volume layer;
and determining the rest calculation channels except the first calculation channel in the target convolution layers as target calculation channels.
6. The residual pruning method of claim 5, wherein said determining a first computation channel of a plurality of said target convolutional layers comprises:
performing mask processing on a second calculation channel required to be called in each target convolution layer to obtain a target channel mask;
the first computing channel is determined based on the target channel mask.
7. The residual pruning method according to claim 4, further comprising, after the clipping processing of the target computing channel:
and establishing a mapping relation between the target computing channel and the target convolution layer.
8. A method of residual pruning according to any one of claims 1 to 3, wherein said finding a target node in a computational graph corresponding to a first model comprises:
constructing the calculation map corresponding to the first model;
and searching for a target node in the computational graph based on the target operator information.
9. The method of residual pruning according to claim 8, wherein said constructing the computational graph corresponding to the first model comprises:
operator information of at least two operators in the first model and sequence information between the at least two operators are obtained;
and constructing the calculation graph based on the operator information and the sequence information.
10. The residual pruning method of claim 9, wherein the operator information comprises any one of: operator node name, operator calculation parameters, operator calculation results.
11. A residual pruning method according to any one of claims 1 to 3, wherein the first model comprises any one of the following: an image classification model and a target detection model.
12. A residual pruning device, comprising:
the searching module is used for searching a target node in the calculation graph corresponding to the first model, wherein the target node corresponds to the residual error connection structure in the first model;
a determining module, configured to determine a target convolutional layer in the computation graph based on the target node traversing the computation graph;
and the processing module is used for pruning the target convolution layer.
13. A residual pruning device, comprising:
a memory having stored thereon programs or instructions;
processor for implementing the steps of the residual pruning method according to any one of claims 1 to 11 when executing said program or instructions.
14. Computer program product, characterized in that it when executed by a processor implements the steps of the residual pruning method according to any one of claims 1 to 11.
15. A readable storage medium having stored thereon a program or instructions, which when executed by a processor, implement the steps of the residual pruning method according to any one of claims 1 to 11.
16. An electronic device, comprising:
residual pruning device according to claim 12 or 13; or (b)
The computer program product of claim 14;
the readable storage medium of claim 15.
CN202310213879.7A 2023-03-06 2023-03-06 Residual pruning method, residual pruning device, electronic equipment and readable storage medium Active CN116468100B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310213879.7A CN116468100B (en) 2023-03-06 2023-03-06 Residual pruning method, residual pruning device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310213879.7A CN116468100B (en) 2023-03-06 2023-03-06 Residual pruning method, residual pruning device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN116468100A true CN116468100A (en) 2023-07-21
CN116468100B CN116468100B (en) 2024-05-10

Family

ID=87181378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310213879.7A Active CN116468100B (en) 2023-03-06 2023-03-06 Residual pruning method, residual pruning device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN116468100B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190385059A1 (en) * 2018-05-23 2019-12-19 Tusimple, Inc. Method and Apparatus for Training Neural Network and Computer Server
US20200342360A1 (en) * 2018-06-08 2020-10-29 Tencent Technology (Shenzhen) Company Limited Image processing method and apparatus, and computer-readable medium, and electronic device
CN112001483A (en) * 2020-08-14 2020-11-27 广州市百果园信息技术有限公司 Method and device for pruning neural network model
CN112836751A (en) * 2021-02-03 2021-05-25 歌尔股份有限公司 Target detection method and device
CN113222138A (en) * 2021-04-25 2021-08-06 南京大学 Convolutional neural network compression method combining layer pruning and channel pruning
US20210256385A1 (en) * 2020-02-14 2021-08-19 Northeastern University Computer-implemented methods and systems for dnn weight pruning for real-time execution on mobile devices
US20210312240A1 (en) * 2020-11-30 2021-10-07 Beijing Baidu Netcom Science And Technology Co., Ltd. Header Model For Instance Segmentation, Instance Segmentation Model, Image Segmentation Method and Apparatus
CN114429208A (en) * 2022-01-21 2022-05-03 深圳市同为数码科技股份有限公司 Model compression method, device, equipment and medium based on residual structure pruning
CN114462582A (en) * 2022-02-25 2022-05-10 腾讯科技(深圳)有限公司 Data processing method, device and equipment based on convolutional neural network model
CN115660066A (en) * 2022-10-14 2023-01-31 桂林理工大学 Convolutional neural network pruning method based on distribution difference

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190385059A1 (en) * 2018-05-23 2019-12-19 Tusimple, Inc. Method and Apparatus for Training Neural Network and Computer Server
US20200342360A1 (en) * 2018-06-08 2020-10-29 Tencent Technology (Shenzhen) Company Limited Image processing method and apparatus, and computer-readable medium, and electronic device
US20210256385A1 (en) * 2020-02-14 2021-08-19 Northeastern University Computer-implemented methods and systems for dnn weight pruning for real-time execution on mobile devices
CN112001483A (en) * 2020-08-14 2020-11-27 广州市百果园信息技术有限公司 Method and device for pruning neural network model
US20210312240A1 (en) * 2020-11-30 2021-10-07 Beijing Baidu Netcom Science And Technology Co., Ltd. Header Model For Instance Segmentation, Instance Segmentation Model, Image Segmentation Method and Apparatus
CN112836751A (en) * 2021-02-03 2021-05-25 歌尔股份有限公司 Target detection method and device
CN113222138A (en) * 2021-04-25 2021-08-06 南京大学 Convolutional neural network compression method combining layer pruning and channel pruning
CN114429208A (en) * 2022-01-21 2022-05-03 深圳市同为数码科技股份有限公司 Model compression method, device, equipment and medium based on residual structure pruning
CN114462582A (en) * 2022-02-25 2022-05-10 腾讯科技(深圳)有限公司 Data processing method, device and equipment based on convolutional neural network model
CN115660066A (en) * 2022-10-14 2023-01-31 桂林理工大学 Convolutional neural network pruning method based on distribution difference

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KAI HUANG 等: "Acceleration-Aware Fine-Grained Channel Pruning for Deep Neural Networks via Residual Gating", IEEE, pages 1902 - 1915 *
张亚平 等: "基于YOLOv3的神经网络模型压缩与实现", 微纳电子与智能制造, pages 86 - 91 *
马治楠 等: "基于深层卷积神经网络的剪枝优化", 电子技术应用, pages 125 - 128 *

Also Published As

Publication number Publication date
CN116468100B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
Pham et al. Complete and accurate clone detection in graph-based models
US6185516B1 (en) Automata-theoretic verification of systems
CN108154198B (en) Knowledge base entity normalization method, system, terminal and computer readable storage medium
JP3787743B2 (en) Mesh generator for numerical analysis
US20040123254A1 (en) Model checking with layered localization reduction
CN109697500B (en) Data processing method and device, electronic equipment and storage medium
JPH08190587A (en) Simulation system for application process
CN110874528B (en) Text similarity obtaining method and device
CN113902034A (en) Vector road data change information identification and extraction method and device
CN116468100B (en) Residual pruning method, residual pruning device, electronic equipment and readable storage medium
CN114462582A (en) Data processing method, device and equipment based on convolutional neural network model
CN112148819A (en) Address recognition method and device combining RPA and AI
Asiler et al. Bb-graph: A subgraph isomorphism algorithm for efficiently querying big graph databases
CN112001483A (en) Method and device for pruning neural network model
CN112256637A (en) File management method and device based on abstract syntax tree and storage medium
CN113568987B (en) Training method and device for knowledge graph embedded model and computer equipment
CN114443911A (en) Graph data semantic analysis method, device and equipment and readable storage medium
CN115146022A (en) Computer-implemented method for keyword search in knowledge graph
CN114861934A (en) Model quantization method, device and equipment for machine learning model
CN115238655A (en) Json data editing method and device
CN110032366A (en) A kind of Code location method and device
CN115242612B (en) Fault diagnosis method and device
CN110083679B (en) Search request processing method and device, electronic equipment and storage medium
JPH0675998A (en) Data retrieving device
CN116578678A (en) File analysis system, analysis method, analysis device and storage medium thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant