CN111860820A - Neural network operator dividing method and device and dividing equipment - Google Patents

Neural network operator dividing method and device and dividing equipment Download PDF

Info

Publication number
CN111860820A
CN111860820A CN202010757561.1A CN202010757561A CN111860820A CN 111860820 A CN111860820 A CN 111860820A CN 202010757561 A CN202010757561 A CN 202010757561A CN 111860820 A CN111860820 A CN 111860820A
Authority
CN
China
Prior art keywords
neural network
operators
computational
operator
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010757561.1A
Other languages
Chinese (zh)
Inventor
戚海涛
李涵
吴欣洋
张爱飞
丁瑞强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Lynxi Technology Co Ltd
Original Assignee
Beijing Lynxi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Lynxi Technology Co Ltd filed Critical Beijing Lynxi Technology Co Ltd
Priority to CN202010757561.1A priority Critical patent/CN111860820A/en
Publication of CN111860820A publication Critical patent/CN111860820A/en
Priority to PCT/CN2021/109499 priority patent/WO2022022670A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method, a device and equipment for dividing a neural network operator, wherein the method comprises the following steps: acquiring attribute information of all operators in a neural network calculation graph; determining a processor corresponding to each operator according to the attribute information of the operator; dividing the neural network computational graph into one or more computational subgraphs, wherein each computational subgraph corresponds to one processor; and processing each computational subgraph through a processor corresponding to the computational subgraph. In the embodiment of the invention, the processor of the corresponding type can be selected according to the attribute information of the operator, and the selected processor is used for processing the operator, so that the processing speed of the neural network computation graph can be improved, and the problem of low processing speed of the neural network computation graph caused by the fact that the processor of the specific type cannot process some operators or the processing efficiency is low is solved.

Description

Neural network operator dividing method and device and dividing equipment
Technical Field
The embodiment of the invention relates to the technical field of network communication, in particular to a method, a device and equipment for dividing a neural network operator.
Background
The neural network is composed of a plurality of operators, and the operators can be regarded as operation functions, such as convolution, pooling (posing) and the like.
Although a Central Processing Unit (CPU) can run all operators, the computation load of convolution, full connection (1024 × 4096) and the like is large, and the CPU runs very slowly and has low efficiency. At this time, a dedicated Accelerated Processing Unit (APU) can be selected for operation. However, the APU often supports limited instructions, and some special operators cannot be operated by the APU.
Therefore, how to perform processor allocation of neural network operators is an urgent technical problem to be solved.
Disclosure of Invention
An object of the embodiments of the present invention is to provide a method, an apparatus, and a device for partitioning a neural network operator, so as to solve a problem how to perform processor allocation of the neural network operator.
In a first aspect, a method for partitioning a neural network operator is provided, including:
acquiring attribute information of all operators in a neural network calculation graph;
determining a processor corresponding to each operator according to the attribute information of the operator;
dividing the neural network computational graph into one or more computational subgraphs, wherein all operators in each computational subgraph correspond to one processor;
and processing each computational subgraph through a processor corresponding to the computational subgraph.
Optionally, the dividing the neural network computation graph into one or more computation subgraphs, where all operators in each computation subgraph correspond to one processor, includes:
and dividing the neural network computation graph into one or more computation subgraphs according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph, wherein all the operators in each computation subgraph correspond to one processor.
Optionally, dividing the neural network computation graph into one or more computation sub-graphs according to the attribute information of the operators and the association relationship between the operators in the neural network computation graph, including:
traversing each operator in the neural network computation graph according to the reverse direction of the processing sequence of each operator in the neural network computation graph;
and dividing the neural network computation graph into one or more computation subgraphs according to preset division rules according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph.
Optionally, the preset partitioning rule includes one or more of the following combinations:
if the attribute information of the operator as the child node is the same as that of the operator as the father node, dividing the two operators into the same computational subgraph;
if one computational subgraph has a plurality of operators as father nodes and the attribute information of the operators as the father nodes is different, the computational subgraph is divided separately for processing;
if the father node of a child node has a plurality of child nodes and the child node is not the last child branch node of the father node, the father node is not traversed, and if the child node is the last child branch node of the father node, the father node is traversed.
Optionally, the processing, by the processor corresponding to the computation subgraph, each computation subgraph includes:
determining the incidence relation among all computation subgraphs according to the incidence relation among all operators in the neural network computation graph;
obtaining a processing sequence of each computational subgraph according to the incidence relation among the computational subgraphs;
and processing each computation subgraph through a corresponding processor according to the processing sequence.
Optionally, the attribute information of the operator includes one or more of the following combinations: the calculation amount, the operator function and the operator parameter.
In a second aspect, an apparatus for partitioning a neural network operator is provided, including:
the acquisition module is used for acquiring attribute information of all operators in the neural network calculation graph;
the determining module is used for determining the processor corresponding to each operator according to the attribute information of the operator;
the dividing module is used for dividing the neural network computational graph into one or more computational subgraphs according to the types of the processors, and each computational subgraph corresponds to one type of processor;
and the processing module is used for processing each computational subgraph through the processor corresponding to the computational subgraph.
Optionally, the dividing module is further configured to: and dividing the neural network computation graph into one or more computation subgraphs according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph, wherein all the operators in each computation subgraph correspond to one processor.
Optionally, the processing module is further configured to: determining the incidence relation among all computation subgraphs according to the incidence relation among all operators in the neural network computation graph; obtaining a processing sequence of each computational subgraph according to the incidence relation among the computational subgraphs; and processing each computation subgraph through a corresponding processor according to the processing sequence.
Optionally, the attribute information of the operator includes one or more of the following combinations: the calculation amount, the operator function and the operator parameter.
In a third aspect, a partitioning device for neural network operators is provided, including: a processor, a memory and a program stored on the memory and executable on the processor, which program, when executed by the processor, implements the steps of the method of partitioning neural network operators according to the first aspect.
In a fourth aspect, a readable storage medium is provided, on which a program is stored, which when executed by a processor implements the steps of the method for partitioning neural network operators according to the first aspect.
In the embodiment of the invention, the processor of the corresponding type can be selected according to the attribute information of the operator, and the selected processor is used for processing the operator, so that the processing speed of the neural network computation graph can be improved, and the problem of low processing speed of the neural network computation graph caused by the fact that the processor of the specific type cannot process some operators or the processing efficiency is low is solved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flowchart of a method for partitioning neural network operators according to an embodiment of the present invention;
FIG. 2 is a second flowchart of a method for partitioning neural network operators according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the partitioning of neural network operators according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an apparatus for partitioning neural network operators according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a partitioning device for neural network operators according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "comprises," "comprising," or any other variation thereof, in the description and claims of this application, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Furthermore, the use of "and/or" in the specification and claims means that at least one of the connected objects, such as a and/or B, means that three cases, a alone, B alone, and both a and B, exist.
In the embodiments of the present invention, words such as "exemplary" or "for example" are used to mean serving as examples, illustrations or descriptions. Any embodiment or design described as "exemplary" or "e.g.," an embodiment of the present invention is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.
Referring to fig. 1, an embodiment of the present invention provides a method for partitioning a neural network operator, including: step 101, step 102, step 103 and step 104.
Step 101: acquiring attribute information of all operators in a neural network computation graph (computational graph).
It is understood that the neural network may be a computational network consisting of Operators (OPs). Alternatively, the neural network is a tree-structured neural network model, but it is understood that the specific form of the neural network model is not limited.
Optionally, the attribute information of the operator includes one or more of the following combinations: the calculation amount, the operator function and the operator parameter.
Step 102: and determining the processor corresponding to each operator according to the attribute information of the operator.
In the embodiment of the present invention, the types of the processors (which may also be referred to as Processing units) may include a Central Processing Unit (CPU), an Accelerated Processing Unit (APU), a Graphics Processing Unit (GPU), a Tensor Processing Unit (TPU), and the like, but are not limited thereto.
For example, the attribute information of operator 1 includes one or more of the following combinations: the calculated quantity A, the operator function A and the operator parameter A, and then the processor 1 (such as a central processing unit) corresponding to the operator 1; the attribute information of operator 2 includes one or more of the following combinations: calculating the quantity B, the operator function B and the operator parameter B, and then the processor 2 (such as an acceleration processor) corresponding to the operator 2;
step 103: and dividing the neural network computation graph into one or more computation subgraphs, wherein all operators in each computation subgraph correspond to one processor.
That is, each computation subgraph may correspond to a processor, for example, the neural network computation graph is divided into computation subgraph 1 and computation subgraph 2, where all operators in computation subgraph 1 correspond to CPUs, that is, all operators in computation subgraph 1 may be processed by CPUs at the same time, and all operators in computation subgraph 2 correspond to APUs.
It can be understood that, in the embodiment of the present invention, the number of the computation graph divided into the computation subgraphs is not limited.
Step 104: and processing each computational subgraph through a processor corresponding to the computational subgraph.
That is, each of the computation subgraphs is compiled by a processor corresponding to the computation subgraph, or each of the computation subgraphs is compiled and run by a processor corresponding to the computation subgraph.
In the embodiment of the invention, the processor of the corresponding type can be selected according to the attribute information of the operator, and the selected processor is used for processing the operator, so that the processing speed of the neural network computation graph can be improved, and the problem of low processing speed of the neural network computation graph caused by the fact that the processor of the specific type cannot process some operators or the processing efficiency is low is solved.
Referring to fig. 2, an embodiment of the present invention provides a method for partitioning a neural network operator, including: step 201 to step 205.
Step 201: and acquiring attribute information of all operators in the neural network computation graph.
Step 202: and dividing the neural network computation graph into one or more computation subgraphs according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph, wherein all the operators in each computation subgraph correspond to one processor.
The association relationship includes: the hierarchical relation of each operator in the neural network and/or the processing sequence of each operator.
Taking the neural network as a tree neural network as an example, the association relationship may include: parent-child node relationships, sibling node relationships, and the like.
It is understood that in the node tree, a parent node owns a child node. The children nodes of the same level are called siblings (brothers or sisters). The top node is called the root. Each node has a parent, except for the root (which has no parent).
The processing sequence of the operators refers to the compiling sequence or the compiling and running sequence of each operator in the neural network computation graph.
Optionally, traversing each operator in the neural network computation graph from bottom to top and from left to right according to the reverse direction of the processing sequence of each operator in the neural network computation graph, and dividing the neural network computation graph into one or more computation subgraphs according to a preset division rule according to the attribute information of the operator and the association relationship between each operator in the neural network computation graph.
Optionally, the partitioning rule includes one or more of the following combinations:
rule 1: if the attribute information of the operator as the child node is the same as that of the operator as the father node, dividing the two operators into the same computational subgraph;
rule 2: if one computational subgraph has a plurality of operators as father nodes and the attribute information of the operators as the father nodes is not completely the same, the computational subgraph is divided separately for processing;
rule 3: if the parent node of a child node has a plurality of child nodes and the child node is not the last child branch node of the parent node, the parent node is not traversed, and if the child node is the last child branch node of the parent node, the parent node is traversed.
It is understood that, in the embodiment of the present invention, whether to divide the neural network computational graph into one computational subgraph or multiple computational subgraphs may be determined based on the above-mentioned division rule.
For example, the neural network computational graph includes a plurality of branches, and the attribute information of the operators of the parent node and the operators of the child nodes on each branch are the same, so that the neural network computational graph can be divided into a computational subgraph, that is, the computational subgraph is the neural network computational graph. For example, the neural network computational graph includes a plurality of branches, and the attribute information of the operator of the parent node and the operator of the child node on each branch are not completely the same, the neural network computational graph may be divided into a plurality of computational subgraphs based on the above-mentioned division rule.
Step 203: and determining the association relation among all computation subgraphs according to the association relation among all operators in the neural network computation graph.
The incidence relation among the computation subgraphs is used for representing the hierarchical relation of the computation subgraphs in the computation subgraphs, and the compiling order, or the compiling and running order of each computation subgraph can be determined based on the incidence relation among the computation subgraphs.
Step 204: and obtaining the processing sequence of each computational subgraph according to the association relation among the computational subgraphs.
Step 205: and processing each computation subgraph through a processor of a corresponding type according to the processing sequence.
In the embodiment of the invention, the operators with the same attribute information are divided according to the computation subgraph regions, and the processing is carried out according to the incidence relation of each computation subgraph, so that the accuracy of operator operation in the neural network can be ensured, and the efficiency of operation processing can be improved.
Referring to fig. 3, for example, according to steps 201 and 202, operator N7 is processed by the CPU and the other operators (N1 to N6, and N8 and N9) are processed by the APUs, and all the operators in the network need to be divided.
N6 is the same as N5 (parent node of N6) in attribute information (or referred to as type) (rule 1), and N5 and N6 are drawn as the same computational subgraph, defined as computational subgraph a;
n4 is of the same type as N5 (rule 1), with N4 and (N5+ N6) also drawn into the A computation subgraph;
the type of the processor corresponding to N3 is the same as that of the processor corresponding to the computational sub-graph A, and N3 and (N4+ N5+ N6) are divided into the computational sub-graph A;
the parent node N2 of N3 has multiple children, and N3 is not the last child node of N2, then the node N2 is not traversed;
traversing the branch N6-N8-N7-N2, wherein the type of the processor corresponding to N8 is the same as that of the processor corresponding to the computational sub-graph A, and dividing N8 and (N3+ N4+ N5+ N6) into the computational sub-graph A;
the type of the processor corresponding to N7 is different from that of the processor corresponding to the computational sub-graph A, and a computational sub-graph B is divided separately;
the parent node N2 of N7 has multiple children, and N7 is not the last child node of N2, then the node N2 is not traversed;
traversing the branch N8-N9-N2;
the computational sub-graph a has three sources, N2, N7, and N9, and the three sources are of different types (N7 is computational sub-graph B), so the type of the processor corresponding to N9 is the same as the type of the processor corresponding to computational sub-graph a, but is not divided into computational sub-graph a, and N9 is divided into computational sub-graph C;
since N9 is the last child node of N2, the N2 node is traversed, and since the kinds of processors corresponding to the N2 node are the same, the N2 is drawn into the computational sub-graph C.
In the embodiment of the invention, the corresponding type processor can be selected according to the attribute information of the operator, and the selected processor is used for processing the operator, so that the processing speed of the neural network computation graph can be improved, and the problem that the processing speed of the neural network computation graph is low due to the fact that a certain type of processor cannot process some operators or the processing efficiency is low is solved.
Referring to fig. 4, an embodiment of the present invention provides an apparatus for partitioning a neural network operator, where the apparatus 400 includes:
an obtaining module 401, configured to obtain attribute information of all operators in the neural network computation graph;
optionally, the attribute information of the operator includes one or more of the following combinations: the calculation amount, the operator function and the operator parameter.
A determining module 402, configured to determine, according to the attribute information of the operators, a processor corresponding to each operator;
a dividing module 403, configured to divide the neural network computational graph into one or more computational subgraphs according to the types of the processors, where each computational subgraph corresponds to one type of processor.
And the processing module 404 is configured to process each computational subgraph through a processor corresponding to the computational subgraph.
In some embodiments, the partitioning module 403 is further configured to: and dividing the neural network computation graph into one or more computation subgraphs according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph, wherein all the operators in each computation subgraph correspond to one processor.
In some embodiments, the partitioning module 403 is further configured to: traversing each operator in the neural network computation graph according to the reverse direction of the processing sequence of each operator in the neural network computation graph; and dividing the neural network computation graph into one or more computation subgraphs according to preset division rules according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph.
Optionally, the preset partitioning rule includes one or more of the following combinations:
if the attribute information of the operator as the child node is the same as that of the operator as the father node, dividing the two operators into the same computational subgraph;
if one computational subgraph has a plurality of operators as father nodes and the attribute information of the operators as the father nodes is different, the computational subgraph is divided separately for processing;
if the father node of a child node has a plurality of child nodes and the child node is not the last child branch node of the father node, the father node is not traversed, and if the child node is the last child branch node of the father node, the father node is traversed.
In some embodiments, the processing module 403 is further configured to: determining the incidence relation among all computation subgraphs according to the incidence relation among all operators in the neural network computation graph; obtaining a processing sequence of each computational subgraph according to the incidence relation among the computational subgraphs; and processing each computation subgraph through a corresponding processor according to the processing sequence.
The partitioning device for neural network operators provided by the embodiment of the present invention can implement each process implemented by the method embodiments shown in fig. 1 and fig. 2, and achieve the same technical effect, and is not described here again to avoid repetition.
As shown in fig. 5, an embodiment of the present application further provides a dividing apparatus for a neural network operator, where the dividing apparatus 500 includes a processor 501, a memory 502, and a program or an instruction stored in the memory 502 and capable of being executed on the processor 501, and when the program or the instruction is executed by the processor 501, the program or the instruction implements each process of the method embodiment shown in fig. 1 or fig. 2, and can achieve the same technical effect, and is not described herein again to avoid repetition.
An embodiment of the present invention further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the method embodiment shown in fig. 1 or fig. 2, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the embodiments of the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (12)

1. A method for partitioning neural network operators, comprising:
acquiring attribute information of all operators in a neural network calculation graph;
determining a processor corresponding to each operator according to the attribute information of the operator;
dividing the neural network computational graph into one or more computational subgraphs, wherein all operators in each computational subgraph correspond to one processor;
and processing each computational subgraph through a processor corresponding to the computational subgraph.
2. The method of claim 1, wherein the partitioning of the neural network computational graph into one or more computational subgraphs, all operators in each computational subgraph corresponding to a processor, comprises:
and dividing the neural network computation graph into one or more computation subgraphs according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph, wherein all the operators in each computation subgraph correspond to one processor.
3. The method according to claim 2, wherein dividing the neural network computation graph into one or more computation subgraphs according to the attribute information of the operators and the association relationship between the operators in the neural network computation graph comprises:
traversing each operator in the neural network computation graph according to the reverse direction of the processing sequence of each operator in the neural network computation graph;
and dividing the neural network computation graph into one or more computation subgraphs according to preset division rules according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph.
4. The method of claim 3, wherein the preset partitioning rule comprises one or more of the following combinations:
if the attribute information of the operator as the child node is the same as that of the operator as the father node, dividing the two operators into the same computational subgraph;
if one computational subgraph has a plurality of operators as father nodes and the attribute information of the operators as the father nodes is different, the computational subgraph is divided separately for processing;
if the father node of a child node has a plurality of child nodes and the child node is not the last child branch node of the father node, the father node is not traversed, and if the child node is the last child branch node of the father node, the father node is traversed.
5. The method of claim 1, wherein the processing each of the computational subgraphs by a corresponding processor of the computational subgraphs comprises:
determining the incidence relation among all computation subgraphs according to the incidence relation among all operators in the neural network computation graph;
obtaining a processing sequence of each computational subgraph according to the incidence relation among the computational subgraphs;
and processing each computation subgraph through a corresponding processor according to the processing sequence.
6. The method according to any of claims 1-5, wherein the attribute information of the operator comprises one or more of the following combinations: the calculation amount, the operator function and the operator parameter.
7. An apparatus for partitioning neural network operators, comprising:
the acquisition module is used for acquiring attribute information of all operators in the neural network calculation graph;
the determining module is used for determining the processor corresponding to each operator according to the attribute information of the operator;
the dividing module is used for dividing the neural network computational graph into one or more computational subgraphs according to the types of the processors, and each computational subgraph corresponds to one type of processor;
and the processing module is used for processing each computational subgraph through the processor corresponding to the computational subgraph.
8. The apparatus of claim 7, wherein the partitioning module is further configured to: and dividing the neural network computation graph into one or more computation subgraphs according to the attribute information of the operators and the incidence relation between the operators in the neural network computation graph, wherein all the operators in each computation subgraph correspond to one processor.
9. The apparatus of claim 7, wherein the processing module is further configured to: determining the incidence relation among all computation subgraphs according to the incidence relation among all operators in the neural network computation graph; obtaining a processing sequence of each computational subgraph according to the incidence relation among the computational subgraphs; and processing each computation subgraph through a corresponding processor according to the processing sequence.
10. The apparatus according to any of claims 7-9, wherein the attribute information of the operator comprises one or more of the following combinations: the calculation amount, the operator function and the operator parameter.
11. An apparatus for partitioning neural network operators, comprising: processor, memory and program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the method of partitioning a neural network operator as claimed in any one of claims 1 to 6.
12. A readable storage medium, characterized in that the readable storage medium has stored thereon a program which, when being executed by a processor, realizes the steps of the method of partitioning neural network operators according to any one of claims 1 to 6.
CN202010757561.1A 2020-07-31 2020-07-31 Neural network operator dividing method and device and dividing equipment Pending CN111860820A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010757561.1A CN111860820A (en) 2020-07-31 2020-07-31 Neural network operator dividing method and device and dividing equipment
PCT/CN2021/109499 WO2022022670A1 (en) 2020-07-31 2021-07-30 Neural network computation graph processing method and apparatus, and processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010757561.1A CN111860820A (en) 2020-07-31 2020-07-31 Neural network operator dividing method and device and dividing equipment

Publications (1)

Publication Number Publication Date
CN111860820A true CN111860820A (en) 2020-10-30

Family

ID=72953458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010757561.1A Pending CN111860820A (en) 2020-07-31 2020-07-31 Neural network operator dividing method and device and dividing equipment

Country Status (2)

Country Link
CN (1) CN111860820A (en)
WO (1) WO2022022670A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112947933A (en) * 2021-02-24 2021-06-11 上海商汤智能科技有限公司 Operator execution method and device, computer equipment and storage medium
CN113051080A (en) * 2021-04-22 2021-06-29 杭州海康威视数字技术股份有限公司 Computation graph execution method and device and heterogeneous platform
WO2022022670A1 (en) * 2020-07-31 2022-02-03 北京灵汐科技有限公司 Neural network computation graph processing method and apparatus, and processing device
WO2022127603A1 (en) * 2020-12-14 2022-06-23 华为技术有限公司 Model processing method and related device
CN114691330A (en) * 2022-03-28 2022-07-01 北京百度网讯科技有限公司 Data processing method, data processing device, electronic equipment and storage medium
CN114819084A (en) * 2022-04-26 2022-07-29 北京百度网讯科技有限公司 Model reasoning method, device, equipment and storage medium
CN115358379A (en) * 2022-10-20 2022-11-18 腾讯科技(深圳)有限公司 Neural network processing method, neural network processing device, information processing method, information processing device and computer equipment
WO2023116312A1 (en) * 2021-12-24 2023-06-29 Oppo广东移动通信有限公司 Data processing method and apparatus, and computer device and storage medium
WO2023125628A1 (en) * 2021-12-31 2023-07-06 华为技术有限公司 Neural network model optimization method and apparatus, and computing device
WO2024022046A1 (en) * 2022-07-28 2024-02-01 华为技术有限公司 Deep learning system and method

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114816752A (en) * 2022-04-26 2022-07-29 山东云海国创云计算装备产业创新中心有限公司 Memory management method, system, equipment and computer readable storage medium
CN115268877B (en) * 2022-09-27 2022-12-13 之江实验室 Intermediate representation method and device for parallel execution of graph computation
US11782723B1 (en) 2022-09-27 2023-10-10 Zhejiang Lab Intermediate representation method and apparatus for parallel execution of graph computation
CN115796228B (en) * 2022-11-15 2024-04-05 北京百度网讯科技有限公司 Operator fusion method, device, equipment and storage medium
CN117576125B (en) * 2024-01-16 2024-04-16 芯瞳半导体技术(山东)有限公司 Neural network calculation graph segmentation method, device, equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108292241B (en) * 2015-10-28 2022-05-24 谷歌有限责任公司 Processing a computation graph
CN111461296B (en) * 2018-12-29 2023-09-22 中科寒武纪科技股份有限公司 Data processing method, electronic device, and readable storage medium
CN110689115B (en) * 2019-09-24 2023-03-31 安徽寒武纪信息科技有限公司 Neural network model processing method and device, computer equipment and storage medium
CN111062467B (en) * 2019-12-18 2023-05-12 开放智能机器(上海)有限公司 Automatic neural network subgraph segmentation method applied to AI heterogeneous compiler
CN111860820A (en) * 2020-07-31 2020-10-30 北京灵汐科技有限公司 Neural network operator dividing method and device and dividing equipment

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022022670A1 (en) * 2020-07-31 2022-02-03 北京灵汐科技有限公司 Neural network computation graph processing method and apparatus, and processing device
WO2022127603A1 (en) * 2020-12-14 2022-06-23 华为技术有限公司 Model processing method and related device
CN112947933A (en) * 2021-02-24 2021-06-11 上海商汤智能科技有限公司 Operator execution method and device, computer equipment and storage medium
CN113051080A (en) * 2021-04-22 2021-06-29 杭州海康威视数字技术股份有限公司 Computation graph execution method and device and heterogeneous platform
WO2023116312A1 (en) * 2021-12-24 2023-06-29 Oppo广东移动通信有限公司 Data processing method and apparatus, and computer device and storage medium
WO2023125628A1 (en) * 2021-12-31 2023-07-06 华为技术有限公司 Neural network model optimization method and apparatus, and computing device
CN114691330A (en) * 2022-03-28 2022-07-01 北京百度网讯科技有限公司 Data processing method, data processing device, electronic equipment and storage medium
CN114819084A (en) * 2022-04-26 2022-07-29 北京百度网讯科技有限公司 Model reasoning method, device, equipment and storage medium
CN114819084B (en) * 2022-04-26 2024-03-01 北京百度网讯科技有限公司 Model reasoning method, device, equipment and storage medium
WO2024022046A1 (en) * 2022-07-28 2024-02-01 华为技术有限公司 Deep learning system and method
CN115358379A (en) * 2022-10-20 2022-11-18 腾讯科技(深圳)有限公司 Neural network processing method, neural network processing device, information processing method, information processing device and computer equipment
CN115358379B (en) * 2022-10-20 2023-01-10 腾讯科技(深圳)有限公司 Neural network processing method, neural network processing device, information processing method, information processing device and computer equipment

Also Published As

Publication number Publication date
WO2022022670A1 (en) 2022-02-03

Similar Documents

Publication Publication Date Title
CN111860820A (en) Neural network operator dividing method and device and dividing equipment
CN108829610B (en) Memory management method and device in neural network forward computing process
Malapert et al. A constraint programming approach for a batch processing problem with non-identical job sizes
CN109582716A (en) Data visualization treating method and apparatus
US10671607B2 (en) Pipeline dependent tree query optimizer and scheduler
CN111104111A (en) Layout processing method and device for tree Canvas
CN116467061B (en) Task execution method and device, storage medium and electronic equipment
CN109684005A (en) Component similarity determines method and device in graphical interfaces
CN111191778B (en) Deep learning network processing method, device and compiler
CN104268243A (en) Position data processing method and device
CN113887396A (en) Image processing method and device, computer equipment and storage medium
CN105335135B (en) Data processing method and central node
Aparicio et al. A scalable parallel approach for subgraph census computation
CN111626311A (en) Heterogeneous graph data processing method and device
KR102326586B1 (en) Method and apparatus for processing large-scale distributed matrix product
CN115374914B (en) Distributed training method, parallel deep learning framework and electronic equipment
CN116382658A (en) Compiling method and device of AI model, computer equipment and storage medium
CN116933841A (en) Operator fusion method and device, electronic equipment and computer readable medium
CN112000478B (en) Method and device for distributing operation resources
CN113051080A (en) Computation graph execution method and device and heterogeneous platform
CN112783441A (en) Method and device for adjusting read-write speed limit of virtual machine disk and computing equipment
CN116484768B (en) System dynamics model construction method and device
KR102488614B1 (en) Method, apparatus and computer program for managing virtualized resources
CN109582295A (en) A kind of data processing method, device, storage medium and processor
CN114240729A (en) Point cloud clustering GPU optimization method and device based on graph structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination