CN111062467B

CN111062467B - Automatic neural network subgraph segmentation method applied to AI heterogeneous compiler

Info

Publication number: CN111062467B
Application number: CN201911312625.0A
Authority: CN
Inventors: 黄明飞; 王海涛; 吕春莹
Original assignee: Open Intelligent Machine Shanghai Co ltd
Current assignee: Open Intelligent Machine Shanghai Co ltd
Priority date: 2019-12-18
Filing date: 2019-12-18
Publication date: 2023-05-12
Anticipated expiration: 2039-12-18
Also published as: CN111062467A

Abstract

The invention discloses an automatic neural network subgraph segmentation method applied to an AI heterogeneous compiler, which comprises the steps of creating a node list; numbering all nodes of the neural network computational graph in advance, and recording the input dependency relationship and the output dependency relationship of each node; acquiring any node without any input dependency relationship from a neural network computational graph; deleting the input nodes without the dependency and all connecting edges taking the input nodes without the dependency as starting points, and storing the numbers of the deleted nodes into a node list; repeating the step S2 and the step S3 until the neural network calculation graph is empty, and storing the number of the deleted node into a node list; and automatically segmenting according to the node attribute colors in the node list to obtain each neural network calculation subgraph through processing. The beneficial effects are that: the method is suitable for the neural network computational graph with any complex structure, and solves the problems that the neural network computational graph is not good for splitting sub-graphs and different computational units are not good for processing sub-graphs.

Description

Automatic neural network subgraph segmentation method applied to AI heterogeneous compiler

Technical Field

The invention relates to the technical field of AI (Artificial Intelligence ), in particular to an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler.

Background

With the continuous development of AI technology, a neural network algorithm based on deep learning has become a mainstream mode of AI research at present. In consideration of cost, power consumption, privacy and the like, more and more application scenes migrate the calculation of the AI algorithm from the cloud to the mobile embedded terminal equipment.

The current embedded device chip is often a plurality of computing units simultaneously, and besides general-purpose CPUs (Central Processing Unit/processors, central processing units), there are special GPUs (Graphics Processing Unit, graphics processors) for accelerating AI algorithms, NPUs (network process units, network processors), FPGAs (Field-Programmable GateArray, field programmable gate arrays) and other AI accelerators. The CPU has good universality and programmability, but the performance and energy efficiency ratio are poor, the AI accelerator has high performance and energy efficiency ratio, but the programmability is poor, some AI accelerators are not programmable, some AI accelerators are programmable, the programming complex development period is long, and some AI accelerators only support shaping calculation, so that the network precision loss is large.

By combining the characteristics of a CPU and an AI accelerator, by carrying out sub-graph segmentation on a neural network, calculating a computationally intensive operator on the AI accelerator (such as a Convolution operator Convolvulation Op, a full-connected operator Fully-connected Op and the like), and calculating some operators with complex calculation logic or operators unsuitable for shaping calculation on the CPU (such as operators for pre-processing and post-processing of a detection network and the like), the method is a solution of the current AI heterogeneous calculation.

In view of this, how to automatically segment the neural network computation graph to realize heterogeneous computation is a problem to be solved at present. The existing schemes are only suitable for simple operator chained neural network computational graphs, and more neural network computational graphs contain more complex operator connection dependency relationships, such as a concat operator needs a plurality of inputs, such as a residual module in a residual network, and all involve non-chained neural network computational graphs. How to find a calculation method capable of carrying out sub-graph segmentation on any neural network calculation graph is a great challenge to be solved by an AI heterogeneous compiler.

The problem to be solved at present is to find out an algorithm for automatically dividing sub-graphs, and divide operator nodes deployed to different computing units in a neural network computing graph into different sub-graphs.

Disclosure of Invention

Aiming at the problems in the prior art, an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler is provided.

The specific technical scheme is as follows:

an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler is used for segmenting operator nodes deployed to different computing units in a neural network computation graph into different neural network computation subgraphs, and comprises the following steps:

step S1, creating a node list;

s2, numbering all nodes of the neural network computational graph in advance, and recording the input dependency relationship and the output dependency relationship of each node;

s3, acquiring any node without any input dependency relationship from the neural network calculation graph;

s4, deleting the node without the input dependency and all connecting edges taking the node without the input dependency as a starting point, and storing the number of the deleted node into the node list;

step S5, repeating the step S2 and the step S3 until the neural network calculation graph is empty, and simultaneously storing the deleted number of the node into the node list;

and S6, acquiring the node list, automatically segmenting according to the attribute colors of the nodes in the node list, and acquiring the input dependency relationship and the output dependency relationship related to the nodes to process and obtain each neural network calculation subgraph.

Preferably, before the step S6, the method further includes: judging whether the neural network calculation graph is empty or not;

if not, turning to the step S3;

if yes, turning to the step S6.

Preferably, the step S6 includes the steps of:

step S60, creating a sub-graph queue warehouse to store sub-graph node queues;

step S61, acquiring a current first node from the node list;

step S62, judging whether the color of the last sub-graph node queue in the sub-graph queue warehouse is the same as the color of the current first node in the node list;

if yes, go to step S63;

if not, turning to step S64;

step S63, adding the current first node in the node list to the subgraph queue warehouse;

step S64, a new sub-graph node queue is newly built in the sub-graph queue warehouse, and a current first node in the node list is used as a first node of the new sub-graph node queue;

step S65, repeating the steps S61-S62 until the node list is empty;

and step S66, acquiring each sub-graph node queue from the sub-graph queue warehouse, and connecting the nodes of each sub-graph node queue according to the input dependency relationship and the output dependency relationship of each node to obtain each neural network calculation sub-graph.

Preferably, before the step S61, the method further includes:

judging whether the node list is empty or not;

if not, turning to the step S61;

if yes, go to step S66.

Preferably, in the step S64, the color of the new child node queue is the same as the color of the current first node in the node list.

Preferably, in the method for automatically splitting the neural network subgraph, the neural network computation graph is split into a plurality of the neural network computation subgraphs, and is deployed on different computation units for computation.

Preferably, the different computing units include a central processor, a graphics processor, and a digital signal processor.

The invention provides an automatic neural network subgraph segmentation method applied to an AI heterogeneous compiler, which has the beneficial effects that:

(1) According to the input dependency and the output dependency of the nodes of the neural network calculation graph, the nodes of the neural network calculation graph are automatically arranged, so that each neural network calculation subgraph is obtained, the problem that the subgraph is not well segmented due to a complex dependency structure in the neural network calculation graph is solved, the running sequence dependency of the nodes is ensured, and the method is applicable to the neural network calculation graph with any complex structure;

(2) The neural network computing graph is divided into a plurality of neural network computing subgraphs and is deployed on different computing units for computing, so that the problem of subgraphs of the neural network computing graph of the different computing units is solved, the method is applicable to not only the situations of two computing units but also the situations of a plurality of computing units, and the method is good in universality, so that the optimal computing force of the whole hardware chip is fully exerted, and the efficient heterogeneous computing of an AI algorithm in actual hardware is completed.

Drawings

Embodiments of the present invention will now be described more fully with reference to the accompanying drawings. The drawings, however, are for illustration and description only and are not intended as a definition of the limits of the invention.

FIG. 1 is a schematic diagram of a neural network computational graph in the prior art;

FIG. 2 is a flowchart illustrating steps of a method for automatically slicing a neural network subgraph applied to an AI heterogeneous compiler according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating steps of an embodiment one of a method for automatically slicing a neural network subgraph applied to an AI heterogeneous compiler according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a second exemplary embodiment of a method for automatically slicing a neural network subgraph applied to an AI heterogeneous compiler;

FIG. 5 is a graph of a division of each neural network computation subgraph of a neural network computation graph of an embodiment II of an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler according to an embodiment of the present invention;

fig. 6 is a flowchart illustrating a third exemplary embodiment of a method for automatically slicing a neural network subgraph applied to an AI-heterogeneous compiler.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.

The invention is further described below with reference to the drawings and specific examples, which are not intended to be limiting.

In the prior art, the neural network computational graph may be considered a directed acyclic graph, as shown in FIG. 1, with each node being an operator and each arrow representing a relationship between operators. Assuming that the neural network computation graph is executed by two devices, i.e., a central processor or a graphics processor, in fig. 1, node 2, node 3, node 5, and node 6 may respectively use the central processor to compute the processing, and node 4 may use the graphics processor to compute the processing.

While the above described neural network computational graph that is suitable for simple operator chaining only involves more complex operator join dependencies for more and more neural network computational graphs, such as the concat operator requires multiple inputs, such as the residual module in the residual network, both involve non-chained neural network computational graphs. How to find a calculation method capable of carrying out sub-graph segmentation on any neural network calculation graph is a great challenge to be solved by an AI heterogeneous compiler.

Aiming at the defects in the prior art, the problem to be solved at present is to find an algorithm for automatically dividing the subgraph, and divide the operator nodes with different colors in the neural network calculation graph into different subgraphs.

Therefore, the invention provides an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler, which is used for deploying a neural network computational graph to operator nodes of different computational units to segment the operator nodes to different neural network computational subgraphs, and comprises the following steps:

step S1, creating a node list;

s4, deleting the input nodes without the dependency relationship and all the connecting edges taking the input nodes without the dependency relationship as starting points, and storing the numbers of the deleted nodes into a node list;

step S5, repeating the step S2 and the step S3 until the neural network calculation graph is empty, and simultaneously storing the number of the deleted node into a node list;

and S6, acquiring a node list, automatically segmenting according to the attribute color (the type of a computing unit to which a node operator belongs) of the node in the node list, and acquiring and processing the input dependency relationship and the output dependency relationship related to the node to obtain each neural network computing subgraph.

Through the technical scheme of the automatic segmentation neural network subgraph method applied to the AI heterogeneous compiler, the automatic segmentation neural network subgraph method is suitable for any neural network computational graph with a complex structure, and according to the input dependency and the output dependency of the nodes of the neural network computational graph, the nodes of the neural network computational graph are automatically arranged, so that each neural network computational subgraph is obtained, the problem that the subgraph is not segmented well due to the complex dependency structure in the neural network computational graph is solved, and the running sequence dependency of the nodes is ensured.

Further, the neural network calculation graph is divided into a plurality of neural network calculation subgraphs and is deployed on different calculation units for calculation, wherein the different calculation units comprise a central processing unit, a graphic processor and a digital signal processor, the problem that the AI heterogeneous compiler divides the subgraphs of the neural network calculation graph of the different calculation units is solved, the method is not only applicable to the situation of two calculation units, but also applicable to the situation of a plurality of calculation units, the universality is good, the optimal calculation force of the whole hardware chip is fully exerted, and the efficient heterogeneous calculation of the AI algorithm in actual hardware is completed.

In the above technical solution, as a preferred embodiment, as shown in fig. 3, before step S6, the method further includes: judging whether the neural network calculation graph is empty or not;

if not, turning to step S3;

if yes, go to step S6.

In this embodiment, the purpose of judging whether the neural network computation graph is empty is to thoroughly acquire the nodes with no input dependency relationship in the neural network computation graph, and store the numbers of the deleted nodes in the node list, automatically segment the nodes according to the colors in the node list, and acquire the input dependency relationship and the output dependency relationship related to the nodes to process and obtain each neural network computation subgraph.

Furthermore, the method for automatically splitting the neural network subgraph is suitable for any neural network computational graph with a complex structure, solves the problem that subgraphs are not split well in the neural network computational graph due to the complex dependency structure, and ensures the running sequence dependency sequence of nodes.

In the above technical solution, as a preferred embodiment, as shown in fig. 4, step S6 includes the following steps:

step S60, creating a sub-graph queue warehouse to store sub-graph node queues;

step S61, acquiring a current first node from a node list;

if yes, go to step S63;

if not, turning to step S64;

step S63, adding the current first node in the node list into a subgraph queue warehouse;

step S64, a new sub-graph node queue is newly built in the sub-graph queue warehouse, and a current first node in the node list is used as a first node of the new sub-graph node queue, wherein the color of the new sub-graph node queue is the same as that of the current first node in the node list;

step S65, repeating the steps S61-S62 until the node list is empty;

and step S66, each sub-graph node queue is obtained from the sub-graph queue warehouse, and the nodes of each sub-graph node queue are connected according to the input dependency relationship and the output dependency relationship of each node so as to obtain each neural network calculation sub-graph.

In this embodiment, taking the neural network calculation diagram of fig. 1 as an example, through steps S1 to S5, the obtained node list is shown in table one, the node 1, the node 2, the node 3, the node 5, and the node 6 are in the same color, the node 4 is in another color, after step S6, the colors of the node 1, the node 2, the node 3, the node 4, the node 5, and the node 6 in the node list are segmented, so as to obtain a sub-graph a, a sub-graph B, and a sub-graph C, as shown in fig. 5.

List one

1

2

3

4

5

6

Further, according to the input dependency relationship and the output dependency relationship of the nodes of the neural network computation graph, the nodes of the neural network computation graph are automatically arranged, and according to the color attribute of the nodes, the node list is automatically divided into each neural network computation subgraph.

In the above technical solution, as a preferred embodiment, as shown in fig. 6, before step S61, the method further includes:

judging whether the node list is empty or not;

if not, turning to step S61;

if yes, go to step S66.

In this embodiment, the purpose of judging whether the node list is empty is to be able to thoroughly judge that the colors of the nodes in the node list and the colors in the sub-graph node queues are sequentially matched and not missed, then obtain each sub-graph node queue from the sub-graph queue warehouse, and connect the nodes of each sub-graph node queue according to the input dependency relationship and the output dependency relationship of each node, so as to obtain each neural network calculation sub-graph.

The foregoing description is only illustrative of the preferred embodiments of the present invention and is not to be construed as limiting the scope of the invention, and it will be appreciated by those skilled in the art that equivalent substitutions and obvious variations may be made using the description and illustrations of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. An automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler is used for segmenting operator nodes deployed to different computing units in a neural network computation graph into different neural network computation subgraphs, and is characterized by comprising the following steps:

step S1, creating a node list;

2. The automatic slicing neural network subgraph method of claim 1 further comprising, prior to step S6: judging whether the neural network calculation graph is empty or not;

if not, turning to the step S3;

if yes, turning to the step S6.

3. The automatic slicing neural network subgraph method of claim 1 wherein step S6 includes the steps of:

step S60, creating a sub-graph queue warehouse to store sub-graph node queues;

step S61, acquiring a current first node from the node list;

if yes, go to step S63;

if not, turning to step S64;

step S65, repeating the steps S61-S62 until the node list is empty;

4. The automatic slicing neural network subgraph method of claim 3 further comprising, prior to step S61:

judging whether the node list is empty or not;

if not, turning to the step S61;

if yes, go to step S66.

5. The automatic slicing neural network subgraph method of claim 3 wherein in step S64 the color of the new subgraph node queue is the same as the color of the current first node in the node list.

6. The automatic slicing neural network subgraph method of claim 1 wherein in the automatic slicing neural network subgraph method, the neural network computation graph is divided into a plurality of the neural network computation subgraphs and deployed for computation on different computation units.

7. The automatic slicing neural network subgraph method of claim 6 wherein different ones of the computing units include a central processor, a graphics processor and a digital signal processor.