CN111062467A

CN111062467A - Automatic neural network subgraph segmentation method applied to AI heterogeneous compiler

Info

Publication number: CN111062467A
Application number: CN201911312625.0A
Authority: CN
Inventors: 黄明飞; 王海涛; 吕春莹
Original assignee: Open Intelligent Machine Shanghai Co ltd
Current assignee: Open Intelligent Machine Shanghai Co ltd
Priority date: 2019-12-18
Filing date: 2019-12-18
Publication date: 2020-04-24
Anticipated expiration: 2039-12-18
Also published as: CN111062467B

Abstract

The invention discloses an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler, which comprises the steps of establishing a node list; numbering all nodes of the neural network computation graph in advance, and recording the input dependency relationship and the output dependency relationship of each node; acquiring any node without input dependency relationship from the neural network computational graph; deleting the nodes without the input dependency relationship and all the connecting edges taking the nodes without the input dependency relationship as starting points, and storing the serial numbers of the deleted nodes into a node list; repeating the step S2 and the step S3 until the neural network calculation graph is empty, and storing the number of the deleted node into a node list; and automatically segmenting according to the node attribute colors in the node list so as to process and obtain each neural network computation subgraph. Has the advantages that: the method is suitable for any neural network computation graph with a complex structure, and solves the problems that the neural network computation graph is not good in segmenting subgraphs and different computation unit processing subgraphs are not good in partitioning.

Description

Automatic neural network subgraph segmentation method applied to AI heterogeneous compiler

Technical Field

The invention relates to the technical field of AI (Artificial Intelligence), in particular to an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler.

Background

With the continuous development of the AI technology, the neural network algorithm based on deep learning has become the mainstream mode of AI research at present. In consideration of the problems of cost, power consumption, privacy and the like, more and more application scenes migrate the calculation of the AI algorithm from the cloud to the mobile embedded terminal device.

At present, an embedded device chip often has a plurality of computing units coexisting, and includes, in addition to a general CPU (central processing Unit), an AI accelerator such as a GPU (graphic processing Unit), an NPU (network Processor Unit), an FPGA (Field Programmable gate array), and the like, which is specially used for accelerating an AI algorithm. The CPU universality and the programmability are good, but the performance and the energy efficiency ratio are poor, and the performance and the energy efficiency ratio of the AI accelerator are high, but the programmability is poor, for example, some AI accelerators cannot be programmed, some AI accelerators can be programmed, but the programming is complex, the development period is long, and some AI accelerators only support shaping calculation, so that the network precision loss is large.

By combining the characteristics of a CPU and an AI accelerator, the neural network is subjected to sub-graph segmentation, compute intensive operators on the AI accelerator (such as Convolution operators, full-connected operators, etc.), and compute some operators with complex computation logic or operators which are not suitable for shaping computation (such as operators for pre-and post-processing of detection network, etc.) on the CPU, so that the method is a solution for the AI heterogeneous computation at present.

In view of this, how to perform automatic sub-graph segmentation on a neural network computation graph is a problem to be solved at present to implement heterogeneous computation. The existing scheme is only suitable for simple operator chain-connected neural network computation graphs, and more neural network computation graphs contain more complex operator connection dependency relations, for example, a concat operator needs a plurality of inputs, for example, residual modules in a residual network all relate to non-chain neural network computation graphs. How to find a computing method capable of performing subgraph segmentation on any neural network computing graph is a great challenge to be solved by an AI heterogeneous compiler.

The problem to be solved at present is to find an algorithm for automatically segmenting subgraphs, and segment operator nodes deployed to different computing units in a neural network computational graph to different subgraphs.

Disclosure of Invention

Aiming at the problems in the prior art, an automatic neural network subgraph segmentation method applied to an AI heterogeneous compiler is provided.

The specific technical scheme is as follows:

an automatic neural network subgraph segmentation method applied to an AI heterogeneous compiler is used for segmenting operator nodes deployed to different computing units in a neural network computing graph to different neural network computing subgraphs, and comprises the following steps:

step S1, a node list is created;

step S2, numbering all nodes of the neural network computation graph in advance, and recording the input dependency relationship and the output dependency relationship of each node;

step S3, acquiring any node without input dependency relationship from the neural network computational graph;

step S4, deleting the nodes without dependency relationship and all the connecting edges taking the nodes without dependency relationship as the starting points, and storing the serial numbers of the deleted nodes into the node list;

step S5, repeating the step S2 and the step S3 until the neural network computation graph is empty, and storing the deleted node numbers into the node list;

and step S6, acquiring the node list, automatically segmenting according to the node attribute colors in the node list, and acquiring the input dependency relationship and the output dependency relationship related to the nodes to process to obtain each neural network computation subgraph.

Preferably, before the step S6, the method further includes: judging whether the neural network computation graph is empty or not;

if not, go to step S3;

if yes, the process goes to step S6.

Preferably, the step S6 includes the following steps:

step S60, creating a subgraph queue warehouse to store the subgraph node queue;

step S61, obtaining the current first node from the node list;

step S62, judging whether the color of the last sub-graph node queue in the sub-graph queue warehouse is the same as the color of the current first node in the node list;

if yes, go to step S63;

if not, go to step S64;

step S63, adding the current first node in the node list to the subgraph queue warehouse;

step S64, creating a new sub-graph node queue in the sub-graph queue warehouse, and using the current first node in the node list as the first node of the new sub-graph node queue;

step S65, repeating the steps S61-S62 until the node list is empty;

step S66, obtaining each sub-graph node queue from the sub-graph queue warehouse, and connecting the nodes of each sub-graph node queue according to the input dependency and the output dependency of each node to obtain each neural network computation sub-graph.

Preferably, before the step S61, the method further includes:

judging whether the node list is empty or not;

if not, go to step S61;

if yes, the process goes to step S66.

Preferably, in step S64, the color of the new sub-graph node queue is the same as the color of the current first node in the node list.

Preferably, in the method for automatically segmenting the neural network subgraph, the neural network computation graph is divided into a plurality of neural network computation subgraphs, and the neural network computation subgraphs are deployed on different computation units for computation.

Preferably, the different computing units comprise a central processing unit, a graphics processor and a digital signal processor.

The invention provides an automatic segmentation neural network subgraph method applied to an AI heterogeneous compiler, which has the advantages that:

(1) the nodes of the neural network computational graph are automatically arranged according to the input dependency relationship and the output dependency relationship of the nodes of the neural network computational graph, so that each neural network computational graph is obtained, the problem that the subgraph is not well segmented due to a complex dependency relationship structure in the neural network computational graph is solved, the running sequence of the nodes is ensured, and the neural network computational graph is suitable for any complex-structure neural network computational graph;

(2) the neural network computation graph is divided into a plurality of neural network computation subgraphs and is deployed on different computation units for computation, so that the problem of subgraph division of the neural network computation graph of different computation units by the AI heterogeneous compiler is solved, the method is suitable for the conditions of two computation units and the conditions of various computation units, the universality is good, the optimal computing power of the whole hardware chip is fully exerted, and the high-efficiency heterogeneous computation of the AI algorithm in actual hardware is completed.

Drawings

Embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings. The drawings are, however, to be regarded as illustrative and explanatory only and are not restrictive of the scope of the invention.

FIG. 1 is a diagram of a neural network computational graph in the prior art;

FIG. 2 is a flowchart illustrating steps of an automatic neural network subgraph segmentation method applied to an AI heterogeneous compiler according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a first step of an embodiment of a method for automatically segmenting a neural network subgraph applied to an AI heterogeneous compiler according to the present invention;

FIG. 4 is a flowchart illustrating steps of a second embodiment of a method for automatically segmenting a neural network subgraph applied to an AI heterogeneous compiler according to the present invention;

fig. 5 is a segmentation graph of each neural network computation sub-graph of the neural network computation graph of the second embodiment of the automatic segmentation neural network sub-graph method applied to the AI heterogeneous compiler according to the embodiment of the present invention;

fig. 6 is a flowchart illustrating steps of a third embodiment of the method for automatically segmenting a neural network subgraph applied to an AI heterogeneous compiler according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.

The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.

In the prior art, the neural network computational graph can be regarded as a directed acyclic graph, as shown in fig. 1, each node is an operator, and each arrow represents a relationship between operators. Assuming that the neural network computation graph is executed by using a central processing unit or a graphic processor, in fig. 1, the

nodes

1, 2, 3, 5, and 6 may respectively use the central processing unit to perform computation processing, and the node 4 may use the graphic processor to perform computation processing.

The above is only suitable for simple operator chain-connected neural network computation graphs, while more and more neural network computation graphs contain more complex operator connection dependency relationships, for example, concat operators need multiple inputs, for example, residual modules in residual network, all relate to non-chain neural network computation graphs. How to find a computing method capable of performing subgraph segmentation on any neural network computing graph is a great challenge to be solved by an AI heterogeneous compiler.

Aiming at the defects in the prior art, the problem to be solved at present is to find an algorithm for automatically segmenting subgraphs and segment operator nodes of different colors in a neural network calculation graph onto different subgraphs.

Therefore, the invention provides an automatic neural network subgraph segmentation method applied to an AI heterogeneous compiler, which is used for deploying a neural network computation graph to operator nodes of different computation units to segment the operator nodes to different neural network computation subgraphs, wherein the method comprises the following steps:

step S1, a node list is created;

step S4, deleting the nodes without the input dependency relationship and all the connecting edges taking the nodes without the input dependency relationship as the starting points, and storing the serial numbers of the deleted nodes into a node list;

step S5, repeating the step S2 and the step S3 until the neural network calculation graph is empty, and storing the number of the deleted node into a node list;

and step S6, acquiring a node list, automatically segmenting according to the node attribute color (the type of the computing unit to which the node operator belongs) in the node list, and acquiring the input dependency relationship and the output dependency relationship related to the node to process to obtain each neural network computing subgraph.

According to the technical scheme of the automatic neural network subgraph segmentation method applied to the AI heterogeneous compiler, the automatic neural network subgraph segmentation method is suitable for any neural network computational graph with a complex structure, and the nodes of the neural network computational graph are automatically arranged according to the input dependency relationship and the output dependency relationship of the nodes of the neural network computational graph, so that each neural network computational subgraph is obtained, the problem that the subgraph is not well segmented due to the complex dependency relationship structure in the neural network computational graph is solved, and the running dependency sequence of the nodes is also ensured.

Further, the neural network computation graph is divided into a plurality of neural network computation subgraphs, and the computation is carried out by deploying the subgraphs on different computation units, wherein the different computation units comprise a central processing unit, a graphic processing unit and a digital signal processing unit, so that the problem of subgraph division of the neural network computation graph of different computation units by the AI heterogeneous compiler is solved, the AI heterogeneous compiler is not only suitable for the conditions of two computation units, but also suitable for the conditions of various computation units, and has good universality, thereby fully exerting the optimal computing power of the whole hardware chip and completing the high-efficiency heterogeneous computation of the AI algorithm in actual hardware.

In the above technical solution, as a preferred embodiment, as shown in fig. 3, before step S6, the method further includes: judging whether the neural network calculation graph is empty or not;

if not, go to step S3;

if yes, the process goes to step S6.

In this embodiment, the purpose of determining whether the neural network computation graph is empty is to thoroughly acquire nodes without dependency relationship input in the neural network computation graph without omission, store the numbers of the deleted nodes in the node list, automatically segment the nodes according to the colors in the node list, and acquire input dependency relationship and output dependency relationship related to the nodes to process the input dependency relationship and the output dependency relationship to obtain each neural network computation subgraph.

Furthermore, the method for automatically segmenting the neural network subgraph is suitable for any neural network computational graph with a complex structure, solves the problem that the subgraph is not well segmented due to a complex dependency structure in the neural network computational graph, and also ensures the operation sequence of the nodes.

In the above technical solution, as shown in fig. 4, the step S6 preferably includes the following steps:

step S60, creating a subgraph queue warehouse to store the subgraph node queue;

step S61, obtaining the current first node from the node list;

if yes, go to step S63;

if not, go to step S64;

step S64, creating a new sub-graph node queue in the sub-graph queue warehouse, and using the current first node in the node list as the first node of the new sub-graph node queue, wherein the color of the new sub-graph node queue is the same as the color of the current first node in the node list;

step S65, repeating the steps S61-S62 until the node list is empty;

and step S66, obtaining each sub-graph node queue from the sub-graph queue warehouse, and connecting the nodes of each sub-graph node queue according to the input dependency and the output dependency of each node to obtain each neural network computation sub-graph.

In this embodiment, taking the neural network computational graph of fig. 1 as an example, after steps S1-S5, the obtained node list is shown as table one, the node 1, the node 2, the node 3, the node 5, and the node 6 are in the same color, and the node 4 is in another color, and after step S6, the colors of the node 1, the node 2, the node 3, the node 4, the node 5, and the node 6 in the node list are segmented, so that a subgraph a, a subgraph B, and a subgraph C can be obtained, as shown in fig. 5.

Watch 1

1

2

3

4

5

6

Further, the nodes of the neural network computation graph are automatically arranged according to the input dependency relationship and the output dependency relationship of the nodes of the neural network computation graph, and the node list is automatically divided into each neural network computation subgraph according to the color attributes of the nodes.

In the above technical solution, as a preferred embodiment, as shown in fig. 6, before step S61, the method further includes:

judging whether the node list is empty or not;

if not, go to step S61;

if yes, the process goes to step S66.

In this embodiment, the purpose of determining whether the node list is empty is to thoroughly determine that the colors of the nodes in the node list are sequentially matched with the colors in the sub-graph node queues without omission, then obtain each sub-graph node queue from the sub-graph queue warehouse, and connect the nodes of each sub-graph node queue according to the input dependency and the output dependency of each node to obtain each neural network computation sub-graph.

While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims

1. An automatic neural network subgraph segmentation method applied to an AI heterogeneous compiler is used for segmenting operator nodes deployed to different computing units in a neural network computing graph to different neural network computing subgraphs, and is characterized by comprising the following steps:

step S1, a node list is created;

2. The method for automatically segmenting neural network subgraph according to claim 1, further comprising, before the step S6: judging whether the neural network computation graph is empty or not;

if not, go to step S3;

if yes, the process goes to step S6.

3. The method for automatically segmenting neural network subgraph according to claim 1, wherein the step S6 comprises the following steps:

step S60, creating a subgraph queue warehouse to store the subgraph node queue;

step S61, obtaining the current first node from the node list;

if yes, go to step S63;

if not, go to step S64;

step S65, repeating the steps S61-S62 until the node list is empty;

4. The method for automatically segmenting a neural network subgraph according to claim 3, further comprising, before said step S61:

judging whether the node list is empty or not;

if not, go to step S61;

if yes, the process goes to step S66.

5. The method according to claim 3, wherein in step S64, the color of the new sub-graph node queue is the same as the color of the first node in the node list.

6. The method according to claim 1, wherein the neural network computational graph is divided into a plurality of neural network computational subgraphs and deployed for computation on different computational units.

7. The method according to claim 6, wherein the different computing units comprise a central processing unit, a graphics processor and a digital signal processor.