CN114365148A

CN114365148A - Neural network operation system and method

Info

Publication number: CN114365148A
Application number: CN201980100192.4A
Authority: CN
Inventors: 熊超; 牛昕宇; 蔡权雄
Original assignee: Shenzhen Corerain Technologies Co Ltd
Current assignee: Shenzhen Corerain Technologies Co Ltd
Priority date: 2019-10-22
Filing date: 2019-10-22
Publication date: 2022-04-15
Also published as: WO2021077284A1

Abstract

Disclosed herein is a neural network operation system and method, the system comprising: the software layer is set to construct a neural network computational graph aiming at a data flow computational architecture according to a preset network model and network model data corresponding to the preset network model and distribute a computational space corresponding to the neural network computational graph; the driving layer is connected with the software layer and is used for initializing the computing nodes according to the computing space and transmitting the node data of a plurality of computing nodes in the neural network computing graph to the hardware layer through a data transmission channel between the driving layer and the hardware layer; and the hardware layer is connected with the driving layer and is used for sequentially acquiring the node data of the plurality of computing nodes through the data transmission pipeline and computing according to the node data.

Description

Neural network operation system and method

Technical Field

The embodiment of the application relates to the field of neural networks, for example to a neural network operation system and method.

Background

With the gradual maturity of deep learning technology, industries based on neural networks have more and more applications on the ground, including security protection, industrial monitoring, automatic driving and the like.

The neural network is composed of a plurality of repetitive computing layers (also called operators), and the computing mode of the neural network has the characteristics of high parallelism and high computing quantity. Graphics Processing Unit (GPU) devices contain a large number of small computing cores, which on the one hand meet the requirements of neural network applications; on the other hand, the framework of early neural network algorithm development was developed by relying on a GPU, and most of the neural network was deployed on the GPU. However, the GPU is designed primarily for processing applications such as image rendering, and is not dedicated to neural network computing, and the architecture efficiency and resource utilization rate of the GPU are low, usually below 30%, so the low architecture efficiency of the GPU gradually becomes the bottleneck of the development of the neural network technology.

The computational efficiency of the data flow architecture can reach more than 90%, and compared with instruction set architectures such as GPU (graphics processing unit) and the like, the data flow architecture can fully use computational resources and is more suitable for the deployment of neural network algorithms. However, although the dataflow architecture is technically advanced, the application running form of the dataflow architecture still has great unknown and uncertainty due to the limited development time of the dataflow architecture and the early application stage.

Disclosure of Invention

The embodiment of the application provides a neural network operation system and method, which are used for distinguishing the operation forms of a neural network operated based on a data flow architecture and reducing the use threshold of data flow equipment.

The embodiment of the present application provides a neural network operation system, including:

the software layer is set to construct a neural network computational graph aiming at a data flow computational architecture according to a preset network model and network model data corresponding to the preset network model and distribute a computational space corresponding to the neural network computational graph;

the driving layer is connected with the software layer and is used for initializing the computing nodes according to the computing space and transmitting the node data of a plurality of computing nodes in the neural network computing graph to the hardware layer through a data transmission channel between the driving layer and the hardware layer;

and the hardware layer is connected with the driving layer and is used for sequentially calculating the node data of the plurality of calculation nodes through the data transmission pipeline and calculating according to the node data.

The embodiment of the application provides a neural network operation method, which comprises the following steps:

the software layer constructs a neural network computational graph aiming at a data flow computational framework according to a preset network model and network model data corresponding to the preset network model and distributes computational space corresponding to the neural network computational graph;

the driving layer initializes the calculation nodes according to the calculation space and transmits node data of a plurality of calculation nodes in the neural network calculation graph to the hardware layer through a data transmission channel between the driving layer and the hardware layer;

and the hardware layer sequentially acquires the node data of the plurality of computing nodes through the data transmission pipeline and performs computation according to the node data.

Drawings

Fig. 1 is a schematic structural diagram of a neural network operating system according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of another neural network operating system according to the second embodiment of the present application;

fig. 3 is a schematic structural diagram of another neural network operating system provided in the third embodiment of the present application;

fig. 4 is a schematic flowchart of a neural network operation method according to a fourth embodiment of the present application.

Description of the symbols: 110-a software layer; 120-a drive layer; 130-hardware layer; 111-computation graph building block; 112-computation graph initialization module; 113-memory allocation module; 114-computation graph run module; 115-a data output module; 116-computation graph run management module; 121-device initialization module; 122-compute node initialization module; 123-a data input module; 124-register configuration module; 125-data write-out module; 1231-data read-in submodule; 1232-data transmission sub-module; 131-input/output (I/O) initialization module; 132-node calculation module; 1321-a data acquisition submodule; 1322-on-chip memory sub-modules; 1323-hardware node computation submodule.

Detailed Description

The present application will be described with reference to the accompanying drawings and examples. The specific embodiments described herein are merely illustrative of the application and are not intended to be limiting. For the purpose of illustration, only some, but not all, of the structures associated with the present application are shown in the drawings.

Some example embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps in this document can be performed in parallel, concurrently or simultaneously. Further, the order of the steps may be rearranged. The process may be terminated when the various step operations are completed, but may have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc.

The terms "first," "second," and the like may be used herein to describe various orientations, actions, steps, or elements, but the orientations, actions, steps, or elements are not limited by these terms. These terms are only used to distinguish one direction, action, step or element from another direction, action, step or element. For example, a first computing node may be referred to as a second computing node, and similarly, a second computing node may be referred to as a first computing node, without departing from the scope of the present application. The first compute node and the second compute node are both compute nodes, but the first compute node and the second compute node are not the same compute node. The terms "first", "second", etc. are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means at least two, e.g., two, three, etc., unless otherwise limited.

Example one

Fig. 1 is a schematic structural diagram of a neural network operation system according to an embodiment of the present disclosure, which is applicable to operation of a neural network based on a data stream computing architecture. As shown in fig. 1, a neural network operation system according to a first embodiment of the present application includes: a software layer 110, a driver layer 120, and a hardware layer 130.

The software layer 110 is configured to construct a neural network computational graph for a data flow computational architecture according to a preset network model and network model data corresponding to the preset network model, and allocate a computational space corresponding to the neural network computational graph.

In this embodiment, the preset network model is a neural network model that needs to be operated under a data flow architecture. The neural network computational graph is an expression form of actual operation of the neural network under a data flow computing architecture, and comprises a plurality of computing nodes and connection relations among the computing nodes, one computing node in the neural network computational graph can correspond to one layer or a plurality of layers in the neural network, and network model data corresponding to the neural network model is data of each computing node in the neural network model. According to the preset network model, a neural network computational graph aiming at the data flow computational architecture can be constructed, then, network model data corresponding to the neural network model are imported into each node of the neural network computational graph, and the neural network computational graph can be used for carrying out actual operation.

In one embodiment, the operation of the neural network computation graph necessarily requires a certain computation space, and therefore, the software layer 110 also allocates the computation space required by the neural network computation graph.

The driver layer 120 is connected to the software layer 110, and the driver layer 120 is configured to initialize the compute nodes according to the computation space and transmit node data of a plurality of compute nodes in the neural network computational graph to the hardware layer 130 through a data transmission channel between the driver layer 120 and the hardware layer 130.

In this embodiment, after the software layer 110 constructs the neural network computation graph, the neural network computation graph is transmitted to the driver layer 120, the driver layer 120 initiates a device initialization request to the hardware layer 130, and the hardware layer 130 initializes a corresponding I/O interface according to the device initialization request, thereby constructing a data transmission pipeline between the driver layer 120 and the hardware layer 130. The software layer 110 controls the hardware layer 130 to perform calculation by controlling the data transmission in the data transmission pipeline.

In one embodiment, after the software layer 110 allocates the computation space required by the neural network computational graph, the driver layer 120 initializes the computation nodes of the neural network computational graph to enable the computation nodes of the neural network computational graph to perform actual operations.

The hardware layer 130 is connected to the driver layer 120, and the hardware layer 130 is configured to sequentially obtain node data of a plurality of computation nodes in the neural network computation graph through the data transmission pipeline and perform computation according to the node data.

In this embodiment, when the user uses the neural network to perform actual operation, data that needs to be calculated is input to the neural network, the software layer 110 obtains the input data and introduces the input data into the neural network computational graph, and then initiates a request for operating the neural network computational graph to the driver layer 120, so that the neural network computational graph performs operation according to the input data of the user. The driver layer 120 traverses each computation node of the neural network computation graph to obtain node data, and transmits the node data to the hardware layer 130 through a data transmission pipeline. After the hardware layer 130 obtains the node data, it completes the calculation of the calculation node on the data flow engine to obtain the output data corresponding to the input data, and transmits the output data to the driver layer 120 through the data transmission pipeline. After the calculation of the whole computation graph is completed, the final output data is transmitted to the software layer 110 by the driver layer 120, and a user can obtain the output data corresponding to the input data through the software layer 110.

The utility model provides a pair of neural network operating system passes through the software layer, the operation form of drive layer and hardware layer to neural network distinguishes, the function on software layer, drive layer and hardware layer is inequality, two liang of interactions again when neural network carries out actual operation, the user only needs to input data or obtain output data through the software layer when using neural network to carry out actual computation, neural network's actual operation is controlled by the drive layer, accomplish at the hardware layer, user and bottom hardware computation have realized control isolation, the use threshold of dataflow equipment has been reduced, be favorable to the wide application of dataflow equipment.

Example two

Fig. 2 is a schematic structural diagram of another neural network operating system provided in the second embodiment of the present application, and the present embodiment is explained based on the above embodiments. As shown in fig. 2, a neural network operation system provided in the second embodiment of the present application includes: a software layer 110, a driver layer 120, and a hardware layer 130. In this embodiment, the software layer 110 includes a computation graph building module 111, a computation graph initializing module 112, a memory allocation module 113, and a computation graph running module 114, the driver layer 120 includes a device initializing module 121, a data transmission module 123, and a computation node initializing module 122, and the hardware layer 130 includes an I/O initializing module 131 and a node computing module 132.

The computation graph construction module 111 is configured to construct a neural network computation graph for the data flow computation architecture according to the preset network model.

In the present embodiment, the neural network is a complex network system formed by a large number of simple processing units, also referred to as operators, which are widely interconnected, i.e., the neural network model is formed by a large number of operators which are widely interconnected. In actual operation, one or more layers of the neural network that perform a function are generally referred to as a computation node, network model data corresponding to the neural network model is data of a plurality of computation nodes in the neural network model, and the neural network computation graph is an expression form of the neural network during actual operation, and includes a plurality of computation nodes of the neural network model and connection relations between the computation nodes. The calculation nodes and the operators can be the same in size or different in size, and the size relation between the calculation nodes and the operators is different in different neural network models. For example, the operator types included in the neural network model are four types, a1, a2, A3, and a4, the computation nodes of the neural network computation graph may be two types, i.e., a first computation node a1+ a2 and a second computation node A3+ a4, and the connection relationship between the computation nodes may be the sum of first running the first computation node a1+ a2 and the second computation node A3+ a4, and then running the first computation node a1+ a2 and the second computation node A3+ a 4: a1+ A2+ A3+ A4. The method comprises the steps of constructing a neural network computation graph aiming at a data flow computation architecture according to a preset network model, namely constructing operators of the neural network model and connection relations among the operators into computation nodes of the neural network computation graph based on the data flow and connection relations among the computation nodes.

The computation graph initialization module 112 is configured to import the network model data corresponding to the preset network model into the neural network computation graph, and transmit the neural network computation graph into the driver layer 120.

In this embodiment, the initialization of the neural network computational graph is to introduce network model data into each computational node of the neural network computational graph, so that each computational node of the neural network computational graph contains actual data and can perform actual operation. The computation graph initialization module 112 transmits the initialized neural network computation graph to the driver layer 120.

The memory allocation module 113 is configured to allocate a computation space required by the neural network computational graph and initiate a computation node initialization request to the driver layer 120, where the computation node initialization request includes the computation space.

In this embodiment, the calculation space required by the neural network calculation graph is allocated by the memory allocation module 113, and after the memory allocation module 113 allocates the calculation space, a calculation node initialization request is initiated to the driver layer 120 to provide a suitable environment for the actual operation of the calculation node.

The computation graph operation module 114 is configured to obtain input data of the preset network model, import the input data into the neural network computation graph, and initiate a request for operation of the neural network computation graph to the driver layer 120.

The device initialization module 121 is arranged to initiate a device initialization request to said hardware layer 130.

In this embodiment, the device initialization module 121 initiates a device initialization request to the hardware layer 130, so that the hardware layer 130 performs I/O interface initialization.

The data transmission module 123 is configured to transmit node data of a plurality of computing nodes in the neural network computational graph to the hardware layer 130 through a data transmission channel between the driver layer 120 and the hardware layer 130 according to the operation request of the neural network computational graph.

The compute node initialization module 122 is configured to initialize a compute node according to the compute node initialization request.

In this embodiment, after receiving the computing node initialization request sent by the memory allocation module 113, the computing node initialization module 122 initializes the computing nodes of the neural network computing graph.

The I/O initialization module 131 is configured to complete initialization of the I/O interface corresponding to the device initialization request according to the device initialization request, and establish the data transmission pipeline.

In this embodiment, after receiving the device initialization request sent by the device initialization module 121, the I/O initialization module 131 completes initialization of the I/O interface corresponding to the device initialization request, thereby establishing a data transmission pipeline between the driver layer 120 and the hardware layer 130.

And the node calculation module 132 is configured to sequentially obtain node data of a plurality of calculation nodes through the data transmission pipeline and perform calculation according to the node data.

The neural network operation system provided by the embodiment of the application completes initialization of the neural network operation environment through mutual cooperation among a plurality of modules of a software layer, a driving layer and a hardware layer, and provides a proper operation environment for operation of the neural network.

EXAMPLE III

Fig. 3 is a schematic structural diagram of another neural network operating system provided in the third embodiment of the present application, and this embodiment is explained based on the above embodiments. As shown in fig. 3, a neural network operation system provided in the third embodiment of the present application includes: a software layer 110, a driver layer 120, and a hardware layer 130. In this embodiment, the software layer 110 includes: the computation graph building module 111, the computation graph initializing module 112, the memory allocation module 113, the computation graph operating module 114, the data output module 115, and the computation graph operation management module 116, and the driver layer 120 includes: the device initialization module 121, the data transmission module 123, the compute node initialization module 122, the register configuration module 124, and the data write-out module 125, and the hardware layer 130 includes: an I/O initialization module 131 and a node calculation module 132; the data transmission module 123 includes a data reading sub-module 1231 and a data transmission sub-module 1232; the node calculation module 132 includes a data acquisition submodule 1321, an on-chip storage submodule 1322, and a hardware node calculation submodule 1323.

In the present embodiment, the neural network is a complex network system formed by a large number of simple processing units, also referred to as operators, which are widely interconnected, i.e., the neural network model is formed by a large number of operators which are widely interconnected. In actual operation, one or more layers of the neural network that perform a function are generally referred to as a computation node, network model data corresponding to the neural network model is data of each computation node in the neural network model, and the neural network computation graph is an expression form of the neural network during actual operation, and includes a plurality of computation nodes of the neural network model and connection relationships between the computation nodes. The calculation nodes and the operators can be the same in size or different in size, and the size relation between the calculation nodes and the operators is different in different neural network models. For example, the operator types included in the neural network model are four types, a1, a2, A3, and a4, the computation nodes of the neural network computation graph may be two types, i.e., a first computation node a1+ a2 and a second computation node A3+ a4, and the connection relationship between the computation nodes may be the sum of first running the first computation node a1+ a2 and the second computation node A3+ a4, and then running the first computation node a1+ a2 and the second computation node A3+ a 4: a1+ A2+ A3+ A4. The method comprises the steps of constructing a neural network computational graph aiming at a data flow computational framework according to a preset network model, namely constructing computational nodes of the neural network model and connection relations among the computational nodes into the computational nodes of the neural network computational graph based on data flow and the connection relations among the computational nodes.

The memory allocation module 113 is configured to allocate the computation space required by the neural network computational graph and initiate a compute node initialization request to the driver layer 120.

In this embodiment, the initialization process of the neural network operation system is completed through interaction among the computation graph construction module 111, the computation graph initialization module 112, the memory allocation module 113, the device initialization module 121, the computation node initialization module 122, and the I/O initialization module 131, and a good operation environment is established for actual operation of the neural network.

The computation graph operation module 114 is configured to obtain input data of a preset neural network model, import the input data into the neural network computation graph, and initiate a request for operation of the neural network computation graph to the driver layer 120.

In this embodiment, when the user uses the neural network to perform actual operation, data that needs to be calculated is input to the neural network, the computation graph operation module 114 obtains the input data of the user and imports the input data into the neural network computation graph, and then initiates an operation request of the neural network computation graph to the driver layer 120, so that the neural network computation graph is calculated according to the input data of the user.

The data reading sub-module 1231 is configured to read node data of a plurality of computing nodes in the neural network computational graph according to the operation request of the neural network computational graph.

The data transmission sub-module 1232 is configured to transmit node data of a plurality of computing nodes in the neural network computational graph to the hardware layer 130 through a data transmission channel.

In this embodiment, the data reading sub-module 1231 reads node data of a plurality of computing nodes in the neural network computation graph, and the data transmission sub-module 1232 transmits the node data of the plurality of computing nodes to the hardware layer 130 through a data transmission pipeline, so that the hardware layer 130 performs actual operations on the computing nodes.

The computation graph run management module 116 is configured to manage the timing and computation space required for the neural network computation graph run.

The register configuration module 124 is configured to control the hardware layer 130 to establish hardware nodes corresponding to a plurality of compute nodes in the neural network computational graph on a data flow engine.

In this embodiment, the register configuration module 124 controls the hardware nodes corresponding to the computation nodes of the neural network computational graph, which are constructed based on the hardware layer 130 of the dataflow architecture, so that the neural network computational graph performs operations on the dataflow engine.

The data obtaining sub-module 1321 is configured to sequentially obtain node data of the plurality of computing nodes through the data transmission pipeline.

The on-chip storage submodule 1322 is configured to store the data transmitted by the data acquisition submodule 1321 through the data transmission pipeline and the output data calculated by the hardware node calculation submodule 1323.

In this embodiment, the on-chip storage sub-module 1322 is configured to store data, and the node data of the plurality of nodes calculated by the hardware layer 130 and the calculated output data are both stored in the on-chip storage sub-module 1322.

The hardware node calculation sub-module 1323 is configured to import the node data in the on-chip storage sub-module 1322 into the hardware node, complete calculation of the hardware node on the dataflow engine, obtain the output data, and store the output data in the on-chip storage sub-module 1322.

In this embodiment, when the hardware layer 130 performs calculation on the neural network, the hardware node calculation 1323 sub-module 1323 invokes node data from the on-chip storage sub-module 1322 and introduces the node data into a corresponding hardware node, so as to complete calculation of the hardware node on the data flow engine. When all the hardware nodes on the data flow engine complete the computation, an output data finally output to the user is obtained, and the hardware node computation submodule 1323 stores the output data in the on-chip storage submodule 1322 to be transmitted to the software layer 110 through the driver layer 120.

The data writing-out module 125 is configured to transmit the output data to the software layer 110 through the data transmission pipeline.

In this embodiment, the hardware node calculation submodule 1323 stores the output data in the on-chip storage submodule 1222, and the data writing-out module 125 calls the output data in the on-chip storage submodule 1222 through the data transmission pipeline and transmits the output data to the software layer 110.

The data output module 115 is arranged to output the output data to a user.

In this embodiment, the data writing-out module 125 of the driver layer 120 transmits the output data to the data output module 115 of the software layer 110, and the data output module 115 outputs the output data to a data storage terminal or an upper computer, so that a user can obtain the output data of the input data calculated by the neural network through the data storage terminal or the upper computer.

In this embodiment, the actual operation of the neural network on the data flow engine is completed through the interaction among the computation graph operation module 114, the data output module 115, the computation graph operation management module 116, the data transmission module 123, the register configuration module 124, the data writing-out module 125, and the node computation module 132, and a user only needs to input a set of input data to the software layer 110 and obtain a corresponding set of output data through the computation of the hardware layer 130.

The third embodiment of the application provides a neural network operation system obtains user's input data and demonstrates output data to the user through the software layer, accomplish the data transmission between software layer and the hardware layer through the driver layer, accomplish the calculation of neural network on the dataflow engine through the hardware layer, the actual operation of dataflow framework equipment has been realized, through the software layer, the operation of neural network is divided into three parts through driver layer and hardware layer, the user only need in the software layer operation can, keep apart with the hardware layer, make things convenient for the application of dataflow equipment.

Example four

Fig. 4 is a schematic flowchart of a method for operating a neural network according to a fourth embodiment of the present disclosure, which is applicable to operation of a neural network based on a data stream computing architecture. The method can be realized by the neural network operation system provided by any embodiment of the application, has the beneficial effects of the corresponding functional module of the neural network operation system, and the content which is not described in the fourth embodiment of the application can refer to the description in any system embodiment of the application.

As shown in fig. 4, a neural network operation method provided in the fourth embodiment of the present application includes:

s410, the software layer constructs a neural network computational graph aiming at a data flow computational framework according to a preset network model and network model data corresponding to the preset network model, and distributes computational space corresponding to the neural network computational graph.

In this embodiment, the preset network model is a neural network model that needs to be operated under a data flow architecture. The layer/layers for completing a function in the neural network are generally called a computing node, the neural network model is composed of a plurality of computing nodes according to a specific connection relation, network model data corresponding to the neural network model is data of each computing node in the neural network model, and the neural network computational graph is an expression form of the neural network when the neural network performs actual operation under a data flow computing architecture and comprises each computing node of the neural network model and the connection relation between the computing nodes. According to the preset network model, a neural network computational graph aiming at the data flow computational architecture can be constructed, then, network model data corresponding to the neural network model are imported into each node of the neural network computational graph, and the neural network computational graph can be used for carrying out actual operation.

In one embodiment, the operation of the neural network computation graph necessarily requires a certain computation space, and therefore, the computation space required by the neural network computation graph needs to be allocated.

And S420, initializing a computing node by the driving layer according to the computing space and transmitting node data of a plurality of computing nodes in the neural network computing graph to the hardware layer through a data transmission channel between the driving layer and the hardware layer.

In this embodiment, the data transmission pipeline is a transmission channel between node data of the neural network computational graph and a hardware node of actual computation, and data transmission of the neural network computational graph is implemented through the data transmission pipeline. The initialization of the computing nodes provides a proper operating environment for the actual operation of the computing nodes, so that the computing nodes of the neural network computing graph can perform the actual operation.

And S430, the hardware layer sequentially obtains the node data of the plurality of computing nodes through the data transmission pipeline and computes according to the node data.

In this embodiment, when a user uses a neural network to perform actual operation, data to be calculated is input to the neural network, after the data input by the user is imported into a neural network computational graph, each computational node of the neural network computational graph generates corresponding node data, and the node data of each computational node is transmitted to a corresponding hardware node on a data flow engine through a data transmission pipeline to perform actual operation. And obtaining output data used by the user after all the hardware nodes are calculated.

The method comprises the steps of constructing a neural network computational graph aiming at a data flow computational framework according to a preset network model and network model data corresponding to the preset network model, and distributing computational space corresponding to the neural network computational graph; initializing a computing node according to the computing space and transmitting node data of a plurality of computing nodes in the neural network computing graph to a hardware layer through a data transmission channel between the driving layer and the hardware layer; and sequentially acquiring node data of the plurality of computing nodes through the transmission pipeline and computing according to the node data. The fourth embodiment of the present application makes full use of the characteristics of the data stream architecture to support the actual operation of the data architecture device.

Claims

A neural network operating system, comprising:

the software layer is set to construct a neural network computational graph aiming at a data flow computational architecture according to a preset network model and network model data corresponding to the preset network model and distribute a computational space corresponding to the neural network computational graph;

the driving layer is connected with the software layer and is used for initializing the computing nodes according to the computing space and transmitting the node data of a plurality of computing nodes in the neural network computing graph to the hardware layer through a data transmission channel between the driving layer and the hardware layer;

and the hardware layer is connected with the driving layer and is configured to sequentially acquire the node data of the plurality of computing nodes through the data transmission pipeline and perform computation according to the node data.
The system of claim 1, wherein the software layers comprise:

the calculation graph building module is arranged for building a neural network calculation graph aiming at the data flow calculation architecture according to a preset network model;

the computation graph initialization module is used for importing network model data corresponding to the preset network model into the neural network computation graph and transmitting the neural network computation graph into a driving layer;

and the memory allocation module is configured to allocate a computation space of the neural network computation graph and initiate a computation node initialization request to the driver layer, wherein the computation node initialization request includes the computation space.
The system of claim 2, wherein the software layers further comprise: and the computation graph operation module is configured to acquire input data of the preset neural network model, import the input data into the neural network computation graph, and initiate a neural network computation graph operation request to the driver layer.
The system of claim 3, wherein the drive layer comprises:

the device initialization module is set to initiate a device initialization request to the hardware layer;

the data transmission module is arranged for transmitting node data of a plurality of computing nodes in the neural network computational graph to the hardware layer through a data transmission channel between the driving layer and the hardware layer according to the operation request of the neural network computational graph;

and the computing node initialization module is set to initialize the computing node according to the computing node initialization request.
The system of claim 4, wherein the hardware layer comprises:

the input/output I/O initialization module is used for finishing the initialization of an I/O interface corresponding to the equipment initialization request according to the equipment initialization request so as to establish the data transmission pipeline;

and the node calculation module is used for sequentially acquiring the node data of the plurality of calculation nodes through the data transmission pipeline and calculating according to the node data.
The system of claim 5, wherein the data transmission module comprises:

the data reading sub-module is used for reading node data of a plurality of computing nodes in the neural network computational graph according to the operation request of the neural network computational graph;

the data transmission sub-module is used for transmitting node data of a plurality of computing nodes in the neural network computational graph to the hardware layer through the data transmission channel;

the system further comprises: and the register configuration module is set to control the hardware layer to establish the hardware nodes corresponding to the plurality of computing nodes on the data stream engine.
The system of claim 6, wherein the node computation module comprises:

the data acquisition submodule is arranged for sequentially acquiring the node data of the plurality of computing nodes through the data transmission pipeline;

the on-chip storage submodule is used for storing the node data acquired by the data acquisition submodule and the output data calculated by the hardware node sub-calculation module;

and the hardware node calculation submodule is used for importing the node data in the on-chip storage submodule into the hardware node, finishing the calculation of the hardware node on the data flow engine to obtain the output data and storing the output data in the on-chip storage submodule.
The system of claim 7, wherein the drive layer further comprises:

and the data writing-out module is used for transmitting the output data to the software layer through the data transmission pipeline.
The system of claim 8, wherein the software layers further comprise:

a data output module configured to output the output data.
A neural network operation method, comprising:

the software layer constructs a neural network computational graph aiming at a data flow computational framework according to a preset network model and network model data corresponding to the preset network model and distributes computational space corresponding to the neural network computational graph;

the driving layer initializes the calculation nodes according to the calculation space and transmits node data of a plurality of calculation nodes in the neural network calculation graph to the hardware layer through a data transmission channel between the driving layer and the hardware layer;

and the hardware layer sequentially acquires the node data of the plurality of computing nodes through the data transmission pipeline and performs computation according to the node data.