WO2021077284A1

WO2021077284A1 - Neural network operating system and method

Info

Publication number: WO2021077284A1
Application number: PCT/CN2019/112466
Authority: WO
Inventors: 熊超; 牛昕宇; 蔡权雄
Original assignee: 深圳鲲云信息科技有限公司
Priority date: 2019-10-22
Filing date: 2019-10-22
Publication date: 2021-04-29
Also published as: CN114365148A

Abstract

Disclosed herein are a neural network operating system and method. The system comprises: a software layer that is configured to, according to a preset network model and network model data corresponding to the preset network model, construct a neural network computational graph for a data stream computing architecture and allocate a computing space corresponding to the neural network computational graph; a driver layer that is connected to the software layer and is configured to carry out computing node initialization according to the computing space and use a data transmission channel between the driver layer and a hardware layer to transmit to the hardware layer node data of multiple computing nodes in the neural network computational graph; and the hardware layer, which is connected to the driver layer and is configured to use the data transmission channel to successively obtain the node data of the multiple computing nodes and perform computing according to the node data.

Description

Neural network operation system and method

Technical field

The embodiments of the present application relate to the field of neural networks, for example, to a neural network operation system and method.

Background technique

With the gradual maturity of deep learning technology, there are more and more applications based on neural networks in industries, including security, industrial monitoring, and autonomous driving.

The neural network is composed of multiple repetitive calculation layers (also called operators). The calculation method of the neural network has the characteristics of high parallelism and high computational load. Graphics Processing Unit (GPU) equipment contains a large number of small computing cores, on the one hand, it is more in line with the needs of neural network applications; on the other hand, the early neural network algorithm development framework is based on GPU development, and most of the neural network deployment On the GPU. However, the original intention of GPU design is to process image rendering and other applications, not dedicated to neural network calculations. GPU architecture efficiency and resource utilization are low, usually below 30%. Therefore, the low architecture efficiency of GPU has gradually become the development of neural network technology. Bottleneck.

The computing efficiency of the data stream architecture can reach more than 90%. Compared with the instruction set architecture such as GPU, it can make full use of computing resources and is more suitable for the deployment of neural network algorithms. However, although the data flow architecture is technologically advanced, due to the limited development time of the data flow architecture, it is still in the early stage of application, and there are still great unknowns and uncertainties in the application operation form of the data flow architecture.

Summary of the invention

The embodiments of the present application provide a neural network operation system and method, to distinguish the operation form of the neural network based on the data flow architecture, and reduce the threshold for using the data flow device.

The embodiment of the application provides a neural network operation system, including:

The software layer is configured to construct a neural network calculation graph for the data flow calculation architecture according to the preset network model and the network model data corresponding to the preset network model, and allocate the calculation space corresponding to the neural network calculation graph;

The driver layer is connected to the software layer, and is configured to initialize the computing nodes according to the computing space and transmit the node data of multiple computing nodes in the neural network calculation graph through the data transmission between the driver layer and the hardware layer Channel transmission to the hardware layer;

The hardware layer is connected to the drive layer and is configured to sequentially calculate the node data of the multiple computing nodes through the data transmission pipeline and perform calculations based on the node data.

The embodiment of the present application provides a neural network operation method, including:

The software layer constructs a neural network calculation graph for the data flow computing architecture according to the preset network model and the network model data corresponding to the preset network model, and allocates the calculation space corresponding to the neural network calculation graph;

The driver layer initializes computing nodes according to the computing space and transmits the node data of multiple computing nodes in the neural network calculation graph to the hardware layer through the data transmission channel between the driver layer and the hardware layer;

The hardware layer sequentially obtains the node data of the multiple computing nodes through the data transmission pipeline and performs calculations based on the node data.

Description of the drawings

FIG. 1 is a schematic structural diagram of a neural network operating system provided by Embodiment 1 of the application;

FIG. 2 is a schematic structural diagram of another neural network operation system provided in the second embodiment of the application;

FIG. 3 is a schematic structural diagram of another neural network operating system provided in the third embodiment of this application;

FIG. 4 is a schematic flowchart of a neural network operation method provided in the fourth embodiment of this application.

Symbol description: 110-software layer; 120-driver layer; 130-hardware layer; 111-computation graph building module; 112-computation graph initialization module; 113-memory allocation module; 114-computation graph running module; 115-data output module 116-Compute graph operation management module; 121-device initialization module; 122-computing node initialization module; 123-data transmission in module; 124-register configuration module; 125-data write-out module; 1231-data read-in sub-module; 1232-data transmission sub-module; 131-input/output (I/O) initialization module; 132-node calculation module; 1321-data acquisition sub-module; 1322-on-chip storage sub-module; 1323-hardware node calculation sub-module.

Detailed ways

The application will be described below with reference to the drawings and embodiments. The specific embodiments described herein are only used to explain the application, but not to limit the application. For ease of description, the drawings only show a part of the structure related to the present application instead of all of the structure.

Some exemplary embodiments are described as processes or methods depicted as flowcharts. Although the flowchart describes multiple steps as sequential processing, many steps in this document can be implemented in parallel, concurrently, or simultaneously. In addition, the order of multiple steps can be rearranged. The processing may be terminated when the multiple step operations are completed, but there may also be additional steps not included in the drawing. Processing can correspond to methods, functions, procedures, subroutines, subroutines, and so on.

The terms "first", "second", etc. may be used herein to describe various directions, actions, steps or elements, etc., but these directions, actions, steps or elements are not limited by these terms. These terms are only used to distinguish a first direction, action, step or element from another direction, action, step or element. For example, without departing from the scope of the present application, the first computing node may be referred to as the second computing node, and similarly, the second computing node may be referred to as the first computing node. Both the first computing node and the second computing node are computing nodes, but the first computing node and the second computing node are not the same computing node. The terms "first", "second", etc. cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with "first" and "second" may explicitly or implicitly include one or more of these features. In the description of this application, "plurality" means at least two, such as two, three, etc., unless otherwise defined.

Example one

FIG. 1 is a schematic structural diagram of a neural network operation system provided in Embodiment 1 of the application, which is applicable to the operation of a neural network based on a data flow computing architecture. As shown in FIG. 1, a neural network operating system provided by Embodiment 1 of the present application includes: a software layer 110, a driver layer 120 and a hardware layer 130.

The software layer 110 is configured to construct a neural network calculation graph for the data flow calculation architecture according to a preset network model and network model data corresponding to the preset network model, and allocate a calculation space corresponding to the neural network calculation graph.

In this embodiment, the preset network model is a neural network model that needs to be calculated under the data flow architecture. Neural network calculation graph is a form of expression when the neural network performs actual calculations under the data flow computing architecture. It includes multiple calculation nodes and the connection relationship between the calculation nodes. A calculation node in the neural network calculation graph can correspond to the neural network. For one or more layers in the neural network model, the network model data corresponding to the neural network model is the data of each computing node in the neural network model. According to the preset network model, a neural network calculation graph for the data flow calculation architecture can be constructed, and then the network model data corresponding to the neural network model can be imported into each node of the neural network calculation graph, and the neural network calculation graph can perform actual calculations.

In an embodiment, the calculation of the neural network calculation graph inevitably requires a certain amount of calculation space. Therefore, the software layer 110 also allocates the calculation space required by the neural network calculation graph.

The drive layer 120 is connected to the software layer 110. The drive layer 120 is configured to initialize the computing nodes according to the computing space and pass the node data of multiple computing nodes in the neural network calculation graph through the drive layer 120 and the hardware layer 130. The data transmission channel is transmitted to the hardware layer 130.

In this embodiment, after the software layer 110 constructs the neural network calculation graph, it passes the neural network calculation graph to the driver layer 120. The driver layer 120 initiates a device initialization request to the hardware layer 130, and the hardware layer 130 responds to the corresponding device initialization request according to the device initialization request. The I/O interface is initialized, thereby constructing a data transmission pipeline between the driver layer 120 and the hardware layer 130. The software layer 110 controls the data transmission in the data transmission pipeline to control the hardware layer 130 to perform calculations.

In one embodiment, after the software layer 110 allocates the calculation space required by the neural network calculation graph, the drive layer 120 also initializes the calculation nodes of the neural network calculation graph, so that the calculation nodes of the neural network calculation graph can perform actual operations.

The hardware layer 130 is connected to the driver layer 120, and the hardware layer 130 is configured to sequentially obtain node data of multiple computing nodes in the neural network calculation graph through the data transmission pipeline and perform calculations based on the node data.

In this embodiment, when the user uses the neural network to perform actual calculations, the neural network is inputted with data that needs to be calculated. The software layer 110 obtains the input data and imports the input data into the neural network calculation graph, and then initiates to the drive layer 120 Neural network calculation graph operation request, so that the neural network calculation graph performs calculations based on the user's input data. The driving layer 120 traverses each calculation node of the neural network calculation graph to obtain node data, and transmits the node data to the hardware layer 130 through a data transmission pipeline. After the hardware layer 130 obtains the node data, it completes the calculation of the computing node on the data flow engine, obtains output data corresponding to the input data, and transmits the output data to the drive layer 120 through the data transmission pipeline. After the calculation of the entire calculation graph is completed, the final output data is transmitted from the driver layer 120 to the software layer 110, and the user can obtain the output data corresponding to the input data through the software layer 110.

A neural network operating system provided in the first embodiment of the application distinguishes the operating form of the neural network through the software layer, the driver layer, and the hardware layer. The functions of the software layer, the driver layer, and the hardware layer are different, and the actual operation is performed on the neural network. There are two interactions at times. When users use neural networks for actual calculations, they only need to input data or obtain output data through the software layer. The actual calculations of the neural network are controlled by the driver layer and completed at the hardware layer. The user and the underlying hardware calculations are realized Controlled isolation reduces the threshold for the use of data stream equipment, which is conducive to the wide application of data stream equipment.

Example two

FIG. 2 is a schematic structural diagram of another neural network operating system provided in the second embodiment of the application. This embodiment is described on the basis of the above-mentioned embodiment. As shown in FIG. 2, a neural network operating system provided in the second embodiment of the present application includes: a software layer 110, a driver layer 120, and a hardware layer 130. In this embodiment, the software layer 110 includes a calculation graph construction module 111, a calculation graph initialization module 112, a memory allocation module 113, and a calculation graph operation module 114. The driver layer 120 includes a device initialization module 121, a data transmission module 123, and a calculation node initialization module. 122. The hardware layer 130 includes an I/O initialization module 131 and a node computing module 132.

The calculation graph construction module 111 is configured to construct a neural network calculation graph for the data flow calculation framework according to the preset network model.

In this embodiment, the neural network is a complex network system formed by a large number of simple processing units that are widely connected to each other. These simple processing units are also called operators, that is, the neural network model consists of a large number of operators. Connected to each other. In actual calculations, one or more layers in a neural network that complete a function are usually called a computing node. The network model data corresponding to the neural network model is the data of multiple computing nodes in the neural network model. The neural network calculation graph It is a form of expression when the neural network performs actual calculations, including the multiple computing nodes of the neural network model and the connection relationship between the computing nodes. The computing node and the operator can be the same size or different sizes, and the size relationship between the computing node and the operator is different for different neural network models. For example, the types of operators included in the neural network model are A1, A2, A3, and A4. The calculation nodes of the neural network calculation graph can be the first calculation node A1+A2 and the second calculation node A3+A4. The connection relationship between the nodes can be to run the first computing node A1+A2 and the second computing node A3+A4 first, and then run the sum of the first computing node A1+A2 and the second computing node A3+A4: A1+A2+ A3+A4. Constructing a neural network calculation graph for a data flow computing architecture based on a preset network model is to construct the operator of the neural network model and the connection relationship between the operators into the calculation node of the neural network calculation graph based on the data flow and the calculation node The connection relationship between.

The calculation graph initialization module 112 is configured to import the network model data corresponding to the preset network model into the neural network calculation graph, and pass the neural network calculation graph to the driving layer 120.

In this embodiment, the initialization of the neural network calculation graph is to import network model data into each calculation node of the neural network calculation graph, so that each calculation node of the neural network calculation graph contains actual data and can perform actual calculations. The calculation graph initialization module 112 transmits the initialized neural network calculation graph to the driving layer 120.

The memory allocation module 113 is configured to allocate the computing space required by the neural network calculation graph and initiate a computing node initialization request to the driving layer 120, and the computing node initialization request includes the computing space.

In this embodiment, the calculation space required by the neural network calculation graph is allocated by the memory allocation module 113. After the memory allocation module 113 allocates the calculation space, it initiates a calculation node initialization request to the drive layer 120, so as to provide suitable calculation nodes for actual calculations. surroundings.

The calculation graph running module 114 is configured to obtain input data of the preset network model, import the input data into the neural network calculation graph, and initiate a neural network calculation graph running request to the driving layer 120.

The device initialization module 121 is configured to initiate a device initialization request to the hardware layer 130.

In this embodiment, the device initialization module 121 initiates a device initialization request to the hardware layer 130, so that the hardware layer 130 performs I/O interface initialization.

The data transmission module 123 is configured to transmit the node data of multiple computing nodes in the neural network calculation graph to the hardware layer 130 through the data transmission channel between the driver layer 120 and the hardware layer 130 according to the neural network calculation graph running request.

The computing node initialization module 122 is configured to perform computing node initialization according to the computing node initialization request.

In this embodiment, after the computing node initialization module 122 receives the computing node initialization request sent by the memory allocation module 113, it initializes the computing nodes of the neural network computing graph.

The I/O initialization module 131 is configured to complete the initialization of the I/O interface corresponding to the device initialization request according to the device initialization request, and establish the data transmission pipeline.

In this embodiment, after the I/O initialization module 131 receives the device initialization request sent by the device initialization module 121, it completes the initialization of the I/O interface corresponding to the device initialization request, thereby establishing data transmission between the driver layer 120 and the hardware layer 130 pipeline.

The node calculation module 132 is configured to sequentially obtain node data of multiple computing nodes through the data transmission pipeline and perform calculations based on the node data.

The neural network operating system provided in the second embodiment of the present application completes the initialization of the neural network operating environment through the cooperation of multiple modules of the software layer, the driver layer, and the hardware layer, and provides a suitable operating environment for the operation of the neural network. .

Example three

FIG. 3 is a schematic structural diagram of another neural network operating system provided in the third embodiment of this application. This embodiment is described on the basis of the above-mentioned embodiment. As shown in FIG. 3, a neural network operating system provided in the third embodiment of the present application includes: a software layer 110, a driver layer 120, and a hardware layer 130. In this embodiment, the software layer 110 includes: a calculation graph construction module 111, a calculation graph initialization module 112, a memory allocation module 113, a calculation graph operation module 114, a data output module 115, and a calculation graph operation management module 116. The driving layer 120 includes: Device initialization module 121, data transmission module 123, computing node initialization module 122, register configuration module 124, and data write module 125. The hardware layer 130 includes: I/O initialization module 131 and node calculation module 132; data transmission module 123 includes data The read-in sub-module 1231 and the data transmission sub-module 1232; the node calculation module 132 includes a data acquisition sub-module 1321, an on-chip storage sub-module 1322, and a hardware node calculation sub-module 1323.

The calculation graph construction module 111 is configured to construct a neural network calculation graph for the data flow calculation architecture according to the preset network model.

In this embodiment, the neural network is a complex network system formed by a large number of simple processing units that are widely connected to each other. These simple processing units are also called operators, that is, the neural network model consists of a large number of operators. Connected to each other. In actual calculations, one or more layers in a neural network that complete a function are usually called a computing node. The network model data corresponding to the neural network model is the data of each computing node in the neural network model. The neural network calculation graph It is a form of expression when the neural network performs actual calculations, including the multiple computing nodes of the neural network model and the connection relationship between the computing nodes. The computing node and the operator can be the same size or different sizes, and the size relationship between the computing node and the operator is different for different neural network models. For example, the types of operators included in the neural network model are A1, A2, A3, and A4. The calculation nodes of the neural network calculation graph can be the first calculation node A1+A2 and the second calculation node A3+A4. The connection relationship between the nodes can be to run the first computing node A1+A2 and the second computing node A3+A4 first, and then run the sum of the first computing node A1+A2 and the second computing node A3+A4: A1+A2+ A3+A4. To construct a neural network calculation graph for the data flow computing architecture based on the preset network model is to construct the calculation nodes of the neural network model and the connection relationship between the calculation nodes into the calculation nodes and the calculation nodes of the neural network calculation graph based on the data flow. The connection relationship between.

The memory allocation module 113 is configured to allocate the computing space required by the neural network calculation graph and initiate a computing node initialization request to the driving layer 120.

In this embodiment, the neural network operation system is completed through the interaction between the calculation graph construction module 111, the calculation graph initialization module 112, the memory allocation module 113, the device initialization module 121, the calculation node initialization module 122, and the I/O initialization module 131. The initialization process of the neural network establishes a good operating environment for the actual operation of the neural network.

The calculation graph running module 114 is configured to obtain input data of a preset neural network model, import the input data into the neural network calculation graph, and initiate a neural network calculation graph running request to the driving layer 120.

In this embodiment, when the user uses the neural network to perform actual calculations, the neural network is inputted with data that needs to be calculated, and the calculation graph running module 114 obtains the user's input data and imports the input data into the neural network calculation graph, and then sends it to the driving layer 120 initiates a request for the operation of the neural network calculation graph, so that the neural network calculation graph is calculated based on the user's input data.

The data reading submodule 1231 is configured to read node data of multiple computing nodes in the neural network calculation graph according to the neural network calculation graph running request.

The data transmission sub-module 1232 is configured to transmit the node data of multiple computing nodes in the neural network calculation graph to the hardware layer 130 through the data transmission channel.

In this embodiment, the data reading submodule 1231 reads the node data of multiple computing nodes in the neural network calculation graph, and the data transmission submodule 1232 transmits the node data of multiple computing nodes to the hardware layer 130 through the data transmission pipeline. This allows the hardware layer 130 to perform actual operations on the computing nodes.

The calculation graph operation management module 116 is configured to manage the time sequence and required calculation space when the neural network calculation graph is run.

The register configuration module 124 is configured to control the hardware layer 130 to establish hardware nodes corresponding to multiple computing nodes in the neural network calculation graph on the data flow engine.

In this embodiment, the register configuration module 124 controls the hardware nodes corresponding to the calculation nodes of the neural network calculation graph constructed based on the hardware layer 130 of the data flow architecture, so that the neural network calculation graph is operated on the data flow engine.

The data acquisition submodule 1321 is configured to sequentially acquire node data of multiple computing nodes through the data transmission pipeline.

The on-chip storage submodule 1322 is configured to store the data transmitted by the data acquisition submodule 1321 through the data transmission pipeline and the output data calculated by the hardware node calculation submodule 1323.

In this embodiment, the on-chip storage sub-module 1322 is configured to store data, and the node data of multiple nodes calculated by the hardware layer 130 and the calculated output data are all stored in the on-chip storage sub-module 1322.

The hardware node calculation submodule 1323 is configured to import the node data in the on-chip storage submodule 1322 into the hardware node, complete the calculation of the hardware node on the data flow engine, obtain the output data, and send the output data to the hardware node. The data is stored in the on-chip storage sub-module 1322.

In this embodiment, when the hardware layer 130 calculates the neural network, the hardware node calculation 1323 submodule 1323 calls the node data from the on-chip storage submodule 1322 and imports the node data into the corresponding hardware node to complete the hardware on the data flow engine. Calculation of nodes. When all the hardware nodes on the data flow engine have completed the calculation, they will get an output data that is finally output to the user. The hardware node calculation sub-module 1323 stores the output data in the on-chip storage sub-module 1322 for transmission to the drive layer 120 Software layer 110.

The data writing module 125 is configured to transmit the output data to the software layer 110 through the data transmission pipeline.

In this embodiment, the hardware node computing sub-module 1323 stores the output data in the on-chip storage sub-module 1222, and the data writing module 125 calls the output data in the on-chip storage sub-module 1222 through the data transmission pipeline, and transmits the output data to the software层110。 Layer 110.

The data output module 115 is configured to output the output data to the user.

In this embodiment, the data writing module 125 of the drive layer 120 transmits the output data to the data output module 115 of the software layer 110, and the data output module 115 outputs the output data to the data storage terminal or upper computer, so that the user can pass the data The storage terminal or the upper computer obtains the output data after the input data is calculated by the neural network.

In this embodiment, the interaction between the calculation graph operation module 114, the data output module 115, the calculation graph operation management module 116, the data transmission module 123, the register configuration module 124, the data write module 125, and the node calculation module 132 is completed. In the actual calculation of the neural network on the data flow engine, the user only needs to input a set of input data to the software layer 110, and after calculation by the hardware layer 130, a corresponding set of output data can be obtained.

A neural network operating system provided by the third embodiment of the application obtains user input data and displays output data to the user through the software layer, completes the data transmission between the software layer and the hardware layer through the driver layer, and completes the neural network in the hardware layer. The calculation on the data flow engine realizes the actual operation of the data flow architecture device. The operation of the neural network is divided into three parts through the software layer, the driver layer and the hardware layer. The user only needs to operate at the software layer, and the hardware layer Isolation, to facilitate the application of data streaming equipment.

Example four

FIG. 4 is a schematic flowchart of a neural network operation method provided in the fourth embodiment of the application, which is applicable to the operation of a neural network based on a data flow computing architecture. This method can be implemented by the neural network operating system provided by any embodiment of this application, and has the beneficial effects of corresponding functional modules of the neural network operating system. For the content not described in the fourth embodiment of this application, please refer to any system embodiment of this application. description.

As shown in Fig. 4, a neural network operation method provided in the fourth embodiment of the present application includes:

S410. The software layer constructs a neural network calculation graph for the data flow calculation architecture according to the preset network model and the network model data corresponding to the preset network model, and allocates a calculation space corresponding to the neural network calculation graph.

In this embodiment, the preset network model is a neural network model that needs to be calculated under the data flow architecture. One layer/multilayer in a neural network that completes a function is usually called a computing node. The neural network model is composed of multiple computing nodes according to a specific connection relationship. The network model data corresponding to the neural network model is in the neural network model. For the data of each computing node, the neural network calculation graph is a form of expression when the neural network performs actual operations under the data flow computing architecture, including each computing node of the neural network model and the connection relationship between the computing nodes. According to the preset network model, a neural network calculation graph for the data flow calculation architecture can be constructed, and then the network model data corresponding to the neural network model can be imported into each node of the neural network calculation graph, and the neural network calculation graph can perform actual calculations.

In one embodiment, the calculation of the neural network calculation graph inevitably requires a certain amount of calculation space. Therefore, the calculation space required by the neural network calculation graph needs to be allocated.

S420. The driver layer initializes computing nodes according to the computing space and transmits node data of multiple computing nodes in the neural network calculation graph to the hardware layer through a data transmission channel between the driver layer and the hardware layer.

In this embodiment, the data transmission pipeline is a transmission channel between the node data of the neural network calculation graph and the actual calculated hardware node, and the data transmission of the neural network calculation graph is realized through the data transmission pipeline. Initializing the computing node is to provide a suitable operating environment for the actual operation of the computing node, so that the computing node of the neural network calculation graph can perform the actual operation.

S430. The hardware layer sequentially obtains the node data of the multiple computing nodes through the data transmission pipeline and performs calculations based on the node data.

In this embodiment, when the user uses the neural network to perform actual calculations, the neural network will input the data that needs to be calculated. After the user input data is imported into the neural network calculation graph, each calculation node of the neural network calculation graph generates corresponding node data , The node data of each computing node is transmitted to the corresponding hardware node on the data flow engine through the data transmission pipeline for actual calculation. When all the hardware nodes are calculated, the output data for the user is obtained.

The fourth embodiment of the present application constructs a neural network calculation graph for a data flow calculation architecture according to a preset network model and the network model data corresponding to the preset network model, and allocates the calculation space corresponding to the neural network calculation graph; according to the calculation space Perform computing node initialization and transmit the node data of multiple computing nodes in the neural network calculation graph to the hardware layer through the data transmission channel between the drive layer and the hardware layer; obtain the data in sequence through the transmission pipeline The node data of multiple nodes is calculated and the calculation is performed based on the node data. The fourth embodiment of the present application makes full use of the characteristics of the data stream architecture to support the actual operation of the data architecture device.

Claims

A neural network operating system, including:

The software layer is configured to construct a neural network calculation graph for the data flow calculation architecture according to the preset network model and the network model data corresponding to the preset network model, and allocate the calculation space corresponding to the neural network calculation graph;

The driver layer is connected to the software layer, and is configured to initialize the computing nodes according to the computing space and transmit the node data of multiple computing nodes in the neural network calculation graph through the data transmission between the driver layer and the hardware layer Channel transmission to the hardware layer;

The hardware layer is connected to the drive layer and is configured to sequentially obtain node data of the multiple computing nodes through the data transmission pipeline and perform calculations based on the node data.
The system according to claim 1, wherein the software layer includes:

The calculation graph building module is set to construct a neural network calculation graph for the data flow calculation architecture according to the preset network model;

The calculation graph initialization module is configured to import the network model data corresponding to the preset network model into the neural network calculation graph, and pass the neural network calculation graph to the driving layer;

The memory allocation module is configured to allocate the calculation space of the neural network calculation graph and initiate a calculation node initialization request to the drive layer, and the calculation node initialization request includes the calculation space.
The system according to claim 2, wherein the software layer further comprises: a calculation graph running module, configured to obtain input data of the preset neural network model, and import the input data into the neural network calculation graph, And initiate a neural network calculation graph running request to the driving layer.
The system according to claim 3, wherein the driving layer comprises:

The device initialization module is set to initiate a device initialization request to the hardware layer;

The data transmission module is configured to transmit the node data of multiple computing nodes in the neural network calculation graph to the hardware layer through the data transmission channel between the drive layer and the hardware layer according to the neural network calculation graph operation request ；

The computing node initialization module is configured to initialize the computing node according to the computing node initialization request.
The system according to claim 4, wherein the hardware layer includes:

The input/output I/O initialization module is configured to complete the initialization of the I/O interface corresponding to the device initialization request according to the device initialization request, so as to establish the data transmission pipeline;

The node calculation module is configured to sequentially obtain the node data of the multiple calculation nodes through the data transmission pipeline and perform calculations based on the node data.
The system according to claim 5, wherein the data transmission module comprises:

A data reading sub-module, configured to read node data of multiple computing nodes in the neural network calculation graph according to the neural network calculation graph running request;

A data transmission sub-module configured to transmit node data of multiple computing nodes in the neural network calculation graph to the hardware layer through the data transmission channel;

The system further includes: a register configuration module configured to control the hardware layer to establish hardware nodes corresponding to the multiple computing nodes on the data flow engine.
The system according to claim 6, wherein the node calculation module comprises:

A data acquisition sub-module configured to sequentially acquire the node data of the multiple computing nodes through the data transmission pipeline;

An on-chip storage sub-module configured to store the node data obtained by the data obtaining sub-module and the output data calculated by the hardware node sub-computing module;

The hardware node calculation submodule is configured to import the node data in the on-chip storage submodule into the hardware node, complete the calculation of the hardware node on the data flow engine, obtain the output data, and send the output data Stored to the on-chip storage sub-module.
The system according to claim 7, wherein the driving layer further comprises:

The data writing module is configured to transmit the output data to the software layer through the data transmission pipeline.
The system according to claim 8, wherein the software layer further comprises:

The data output module is set to output the output data.
A neural network operation method, including:

The software layer constructs a neural network calculation graph for the data flow computing architecture according to the preset network model and the network model data corresponding to the preset network model, and allocates the calculation space corresponding to the neural network calculation graph;

The driver layer initializes computing nodes according to the computing space and transmits the node data of multiple computing nodes in the neural network calculation graph to the hardware layer through the data transmission channel between the driver layer and the hardware layer;

The hardware layer sequentially obtains the node data of the multiple computing nodes through the data transmission pipeline and performs calculations based on the node data.