CN111338816B - Instruction interaction method, system, equipment and storage medium based on neural network - Google Patents

Instruction interaction method, system, equipment and storage medium based on neural network Download PDF

Info

Publication number
CN111338816B
CN111338816B CN202010099596.0A CN202010099596A CN111338816B CN 111338816 B CN111338816 B CN 111338816B CN 202010099596 A CN202010099596 A CN 202010099596A CN 111338816 B CN111338816 B CN 111338816B
Authority
CN
China
Prior art keywords
instruction
data
layer
graph
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010099596.0A
Other languages
Chinese (zh)
Other versions
CN111338816A (en
Inventor
李金鹏
黄炯凯
蔡权雄
牛昕宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Corerain Technologies Co Ltd
Original Assignee
Shenzhen Corerain Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Corerain Technologies Co Ltd filed Critical Shenzhen Corerain Technologies Co Ltd
Priority to CN202010099596.0A priority Critical patent/CN111338816B/en
Publication of CN111338816A publication Critical patent/CN111338816A/en
Application granted granted Critical
Publication of CN111338816B publication Critical patent/CN111338816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/545Interprogram communication where tasks reside in different layers, e.g. user- and kernel-space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Debugging And Monitoring (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the invention discloses a method, a system, equipment and a storage medium for instruction interaction based on a neural network. The instruction interaction method based on the neural network comprises the following steps: the request daemon module checks whether an idle hardware layer exists; under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph; analyzing the structure data to obtain command data which can be identified by a driving layer; and calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, enabling the driving layer to initialize according to the instruction data and the parameter data, calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to enable the driving layer to drive the idle hardware layer to calculate the input calculation graph and the parameter data according to the instruction data so as to obtain an output calculation graph. The embodiment of the invention realizes the improvement of the communication efficiency of multiple hardware of the neural network.

Description

Instruction interaction method, system, equipment and storage medium based on neural network
Technical Field
The embodiment of the invention relates to the technical field of neural networks, in particular to a method, a system, equipment and a storage medium for instruction interaction based on a neural network.
Background
With the gradual maturation of deep learning technology, the industry floor application based on neural networks is more and more, including security protection, industrial monitoring, automatic driving and the like.
The neural network consists of a plurality of repeated computing layers (also called operators), and the computing mode has the characteristics of high parallelism and high computing amount. In the neural network in the related art, a software layer directly drives a hardware layer through a driving layer, and the software layer directly communicates with the driving layer.
However, the above method is difficult to support communication between multiple hardware layers, and when multiple hardware layers need to be controlled, the occupation of the hardware layers by the software layer cannot be mutually notified or notification is complex, and the communication efficiency is low by using a message method.
Disclosure of Invention
The embodiment of the invention provides a method, a system, equipment and a storage medium for instruction interaction based on a neural network, so as to improve the communication efficiency of multiple hardware of the neural network.
The embodiment of the invention provides a command interaction method based on a neural network, which comprises the following steps:
the request daemon module checks whether an idle hardware layer exists;
under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph;
analyzing the structure data to obtain command data which can be identified by a driving layer;
calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data;
and calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and parameter data according to the instruction data so as to obtain an output calculation graph.
In one aspect, an embodiment of the present invention further provides a command interaction system based on a neural network, where the command interaction system includes:
the daemon module is used for checking whether an idle hardware layer exists, using the idle hardware layer under the condition that the idle hardware layer exists, and marking that the idle hardware layer is occupied;
the model analysis module is used for analyzing a preset neural network model to obtain structural data and parameter data, preprocessing and inputting a calculation map;
the instruction generation module is used for analyzing the structural data to obtain instruction data which can be identified by the driving layer;
the instruction calling module is used for calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, enabling the driving layer to initialize according to the instruction data and the parameter data, calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to enable the driving layer to drive the idle hardware layer to calculate the input calculation graph and the parameter data according to the instruction data so as to obtain an output calculation graph.
On the other hand, the embodiment of the invention also provides a command interaction device based on the neural network, which comprises:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of instruction interaction as provided by any of the embodiments of the present invention.
In yet another aspect, an embodiment of the present invention further provides a computer readable storage medium, where a computer program is stored, where the computer program is executed by a processor to implement an instruction interaction method as provided in any one of the embodiments of the present invention.
The embodiment of the invention checks whether an idle hardware layer exists or not through a request daemon module; under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph; analyzing the structure data to obtain command data which can be identified by a driving layer; calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data; and calling a second instruction to enable the driving layer to receive the input calculation graph, calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and parameter data according to the instruction data so as to obtain an output calculation graph, uniformly managing and scheduling hardware layer resources by using a daemon module, controlling the driving layer to finish calculation by calling a simple instruction by a software layer, solving the problem that occupation of the hardware layer by the software layer cannot be mutually notified or notification is complex when a plurality of hardware layers need to be controlled, and realizing the effect of improving the communication efficiency of multiple hardware of the neural network.
Drawings
Fig. 1 is a schematic flow chart of a command interaction method based on a neural network according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of another instruction interaction method based on a neural network according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an instruction interaction system based on a neural network according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an instruction interaction device based on a neural network according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are for purposes of illustration and not of limitation. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Before discussing exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts steps as a sequential process, many of the steps may be implemented in parallel, concurrently, or with other steps. Furthermore, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Furthermore, the terms "first," "second," and the like, may be used herein to describe various directions, acts, steps, or elements, etc., but these directions, acts, steps, or elements are not limited by these terms. These terms are only used to distinguish one direction, action, step or element from another direction, action, step or element. For example, a first module may be referred to as a second module, and similarly, a second module may be referred to as a first module, without departing from the scope of the invention. Both the first module and the second module are modules, but they are not the same module. The terms "first," "second," and the like, are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more of the described features. In the description of the embodiments of the present invention, the meaning of "plurality" is at least two, for example, two, three, etc., unless explicitly defined otherwise.
Example 1
As shown in fig. 1, a first embodiment of the present invention provides a neural network-based instruction interaction method, where the instruction interaction method includes:
s110, the request daemon module checks whether an idle hardware layer exists.
In this embodiment, in the calculation of the neural network, the hardware layer may be called by multiple software layers, but the use of the hardware layer must be exclusive, in this case, the daemon module manages the call of the software layer to the hardware layer, when the user operates the software layer to perform the calculation of the neural network, the software layer checks whether there is an idle hardware layer with the daemon module, if there is no idle hardware layer, the daemon module performs the operation of waiting or not to be performed with the software layer, if there is an idle hardware layer, the daemon module will call the idle hardware layer to the requested software layer, and marks the hardware layer as in use, and accordingly reduces the number of idle hardware layers.
In one embodiment, a user may control the end-to-end auto-compile tool chain runtime (RainBuilder Runtime, rbrun) of the software layer to prepare for the calculation of the neural network, which automatically requests the daemon module to check whether there is an idle hardware layer in preparation for starting the calculation.
And S120, under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph.
In one embodiment, the neural network is a complex network system formed by a large number of simple processing units, also referred to as operators, widely interconnected by a large number of operators, i.e., the neural network model. In actual operation, one or more layers of the neural network that perform a function are generally referred to as a computing node, a computing graph is an expression form of the neural network when performing the actual operation, and includes a plurality of computing nodes and connection relationships between the computing nodes, one computing node in the computing graph may correspond to one or more layers of the neural network, and an input computing graph is a computing node of a first layer when the neural network calculates. The computing nodes and the operators can be the same size, or can be different in size, and in different neural network models, the size relations between the computing nodes and the operators are different. For example, the types of operators included in the neural network model are four types A1, A2, A3 and A4, the computing nodes of the neural network computing graph may be two types of first computing nodes a1+a2 and second computing nodes a3+a4, and the connection relationship between the computing nodes may be that the first computing nodes a1+a2 and the second computing nodes a3+a4 are operated first, and then the sum a1+a2+a3+a4 of the first computing nodes a1+a2 and the second computing nodes a3+a4 is operated.
In this embodiment, the preset neural network model may be generated by a user through writing of a software layer in advance, and structural data and parameter data may be obtained by analyzing the preset neural network model, where the structural data is a user-defined neural network structure written by the user, and the parameter data is weight data corresponding to the structural data provided by the user. The preprocessing input calculation graph can allocate calculation space in a hardware layer required by the input calculation graph for a software layer, and request to construct a data transmission pipeline between a driving layer and the hardware layer, and the software layer realizes the control of the hardware layer to calculate by controlling the data transmission in the data transmission pipeline.
In one embodiment, if the daemon module detects that an idle hardware layer exists, the daemon module will call the idle hardware layer to the rbrun time of the software layer, mark the hardware layer as in use, and reduce the number of idle hardware layers accordingly, after which the user can parse the pb file of the preset neural network model through an end-to-end automatic compiling tool chain compiler (RainBuilder Compiler, rbcomp) of the software layer to generate the pbtxt file of the structure data and the coeff file of the parameter data, after the user adds the preprocessing code, call the rbrun time of the software layer according to the use flow, and transmit the preprocessed input calculation map, the pbtxt file and the coeff file to the rbrun time.
S130, analyzing the structural data to obtain command data which can be identified by a driving layer.
In this embodiment, after the rbrutime of the software layer obtains the preprocessed input calculation map, the pbtxt file and the coeff file, the pbtxt file is parsed, and the pbtxt file is converted into command data identifiable by the Rbdriver of the driver layer, where the command data includes commands called by the Rbdriver to control the calculation of the hardware layer, by the end-to-end automatic compiling tool chain driver (RainBuilder driver), and the command data are sequentially arranged according to the actual calculation sequence of the hardware layer, so that the driver layer does not need to repeatedly communicate with the software layer after obtaining the command data, but sequentially obtains the command to drive the hardware layer to perform the calculation.
S140, calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data.
S150, calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and the parameter data according to the instruction data so as to obtain an output calculation graph.
In this embodiment, after the rbrun of the software layer obtains the instruction data identifiable by the Rbdriver of the driver layer, an add_parameter instruction is called, the instruction data identifiable by the Rbdriver of the driver layer and the coeff file are sent to the Rbdriver of the driver layer together, and the Rbdriver of the driver layer is driven according to the add_parameter instruction of the rbrun of the software layer, and is initialized according to the instruction data and the coeff file to prepare for calculation. After the Rbdriver of the driving layer is ready, the RbRuntime of the software layer calls an input_engine instruction, the Rbdriver of the driving layer sends a preprocessed input calculation graph to the Rbdriver of the driving layer, the Rbdriver of the driving layer drives according to the input_engine instruction of the RbRuntime of the software layer, receives the preprocessed input calculation graph sent by the RbRuntime of the software layer, calls the run_engine instruction to inform the Rbdriver of the driving layer that the input calculation graph can be calculated, and the Rbdriver of the driving layer drives according to the run_engine instruction of the RbRuntime of the software layer, sends the preprocessed input calculation graph to the hardware layer, and sequentially drives the hardware layer to calculate the input calculation graph according to the instructions in the instruction data to obtain an output calculation graph.
The embodiment of the invention checks whether an idle hardware layer exists or not through a request daemon module; under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph; analyzing the structure data to obtain command data which can be identified by a driving layer; calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data; and calling a second instruction to enable the driving layer to receive the input calculation graph, calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and parameter data according to the instruction data so as to obtain an output calculation graph, uniformly managing and scheduling hardware layer resources by using a daemon module, controlling the driving layer to finish calculation by calling a simple instruction by a software layer, solving the problem that occupation of the hardware layer by the software layer cannot be mutually notified or notification is complex when a plurality of hardware layers need to be controlled, and realizing the effect of improving the communication efficiency of multiple hardware of the neural network.
Example two
As shown in fig. 2, a second embodiment of the present invention provides another instruction interaction method based on a neural network, where the second embodiment of the present invention is further optimized based on the first embodiment of the present invention, and the instruction interaction method includes:
s210, receiving writing and training instructions of a user to generate the preset neural network model.
In this embodiment, the user may write and train the neural network according to his own needs through software such as TensorFlow of the software layer, so as to obtain a preset neural network model, that is, generate the pb file.
S220, the request daemon module checks whether an idle hardware layer exists.
And S230, under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation map.
S240, analyzing the structural data to obtain command data which can be identified by the driving layer.
S250, calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data.
S260, calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and the parameter data according to the instruction data so as to obtain an output calculation graph.
The implementation method in S220-S260 in the embodiment of the present invention is the same as the first embodiment of the present invention.
S270, calling a fourth instruction to read the output computing graph, and judging whether the output computing graph has remaining nodes to be computed.
S280, converting the output computational graph into a computational result and returning the computational result to a user in response to the judgment result that the output computational graph does not have the remaining nodes to be calculated.
And S290, calculating the remaining nodes to be calculated according to the structural data in response to the judging result of the remaining nodes to be calculated in the output calculation graph, obtaining remaining calculation nodes, converting the output calculation graph and the remaining calculation nodes into calculation results, and returning the calculation results to the user.
In this embodiment, after the hardware layer calculates the input calculation map to obtain the output calculation map, the rbrun time of the software layer invokes the load_engine instruction to read the output calculation map, the rdrive of the driver layer outputs the obtained output calculation map to the rdrive of the driver layer according to the load_engine instruction of the rbrun time of the software layer, and the rdrive of the driver layer continues to output the output calculation map to the memory pre-allocated by the rbrun time of the software layer according to the load_engine instruction of the rbrun time of the software layer. The rbrun of the software layer reads the output computation graph from the memory, and determines whether there are any remaining nodes to be computed in the output computation graph, where the remaining nodes may be nodes that cannot be computed by the hardware layer, or may be nodes specified by the predetermined neural network model that are not computed by the hardware layer. If there are no remaining nodes to be calculated, the RbRuntime of the software layer directly converts the output calculation graph into a calculation result, and returns the calculation result to the user to complete the calculation of the neural network once. If there are remaining nodes to be calculated, the RbRuntime of the software layer calculates the remaining nodes to be calculated according to the structure data of the preset neural network model, namely, the pbtxt file to obtain remaining calculated nodes, and after calculation is completed, the RbRuntime of the software layer converts the output calculation graph and the obtained remaining calculated nodes into calculation results together, and returns the calculation results to the user to complete calculation of the neural network once.
According to the embodiment of the invention, the preset neural network model is generated by receiving the writing and training instructions of the user, the fourth instruction is called to read the output computing graph, whether the output computing graph has the remaining nodes to be computed or not is judged, if not, the output computing graph is converted into the computing result and returned to the user, if yes, the remaining nodes to be computed are computed according to the structural data to obtain the remaining computing nodes, the output computing graph and the remaining computing nodes are converted into the computing result and returned to the user, the nodes which are not computed in the hardware layer are directly output and are computed by using the software layer, the problem that the remaining nodes to be computed are inconvenient to compute when the neural network is computed is solved, and the high-efficiency computation of the neural network is realized.
Example III
As shown in fig. 3, a third embodiment of the present invention provides an instruction interaction system 300 based on a neural network, where the instruction interaction system 300 provided in the third embodiment of the present invention can execute the instruction interaction method provided in any embodiment of the present invention, and has functional modules and effects corresponding to the execution method. The instruction interaction system 300 includes a daemon module 310, a model parsing module 320, an instruction generation module 330, and an instruction invoking module 340.
The daemon module 310 is configured to check whether an idle hardware layer exists, use the idle hardware layer if the idle hardware layer exists, and mark that the idle hardware layer is occupied; the model analysis module 320 is used for analyzing a preset neural network model to obtain structural data and parameter data and preprocessing an input calculation map; the instruction generating module 330 is configured to parse the structural data to obtain instruction data identifiable by a driving layer; the instruction calling module 340 is configured to call a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enable the driving layer to initialize according to the instruction data and the parameter data; for invoking a second instruction to cause the driver layer to receive the input computational graph; and the third instruction is also used for calling the driving layer to drive the idle hardware layer to calculate the input calculation graph and the parameter data according to the instruction data so as to obtain an output calculation graph.
In this embodiment, in the calculation of the neural network, the hardware layer may be called by multiple software layers, but the use of the hardware layer must be exclusive, in this case, the daemon module manages the call of the software layer to the hardware layer, when the user operates the software layer to perform the calculation of the neural network, the software layer checks whether there is an idle hardware layer with the daemon module, if there is no idle hardware layer, the daemon module performs the operation of waiting or not to be performed with the software layer, if there is an idle hardware layer, the daemon module will call the idle hardware layer to the requested software layer, and marks the hardware layer as in use, and accordingly reduces the number of idle hardware layers.
In one embodiment, the user may control the rbrun time of the software layer to prepare for the calculation of the neural network, and the rbrun time may automatically request the daemon module to check whether there is an idle hardware layer in preparation for the calculation.
In one embodiment, the neural network is a complex network system formed by a large number of simple processing units, also referred to as operators, widely interconnected by a large number of operators, i.e., the neural network model. In actual operation, one or more layers of the neural network that perform a function are generally referred to as a computing node, a computing graph is an expression form of the neural network when performing the actual operation, and includes a plurality of computing nodes and connection relationships between the computing nodes, one computing node in the computing graph may correspond to one or more layers of the neural network, and an input computing graph is a computing node of a first layer when the neural network calculates. The computing nodes and the operators can be the same size, or can be different in size, and in different neural network models, the size relations between the computing nodes and the operators are different. For example, the types of operators included in the neural network model are four types A1, A2, A3 and A4, the computing nodes of the neural network computing graph may be two types of first computing nodes a1+a2 and second computing nodes a3+a4, and the connection relationship between the computing nodes may be that the first computing nodes a1+a2 and the second computing nodes a3+a4 are operated first, and then the sum a1+a2+a3+a4 of the first computing nodes a1+a2 and the second computing nodes a3+a4 is operated.
In this embodiment, the preset neural network model may be generated by a user through writing of a software layer in advance, and structural data and parameter data may be obtained by analyzing the preset neural network model, where the structural data is a user-defined neural network structure written by the user, and the parameter data is weight data corresponding to the structural data provided by the user. The preprocessing input calculation graph can allocate calculation space in a hardware layer required by the input calculation graph for a software layer, and request to construct a data transmission pipeline between a driving layer and the hardware layer, and the software layer realizes the control of the hardware layer to calculate by controlling the data transmission in the data transmission pipeline.
In an embodiment, if the daemon module detects that an idle hardware layer exists, the daemon module will call the idle hardware layer to the rbrun time of the software layer, mark the hardware layer as in use, and correspondingly reduce the number of the idle hardware layer, after that, the user can analyze the pb file of the preset neural network model through the rbcomp of the software layer to generate the pbtxt file of the structure data and the coeff file of the parameter data, after the user adds the preprocessing code, call the rbrun time of the software layer according to the use flow, and transmit the preprocessed input calculation graph, the pbtxt file and the coeff file to the rbrun time.
In this embodiment, after the rbrutime of the software layer obtains the preprocessed input calculation graph, the pbtxt file and the coeff file, the pbtxt file is parsed, and the pbtxt file is converted into command data identifiable by the Rbdriver of the driver layer, where the command data includes commands called by the Rbdriver of all the driver layers to control the calculation of the hardware layer, and the command data are sequentially arranged according to the actual calculation sequence of the hardware layer, so that the driver layer does not need to repeatedly communicate with the software layer after obtaining the command data, but sequentially obtains the command to drive the hardware layer to perform the calculation.
In this embodiment, after the rbrun of the software layer obtains the instruction data identifiable by the Rbdriver of the driver layer, an add_parameter instruction is called, the instruction data identifiable by the Rbdriver of the driver layer and the coeff file are sent to the Rbdriver of the driver layer together, and the Rbdriver of the driver layer is driven according to the add_parameter instruction of the rbrun of the software layer, and is initialized according to the instruction data and the coeff file to prepare for calculation. After the Rbdriver of the driving layer is ready, the RbRuntime of the software layer calls an input_engine instruction, the Rbdriver of the driving layer sends a preprocessed input calculation graph to the Rbdriver of the driving layer, the Rbdriver of the driving layer drives according to the input_engine instruction of the RbRuntime of the software layer, receives the preprocessed input calculation graph sent by the RbRuntime of the software layer, calls the run_engine instruction to inform the Rbdriver of the driving layer that the input calculation graph can be calculated, and the Rbdriver of the driving layer drives according to the run_engine instruction of the RbRuntime of the software layer, sends the preprocessed input calculation graph to the hardware layer, and sequentially drives the hardware layer to calculate the input calculation graph according to the instructions in the instruction data to obtain an output calculation graph.
In one embodiment, the instruction interaction system further includes a model generation module 350 and a calculation output module 360,
the model generating module 350 is configured to receive writing and training instructions of a user to generate the preset neural network model. The calculation output module 360 is configured to invoke a fourth instruction to read the output calculation graph, determine whether there are remaining nodes to be calculated in the output calculation graph, and convert the output calculation graph into a calculation result and return the calculation result to the user in response to a determination that there are no remaining nodes to be calculated in the output calculation graph. The calculation output module is further configured to calculate the remaining nodes to be calculated according to the structural data in response to a determination result that the remaining nodes to be calculated exist in the output calculation graph, obtain remaining calculation nodes, convert the output calculation graph and the remaining calculation nodes into calculation results, and return the calculation results to a user.
In this embodiment, the user may write and train the neural network through the TensorFlow of the software layer according to the own requirement, so as to obtain the preset neural network model, that is, generate the pb file. After the hardware layer calculates the input calculation graph to obtain the output calculation graph, the RbRuntime of the software layer calls the load_engine instruction to read the output calculation graph, the Rbdriver of the driving layer enables the hardware layer to output the obtained output calculation graph to the Rbdriver of the driving layer according to the load_engine instruction of the RbRuntime of the software layer, and the Rbdriver of the driving layer continuously outputs the output calculation graph to a memory which is pre-allocated by the RbRuntime of the software layer according to the load_engine instruction of the RbRuntime of the software layer. The rbrun of the software layer reads the output computation graph from the memory, and determines whether there are any remaining nodes to be computed in the output computation graph, where the remaining nodes may be nodes that cannot be computed by the hardware layer, or may be nodes specified by the predetermined neural network model that are not computed by the hardware layer. If there are no remaining nodes to be calculated, the RbRuntime of the software layer directly converts the output calculation graph into a calculation result, and returns the calculation result to the user to complete the calculation of the neural network once. If there are remaining nodes to be calculated, the RbRuntime of the software layer calculates the remaining nodes to be calculated according to the structure data of the preset neural network model, namely, the pbtxt file to obtain remaining calculated nodes, and after calculation is completed, the RbRuntime of the software layer converts the output calculation graph and the obtained remaining calculated nodes into calculation results together, and returns the calculation results to the user to complete calculation of the neural network once.
The embodiment of the invention is used for checking whether an idle hardware layer exists or not through the daemon module 310, and using the idle hardware layer and marking that the idle hardware layer is occupied under the condition that the idle hardware layer exists; the model analysis module 320 is configured to analyze a preset neural network model to obtain structural data and parameter data, and preprocess an input calculation map; the instruction generating module 330 is configured to parse the structural data to obtain instruction data identifiable by the driving layer; the instruction calling module 340 is configured to call a first instruction to enable the driving layer to receive the parameter data and the instruction data, initialize the driving layer according to the instruction data and the parameter data, call a second instruction to enable the driving layer to receive the input computation graph, and call a third instruction to inform the driving layer to drive the idle hardware layer to compute the input computation graph and the parameter data according to the instruction data so as to obtain an output computation graph, uniformly manage and schedule hardware layer resources by using a daemon module, and call a simple instruction by a software layer to control the driving layer to complete computation, so that the problem that occupation of the hardware layer by the software layer cannot be mutually notified or notification is complex when a plurality of hardware layers need to be controlled is solved, and an effect of improving communication efficiency of multiple hardware of the neural network is achieved.
Example IV
Fig. 4 is a schematic structural diagram of a command interaction device based on a neural network according to a fourth embodiment of the present invention. Fig. 4 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 4 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in FIG. 4, the computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include: one or more processors or processing units 16, a system memory 28 (i.e., memory in fig. 4), a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.
Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, these architectures include industry standard architecture (Industry Standard Architecture, ISA) bus, micro channel architecture (MicroChannel Architecture, MCA) bus, enhanced ISA bus, video electronics standards association (Video Electronics Standard Association, VESA) local bus, and peripheral component interconnect (Peripheral Component Interconnect, PCI) bus.
Computer device 12 includes a variety of computer system readable media. Such media can be available media that can be accessed by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (Random Access Memory, RAM) 30 and/or cache memory 32. Computer device 12 may include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard disk drive"). Although not shown in fig. 4, a disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., portable compact disk read-only memory (Compact Disc Read Only Memory, CD-ROM), digital versatile disk read-only memory (Digital Video Disk Read Only Memory, DVD-ROM), or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. The system memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the various embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, system memory 28, such program modules 42 including an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or a device (e.g., network card, modem, etc.) that enables the computer device 12 to communicate with one or more other computing devices. Such communication may be via an Input/Output (I/O) interface 22. Moreover, the computer device 12 may also communicate with one or more networks such as a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN) and/or a public network such as the internet via the network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. Although not shown, other hardware and/or software modules may be used in connection with computer device 12, including microcode, device drivers, redundant processing units, external disk drive arrays, disk array (Redundant Arrays of Independent Drives, RAID) systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing the instruction interaction method provided by the embodiment of the present invention:
the request daemon module checks whether an idle hardware layer exists;
under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph;
analyzing the structure data to obtain command data which can be identified by a driving layer;
calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data;
and calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and parameter data according to the instruction data so as to obtain an output calculation graph.
Example five
The fifth embodiment of the present invention further provides a computer readable storage medium, where a computer program is stored, where the program when executed by a processor implements the methods provided by all the embodiments of the present invention:
the request daemon module checks whether an idle hardware layer exists;
under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph;
analyzing the structure data to obtain command data which can be identified by a driving layer;
calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data;
and calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and parameter data according to the instruction data so as to obtain an output calculation graph.
The computer storage media of embodiments of the invention may take the form of a combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium may include, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination thereof. Examples (a non-exhaustive list) of the computer-readable storage medium include: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access Memory (Random Access Memory, RAM), a Read-Only Memory (ROM), an erasable programmable Read-Only Memory (EPROM) or flash Memory, an optical fiber, a portable compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), an optical storage device, a magnetic storage device, or a suitable combination thereof. In this document, a computer readable storage medium may be a tangible medium that contains, or stores, a program that can be used by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be a computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using a suitable medium, including wireless, wireline, optical fiber cable, radio Frequency (RF), etc., or a suitable combination of the foregoing.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of remote computers, the remote computer may be connected to the user computer through any kind of network, including a LAN or WAN, or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the above embodiments, but may include many other equivalent embodiments without departing from the spirit of the invention, the scope of which is determined by the scope of the appended claims.

Claims (10)

1. A neural network-based instruction interaction method, comprising:
the request daemon module checks whether an idle hardware layer exists;
under the condition that the idle hardware layer exists, analyzing a preset neural network model to obtain structural data and parameter data, and preprocessing an input calculation graph;
analyzing the structure data to obtain command data which can be identified by a driving layer;
calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, and enabling the driving layer to initialize according to the instruction data and the parameter data;
and calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to inform the driving layer to drive the idle hardware layer to calculate the input calculation graph and the parameter data according to the instruction data so as to obtain an output calculation graph.
2. The instruction interaction method according to claim 1, further comprising, before the request daemon module checks whether there is an idle hardware layer:
and receiving writing and training instructions of a user to generate the preset neural network model.
3. The instruction interaction method of claim 1, further comprising, after said invoking a third instruction to inform said driver layer to drive said idle hardware layer to calculate said input computational graph and said parameter data according to said instruction data to obtain an output computational graph:
calling a fourth instruction to read the output computing graph, and judging whether the output computing graph has residual nodes to be computed or not;
and converting the output computational graph into a computational result and returning the computational result to a user in response to the judgment result that the output computational graph does not have the remaining nodes to be calculated.
4. The instruction interaction method according to claim 3, further comprising, after said determining whether there are remaining nodes to be calculated in the output computation graph:
and responding to the judging result of the remaining nodes to be calculated in the output calculation graph, calculating the remaining nodes to be calculated according to the structural data to obtain remaining calculation nodes, converting the output calculation graph and the remaining calculation nodes into calculation results, and returning the calculation results to a user.
5. A neural network-based instruction interaction system, comprising:
the daemon module is used for checking whether an idle hardware layer exists, using the idle hardware layer under the condition that the idle hardware layer exists, and marking that the idle hardware layer is occupied;
the model analysis module is used for analyzing a preset neural network model to obtain structural data and parameter data, preprocessing and inputting a calculation map;
the instruction generation module is used for analyzing the structural data to obtain instruction data which can be identified by the driving layer;
the instruction calling module is used for calling a first instruction to enable the driving layer to receive the parameter data and the instruction data, enabling the driving layer to initialize according to the instruction data and the parameter data, calling a second instruction to enable the driving layer to receive the input calculation graph, and calling a third instruction to enable the driving layer to drive the idle hardware layer to calculate the input calculation graph and the parameter data according to the instruction data so as to obtain an output calculation graph.
6. The instruction interaction system of claim 5, further comprising:
and the model generation module is used for receiving writing and training instructions of a user to generate the preset neural network model.
7. The instruction interaction system of claim 5, further comprising:
the computing output module is used for calling a fourth instruction to read the output computing graph and judging whether the output computing graph has remaining nodes to be computed or not, and is also used for converting the output computing graph into a computing result and returning the computing result to a user in response to the judging result that the output computing graph does not have the remaining nodes to be computed.
8. The instruction interaction system of claim 7, wherein the computing output module is further configured to, in response to a determination that the output computing graph includes the remaining nodes to be computed, compute the remaining nodes to be computed according to the structure data, obtain remaining computing nodes, convert the output computing graph and the remaining computing nodes into computing results, and return the computing results to a user.
9. A neural network-based instruction interaction device, comprising:
one or more processors;
storage means for storing one or more programs,
the one or more programs being executed by the one or more processors to cause the one or more processors to implement the instruction interaction method of any of claims 1-4.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the instruction interaction method of any of claims 1-4.
CN202010099596.0A 2020-02-18 2020-02-18 Instruction interaction method, system, equipment and storage medium based on neural network Active CN111338816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010099596.0A CN111338816B (en) 2020-02-18 2020-02-18 Instruction interaction method, system, equipment and storage medium based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010099596.0A CN111338816B (en) 2020-02-18 2020-02-18 Instruction interaction method, system, equipment and storage medium based on neural network

Publications (2)

Publication Number Publication Date
CN111338816A CN111338816A (en) 2020-06-26
CN111338816B true CN111338816B (en) 2023-05-12

Family

ID=71181718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010099596.0A Active CN111338816B (en) 2020-02-18 2020-02-18 Instruction interaction method, system, equipment and storage medium based on neural network

Country Status (1)

Country Link
CN (1) CN111338816B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10019668B1 (en) * 2017-05-19 2018-07-10 Google Llc Scheduling neural network processing
CN109165720A (en) * 2018-09-05 2019-01-08 深圳灵图慧视科技有限公司 Neural network model compression method, device and computer equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3502975A1 (en) * 2017-12-20 2019-06-26 Fujitsu Limited Methods and apparatus for model parallelism in artificial neural networks
CN109063829B (en) * 2018-06-22 2021-03-16 泰康保险集团股份有限公司 Neural network construction method and device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10019668B1 (en) * 2017-05-19 2018-07-10 Google Llc Scheduling neural network processing
CN109165720A (en) * 2018-09-05 2019-01-08 深圳灵图慧视科技有限公司 Neural network model compression method, device and computer equipment

Also Published As

Publication number Publication date
CN111338816A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
US11315034B2 (en) Intelligent big data system, and method and apparatus for providing intelligent big data service
EP3893112A2 (en) Method and apparatus for scheduling deep learning reasoning engines, device, and medium
CN111190741B (en) Scheduling method, equipment and storage medium based on deep learning node calculation
CN110191021B (en) Protocol testing method and device, electronic equipment and storage medium
CN111428933B (en) Logistics address recommendation method, system, equipment and storage medium
CN108833510B (en) Message processing method and device
CN111930489B (en) Task scheduling method, device, equipment and storage medium
CN111145076A (en) Data parallelization processing method, system, equipment and storage medium
CN110955640B (en) Cross-system data file processing method, device, server and storage medium
CN110826706B (en) Data processing method and device for neural network
CN112346794A (en) Interface calling method, device, equipment and medium
CN116467061B (en) Task execution method and device, storage medium and electronic equipment
CN111966653A (en) Data processing method, device, server and storage medium for micro-service call link
CN111338816B (en) Instruction interaction method, system, equipment and storage medium based on neural network
CN111309382B (en) Instruction pushing method, system, equipment and storage medium based on neural network
CN111124409B (en) Sketch-based service page generation method, device, equipment and storage medium
CN116909748A (en) Computing power resource allocation method and device, electronic equipment and storage medium
KR20140059353A (en) System and method for extracting and converting cad data of ship
US7464377B2 (en) Application parallel processing system and application parallel processing method
CN112230911B (en) Model deployment method, device, computer equipment and storage medium
CN114791885A (en) Interface test method, device, equipment and medium
US11144356B2 (en) Dynamic determination of memory requirements for function as a service multi-invocation flows
CN111949259A (en) Risk decision configuration method, system, electronic equipment and storage medium
Dada et al. Design and architecture of web services for simulation of biochemical systems
CN114546530B (en) Big data loading method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant