CN112947933A - Operator execution method and device, computer equipment and storage medium - Google Patents

Operator execution method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112947933A
CN112947933A CN202110209717.7A CN202110209717A CN112947933A CN 112947933 A CN112947933 A CN 112947933A CN 202110209717 A CN202110209717 A CN 202110209717A CN 112947933 A CN112947933 A CN 112947933A
Authority
CN
China
Prior art keywords
operator
graph
target
operators
basic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110209717.7A
Other languages
Chinese (zh)
Inventor
李懋林
李周洋
张行程
李秀红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sensetime Intelligent Technology Co Ltd
Original Assignee
Shanghai Sensetime Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sensetime Intelligent Technology Co Ltd filed Critical Shanghai Sensetime Intelligent Technology Co Ltd
Priority to CN202110209717.7A priority Critical patent/CN112947933A/en
Publication of CN112947933A publication Critical patent/CN112947933A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The present disclosure provides an operator execution method, an operator execution device, a computer device, and a storage medium, wherein the method includes: acquiring a forward calculation graph of a target operator; wherein the forward computational graph comprises a plurality of base operators; determining a first sub-graph from the forward calculation graph, and performing compiling processing on the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph; the first sub-graph comprises at least part of basic operators in the forward calculation graph; generating a first target calculation graph of the target operator based on the first compiling operator; based on the first target computation graph, performing a forward process of the target operator. According to the method and the device, when the target operator is executed based on the first target calculation graph, the number of times of hardware resource allocation and calculation resource scheduling is less, and therefore execution efficiency of the user-defined operator is improved.

Description

Operator execution method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of machine learning technologies, and in particular, to an operator execution method, an operator execution device, a computer device, and a storage medium.
Background
The machine learning framework is software for training the neural network model, and a developer can write codes for realizing the neural network model through the machine learning framework and run the codes for the neural network model through the machine learning framework to realize the training process of the neural network model. The current machine learning framework has low execution efficiency on user-defined operators.
Disclosure of Invention
The embodiment of the disclosure at least provides an operator execution method, an operator execution device, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides an operator executing method, including:
acquiring a forward calculation graph of a target operator; wherein the forward computational graph comprises a plurality of base operators; determining a first sub-graph from the forward calculation graph, and performing compiling processing on the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph; the first sub-graph comprises at least part of basic operators in the forward calculation graph; generating a first target calculation graph of the target operator based on the first compiling operator; based on the first target computation graph, performing a forward process of the target operator.
Therefore, operators included in the first target calculation graph are fewer than operators included in the forward calculation graph, so that when the target operators are executed based on the first target calculation graph, the times of hardware resource allocation and calculation resource scheduling are less, and the execution efficiency of the user-defined operators is improved.
In one possible implementation, the obtaining a forward computation graph of a target operator includes: recording first calling information for calling a predefined basic operator in a forward process of executing the target operator; and generating a forward calculation graph of the target operator based on the first calling information.
Therefore, the first calling information of the basic operator existing in the machine learning framework in the forward process of executing the target operator is recorded, and the forward calculation graph of the forward process of the target operator can be obtained based on the first calling information, so that the method is more convenient and faster.
In one possible embodiment, the first call information includes: the type of the basic operator called when the forward process of the target operator is executed, the calling sequence of the basic operator and the dependency relationship among different basic operators.
In one possible embodiment, the base operator comprises at least one of: a mathematical operation base operator, a comparison operation base operator, a logical operation base operator, a specification operation base operator, and a threshold operation base operator.
In one possible embodiment, the determining a first sub-graph from the forward computation graph includes: determining the first subgraph from the forward computation graph based on connection relation information between the base operators constituting the forward computation graph and the type of the base operators.
In one possible embodiment, the determining the first sub-graph from the forward computation graph based on information of connection relationships between the basic operators constituting the forward computation graph and types of the basic operators includes: determining an alternative base operator from the forward computational graph based on the type of the base operator; determining a target basic operator from the alternative basic operators based on the connection relation information among the alternative basic operators; and obtaining the first subgraph based on the target basic operator.
In a possible implementation, the generating a first target computation graph of the target operator based on the first compiling operator includes: and generating a first target calculation graph of the target operator based on the first compiling operator and other basic operators in the forward calculation graph except the first subgraph.
Thus, because all of the base operators included in the first sub-graph are compiled into one first compiler, the number of operators of the forward computation graph is reduced, and the first target computation graph is formed with fewer operators than the forward computation graph.
In one possible embodiment, the method further comprises: obtaining a reverse calculation chart of the reverse process of the target operator; determining a second sub-graph from the reverse calculation graph, and performing compilation processing on the second sub-graph to obtain a second compiler corresponding to the second sub-graph; determining a second target computational graph of the target operator reverse process based on the second compiling operator.
Therefore, the second sub-graph containing part of the reverse basic operators in the reverse calculation graph is compiled into a second compiling operator, the number of operators is reduced, the number of the operators in the second target calculation graph generated based on the second compiling operator is less than that of the operators in the reverse calculation graph, the number of the operators called in the reverse process of the target operators is effectively reduced, and the execution efficiency of the reverse process of the target operators is improved.
In a possible implementation, the obtaining a reverse computation graph of the target operator reverse process includes: recording second calling information for calling a predefined reverse basic operator in the reverse process of executing the target operator; and generating a reverse calculation graph of the target operator based on the second calling information.
Therefore, the reverse calculation graph of the reverse process of the target operator can be obtained based on the second calling information by recording the second calling information of the reverse basic operator existing in the machine learning framework in the reverse process of executing the target operator, and the method is more convenient and faster.
In one possible embodiment, the method further comprises: based on the second target computation graph, performing a reverse process of the target operator.
In a second aspect, an embodiment of the present disclosure further provides an apparatus for executing an operator, including:
the acquisition module is used for acquiring a forward calculation graph of the target operator; wherein the forward computational graph comprises a plurality of base operators; the compiling module is used for determining a first sub-graph from the forward calculation graph and compiling the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph; the first sub-graph comprises at least part of basic operators in the forward calculation graph; a generating module, configured to generate a first target computation graph of the target operator based on the first compiling operator; and the execution module is used for executing the forward process of the target operator based on the first target calculation graph.
In a possible implementation manner, when obtaining a forward computation graph of a target operator, the obtaining module is specifically configured to record first calling information for calling a predefined basic operator in a forward process of executing the target operator; and generating a forward calculation graph of the target operator based on the first calling information.
In one possible embodiment, the first call information includes: the type of the basic operator called when the forward process of the target operator is executed, the calling sequence of the basic operator and the dependency relationship among different basic operators.
In one possible embodiment, the base operator comprises at least one of: a mathematical operation base operator, a comparison operation base operator, a logical operation base operator, a specification operation base operator, and a threshold operation base operator.
In a possible implementation, when determining the first sub-graph from the forward computation graph, the compiling module is specifically configured to determine the first sub-graph from the forward computation graph based on information of connection relationships between the base operators constituting the forward computation graph and a type of the base operator.
In a possible implementation, when determining the first sub-graph from the forward computation graph based on information of connection relationships between the base operators constituting the forward computation graph and the types of the base operators, the compiling module is specifically configured to determine an alternative base operator from the forward computation graph based on the types of the base operators; determining a target basic operator from the alternative basic operators based on the connection relation information among the alternative basic operators; and obtaining the first subgraph based on the target basic operator.
In a possible implementation manner, when generating the first target computation graph of the target operator based on the first compiling operator, the generating module is specifically configured to generate the first target computation graph of the target operator based on the first compiling operator and a base operator in the forward computation graph except for the first sub-graph.
In a possible implementation manner, the obtaining module is further configured to obtain a reverse computation graph of the target operator reverse process; the compiling module is further configured to determine a second sub-graph from the reverse calculation graph, and perform compiling processing on the second sub-graph to obtain a second compiling operator corresponding to the second sub-graph; the generating module is further configured to determine, based on the second compiling operator, a second target calculation graph of the target operator reverse process.
In a possible implementation manner, when the reverse computation graph of the target operator reverse process is obtained, the obtaining module is specifically configured to record second calling information for calling a predefined reverse basic operator in the reverse process of executing the target operator; and generating a reverse calculation graph of the target operator based on the second calling information.
In a possible implementation, the execution module is further configured to execute a reverse process of the target operator based on the second target computation graph.
In a third aspect, this disclosure also provides a computer device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, and when the machine-readable instructions are executed by the processor, the machine-readable instructions are executed by the processor to perform the steps in the first aspect or any one of the possible implementations of the first aspect.
In a fourth aspect, this disclosure also provides a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.
For the description of the effects of the operator executing apparatus, the computer device, and the computer readable storage medium, reference is made to the description of the operator executing method, and details are not repeated here.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.
Fig. 1 shows a flowchart of an execution method of an operator provided by an embodiment of the present disclosure;
FIG. 2 illustrates an example diagram of a forward computational graph of a target operator provided by an embodiment of the present disclosure;
fig. 3 illustrates an example diagram of a first sub-diagram provided by an embodiment of the present disclosure;
FIG. 4 illustrates an example diagram of a first target computation graph provided by an embodiment of the present disclosure;
FIG. 5 is a flow chart illustrating a method for executing another operator provided by the embodiments of the present disclosure
FIG. 6 is a schematic diagram illustrating an apparatus for executing an operator according to an embodiment of the disclosure;
fig. 7 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of embodiments of the present disclosure, as generally described and illustrated herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
Research shows that for some existing operators, a machine learning framework can define the operators well; when executing these operators, the machine learning framework typically only needs to make one or a limited number of hardware resource allocations and computation resource scheduling for them. However, machine learning models are complex and various, and with the continuous development of the field of artificial intelligence, new models and new operators defined by developers can continuously appear; once new custom operators appear, since the custom operators are not well defined in advance in a machine learning framework, developers are required to give forward processes and reverse processes of the custom operators; the forward process of the self-defined operator comprises a forward calculation graph formed by a plurality of basic operators; the reverse process comprises a reverse calculation graph formed by a plurality of reverse basic operators; when the machine learning frame executes a basic operator, corresponding hardware resource allocation and computing resource scheduling need to be performed on the basic operator, so that various hardware computing resources need to be frequently scheduled in the forward process and the reverse process of executing the user-defined operator by the machine learning frame, the scheduling time is more, and the efficiency of executing the user-defined operator is lower.
Based on the research, the present disclosure provides an operator execution method, an operator execution apparatus, a computer device, and a storage medium, where a first sub-graph is determined from a forward computation graph of a target operator, and at least a part of a basic operator in the forward computation graph is included in the first sub-graph; after the first sub-graph is compiled, the basic operator included in the first sub-graph is compiled into a first compiling operator, and then after a first target calculation graph of the target operator is generated based on the first compiling operator, the operators included in the first target calculation graph are fewer than the operators included in a forward calculation graph, so that when the target operator is executed based on the first target calculation graph, the times of hardware resource allocation and calculation resource scheduling are fewer, and the execution efficiency of the user-defined operator is further improved.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
To facilitate understanding of the present embodiment, first, a detailed description is given of an execution method of an operator disclosed in the embodiments of the present disclosure, where an execution subject of the execution method of an operator provided in the embodiments of the present disclosure is generally a computer device with certain computing capability, and the computer device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the method of executing the operator may be implemented by a processor calling computer-readable instructions stored in a memory.
In the computer device, for example, a machine learning framework may be deployed, and the execution method of the operator provided by the embodiment of the present disclosure is executed by the machine learning framework.
The target operators described in the embodiments of the present disclosure include, for example, mappings written by users through high-level programming languages (e.g., C, C + +, C #, Java, etc.) that enable certain functions, i.e., the correspondence from one function space (e.g., Banach space, Hilbert space, Sobolev space) to another function space.
The basic operator described in the embodiments of the present disclosure is a function map that is pre-existing in a machine learning framework and can implement a basic computation function, and includes, for example: mathematical operation basis operators (e.g., convolution, summation, etc.), comparison operation basis operators, logical operation basis operators, reduction operation basis operators, threshold operation basis operators, differential basis operators, gradient basis operators, divergence basis operators, laplacian basis operators, hamilton basis operators, and the like.
The reverse basic operator described in the embodiment of the present disclosure is, for example, an operator that performs reverse operation corresponding to any one of the basic operators; for example, if the basic operator is a mathematical operation basic operator for performing integral calculation, the corresponding inverse basic operator is a mathematical operation basic operator for performing derivative calculation.
The first compiling operator described in the embodiments of the present disclosure includes, for example, an operator obtained by compiling a plurality of basic operators and capable of realizing the same function in place of the plurality of basic operators.
The second compiling operator described in the embodiment of the present disclosure includes, for example, an operator obtained by compiling a plurality of reverse basic operators and capable of replacing the plurality of reverse basic operators to implement the same function.
The following describes an execution method of an operator provided in the embodiments of the present disclosure.
Referring to fig. 1, a flowchart of an execution method of an operator provided in an embodiment of the present disclosure is shown, where the method includes steps S101 to S104, where:
s101: and acquiring a forward calculation graph of the target operator.
The forward calculation process of the target operator from input to output is actually a process of calling a plurality of basic operators pre-existing in a machine learning framework to realize a target calculation function, and the forward calculation graph comprises the plurality of basic operators.
In another embodiment of the present disclosure, the following method may be adopted to obtain the forward computation graph of the target operator, for example: recording first calling information for calling a predefined basic operator in a forward process of executing a target operator; and acquiring a forward calculation graph of the target operator based on the first calling information.
The forward process of the target operator is a computer code process for executing the target operator according to a forward computational graph, and the computer code executing process is a process for processing input data to be processed to obtain a corresponding prediction result; the computer code of the target operator is source code written by a high-level programming language, such as C, C + +, C #, Java, and the like, which can implement the corresponding function of the target operator. When the machine learning framework is used for executing the corresponding computer code, after the code is transmitted to a computing engine of the computer from an application layer where the machine learning framework is located, the computing engine needs to convert the source code into a basic code which can be identified by the computer, namely in the process of executing the computer code, because a target operator is not predefined in the machine learning framework in advance, the machine learning framework calls the basic operator predefined by the machine learning framework when the computer code is executed. In the process of executing the target operator, the machine learning framework can record the calling condition of the basic operator in the process to obtain first calling information. The first invocation information comprises, for example, the kind of base operator invoked when executing the computer code, and the order of invocation of the base operators, dependencies between different base operators.
For example, predetermined samples may be input to a machine learning framework; the machine learning framework performs processing corresponding to the target operator on the sample data. The process of executing the corresponding processing is the process of executing the computer code of the target operator. In the process of executing the computer code, the machine learning frame calls the basic operator to realize the corresponding processing task of the target operator, namely the basic operator in the machine learning frame is called to obtain the output corresponding to the sample data; in the process, the machine learning framework records the category of each basic operator called in the computer code execution process, the calling sequence of each basic operator called and the dependency relationship among the called basic operators. And obtaining first calling information after the computer code is executed. And then, generating a forward calculation graph of the target operator by using the first calling information.
Here, the forward computation Graph of the target operator may be represented as a Directed Acyclic Graph (DAG); a plurality of nodes are included in the DAG, each node representing a base operator. The base operators with dependencies between them are connected by directed edges. The DAG is used as the generated forward computation graph.
Illustratively, in the computer code execution process, a basic operator 1 is called first; then, calling a basic operator 2 and a basic operator 3 in parallel; invoking basic operator 4 after basic operator 2, invoking basic operator 5 after basic operator 4, invoking basic operator 6 after basic operator 5, invoking basic operator 7 after basic operator 6, invoking basic operator 8 after basic operator 7; the basic operator 9 is called after the basic operator 3, and the basic operator 8 is called after the basic operator 9, so as to obtain the forward computation graph shown in fig. 2, where the dependency relationship in the embodiment of the present disclosure is an association relationship expressed by a calling sequence between the basic operators, for example, the output of the basic operator 1 in fig. 2 is the input of the basic operator 2, so that there is a dependency relationship between the basic operator 1 and the basic operator 2, and the basic operator 2 depends on the basic operator 1, and the output of the basic operator 2 is not associated with the input of the basic operator 3, so there is no dependency relationship between the basic operator 2 and the basic operator 3.
After receiving the above S101, after obtaining the forward computation graph of the target operator, the method further includes:
s102: and determining a first sub-graph from the forward calculation graph, and compiling the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph.
Wherein the first sub-graph for example comprises at least part of the base operators in the forward computational graph; at least one first sub-graph may be determined in the forward computational graph.
In another embodiment of the present disclosure, the following method may be used to determine the first sub-graph from the forward computational graph:
and determining a first subgraph from the forward calculation graph based on the connection relation information among the basic operators in the forward calculation graph and the types of the basic operators.
Here, when determining the first sub-graph from the forward computational graph based on the connection relation information and the type of the base operator, the alternative base operator may be first determined from the forward computational graph based on the type of the base operator;
then, determining a target basic operator from the alternative basic operators based on the connection relation information among the alternative basic operators; and obtaining a first subgraph based on the target basic operator.
Here, the alternative base operator includes a base operator of which the type is a preset type.
Before generating the first subgraph, a plurality of preset types may first be determined. The operators of the preset type include, for example, basic operators with simple calculation processes, such as multiplication operators, division operators, addition operators, subtraction operators, and the like. For some basic operators, such as convolution operators, the calculation process itself is complicated, so that they may not be used as the preset type.
When the first sub-graph is generated, the types of all basic operators in the forward calculation graph can be matched with the preset types in sequence; and if the type of any basic operator is successfully matched with the preset type, taking the basic operator as an alternative basic operator.
After the alternative basic operators are determined from the forward calculation graph, the alternative basic operators can be divided into different groups based on the connection relation information among different alternative basic operators; wherein, in each group, any alternative basic operator is connected with another alternative basic operator in the group.
If the number of the alternative basic operators in any one group is larger than the preset number, the alternative basic operators in the group can be used as target basic operators, and a first subgraph is obtained based on the target basic operators in the group.
Therefore, the method ensures that a plurality of basic operators with continuous processing relations can be compiled into one first compiling operator, and reduces the number of operators in the finally formed first target calculation graph.
For example: in the forward calculation diagram shown in fig. 2, the basic operators 3, the basic operators 5, the basic operators 6, and the basic operators 7 are determined as alternative basic operators, and based on the connection relationship information of these basic operators, the basic operators 2 can be divided into three groups, the basic operators 2 are a first group, the basic operators 3 are a second group, the basic operators 5, the basic operators 6, and the basic operators 7 are a group, that is, a third group, and assuming that the preset number is 2, the basic operators 5, the basic operators 6, and the basic operators 7 are respectively used as the target basic operators 5, the target basic operators 6, and the target basic operators 7, and the first sub-graph shown in fig. 3 is determined based on these determined target basic operators.
After at least one first sub-graph is determined from the forward calculation graph, compiling the first sub-graph aiming at each first sub-graph, and compiling all basic operators in the first sub-graph into one first compiling operator which can realize the same functions of all the basic operators.
Receiving the step S102, after obtaining the first compiling operator corresponding to the first sub-graph, further including:
s103: based on the first compiling operator, a first target calculation graph of the target operator is generated.
S104: based on the first target computation graph, a forward process of a target operator is performed.
Specifically, the first target calculation graph of the target operator is generated based on the first compiling operator, for example, the first target calculation graph of the target operator is generated based on the first compiling operator and other basic operators in the forward calculation graph except the first subgraph.
When the first target calculation graph is generated, the connection relationship between the other basic operators and the first compiling operator can be determined by using the connection relationship between the plurality of target basic operators corresponding to the first compiling operator and the other basic operators in the forward calculation graph.
And then generating a first target calculation graph based on the other basic operators, the first compiling operator and the connection relation between the other basic operators and the first compiling operator.
Illustratively, the basic operator 5, the basic operator 6 and the basic operator 7 in fig. 3 are compiled into a first compiling operator 1, and the connection relationship between the first compiling operator 1 and other basic operators is determined based on the connection relationship between the basic operator 5, the basic operator 6 and the basic operator 7 and other basic operators in the forward calculation diagram shown in fig. 2; the first target calculation graph shown in fig. 4 is determined based on the other basic operators, the first compiling operator 1, and the connection relationship between the first compiling operator 1 and the other basic operators.
Thus, because all of the base operators included in the first sub-graph are compiled into one first compiler, the number of operators of the forward computation graph is reduced, and the first target computation graph is formed with fewer operators than the forward computation graph. Therefore, when the target operator is executed based on the first target computation graph, the number of times of hardware resource allocation and computation resource scheduling is required to be reduced, and the efficiency of executing the target operator is increased.
The embodiment of the disclosure determines a first sub-graph from a forward computation graph of a target operator, wherein the first sub-graph comprises a plurality of at least partial basic operators in the forward computation graph; after the first sub-graph is compiled, the basic operator included in the first sub-graph is compiled into a first compiling operator, and then after a first target calculation graph of the target operator is generated based on the first compiling operator, the operators included in the first target calculation graph are fewer than the operators included in a forward calculation graph, so that when the target operator is executed based on the first target calculation graph, the times of hardware resource allocation and calculation resource scheduling are fewer, and the execution efficiency of the user-defined operator is further improved.
In another embodiment of the present disclosure, there is provided another method for executing an operator, as shown in fig. 5, including: S501-S503, wherein:
s501: and acquiring a reverse calculation graph of the reverse process of the target operator.
In specific implementation, the machine learning framework not only includes basic operators, but also includes reverse basic operators corresponding to each basic operator, and the obtained reverse calculation graph of the target operator can record second calling information for calling the predefined reverse basic operator in the reverse process of executing the target operator; and generating a reverse calculation graph of the target operator based on the second calling information. The process of generating the reverse computation graph of the target operator based on the second calling information is similar to the process of generating the forward computation graph of the target operator based on the first calling information, and is not described herein again.
S502: and determining a second subgraph from the reverse calculation graph, and compiling the second subgraph to obtain a second compiler corresponding to the second subgraph.
S503: and determining a second target calculation graph of the target operator reverse process based on the second compiling operator.
In a specific implementation, the manner of determining the second sub-graph is similar to the manner of determining the first sub-graph, and the manner of determining the second target computation graph in the target operator reverse process based on the second compiling operator is similar to the manner of determining the first target computation graph in the target operator forward process based on the first compiling operator, which is not described herein again.
In another embodiment of the present disclosure, after obtaining the second target computation graph, the reverse process of the target operator may be further executed based on the second target computation graph, where the reverse process of the target operator is: and carrying out a gradient back propagation process according to the result output by each operator, and optimizing the operator parameters in each operator through the back propagation gradient.
Because the number of operators in the second target calculation graph is less than that of operators in the reverse calculation graph, the reverse process of executing the target operators based on the second target calculation graph can reduce the scheduling of hardware resources and improve the execution efficiency of the reverse process of the target operators.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Based on the same inventive concept, an operator execution device corresponding to an operator execution method is further provided in the embodiments of the present disclosure, and as the principle of solving the problem of the device in the embodiments of the present disclosure is similar to the operator execution method in the embodiments of the present disclosure, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
Referring to fig. 6, a schematic diagram of an apparatus for executing an operator according to an embodiment of the present disclosure is shown, where the apparatus includes: an acquisition module 601, a compiling module 602, a generating module 603, and an executing module 604; wherein the content of the first and second substances,
an obtaining module 601, configured to obtain a forward computation graph of a target operator; wherein the forward computational graph comprises a plurality of base operators;
a compiling module 602, configured to determine a first sub-graph from the forward computation graph, and perform compiling processing on the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph; the first sub-graph comprises at least part of basic operators in the forward calculation graph;
a generating module 603, configured to generate a first target computation graph of the target operator based on the first compiling operator;
an executing module 604, configured to execute a forward process of the target operator based on the first target computation graph.
In a possible implementation manner, when obtaining a forward computation graph of a target operator, the obtaining module is specifically configured to record first calling information for calling a predefined basic operator in a forward process of executing the target operator; and generating a forward calculation graph of the target operator based on the first calling information.
In one possible embodiment, the first call information includes: the type of the basic operator called when the forward process of the target operator is executed, the calling sequence of the basic operator and the dependency relationship among different basic operators.
In one possible embodiment, the base operator comprises at least one of: a mathematical operation base operator, a comparison operation base operator, a logical operation base operator, a specification operation base operator, and a threshold operation base operator.
In a possible implementation, when determining the first sub-graph from the forward computation graph, the compiling module is specifically configured to determine the first sub-graph from the forward computation graph based on information of connection relationships between the base operators constituting the forward computation graph and a type of the base operator.
In a possible implementation, when determining the first sub-graph from the forward computation graph based on information of connection relationships between the base operators constituting the forward computation graph and the types of the base operators, the compiling module is specifically configured to determine an alternative base operator from the forward computation graph based on the types of the base operators; determining a target basic operator from the alternative basic operators based on the connection relation information among the alternative basic operators; and obtaining the first subgraph based on the target basic operator.
In a possible implementation manner, when generating the first target computation graph of the target operator based on the first compiling operator, the generating module is specifically configured to generate the first target computation graph of the target operator based on the first compiling operator and a base operator in the forward computation graph except for the first sub-graph.
In a possible implementation manner, the obtaining module is further configured to obtain a reverse computation graph of the target operator reverse process; the compiling module is further configured to determine a second sub-graph from the reverse calculation graph, and perform compiling processing on the second sub-graph to obtain a second compiling operator corresponding to the second sub-graph; the generating module is further configured to determine, based on the second compiling operator, a second target calculation graph of the target operator reverse process.
In a possible implementation manner, when the reverse computation graph of the target operator reverse process is obtained, the obtaining module is specifically configured to record second calling information for calling a predefined reverse basic operator in the reverse process of executing the target operator; and generating a reverse calculation graph of the target operator based on the second calling information.
In a possible implementation, the execution module is further configured to execute a reverse process of the target operator based on the second target computation graph.
The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.
An embodiment of the present disclosure further provides a computer device, as shown in fig. 7, which is a schematic structural diagram of the computer device provided in the embodiment of the present disclosure, and includes:
a processor 71 and a memory 72; the memory 72 stores machine-readable instructions executable by the processor 71, the processor 71 being configured to execute the machine-readable instructions stored in the memory 72, the processor 71 performing the following steps when the machine-readable instructions are executed by the processor 71:
acquiring a forward calculation graph of a target operator; wherein the forward computational graph comprises a plurality of base operators;
determining a first sub-graph from the forward calculation graph, and performing compiling processing on the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph; the first sub-graph comprises at least part of basic operators in the forward calculation graph;
generating a first target calculation graph of the target operator based on the first compiling operator;
based on the first target computation graph, performing a forward process of the target operator.
The memory 72 includes a memory 721 and an external memory 722; the memory 721 is also referred to as an internal memory, and temporarily stores operation data in the processor 71 and data exchanged with an external memory 722 such as a hard disk, and the processor 71 exchanges data with the external memory 722 through the memory 721.
For the specific execution process of the instruction, reference may be made to the steps of the operator execution method described in the embodiments of the present disclosure, and details are not described here.
The embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the method for executing the operator in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.
The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and an instruction included in the program code may be used to execute the step of the method for executing the operator in the foregoing method embodiments, which may be referred to specifically as the foregoing method embodiments, and is not described herein again.
The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (13)

1. A method for executing an operator, comprising:
acquiring a forward calculation graph of a target operator; wherein the forward computational graph comprises a plurality of base operators;
determining a first sub-graph from the forward calculation graph, and performing compiling processing on the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph; the first sub-graph comprises at least part of basic operators in the forward calculation graph;
generating a first target calculation graph of the target operator based on the first compiling operator;
based on the first target computation graph, performing a forward process of the target operator.
2. The method for executing an operator according to claim 1, wherein said obtaining a forward computation graph of a target operator comprises:
recording first calling information for calling a predefined basic operator in a forward process of executing the target operator;
and generating a forward calculation graph of the target operator based on the first calling information.
3. The method for executing an operator according to claim 2, wherein the first calling information includes: the type of the basic operator called when the forward process of the target operator is executed, the calling sequence of the basic operator and the dependency relationship among different basic operators.
4. A method of performing an operator according to any of claims 1-3, wherein the base operator comprises at least one of:
a mathematical operation base operator, a comparison operation base operator, a logical operation base operator, a specification operation base operator, and a threshold operation base operator.
5. The method of performing an operator according to any of claims 1-4, wherein said determining a first sub-graph from said forward computational graph comprises:
determining the first subgraph from the forward computation graph based on connection relation information between the base operators constituting the forward computation graph and the type of the base operators.
6. The method according to claim 5, wherein said determining the first sub-graph from the forward computation graph based on information of connection relationships between the base operators constituting the forward computation graph and types of the base operators comprises:
determining an alternative base operator from the forward computational graph based on the type of the base operator;
determining a target basic operator from the alternative basic operators based on the connection relation information among the alternative basic operators;
and obtaining the first subgraph based on the target basic operator.
7. The method for executing an operator according to any one of claims 1 to 6, wherein said generating a first target computation graph of said target operator based on said first compilation operator comprises:
and generating a first target calculation graph of the target operator based on the first compiling operator and other basic operators in the forward calculation graph except the first subgraph.
8. The method of executing an operator according to any of claims 1-7, further comprising:
obtaining a reverse calculation chart of the reverse process of the target operator;
determining a second sub-graph from the reverse calculation graph, and performing compilation processing on the second sub-graph to obtain a second compiler corresponding to the second sub-graph;
determining a second target computational graph of the target operator reverse process based on the second compiling operator.
9. The method for executing an operator according to claim 8, wherein said obtaining a reverse computation graph of a reverse process of the target operator comprises:
recording second calling information for calling a predefined reverse basic operator in the reverse process of executing the target operator;
and generating a reverse calculation graph of the target operator based on the second calling information.
10. The method for executing an operator according to claim 8 or 9, further comprising: based on the second target computation graph, performing a reverse process of the target operator.
11. An apparatus for executing an operator, comprising:
the acquisition module is used for acquiring a forward calculation graph of the target operator; wherein the forward computational graph comprises a plurality of base operators;
the compiling module is used for determining a first sub-graph from the forward calculation graph and compiling the first sub-graph to obtain a first compiling operator corresponding to the first sub-graph; the first sub-graph comprises at least part of basic operators in the forward calculation graph;
a generating module, configured to generate a first target computation graph of the target operator based on the first compiling operator;
and the execution module is used for executing the forward process of the target operator based on the first target calculation graph.
12. A computer device, comprising: a processor, a memory storing machine-readable instructions executable by the processor, the processor for executing machine-readable instructions stored in the memory, the machine-readable instructions, when executed by the processor, the processor performing the steps of the method of execution of the operator of any one of claims 1 to 10.
13. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when executed by a computer device, performs the steps of the method of execution of an operator according to any one of claims 1 to 10.
CN202110209717.7A 2021-02-24 2021-02-24 Operator execution method and device, computer equipment and storage medium Pending CN112947933A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110209717.7A CN112947933A (en) 2021-02-24 2021-02-24 Operator execution method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110209717.7A CN112947933A (en) 2021-02-24 2021-02-24 Operator execution method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112947933A true CN112947933A (en) 2021-06-11

Family

ID=76246112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110209717.7A Pending CN112947933A (en) 2021-02-24 2021-02-24 Operator execution method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112947933A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003306A (en) * 2021-10-27 2022-02-01 上海商汤科技开发有限公司 Video memory optimization method, device, equipment and storage medium
CN114880537A (en) * 2022-05-13 2022-08-09 深圳宏鹏数字供应链管理有限公司 Enterprise risk assessment method, system and storage medium
CN114897146A (en) * 2022-05-18 2022-08-12 北京百度网讯科技有限公司 Model generation method and device and electronic equipment
CN114924745A (en) * 2022-05-19 2022-08-19 北京百度网讯科技有限公司 Operation method and device of deep learning compiler and electronic equipment
CN115145965A (en) * 2022-09-01 2022-10-04 浙江大华技术股份有限公司 Data stream generation method, electronic device and computer-readable storage medium
WO2023124677A1 (en) * 2021-12-30 2023-07-06 华为技术有限公司 Data processing method and computing platform

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345937A (en) * 2017-01-06 2018-07-31 谷歌有限责任公司 Cycle is merged with library
CN110689116A (en) * 2019-09-24 2020-01-14 上海寒武纪信息科技有限公司 Neural network pruning method and device, computer equipment and storage medium
CN110908667A (en) * 2019-11-18 2020-03-24 北京迈格威科技有限公司 Method and device for joint compilation of neural network and electronic equipment
CN111160551A (en) * 2019-12-04 2020-05-15 上海寒武纪信息科技有限公司 Computation graph execution method, computer device, and storage medium
CN111222637A (en) * 2020-01-17 2020-06-02 上海商汤智能科技有限公司 Neural network model deployment method and device, electronic equipment and storage medium
CN111260019A (en) * 2020-02-18 2020-06-09 深圳鲲云信息科技有限公司 Data processing method, device and equipment of neural network model and storage medium
CN111338635A (en) * 2020-02-20 2020-06-26 腾讯科技(深圳)有限公司 Graph compiling method, device and equipment for calculation graph and storage medium
US20200334544A1 (en) * 2019-04-19 2020-10-22 EMC IP Holding Company LLC Method, device and computer program product for processing machine learning model
CN111860820A (en) * 2020-07-31 2020-10-30 北京灵汐科技有限公司 Neural network operator dividing method and device and dividing equipment
US20210034582A1 (en) * 2019-08-02 2021-02-04 EMC IP Holding Company LLC Method, electronic device and computer program product for processing machine learning model

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345937A (en) * 2017-01-06 2018-07-31 谷歌有限责任公司 Cycle is merged with library
US20200334544A1 (en) * 2019-04-19 2020-10-22 EMC IP Holding Company LLC Method, device and computer program product for processing machine learning model
US20210034582A1 (en) * 2019-08-02 2021-02-04 EMC IP Holding Company LLC Method, electronic device and computer program product for processing machine learning model
CN110689116A (en) * 2019-09-24 2020-01-14 上海寒武纪信息科技有限公司 Neural network pruning method and device, computer equipment and storage medium
CN110908667A (en) * 2019-11-18 2020-03-24 北京迈格威科技有限公司 Method and device for joint compilation of neural network and electronic equipment
CN111160551A (en) * 2019-12-04 2020-05-15 上海寒武纪信息科技有限公司 Computation graph execution method, computer device, and storage medium
CN111222637A (en) * 2020-01-17 2020-06-02 上海商汤智能科技有限公司 Neural network model deployment method and device, electronic equipment and storage medium
CN111260019A (en) * 2020-02-18 2020-06-09 深圳鲲云信息科技有限公司 Data processing method, device and equipment of neural network model and storage medium
CN111338635A (en) * 2020-02-20 2020-06-26 腾讯科技(深圳)有限公司 Graph compiling method, device and equipment for calculation graph and storage medium
CN111860820A (en) * 2020-07-31 2020-10-30 北京灵汐科技有限公司 Neural network operator dividing method and device and dividing equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NESTERUK, LG ET AL.: "On modeling neural-network protection facilities for computer-aided systems: A formal model of adaptation and operation. II", 《AUTOMATION AND REMOTE CONTROL》, vol. 70, no. 2 *
殷崇勇;尹首一;刘雷波;杨超;朱敏;魏少军;: "可重构媒体处理器任务编译器的前端设计", 北京邮电大学学报, no. 03 *
胡浩;沈莉;周清雷;巩令钦;: "基于LLVM编译器的节点融合优化方法", 计算机科学, no. 1 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003306A (en) * 2021-10-27 2022-02-01 上海商汤科技开发有限公司 Video memory optimization method, device, equipment and storage medium
CN114003306B (en) * 2021-10-27 2024-03-15 上海商汤科技开发有限公司 Video memory optimization method, device, equipment and storage medium
WO2023124677A1 (en) * 2021-12-30 2023-07-06 华为技术有限公司 Data processing method and computing platform
CN114880537A (en) * 2022-05-13 2022-08-09 深圳宏鹏数字供应链管理有限公司 Enterprise risk assessment method, system and storage medium
CN114897146A (en) * 2022-05-18 2022-08-12 北京百度网讯科技有限公司 Model generation method and device and electronic equipment
CN114897146B (en) * 2022-05-18 2023-11-03 北京百度网讯科技有限公司 Model generation method and device and electronic equipment
CN114924745A (en) * 2022-05-19 2022-08-19 北京百度网讯科技有限公司 Operation method and device of deep learning compiler and electronic equipment
CN115145965A (en) * 2022-09-01 2022-10-04 浙江大华技术股份有限公司 Data stream generation method, electronic device and computer-readable storage medium

Similar Documents

Publication Publication Date Title
CN112947933A (en) Operator execution method and device, computer equipment and storage medium
US8997065B2 (en) Automatic modularization of source code
CN111831287B (en) Method, apparatus and program product for determining resources required to execute a code segment
CN111104120A (en) Neural network compiling method and system and corresponding heterogeneous computing platform
CN113642659A (en) Training sample set generation method and device, electronic equipment and storage medium
Martínez-del-Amor et al. Adaptative parallel simulators for bioinspired computing models
CN114004335A (en) Data processing method and device, electronic equipment and storage medium
CN114398040A (en) Neural network reasoning method, device, computer equipment and storage medium
US20120166444A1 (en) Co-map communication operator
Hascoët et al. Source-to-source adjoint Algorithmic Differentiation of an ice sheet model written in C
CN115268936B (en) Optimization method and device for calculation chart compilation
CN114565102A (en) Method, electronic device and computer program product for deploying machine learning model
JP4870956B2 (en) Embedded program generation method, embedded program development system, and information table section
CN115328458A (en) Business application development method and device
Hoefler et al. Automatic complexity analysis of explicitly parallel programs
CN113327217A (en) Convolution processing method and device, computer equipment and storage medium
Naumann Adjoint code design patterns
CN113626035A (en) Neural network compiling method facing RISC-V equipment based on TVM
CN112001494A (en) Method for realizing support of FPGA (field programmable Gate array) back-end equipment by nGraph framework
CN111913712A (en) Method and apparatus for deploying neural network model at Web end
Siefert Model-assisted pattern search
CN114356340A (en) Neural network compiling method and device, computer equipment and storage medium
CN113986240A (en) Compiling tracking method and device
Albert et al. Quantified abstractions of distributed systems
Li Mc2For: a MATLAB to Fortran 95 complier

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination