CN115658331A

CN115658331A - Compiling method and device of dynamic neural network, electronic equipment and storage medium

Info

Publication number: CN115658331A
Application number: CN202211688051.9A
Authority: CN
Inventors: 李晓泉; 胡胜新
Original assignee: Shanghai Denglin Technology Co ltd; Hangzhou Denglin Hanhai Technology Co ltd
Current assignee: Shanghai Denglin Technology Co ltd; Hangzhou Denglin Hanhai Technology Co ltd
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2023-01-31
Anticipated expiration: 2042-12-28
Also published as: CN115658331B

Abstract

The application provides a compiling method and device of a dynamic neural network, electronic equipment and a storage medium, and relates to the technical field of computers. The method can obtain the maximum shape of an input tensor in the dynamic neural network at the compiling stage of the dynamic neural network; determining the maximum shape of the output tensor of a first class of target operator in the dynamic neural network based on the maximum shape of the input tensor in the dynamic neural network; determining the maximum shape of the output tensor of the second type of target operator in the dynamic neural network according to the maximum shape of the output tensor of the first type of target operator in the dynamic neural network; and allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator and the maximum shape of the output tensor of the second class of target operator in the dynamic neural network, so that the memory space can be allocated for each operator in the dynamic neural network in the compiling stage of the dynamic neural network, and the execution speed of the dynamic neural network can be further improved.

Description

Compiling method and device of dynamic neural network, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for compiling a dynamic neural network, an electronic device, and a storage medium.

Background

Compared with a static neural network, the dynamic neural network can adaptively adjust the structure and/or parameters of the network according to different input data, and the calculation link of the dynamic neural network may not be fixed during inference calculation, but may be different according to different inputs.

In the prior art, the memory usage amount is often dynamically calculated for each layer structure or each operator of the dynamic neural network during the inference execution process of the dynamic neural network, and the memory is dynamically allocated to each operator in the dynamic neural network according to the calculation result during the inference execution process. That is, the implementation process of the existing dynamic neural network often includes: the dynamic calculation process of the memory usage and the dynamic memory allocation process based on the calculation result are carried out, so that the existing dynamic neural network has the problem of low execution speed.

Disclosure of Invention

An object of the present application is to provide a compiling method, an apparatus, an electronic device, and a storage medium for a dynamic neural network, which can provide data references for memory allocation of the dynamic neural network in a compiling period in advance, facilitate completion of corresponding memory allocation before the dynamic neural network develops an inference execution process, and improve an execution speed of the dynamic neural network.

In order to achieve the above purpose, the technical solutions adopted in the embodiments of the present application are as follows:

in a first aspect, the present invention provides a method for compiling a dynamic neural network, the method including:

in the compiling stage of the dynamic neural network, acquiring the maximum shape of an input tensor in the dynamic neural network;

determining a maximum shape of an output tensor of a first class of target operator in the dynamic neural network based on a maximum shape of an input tensor in the dynamic neural network, wherein the shape of the output tensor of the first class of target operator is related to the shape of the input tensor;

determining the maximum shape of the output tensor of a second type of target operator in the dynamic neural network according to the maximum shape of the output tensor of a first type of target operator in the dynamic neural network, wherein the shape of the output tensor of the second type of target operator is related to the value of the input tensor;

and allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator and the maximum shape of the output tensor of the second class of target operator in the dynamic neural network.

In a second aspect, the present invention provides a compiling apparatus for a dynamic neural network, the compiling apparatus comprising:

the acquisition module is used for acquiring the maximum shape of an input tensor in the dynamic neural network at the compiling stage of the dynamic neural network;

a first determining module, configured to determine a maximum shape of an output tensor of a first class of target operator in the dynamic neural network based on a maximum shape of an input tensor in the dynamic neural network, wherein the shape of the output tensor of the first class of target operator is related to the shape of the input tensor;

a second determining module, configured to determine a maximum shape of an output tensor of a second type of target operator in the dynamic neural network according to a maximum shape of an output tensor of a first type of target operator in the dynamic neural network, where the shape of the output tensor of the second type of target operator is related to a value of an input tensor;

and the distribution module is used for distributing memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator and the maximum shape of the output tensor of the second class of target operator in the dynamic neural network.

In a third aspect, the present invention provides an electronic device comprising: the dynamic neural network compiling device comprises a processor, a storage medium and a bus, wherein the storage medium stores machine readable instructions executable by the processor, when an electronic device runs, the processor is communicated with the storage medium through the bus, and the processor executes the machine readable instructions to execute the steps of the dynamic neural network compiling method according to any one of the preceding embodiments.

In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to execute the steps of the method for compiling a dynamic neural network according to any one of the foregoing embodiments.

The beneficial effect of this application is:

in the compiling method and device for the dynamic neural network, the electronic device and the storage medium provided by the embodiment of the application, the maximum shape of the input tensor in the dynamic neural network can be obtained at the compiling stage of the dynamic neural network; determining the maximum shape of the output tensor of a first class of target operator in the dynamic neural network based on the maximum shape of the input tensor in the dynamic neural network, wherein the shape of the output tensor of the first class of target operator is related to the shape of the input tensor; determining the maximum shape of the output tensor of a second type of target operator in the dynamic neural network according to the maximum shape of the output tensor of a first type of target operator in the dynamic neural network, wherein the shape of the output tensor of the second type of target operator is related to the value of the input tensor; according to the maximum shape of the output tensor of the first class of target operators in the dynamic neural network and the maximum shape of the output tensor of the second class of target operators in the dynamic neural network, the memory space is allocated to each operator in the dynamic neural network.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a schematic flowchart of a compiling method of a dynamic neural network according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of another dynamic neural network compiling method according to an embodiment of the present disclosure;

fig. 3 is a flowchart illustrating a compiling method of a dynamic neural network according to an embodiment of the present application;

fig. 4 is a schematic flowchart of a compiling method of a dynamic neural network in an application scenario according to an embodiment of the present application;

fig. 5 is a flowchart illustrating a compiling method of a dynamic neural network according to an embodiment of the present application;

fig. 6 is a partial flowchart of a compiling method of a dynamic neural network in another application scenario provided in the embodiment of the present application;

fig. 7 is a functional block diagram of a compiling device of a dynamic neural network according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.

In the prior art, for a dynamic neural network, the memory required by each layer in the network is dynamically calculated in the actual execution stage of the network, and is dynamically allocated. In the prior art, the process of allocating the memory to each operator in the dynamic neural network is completed in the execution stage of the dynamic neural network, and the calculation process of the memory usage is also performed in the execution stage of the network, so that the dynamic neural network has the problem of low execution speed.

In view of this, embodiments of the present disclosure provide a compiling method for a dynamic neural network, which can improve an execution speed of the dynamic neural network.

Fig. 1 is a flowchart illustrating a compiling method of a dynamic neural network according to an embodiment of the present disclosure, and optionally, an execution subject of the method may be an electronic device such as a computer, a server, a processor, and the like having a compiling function of the dynamic neural network, and specifically, may be a graph compiler disposed on the computer. It should be noted that the method provided in the embodiment of the present application may be specifically applied to a plurality of fields such as image classification, text translation, speech recognition, object recognition, and the like, and it can be understood that when the method is applied to different fields, the method corresponds to different dynamic neural networks.

As shown in fig. 1, the method includes: S101-S104.

S101, in a compiling stage of the dynamic neural network, acquiring the maximum shape of an input tensor in the dynamic neural network.

The dynamic neural network is a neural network in which the shape of the input tensor or the shape of the intermediate result tensor dynamically changes. The processing process of the dynamic neural network can be divided into a compiling stage and an executing stage, wherein the compiling stage can convert the dynamic neural network into an object code (specifically a binary instruction code) which is executed on a computing platform; the execution phase may execute the translated object code on the computing platform.

A tensor is a container of operator computed data, including the input tensor and the output tensor. In the following description, for convenience of description, the input tensor is abbreviated as input, and the output tensor is abbreviated as output.

In addition to storing data, tensors also record the spatial location of the data as it is stored. In a dynamic neural network, an operator may correspond to a computational unit, for example: convolution Layer (Convolution Layer) is an operator; the summation of weights in a Fully-connected Layer (FC Layer) is an operator. It is worth noting that different operators may be included for different dynamic neural networks.

Each operator in the neural network has an input tensor and an output tensor for the operator. In the entire neural network, the output tensors of some operators may be used as intermediate result tensors of the entire neural network, i.e., may be the input tensors of other operators in the neural network.

In the embodiment of the present application, in a compiling stage of a dynamic neural network, an input tensor of the dynamic neural network may be extracted, and a maximum shape (information) of the input tensor in the dynamic neural network may be obtained.

Optionally, in some embodiments, the maximum shape of the input tensor in the dynamic neural network may be passed into the graph compiler as a compilation parameter.

S102, determining the maximum shape of the output tensor of the first class of target operator in the dynamic neural network based on the maximum shape of the input tensor in the dynamic neural network.

The shape of the output tensor of the first class of target operators is related to the shape of the input tensor of the first class of target operators, that is, for any one of the first class of target operators, the shape of the output tensor is determined only by the shape of the input tensor of the operator and is not related to the value of the input tensor of the operator. For the first class of object operators, the maximum shape of the output tensor can be derived as long as the maximum shape of the input tensor is known.

Optionally, the first type target operator may be: an add operator add, various unary operators (e.g., cos, sin, log, positive and negative operators negative, etc.), various pooling operators (e.g., max pooling, avg pooling, etc.), etc., which are not limited herein, and may include different types of operators according to different application scenarios. If the shape of the output tensor of an operator has a fixed derivation relationship with the shape of the input tensor of the operator, the maximum shape of the output tensor of the operator can be derived as long as the maximum shape of the input tensor of the operator is known, and thus, the first type of target operator can also be used in the present application, for example, the convolution operation operator (conv) is also the first type of target operator.

Taking the addition operator add as an example, the output shape and the input shape of the addition operator are the same, and therefore, the maximum shape of the output tensor can be obtained for the addition operator as long as the maximum shape of the input tensor is known.

S103, determining the maximum shape of the output tensor of the second type of target operator in the dynamic neural network according to the maximum shape of the output tensor of the first type of target operator in the dynamic neural network.

Wherein, unlike the first type of target operator, the shape of the output tensor of the second type of target operator is related to the value of the input tensor of the second type of target operator, i.e., for any one of the second type of target operator, the shape of the output tensor is determined by the value of the input tensor, rather than being dependent only on the shape of the input tensor.

Optionally, the second type target operator may be: the target operators in the second category in the dynamic neural network may include different types of operators according to different application scenarios, without limitation, by using a start (stop), a resume (restart), a tile (tile), a broadcast _ to, a strained _ slice, and the like. Illustratively, the orange (stop) operator is used to generate an array, the reshape operator may be used to reconstruct the array, the broadcast _ to operator is used to broadcast, the strained _ slice operator is used to slice in each dimension of the tensor, the tile operator may be used to tile, and in some cases may be used to stitch images, and examples of the second type of target operator should not be construed as limiting the present application.

Taking an arch operator as an example, the operator can accept two input tensors, which are called start and stop, respectively, and generate an output tensor, and the operator can generate a one-dimensional tensor with a shape of [ stop-start ] according to the values of the two input tensors, for example, when start =0 and stop =10, the maximum shape of the generated output tensor is [10], and when start =0 and stop =20, the maximum shape of the generated output tensor is [20]. It can be seen that for this operator, the maximum shape of the output tensor of the operator is determined by the values of the input tensors start and stop (rather than the shape of the input tensor start, stop itself).

Therefore, for the second type of target operator, only the shape of the input tensor of the second type of target operator is known, and the shape of the output tensor of the type of operator cannot be derived. In the example of an arange (start, stop), the shape of the input tensor start, stop itself is known (i.e. the tensor with length [1 ]), but the maximum shape of the output tensor of the arange operator (which may also be called the maximum output shape) still needs to be inferred based on the maximum value of the input tensor stop.

It should be noted that, the number and the type of the first type target operator and the number and the type of the second type target operator in the dynamic neural network are not limited herein, and may be different according to the actual application scenario.

Regarding S102 and S103, the maximum shape of the output tensor of a part of operators in the dynamic neural network can be obtained through S102, and for convenience of description, the maximum shape of the output tensor can also be simply referred to as a maximum output shape. After S102, it may be determined whether there are other operators in the dynamic neural network that do not determine the maximum shape of the output tensor, and in addition, after S102, the other operators in the dynamic neural network may also be analyzed according to the result obtained in S102, so as to determine the maximum output shape of the second type of target operator that may be associated with the first type of target operator.

And S104, allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator and the maximum shape of the output tensor of the second class of target operator in the dynamic neural network.

When the method and the device are applied, operators in the dynamic neural network can be divided into a first type of target operator and a second type of target operator, the maximum shape of the output tensor of the first type of target operator and the maximum shape of the output tensor of the second type of target operator are obtained respectively, based on operator semantics, the memory size which can be maximally occupied by the output tensors corresponding to the first type of target operator and the second type of target operator can be calculated, and accordingly, memory spaces are respectively allocated for the first type of target operator and the second type of target operator.

By applying the embodiment of the application, the basis of memory allocation can be obtained in advance at the compiling stage of the dynamic neural network, so that the memory allocation can be supported in advance, namely, the memory allocation can be supported at the compiling stage without calculation and allocation at the execution stage, therefore, compared with the prior art, the execution time of the dynamic neural network can be effectively shortened, and the execution speed of the dynamic neural network is improved.

In summary, an embodiment of the present application provides a compiling method for a dynamic neural network, where the method includes: in the compiling stage of the dynamic neural network, acquiring the maximum shape of an input tensor in the dynamic neural network; determining the maximum shape of the output tensor of a first class of target operator in the dynamic neural network based on the maximum shape of the input tensor in the dynamic neural network, wherein the shape of the output tensor of the first class of target operator is related to the shape of the input tensor; determining the maximum shape of the output tensor of a second type of target operator in the dynamic neural network according to the maximum shape of the output tensor of a first type of target operator in the dynamic neural network, wherein the shape of the output tensor of the second type of target operator is related to the value of the input tensor; according to the maximum shape of the output tensor of the first class of target operators and the maximum shape of the output tensor of the second class of target operators in the dynamic neural network, memory space is allocated to each operator in the dynamic neural network.

Fig. 2 is a flowchart illustrating another dynamic neural network compiling method according to an embodiment of the present disclosure. After S102, since the maximum shape of the output tensor of the first class of target operator in the dynamic neural network can be determined, based on this, it can be further determined (detected) whether there is an operator in the dynamic neural network that fails to determine the maximum shape of the output tensor, and if so, in some embodiments, the inputs of these operators can be subjected to source analysis (for example, S201 is performed), and based on the analysis result, it can be known whether to continue to determine the maximum output shape of these operators based on semantic analysis.

As an embodiment, in step S103, determining the maximum shape of the output tensor of the second type of target operator in the dynamic neural network according to the maximum shape of the output tensor of the first type of target operator in the dynamic neural network may include: S201-S203.

S201, determining whether the numerical value of the input tensor of the second type of target operator in the dynamic neural network is derived from the shape of the output tensor of the first type of target operator.

And obtaining the input tensor of the second type of target operator according to the input tensor of the first type of target operator, wherein whether the input tensor of the second type of target operator depends on the output tensor of the first type of target operator can be obtained by performing source analysis on the input of the second type of target operator.

It will be appreciated that in a dynamic neural network, there may be some correlation between operators, for example, in some embodiments, the value of the input tensor of the second type of target operator may be derived from the shape of the output tensor of the first type of target operator, in which case, if the maximum shape of the output tensor of the second type of target operator is to be determined, the maximum shape of the output tensor of the first type of target operator is required.

And S202, if so, acquiring the maximum numerical value of the input tensor of the second type of target operator according to the maximum shape of the output tensor of the first type of target operator.

And S203, determining the maximum shape of the output tensor of the second type target operator according to the maximum numerical value of the input tensor of the second type target operator.

The method includes the steps that whether the numerical value of an input tensor of each second type of target operator in the dynamic neural network is derived from the shape of an output tensor of the first type of target operator or not can be determined, if yes, the maximum shape of the output tensor of the first type of target operator can be obtained, and accordingly the maximum numerical value of the input tensor of the second type of target operator can be obtained; it will be appreciated that after the maximum value of the input tensor of the second type of target operator is obtained, the maximum shape of the output tensor of the second type of target operator can be further determined.

For example, a dynamic neural network includes partial codes of the following two operators:

%A=shape_of(%x)；

%B=arange(0,%A)；

the shape _ of belongs to the first class of target operator, the orange belongs to the second class of target operator, semantic analysis is performed on the operators, it can be known that for the operator orange, the maximum value of the input tensor of% a must be known, and the maximum shape of the output tensor of the orange can be calculated.

It should be noted that, in some embodiments, if the maximum shape of the output tensor of the second type of target operator does not depend on the maximum shape of the output tensor of the previous operator (for example, the foregoing first type of target operator), then the maximum shape of the output tensor of the second type of target operator may also be determined according to the calibration data set, where the specific process may refer to the relevant content of S402 below, and is not described herein again.

Through the implementation mode, the source analysis can be performed on the input of the second type of target operator, so that when the input numerical value of the second type of target operator is determined to be derived from the output of the first type of target operator, the maximum shape of the output tensor of the second type of target operator can be determined by combining operator semantics, a wider application range can be supported, the applicability is stronger, and efficient processing is facilitated under the condition of facing different types of operators.

As an implementation manner, the compiling method of the dynamic neural network provided by the embodiment of the present application may be performed through multiple compiling iterations.

Fig. 3 is a flowchart illustrating a compiling method of a dynamic neural network according to an embodiment of the present disclosure. After S102, in S103, determining the maximum shape of the output tensor of the second type of target operator in the dynamic neural network according to the maximum shape of the output tensor of the first type of target operator in the dynamic neural network may include:

s205, determining whether a residual operator with the maximum shape of the output tensor not determined exists in the dynamic neural network or not according to the maximum shape of the output tensor of the first class target operator in the dynamic neural network.

If it is determined in S205 that there is no operator for determining the maximum shape of the output tensor, it indicates that the maximum shapes of the output tensors of all the operators in the dynamic neural network can be determined, and memory allocation may be performed for each operator. If there is currently a remaining operator for which the maximum shape of the output tensor is not determined, S206 is performed.

And S206, if the residual operator with the maximum shape of the output tensor undetermined exists and the residual operator comprises the second type of target operator, performing numerical source analysis on the input tensor on the second type of target operator in the residual operator.

Regarding the numerical source analysis of S206, referring to the description related to the foregoing S201, by analyzing the numerical sources of the input tensors of the second type target operators among the remaining operators, if the numerical values of the input tensors of these operators are derived from the foregoing first type target operators for which the maximum shape of the output tensor has been determined, the processing can be performed according to the ideas of the foregoing S201 to S203.

And S207, if the numerical value of the input tensor of the second type target operator in the residual operators is derived from the previous operator of the determined maximum shape of the output tensor, determining the maximum numerical value of the input tensor and the maximum shape of the output tensor for the second type target operator in the residual operators according to the previous operator of the determined maximum shape of the output tensor.

In some scenarios, the preceding operator in S207 may be a first type of target operator for which the maximum shape of the output tensor has been currently determined, or may be a second type of target operator for which the maximum shape of the output tensor has been currently determined. Determining the maximum value of the input tensor and the maximum shape of the output tensor of the second type target operator in the rest operators according to the previous operator of the determined maximum shape of the output tensor, and the method comprises the following steps: determining the maximum value of the input tensor of the second type target operator corresponding to the previous operator in the residual operator according to the maximum shape of the output tensor of the previous operator, and determining the maximum shape of the output tensor of the second type target operator in the residual operator according to the maximum value of the input tensor of the second type target operator in the residual operator. The process can be realized based on operator semantic analysis and numerical source analysis, and specific semantic analysis details and the numerical source analysis process of the input tensor should not be understood as limitations to the present application.

S207 may be followed by S208-S209.

S208, determining whether a residual operator with the maximum shape of the undetermined output tensor exists in the dynamic neural network currently.

If it is determined in S208 that there is no operator for which the maximum shape of the output tensor is not yet determined, it indicates that the maximum shape of the output tensor of all the operators in the dynamic neural network can be determined, and memory allocation may be performed for each operator, and if there is still a remaining operator for which the maximum shape of the output tensor is not yet determined, S209 is performed.

And S209, if yes, skipping to execute the step S102 when determining that the next round of iterative computation flow is needed currently, so as to iteratively determine the maximum shape of the output tensor of each residual operator in the dynamic neural network.

In S209, if it is determined that the maximum shape of the output tensor of the new target operator exists in the calculation flow (including S102, S205-S208) based on the semantic analysis of the operator in the current round, it is determined that the next iteration calculation flow is currently required, and the steps of S102 and S205 may be skipped, so that the maximum shape of the output tensor of each remaining operator in the dynamic neural network can be determined iteratively in the next round according to the maximum shape of the output tensor of the new target operator.

It is understood that, the descriptions of step S206 and step S207 can refer to the contents of step S201 to step S203, except that in the embodiment of the present application, the maximum shape of the output tensor of each operator can be obtained through multiple compiling iterations, where in some application scenarios, the maximum shape of the output tensor of some operators in the dynamic neural network may be obtained through each iteration.

Optionally, in each iteration process, it may be detected whether a remaining operator that does not determine the maximum shape of the output tensor exists in the dynamic neural network, where for convenience of distinguishing and describing, if it is known that the maximum shape of the output tensor of the remaining operator that does not determine the output tensor exists after the maximum shape of the output tensor of the first type of target operator is calculated, the remaining operator at this time may be referred to as a first remaining operator, and if it is known that the maximum shape of the output tensor does not exist after the maximum shape of the output tensor of the second type of target operator is calculated, the remaining operator at this time may be referred to as a second remaining operator, which is not limited herein. If there is a remaining operator for which the maximum shape of the output tensor is not determined, then the maximum shape of the output tensor of the first remaining operator and/or the second remaining operator can be determined in the present iteration process according to the maximum shape of the output tensor of the known operator (the previous operator) based on the above S206 and S207. It should be noted that, the embodiment of the present application is not limited to a specific number of iterations, and may iterate 3 times, 5 times, or more times according to an actual application scenario until no new result is generated (which means that a maximum output shape of a new target operator is no longer available).

It can be understood that, for an operator, if the maximum shape of the output tensor of the operator can be obtained in the current compiling iteration process, the maximum shape of the output tensor of the operator does not need to be determined repeatedly in the next compiling iteration process; if the maximum shape of the output tensor of the operator cannot be obtained in the current compiling iteration, a required result cannot be obtained only by one iteration process, for example, the numerical value of the input tensor of the operator is derived from the output tensor of some part (one or more) of the operator in the current iteration process, and then the maximum shape of the output tensor of the operator can be determined based on the obtained result in the next compiling iteration, that is, the maximum shape of the output tensor of the operator can be obtained by multiple times of iteration compiling.

Fig. 4 is a flowchart illustrating a compiling method of a dynamic neural network in an application scenario according to an embodiment of the present application. As shown in fig. 4, the implementation of the above method can be embodied in the following manner. Alternatively, the execution subject of the method may be a graph compiler, and the maximum shape of the input tensor in the dynamic neural network may be input to the graph compiler as a compiling parameter (S210).

In the first iteration process, the maximum shape of the output tensor of the first class of target operator in the dynamic neural network is calculated by a graph compiler (S211), wherein the maximum shape of the output tensor of only part of the first class of target operator is possibly calculated in the first iteration process (for the maximum shape of the output tensor of the rest part of the first class of target operator, the maximum shape can be obtained by a plurality of times of compiling iteration calculation processes or by statistics after compensation calculation); based on the calculation result of S211, it can be determined whether there is currently a first remaining operator in the dynamic neural network for which the maximum shape of the output tensor is not determined (S212). If the result of the determination in S212 is negative, it indicates that the maximum shapes of the output tensors of all the operators in the dynamic neural network are already determined, and the calculation process may be ended (S216).

If it is determined that the first remaining operator includes the second type target operator and the value of the input tensor of the second type target operator is derived from the first type target operator for which the maximum shape of the output tensor is determined, in S212, the maximum value of the input tensor of the second type target operator in the first remaining operator can be obtained according to the determined maximum shape of the output tensor of the first type target operator, and the maximum shape of the output tensor of the second type target operator is determined according to the maximum value of the input tensor of the second type target operator (S213); up to this point, it may be determined again whether a second remaining operator of undetermined maximum shape of the output tensor currently exists in the dynamic neural network (S214), and an operator of the second remaining operator may still be the first type target operator and/or the second type target operator, which is not limited herein.

If the second remaining operator for which the maximum shape of the output tensor is not determined currently does not exist in the dynamic neural network, it indicates that the maximum shapes of the output tensors of all the operators in the dynamic neural network are determined through the foregoing steps, and then the calculation process may be ended (S216). Otherwise, the analysis and calculation for the second remaining operator with the maximum shape of the undetermined output tensor in the dynamic neural network still have to be continued, for example, when it is determined that the second remaining operator with the maximum shape of the undetermined output tensor currently still exists in the dynamic neural network, whether the maximum shape of the output tensor of the new first type target operator and/or the second type target operator is calculated in the current iteration process or not may be analyzed (S215), and if so, the next iteration process may be performed (i.e., it is skipped to S211 to perform the loop iteration calculation process) based on the maximum shape of the output tensor of the new first type target operator and/or the second type target operator; if the maximum shape of the output tensor of the new first-type target operator and/or second-type target operator is not calculated in the iteration process (that is, the maximum output shape of the operator is not obtained through S211 and S213 in the process), it indicates that a new calculation result can no longer be obtained through the semantic analysis of the operator and the analysis of the input source of the operator.

It should be noted that, if it is determined that the maximum shape of the output tensor of the new first-class target operator and/or the second-class target operator is not calculated in the iteration process of the current round, in this case, an operator for determining the maximum shape of the output tensor may still exist in the dynamic neural network, and cannot be calculated at present, and optionally, in this case, further compensation calculation may be performed in steps S301 and S302 described below.

Fig. 5 is a flowchart illustrating a compiling method of a dynamic neural network according to an embodiment of the present disclosure. As shown in fig. 5, in S104, allocating a memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator and the maximum shape of the output tensor of the second class of target operator in the dynamic neural network may include: S301-S303.

S301, determining whether a residual operator which does not determine the maximum shape of the output tensor exists in the dynamic neural network according to the first class target operator which determines the maximum shape of the output tensor and the second class target operator which determines the maximum shape of the output tensor.

In view of that the dynamic neural network may further include a residual operator that needs to rely on the input data to determine the maximum shape of the output tensor, based on the above embodiment, it may be determined whether there is a residual operator that cannot determine the maximum shape of the output tensor in the dynamic neural network, and if not, it indicates that the maximum shapes of the output tensors of all the operators are already available, and then a memory space may be allocated to each operator accordingly.

It should be noted that, in an actual application scenario, the remaining operator may be a first type of target operator, and may also be a second type of target operator, where a relationship between the remaining operator and the two types of target operators is not limited herein.

In some embodiments, in particular in S301, the remaining operators for which the maximum shape of the output tensor is not determined may be screened out from all the target operators according to the identification of the first type of target operator for which the maximum shape of the output tensor has been determined, the identification of the second type of target operator for which the maximum shape of the output tensor has been determined, and the identifications of all the target operators in the dynamic neural network. The present application does not limit the identifier as long as operator differentiation is possible.

S302, if the residual operator of which the maximum shape of the output tensor is not determined exists, determining the maximum shape of the output tensor of the residual operator according to a preset calibration algorithm.

If the partial operators exist, the maximum shape of the output tensor is not determined by the partial operators in the dynamic neural network, and the partial operators are the residual operators.

Optionally, the preset calibration algorithm may include a calibration data set, and the calibration data set may include a plurality of calibration samples, and when the determination is performed at S302, the maximum shape of the output tensor of each remaining operator may be determined through the plurality of calibration samples.

For example, a dynamic neural network includes the following codes:

%D=arange(0,%x)；

in the code, since the maximum shape of the output tensor of the arange operator needs to be determined according to the numerical value of% x, if the numerical value of% x cannot be obtained through theoretical calculation, in this case, the maximum shape of the output tensor of the arange operator at this point can be determined by using a preset calibration algorithm based on the calibration data set.

And S303, allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator, the maximum shape of the output tensor of the second class of target operator and the maximum shape of the output tensors of the rest operators in the dynamic neural network.

Based on the above description, after the maximum shapes of the output tensors of the first type of target operator, the second type of target operator, and the remaining operators are obtained, memory spaces may be allocated to the operators accordingly, so that a basis for memory allocation may be obtained in advance, which is convenient for completing memory allocation in advance, for example, memory allocation of a dynamic neural network may be completed at a compiling stage of the dynamic neural network, where S301 to S303 may be understood as a further supplement to the above embodiments S101 to S104.

Optionally, in the step S302, determining the maximum shape of the output tensor of the remaining operator according to the preset calibration algorithm may include: S401-S403.

S401, determining all target operators in the dynamic neural network.

The dynamic neural network can be analyzed or decomposed to obtain an operator list, the operator list can include all target operators in the dynamic neural network, and it should be noted that the application does not limit the type and number of the target operators in the dynamic neural network, and the application can be different according to the actual application scenario.

S402, calculating the shape of the output tensor of each target operator in all the target operators based on all the calibration samples in the calibration data set, wherein the calibration data set comprises a plurality of calibration samples.

The samples in the calibration data set may be used as input data samples for the target operator. For each calibration sample in the calibration data set, the shape of the output tensor of each target operator may be calculated, and it should be noted that, for different calibration samples, the shape of the output tensor of the same target operator may include a plurality of shapes, which may be the same or different, and is not limited herein. In other words, based on different input samples, different output situations that a single operator may correspond to based on different input samples can be obtained, that is, different output shapes that a single operator may correspond to when facing different input samples can be statistically obtained.

It should be noted that, the present application does not limit the number and the type of the calibration samples in the calibration data set, and the number and the type may be different according to an actual application scenario, and different application fields may correspond to different types of calibration samples, for example, the calibration sample may specifically be an image calibration sample, a text calibration sample, a voice calibration sample, and the like.

And S403, determining the maximum shapes of the output tensors of the rest operators in the dynamic neural network according to the shapes of the output tensors of the target operators.

After the shape of the output tensor of each target operator is calculated and counted for each calibration sample, the maximum shape of the output tensors of the rest operators in the dynamic neural network can be determined according to the shape.

Optionally, as an implementation manner of S403, S403 may include: s501 and S502.

And S501, acquiring the maximum shape of the output tensor corresponding to each target operator according to the shape of the output tensor of each target operator.

Based on the foregoing description, it can be understood that, for different calibration samples, the shape of the output tensor of the same target operator may include a plurality of shapes, and therefore, various shapes of the output tensors that may respectively appear when each target operator faces different input samples may be obtained through calculation and statistics, and the maximum shape of the output tensor of each target operator is screened out from the shapes, that is, the maximum shapes of the output tensors of all target operators in the dynamic neural network may be obtained by applying the embodiment of the present application.

Based on the above description, it should be further noted that, if the calibration data set is preset in the actual application scene, the maximum shape of the output tensor of each target operator in the dynamic neural network may also be obtained through steps S401, S402, and S501.

And S502, determining the maximum shapes of the output tensors of the rest operators in the dynamic neural network according to the maximum shapes of the output tensors corresponding to the target operators respectively.

After the maximum shapes of the output tensors corresponding to the target operators are obtained, the maximum shapes of the output tensors of the residual operators (the residual operators determined by the S301) in the dynamic neural network can be screened out. Furthermore, based on the above steps S101 to S104, it can be seen that the present application can still support most scenarios as much as possible without introducing a calibration data set.

Optionally, in the step S502, determining the maximum shapes of the output tensors of the remaining operators in the dynamic neural network according to the maximum shapes of the output tensors corresponding to the target operators respectively includes:

s601, screening and determining the identifiers of the rest operators according to the identifiers of the target operators, the identifier of the first type of target operator with the determined maximum shape of the output tensor and the identifier of the second type of target operator with the determined maximum shape of the output tensor.

The method includes the steps of obtaining an identifier of a first type of target operator with a determined maximum shape of an output tensor and an identifier of a second type of target operator with the determined maximum shape of the output tensor, and screening out identifiers of remaining operators without the determined maximum shape of the output tensor from the identifiers of the target operators according to the two identifiers and the identifiers of the target operators.

S602, according to the marks of the residual operators, the maximum shapes of the output tensors of the residual operators are screened out from the maximum shapes of the output tensors corresponding to the target operators respectively.

After the identification of the residual operator is determined, the maximum shape of the output tensor of the residual operator can be screened out from the maximum shapes of the output tensors of the target operators according to the identification, and the maximum shape of the output tensor of the residual operator is obtained.

Fig. 6 is a partial flowchart of a compiling method of a dynamic neural network in another application scenario provided in the embodiment of the present application. As shown in fig. 6, the dynamic neural network may be decomposed into an operator list, and the calculation of the shape of each operator in the operator list may be performed on each calibration sample in the calibration data set, where the calculation includes: calculating the shape of the output tensor of the first target operator in the operator list of the network by taking each calibration sample in the calibration data set as input, and recording (S311); based on the calculation result of the previous step (i.e., the result of the previous step as input of the next operator), the shape of the output tensor of the next target operator is continuously calculated and recorded (S312). The process is circulated until the shape of the output tensor of each target operator found in the operator list is calculated by using each calibration sample (S313); in this case, for the shape of the output tensor of each target operator, the maximum shape parameter can be screened from the corresponding record as the maximum shape parameter of the output tensor of the target operator (that is, the maximum output shapes corresponding to the remaining operators are obtained from the record statistics according to the records corresponding to the operators, corresponding to S314).

Optionally, the method further includes: acquiring the identifications of all target operators in the dynamic neural network; and determining a first class of target operators and a second class of target operators in the dynamic neural network according to the identifications of all the target operators and a preset mapping table, wherein the preset mapping table comprises the identifications of all the operators in the first class of operators and the identifications of all the operators in the second class of operators.

Optionally, the execution main body of the compiling method of the application may be pre-stored with a preset mapping table, and of course, the preset mapping table may also be pre-stored in other storage media and acquired when needed.

After the identifiers of all target operators in the dynamic neural network are obtained, the identifiers of all target operators in the dynamic neural network can be compared with a preset mapping table, and if the identifier of each target operator is the same as the identifier of one operator in a first class of operators in the preset mapping table, the target operator can be regarded as the first class of target operators; if the identifier of the target operator is the same as the identifier of one operator in the second class of operators in the preset mapping table, the target operator can be considered as the second class of target operators, and by applying the embodiment of the application, the target operators in the dynamic neural network can be quickly divided, so that the compiling speed can be improved.

Fig. 7 is a functional block diagram of a compiling apparatus for a dynamic neural network according to an embodiment of the present application, the basic principle and the generated technical effect of the apparatus are the same as those of the corresponding method embodiment, and for a brief description, the corresponding contents in the method embodiment may be referred to for the parts not mentioned in this embodiment. As shown in fig. 7, the compiling apparatus 100 includes:

an obtaining module 110, configured to obtain, at a compiling stage of the dynamic neural network, a maximum shape of an input tensor in the dynamic neural network;

a first determining module 120, configured to determine a maximum shape of an output tensor of a first class of target operator in the dynamic neural network based on a maximum shape of an input tensor in the dynamic neural network, wherein the shape of the output tensor of the first class of target operator is related to the shape of the input tensor;

a second determining module 130, configured to determine a maximum shape of an output tensor of a second type of target operator in the dynamic neural network according to a maximum shape of an output tensor of a first type of target operator in the dynamic neural network, where the shape of the output tensor of the second type of target operator is related to a value of an input tensor;

the allocating module 140 is configured to allocate a memory space to each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator in the dynamic neural network and the maximum shape of the output tensor of the second class of target operator in the dynamic neural network.

In an optional embodiment, the second determining module 130 is specifically configured to: determining whether a value of an input tensor of a second class of target operator in the dynamic neural network is derived from a shape of an output tensor of the first class of target operator; if so, acquiring the maximum numerical value of the input tensor of the second type of target operator according to the maximum shape of the output tensor of the first type of target operator; and determining the maximum shape of the output tensor of the second type target operator according to the maximum numerical value of the input tensor of the second type target operator.

In an optional embodiment, the allocating module 140 is specifically configured to: determining whether there is a remaining operator in the dynamic neural network for which the maximum shape of the output tensor is not determined, based on the first class of target operator for which the maximum shape of the output tensor has been determined and the second class of target operator for which the maximum shape of the output tensor has been determined; if so, determining the maximum shape of the output tensor of the residual operator according to a preset calibration algorithm; and allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator, the maximum shape of the output tensor of the second class of target operator and the maximum shape of the output tensors of the rest operators in the dynamic neural network.

In an optional embodiment, the allocating module 140 is specifically configured to: determining all target operators in the dynamic neural network; calculating a shape of an output tensor of each of the all target operators based on each calibration sample in a calibration data set, the calibration data set including a plurality of calibration samples; and determining the maximum shape of the output tensors of the rest operators in the dynamic neural network according to the shape of the output tensor of each target operator.

In an optional embodiment, the allocating module 140 is specifically configured to: acquiring the maximum shape of the output tensor corresponding to each target operator according to the shape of the output tensor of each target operator; and determining the maximum shapes of the output tensors of the rest operators in the dynamic neural network according to the maximum shapes of the output tensors corresponding to the target operators respectively.

In an optional embodiment, the allocating module 140 is specifically configured to: screening and determining the identifiers of the rest operators according to the identifiers of the target operators, the identifiers of the first type of target operators with the determined maximum shapes of the output tensors and the identifiers of the second type of target operators with the determined maximum shapes of the output tensors; and screening the maximum shape of the output tensor of the residual operator from the maximum shapes of the output tensors respectively corresponding to the target operators according to the identification of the residual operator.

In an optional embodiment, the second determining module 130 is specifically configured to: determining whether a first residual operator with the maximum shape of the output tensor not determined currently exists in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator in the dynamic neural network; if yes, performing numerical source analysis of the input tensor on the second type target operator in the first residual operator, wherein the first residual operator comprises a second type target operator; if the determined numerical value of the input tensor of the second type of target operator in the first residual operator is derived from the forward operator of the maximum shape of the determined output tensor, determining the maximum numerical value of the input tensor and the maximum shape of the output tensor for the second type of target operator in the first residual operator according to the forward operator of the maximum shape of the determined output tensor; determining whether a second remaining operator for which a maximum shape of an output tensor is not determined currently exists in the dynamic neural network; and determining whether the next round of iterative computation flow needs to be performed currently. The first determining module 120 and the second determining module 130 are configured to iteratively determine the maximum shape of the output tensor of each remaining operator in the dynamic neural network when determining whether a next iterative computation flow is currently required.

The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.

These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors, or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).

Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the compiling apparatus and the electronic device may be integrated, for example, the compiling apparatus may be deployed in the electronic device. As shown in fig. 8, the electronic device may include: a processor 310, a storage medium 320 and a bus 330, wherein the storage medium 320 stores machine-readable instructions executable by the processor 310, and when the electronic device is operated, the processor 310 communicates with the storage medium 320 via the bus 330, and the processor 310 executes the machine-readable instructions to perform the steps of the above-mentioned method embodiments. The specific implementation and technical effects are similar, and are not described herein again. Of course, the electronic device may also include more functional components, which are not limited herein.

Optionally, the present application further provides a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program performs the steps of the above method embodiments. The specific implementation and technical effects are similar, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other media capable of storing program codes.

It is noted that, in this document, relational terms such as "first" and "second", and the like, are used solely to distinguish one object or action from another object or action without necessarily requiring or implying any actual such relationship or order between such objects or actions.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for compiling a dynamic neural network, the method comprising:

determining a maximum shape of an output tensor of a second type of target operator in the dynamic neural network according to a maximum shape of an output tensor of a first type of target operator in the dynamic neural network, wherein the shape of the output tensor of the second type of target operator is related to a numerical value of an input tensor;

2. The method of claim 1, wherein determining the maximum shape of the output tensor for the second class of target operator in the dynamic neural network from the maximum shape of the output tensor for the first class of target operator in the dynamic neural network comprises:

determining whether a value of an input tensor of a second class of target operator in the dynamic neural network is derived from a shape of an output tensor of the first class of target operator;

if so, acquiring the maximum numerical value of the input tensor of the second type of target operator according to the maximum shape of the output tensor of the first type of target operator;

and determining the maximum shape of the output tensor of the second type target operator according to the maximum numerical value of the input tensor of the second type target operator.

3. The method according to claim 1 or 2, wherein the allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator and the maximum shape of the output tensor of the second class of target operator comprises:

determining whether there is a remaining operator in the dynamic neural network for which the maximum shape of the output tensor is not determined, based on the first class of target operator for which the maximum shape of the output tensor has been determined and the second class of target operator for which the maximum shape of the output tensor has been determined;

if so, determining the maximum shape of the output tensor of the residual operator according to a preset calibration algorithm;

and allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator, the maximum shape of the output tensor of the second class of target operator and the maximum shape of the output tensors of the rest operators in the dynamic neural network.

4. The method of claim 3, wherein said determining the maximum shape of the output tensor of the residual operator according to a preset alignment algorithm comprises:

determining all target operators in the dynamic neural network;

calculating a shape of an output tensor of each of the all target operators based on each calibration sample in a calibration data set, the calibration data set including a plurality of calibration samples;

and determining the maximum shape of the output tensors of the rest operators in the dynamic neural network according to the shape of the output tensor of each target operator.

5. The method of claim 4, wherein determining the maximum shape of the output tensors of the remaining operators in the dynamical neural network based on the shape of the output tensor of each of the target operators comprises:

acquiring the maximum shape of the output tensor corresponding to each target operator according to the shape of the output tensor of each target operator;

and determining the maximum shapes of the output tensors of the rest operators in the dynamic neural network according to the maximum shapes of the output tensors corresponding to the target operators respectively.

6. The method of claim 5, wherein determining the maximum shape of the output tensors of the remaining operators in the dynamic neural network according to the maximum shape of the output tensor corresponding to each of the target operators comprises:

screening and determining the identifiers of the rest operators according to the identifiers of the target operators, the identifiers of the first type of target operators with the determined maximum shapes of the output tensors and the identifiers of the second type of target operators with the determined maximum shapes of the output tensors;

and screening out the maximum shapes of the output tensors of the surplus operators from the maximum shapes of the output tensors corresponding to the target operators respectively according to the identifications of the surplus operators.

7. The method of claim 1, wherein determining the maximum shape of the output tensor of the second class of target operator in the dynamic neural network from the maximum shape of the output tensor of the first class of target operator in the dynamic neural network comprises:

determining whether a first residual operator with the maximum shape of the undetermined output tensor exists in the dynamic neural network at present according to the maximum shape of the output tensor of the first class of target operator in the dynamic neural network;

if yes, performing numerical source analysis of the input tensor on the second type target operator in the first residual operator, wherein the first residual operator comprises a second type target operator;

if the determined numerical value of the input tensor of the second type of target operator in the first residual operator is derived from the forward operator of the maximum shape of the determined output tensor, determining the maximum numerical value of the input tensor and the maximum shape of the output tensor for the second type of target operator in the first residual operator according to the forward operator of the maximum shape of the determined output tensor;

determining whether a second remaining operator for which a maximum shape of an output tensor is not determined currently exists in the dynamic neural network;

if yes, when the next iteration calculation process is determined to be needed currently, the step of determining the maximum shape of the output tensor of the first class of target operator in the dynamic neural network and the step of determining whether a first residual operator which does not determine the maximum shape of the output tensor exists currently in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator in the dynamic neural network are skipped, so that the maximum shape of the output tensor of each residual operator in the dynamic neural network is determined by iteration.

8. A compiling apparatus of a dynamic neural network, characterized in that the compiling apparatus comprises:

and the allocation module is used for allocating memory space for each operator in the dynamic neural network according to the maximum shape of the output tensor of the first class of target operator and the maximum shape of the output tensor of the second class of target operator in the dynamic neural network.

9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the method for compiling a dynamic neural network as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, performs the steps of the method for compiling a dynamic neural network according to any one of claims 1 to 7.