CN116756589B

CN116756589B - Method, computing device and computer readable storage medium for matching operators

Info

Publication number: CN116756589B
Application number: CN202311028501.6A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Beijing Bilin Technology Development Co ltd; Shanghai Biren Intelligent Technology Co Ltd
Current assignee: Beijing Bilin Technology Development Co ltd; Shanghai Bi Ren Technology Co ltd
Priority date: 2023-08-16
Filing date: 2023-08-16
Publication date: 2023-11-17
Anticipated expiration: 2043-08-16
Also published as: CN116756589A

Abstract

The present disclosure relates to a method, computing device, and computer-readable storage medium for matching operators. The method comprises the following steps: acquiring parameters of a target operator to be matched; comparing the parameters of the target operator with the reference parameters of the static shape reference operator of the type compiled in advance; and if at least one first type parameter of the parameters of the target operator is smaller than the corresponding first type parameter of the reference parameters, and other parameters of the target operator are the same as the corresponding other reference parameters of the reference parameters, taking the pre-compiled type static shape reference operator as a candidate operator matched with the target operator, wherein the first type parameter is a parameter related to the shape of the operator input and output data. The technical scheme provided by the disclosure has good generalization and is transparent to users.

Description

Method, computing device and computer readable storage medium for matching operators

Technical Field

The present disclosure relates generally to the field of information processing, and in particular, to a method, computing device, and computer-readable storage medium for matching operators.

Background

In the field of information processing, a calculation unit for performing a specific information processing function is called an Operator (OP). For example, when a graphics processor (Graphics Processing Unit, GPU) performs deep learning, a convolution operation is typically performed using a convolution operator, and a matrix multiplication operation is typically performed using a matrix multiplication operator. In information processing, a processor (e.g., a GPU) typically needs to match reference operators in a set of callable operators (hereinafter referred to as reference operators) according to parameters of an operator (hereinafter referred to as a target operator) that is desired to be invoked, so as to obtain a reference operator that matches the target operator. For example, the reference operator is an operator Of an advanced-Of-Time Compilation (AOT) type Static Shape (Static Shape).

The traditional matching operator technical scheme adopts complete matching (also called exact matching), and can hit the pre-compiling type static shape reference operator only under the condition that all parameters of the target operator are consistent, so that the pre-compiling type static shape reference operator can only match a specific target operator, and the generalization is poor. In addition, in the conventional technical solution of the matching operator, the user generally needs to perform preprocessing on the data to be processed (for example, the data to be processed is filled with 0 to fill the data to be processed into a specific shape), so that parameters of the target operator corresponding to the preprocessed data can be the same as reference parameters of a specific reference operator in the set of reference operators, so as to ensure that the requirement of complete matching can be met when the operators are matched, thus requiring user participation and being opaque to the user.

In summary, the technical scheme of the traditional matching operator has the following defects: poor generalization and opacity to users.

Disclosure of Invention

In view of the above problems, the present disclosure provides a method, a computing device, and a computer-readable storage medium for matching operators, where in the provided technical solution, generalization is good, and transparent to users.

According to a first aspect of the present disclosure there is provided a method for matching operators, the method comprising: acquiring parameters of a target operator to be matched; comparing the parameters of the target operator with the reference parameters of the static shape reference operator of the type compiled in advance; and if at least one first type parameter of the parameters of the target operator is smaller than the corresponding first type parameter of the reference parameters, and other parameters of the target operator are the same as the corresponding other reference parameters of the reference parameters, taking the pre-compiled type static shape reference operator as a candidate operator matched with the target operator, wherein the first type parameter is a parameter related to the shape of the operator input and output data.

In some embodiments, the target operators include point-wise operation type operators, convolution operators, matrix multiplication operators, and average pooling operators; the target operator does not include an averaging type operator, a padding non-zero value type operator.

In some embodiments, a first type of parameter among the parameters of the target operator is determined via: in response to the target operator being a convolution operator, such that a first type of parameter of the parameters of the target operator includes an output channel, an input channel, an output height, an output width, a filter height, a filter width, a batch size; or in response to the target operator being a matrix multiplier, such that a first type of parameter of the parameters of the target operator includes a left matrix height, a left matrix width, a right matrix height.

In some embodiments, comparing the parameters of the target operator with the reference parameters of the pre-compiled type static shape reference operator comprises: acquiring a table storing reference parameters of a static shape reference operator of an advanced compiling type; and comparing the parameters of the target operator with the reference parameters of the pre-compiled type static shape reference operator stored in the table.

In some embodiments, the method further comprises: and if each parameter in the parameters of the target operator is the same as the corresponding reference parameter in the reference parameters, acquiring the static shape reference operator of the advanced compiling type as an operator matched with the target operator.

In some embodiments, the method further comprises: in response to an operator matching the target operator with the pre-compiled type static shape reference operator as a candidate, the following is performed: calculating the calculated amount of the target operator based on the parameters of the target operator; calculating the calculated amount of the static shape reference operator of the type compiled in advance based on the reference parameters of the static shape reference operator of the type compiled in advance; calculating the effective calculation duty ratio of the target operator relative to the static shape reference operator of the pre-compiling type based on the calculated amount of the target operator and the calculated amount of the static shape reference operator of the pre-compiling type; judging whether the effective calculation duty ratio meets constraint conditions or not; and if the effective calculation duty ratio meets the constraint condition, acquiring a static shape reference operator of a pre-compiling type as an operator matched with the target operator.

In some embodiments, the method further comprises: judging whether a Dynamic Shape (Dynamic Shape) reference operator of the advanced coding type exists or not according to the fact that all static Shape reference operators of the advanced coding type are not matched with a target operator; and if the pre-compiling type dynamic shape reference operator exists, acquiring the pre-compiling type dynamic shape reference operator as an operator matched with the target operator.

In some embodiments, the method further comprises: if there is no pre-Compilation type dynamic shape reference operator, an operator matching the target operator is obtained by Just-In-Time Compilation (JIT).

In some embodiments, the method further comprises: based on the boundary check, reading the data to be processed; invoking a static shape reference operator of a pre-compiling type matched with the target operator to process the read data; and storing the processed data based on the boundary check

According to a second aspect of the present disclosure, there is also provided a computing device comprising: at least one processor; and at least one memory coupled to the at least one processor and storing instructions for execution by the at least one processor, the instructions when executed by the at least one processor cause the computing device to perform the method according to the first aspect of the present disclosure.

According to a third aspect of the present disclosure, there is also provided a computer readable storage medium having stored thereon computer program code which, when executed, performs the method according to the first aspect of the present disclosure.

The summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the disclosure, nor is it intended to be used to limit the scope of the disclosure.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present disclosure, and other drawings may be obtained according to the provided drawings without inventive effort to those of ordinary skill in the art.

FIG. 1 illustrates a schematic diagram of a computing device for matching operators according to an embodiment of the present disclosure.

Fig. 2 illustrates a flow chart of a method for matching operators according to an embodiment of the present disclosure.

Fig. 3 illustrates a flow chart of a method for matching operators according to an embodiment of the present disclosure.

Fig. 4 illustrates a flow chart of a method for matching operators according to an embodiment of the present disclosure.

Fig. 5 illustrates a block diagram of an exemplary electronic device for implementing embodiments of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, but not all embodiments, and they should not be construed as limiting the scope of the present disclosure. Based on the embodiments in this disclosure, all other embodiments that a person of ordinary skill in the art would obtain without making any inventive effort are within the scope of protection of this disclosure.

The term "comprising" and variations thereof as used in this disclosure mean open ended, i.e., "including but not limited to. The term "or" means "and/or" unless specifically stated otherwise. The term "based on" means "based at least in part on". The terms "one example embodiment" and "one embodiment" mean "at least one example embodiment. The term "another embodiment" means "at least one additional embodiment".

As described above, in the conventional technical solution of the matching operator, generalization is poor and opaque to the user.

To at least partially address one or more of the above problems, as well as other potential problems, the present disclosure proposes a method, computing device, and computer-readable storage medium of matching operators. In an embodiment of the present disclosure, if at least one first type parameter of parameters of the target operator is smaller than a corresponding first type parameter of the reference parameters, and other parameters of the target operator are the same as corresponding other reference parameters of the reference parameters, taking the pre-compilation type static shape reference operator as a candidate operator matched with the target operator, without all parameters being completely matched, thereby at least allowing the pre-compilation type static shape operator to be capable of matching a range of target operators (instead of only being capable of matching a specific one of the target operators), the generalization is strong; in addition, all parameters are not required to be completely matched in the process of matching operators, so that a user is not required to preprocess data to be processed, at least the process of matching operators is transparent to the user, and the automation degree is high.

Further, in the embodiment of the disclosure, the reference parameters of the pre-compiling type static shape reference operator are stored in the pre-configured table, so that the reference parameters of the pre-compiling type static shape reference operator can be conveniently and rapidly obtained, and the time for matching the operator is shortened.

In embodiments of the present disclosure, it is further determined whether to use the pre-compilation type static shape reference operator as an operator that matches the target operator by determining whether an effective calculated duty of the target operator with respect to the pre-compilation type static shape reference operator satisfies a constraint condition, at least to avoid wasting resources.

Further, in an embodiment of the present disclosure, in the presence of a matched precompiled type static shape reference operator, acquiring the precompiled type static shape reference operator; acquiring a dynamic shape reference operator of the advanced compiling type under the condition that the matched static shape reference operator of the advanced compiling type does not exist; under the condition that the dynamic shape reference operator of the compiling type does not exist, acquiring an operator matched with the target operator through just-in-time compiling; at least can consider the requirements of matching time and operator performance.

Furthermore, in the embodiment of the disclosure, the data is read and stored based on the boundary check, so that the data to be processed can be prevented from being filled with 0 by a user and the data output by the static shape reference operator of the advanced compiling type can be prevented from being post-processed, thereby saving the storage space and further improving the automation degree of the process of matching the operators.

The present disclosure is illustrated by the following several specific examples. Detailed descriptions of known functions and known components may be omitted for the sake of clarity and conciseness in the following description of the embodiments of the present disclosure. When any element of an embodiment of the present disclosure appears in more than one drawing, the element is identified by the same reference numeral in each drawing.

FIG. 1 illustrates a schematic diagram of a computing device 100 for matching operators, according to an embodiment of the present disclosure. As shown in fig. 1, computing device 100 includes a parameter acquisition module 120, an operator matching module 140, and an operator execution module 160.

With respect to computing device 100, it is configured as a matching operator. It should be noted that computing device 100 may include additional components not shown, the scope of the present disclosure being not limited in this respect. For example, in some embodiments, computing device 100 may have one or more processors, which may include special purpose processors such as GPUs, field programmable gate arrays (Field Programmable Gate Array, FPGAs), and application specific integrated circuits (Application Specific Integrated Circuit, ASICs), and general purpose processors such as central processing units (Central Processing Unit, CPUs).

With respect to the parameter acquisition module 120, it is configured to acquire parameters of the target operator to be matched.

With respect to the operator matching module 140, it is configured to match operators.

For example, the operator matching module 140 may be configured to compare parameters of the target operator with reference parameters of a pre-compiled type static shape reference operator; and if at least one first type parameter of the parameters of the target operator is smaller than the corresponding first type parameter of the reference parameters, and other parameters of the target operator are the same as the corresponding other reference parameters of the reference parameters, taking the pre-compiled type static shape reference operator as a candidate operator matched with the target operator, wherein the first type parameter is a parameter related to the shape of the operator input and output data.

For example, the operator matching module 140 may be configured to calculate a calculation amount of the target operator based on the parameters of the target operator; calculating the calculated amount of the static shape reference operator of the type compiled in advance based on the reference parameters of the static shape reference operator of the type compiled in advance; calculating the effective calculation duty ratio of the target operator relative to the static shape reference operator of the pre-compiling type based on the calculated amount of the target operator and the calculated amount of the static shape reference operator of the pre-compiling type; judging whether the effective calculation duty ratio meets constraint conditions or not; and if the effective calculation duty ratio meets the constraint condition, acquiring a static shape reference operator of a pre-compiling type as an operator matched with the target operator.

For example, the operator matching module 140 may be configured to determine whether there is a pre-compilation type dynamic shape reference operator in response to none of the pre-compilation type static shape reference operators matching the target operator; and if the pre-compiling type dynamic shape reference operator exists, acquiring the pre-compiling type dynamic shape reference operator as an operator matched with the target operator. For another example, the operator matching module 140 may be further configured to obtain an operator matching the target operator by just-in-time compilation if there is no pre-compilation type dynamic shape reference operator.

With respect to the operator execution module 160, it is configured to call operators that match the target operator for data processing. For example, operator execution module 160 may be configured to read data to be processed based on a boundary check; invoking a static shape reference operator of a pre-compiling type matched with the target operator to process the read data; and storing the processed data based on the boundary check. For example, the operator execution module 160 may be configured to invoke a pre-compilation type dynamic shape reference operator that matches a target operator for data processing. For another example, operator execution module 160 may also be configured to invoke just-in-time compiled operators for data processing.

It should be noted that, the parameter obtaining module 120, the operator matching module 140, and the operator executing module 160 may be implemented as hardware, software, or firmware, depending on the actual situation, which is not limited by the embodiments of the present disclosure.

For example, computing device 100 may perform embodiments of the methods described below in connection with fig. 2, 3, and 4.

Fig. 2 illustrates a flow chart of a method 200 for matching operators in accordance with an embodiment of the present disclosure. For example, the method 200 may be performed by the computing device 100 described in connection with fig. 1, or by the electronic device 500 described in connection with fig. 5. It should be understood that method 200 may also include additional blocks not shown and/or that the blocks shown may be omitted, the scope of the disclosure being not limited in this respect.

In step 202, the computing device 100 obtains parameters of the target operator to be matched.

With respect to the target operator, it is an operator that is desired to be invoked in order to process the data to be processed. For example, the target operators may include point-wise operation (Elementwise) type operators, convolution operators, matrix multiplication operators, and average pooling operators, where the point-wise operation type operators may include linear rectification (Rectified Linear Unit, reLU) operators and addition operators (e.g., bias_add). For example, the target operator does not include an averaging type operator, a padding non-zero value type operator.

It should be noted that, the types of the target operators applicable to the technical solution of the present disclosure and the types of the target operators not applicable to the technical solution of the present disclosure have been enumerated in the above, but since the types of the operators are difficult to be exhausted and new operators are continuously developed, whether the non-enumerated operators are applicable to the technical solution of the present disclosure may depend on the actual situation, and the embodiments of the present disclosure are not limited thereto. For example, the applicability of the disclosed solution to an unrecited operator may be determined by means of verification prior to use.

Parameters about the target operator, which are used to indicate relevant information of the target operator. For example, the parameters of the target operator may indicate the name of the target operator, operator input-output data shape, attribute settings, and the like. For example, the operator input output data shape is related to the amount and format of data to be processed. It should be noted that the parameters of the target operator depend on the target operator, and embodiments of the present disclosure are not limited thereto.

Regarding the parameters for obtaining the target operator to be matched, it may for example comprise: the computing device 100 receives input information (e.g., operator descriptors) and parses parameters of the target operator from the input information.

In step 204, the computing device 100 compares the parameters of the target operator with the reference parameters of the pre-compiled type static shape reference operator.

With respect to the reference operator, it is an operator that is callable for processing data to be processed. For example, there is a set of operators that can be invoked for processing data to be processed, and the reference operator can be any operator in the set. Similar to the target operator, the reference operator may also include a point-wise operation type operator, a convolution operator, a matrix multiplication operator, and an average pooling operator, where the point-wise operation type operator may include a ReLU operator and an addition operator (e.g., bias add).

With respect to a precompiled type static shape reference operator, wherein precompiled type refers to a reference operator that is an operator that is precompiled generated, a static shape refers to a reference operator that is an operator that is configured to process only one particular shape of data to be processed.

With respect to the reference parameters, it is used to indicate the relevant information of the reference operator. For example, the reference parameter may indicate a name of the reference operator, an operator input-output data shape, an attribute setting, and the like.

Regarding the comparison of the parameters of the target operator with the reference parameters of the pre-compiled type static shape reference operator, it may for example comprise: acquiring a table storing reference parameters of a static shape reference operator of an advanced compiling type; and comparing the parameters of the target operator with the reference parameters of the pre-compiled type static shape reference operator stored in the table. For example, the table may be preconfigured, in which case the reference parameters of the pre-compiled type static shape reference operator may be conveniently and quickly obtained, thereby shortening the time of the process of matching operators. It should be noted that, depending on the actual situation, the table may also be configured and updated in real time, which is not limited by the embodiment of the present disclosure. It should also be noted that, depending on the actual situation, the computing device 100 may also obtain the reference parameters of the static shape reference operator of the pre-compilation type in other manners, which are not limited by the embodiments of the present disclosure.

In step 206, if at least one first type parameter of the parameters of the target operator is less than a corresponding first type parameter of the reference parameters, and other parameters of the target operator are the same as corresponding other reference parameters of the reference parameters, the computing device 100 will pre-compile the type static shape reference operator as a candidate operator that matches the target operator, wherein the first type parameter is a parameter related to the operator input output data shape.

Regarding the first type of parameter, it is a parameter related to the operator input output data shape. It should be noted that the first type of parameter depends on the target operator, and embodiments of the present disclosure are not limited in this regard. It should also be noted that, depending on the actual situation, the first type of parameters may be some or all of parameters related to the operator input-output data shape, which embodiments of the present disclosure do not limit. For example, a first type of parameter among the parameters of the target operator may be determined via: in response to the target operator being a convolution operator, such that a first type of parameter of the parameters of the target operator includes an output channel, an input channel, an output height, an output width, a filter height, a filter width, a batch size; or in response to the target operator being a matrix multiplier, such that a first type of parameter of the parameters of the target operator includes a left matrix height, a left matrix width, a right matrix height. It should be noted that, in order to perform the matrix multiplication operation, the right matrix height needs to be equal to the left matrix width, so that in response to the target operator being a matrix multiplier, the first type of parameters in the parameters of the target operator include the left matrix height, the left matrix width, and the right matrix width.

For example, in one example, in response to the target operator being a convolution operator, if the output channel, the input channel, and the other of the parameters of the target operator are the same as the corresponding other of the parameters of the pre-compilation type static shape reference operator, the pre-compilation type static shape reference operator may be considered as a candidate operator that matches the target operator.

In an embodiment of the present disclosure, if at least one first type parameter of parameters of the target operator is smaller than a corresponding first type parameter of the reference parameters, and other parameters of the target operator are the same as corresponding other reference parameters of the reference parameters, taking the pre-compilation type static shape reference operator as a candidate operator matched with the target operator, without all parameters being completely matched, thereby at least allowing the pre-compilation type static shape operator to be capable of matching a range of target operators (instead of only being capable of matching a specific one of the target operators), the generalization is strong; in addition, all parameters are not required to be completely matched in the process of matching operators, so that a user is not required to preprocess data to be processed, at least the process of matching operators is transparent to the user, and the automation degree is high.

It should be noted that, in the embodiment of the present disclosure, the static shape reference operator of the pre-compilation type is taken as the candidate operator matched with the target operator without completely matching all parameters, but this is not a limitation of the present disclosure. For example, in one embodiment of the present disclosure, if each of the parameters of the target operator is the same as a corresponding one of the reference parameters, the pre-compilation type static shape reference operator is obtained as an operator that matches the target operator.

It should also be noted that, in embodiments of the present disclosure, in response to the operator matching the target operator having the pre-compiled type static shape reference operator as a candidate, the computing device 100 may directly have the pre-compiled type static shape reference operator as an operator matching the target operator, although this is not a limitation of the present disclosure. For example, the computing device 100 may also determine whether to use a pre-compiled type static shape reference operator as an operator that matches the target operator by further determination, which may be referred to the embodiment described below in connection with FIG. 3.

Fig. 3 illustrates a flow chart of a method 300 for matching operators according to an embodiment of the present disclosure. For example, method 300 may be performed by computing device 100 described in connection with fig. 1, or by electronic device 500 described in connection with fig. 5. It should be understood that method 300 may also include additional blocks not shown and/or that the blocks shown may be omitted, the scope of the disclosure being not limited in this respect.

In step 302, the computing device 100 calculates a calculation amount of the target operator based on the parameters of the target operator.

The calculated amount of the target operator is represented by a calculated amount evaluation index. For example, the calculated amount evaluation index may be a Multiply-accumulate operand (multiple-Accumulate Operations, MACs). For example, in response to the target operator being a convolution operator, MACs of the target operator are output channel batch size output height output width input channel filter height filter width; in response to the target operator being a matrix multiplier, MACs of the target operator is a left matrix height x a left matrix width x a right matrix width. It should be noted that the calculation amount evaluation index may depend on the actual situation, and the embodiment of the present disclosure is not limited thereto. For example, the calculation amount evaluation index may be floating point numbers (Floating Point Operations, FLOPs).

In step 304, the computing device 100 calculates the calculation amount of the pre-compilation type static shape reference operator based on the reference parameters of the pre-compilation type static shape reference operator.

Regarding the calculation amount of the pre-compilation type static shape reference operator, it is represented by a calculation amount evaluation index. For example, the calculation amount evaluation index for representing the calculation amount of the static shape reference operator of the advanced compilation type is the same as the calculation amount evaluation index for representing the calculation amount of the target operator, and the calculation manner of the calculation amount evaluation index may refer to the description related to the calculation amount of the target operator, which is not repeated here.

It should be noted that, the calculation of the calculation-advanced compilation type static shape reference operator may occur before, after, or simultaneously with the calculation of the target operator, which is not limited by the embodiments of the present disclosure. For example, the calculation amount of the pre-compilation type static shape reference operator may be calculated in advance and stored in the table described above in which the reference parameters of the pre-compilation type static shape reference operator are stored.

In step 306, the computing device 100 calculates an effective computing duty of the target operator relative to the pre-compiled type static shape reference operator based on the computed amount of the target operator and the computed amount of the pre-compiled type static shape reference operator.

Regarding the effective calculation duty ratio of the target operator with respect to the pre-compilation type static shape reference operator, it is a relative calculation amount evaluation index for indicating a relative magnitude relation between the calculation amount of the target operator and the calculation amount of the pre-compilation type static shape reference operator. For example, the effective calculated duty cycle of the target operator with respect to the pre-compilation type static shape reference operator may be calculated as a ratio of the calculated amount of the target operator to the calculated amount of the pre-compilation type static shape reference operator.

It should be noted that, depending on the actual situation, other calculation manners may be used to calculate the effective calculation duty ratio of the target operator with respect to the static shape reference operator of the pre-compilation type, which is not limited by the embodiments of the present disclosure. It should be further noted that, depending on the actual situation, other relative computation amount evaluation indexes may be used to indicate the relative magnitude relationship between the computation amount of the target operator and the computation amount of the static shape reference operator of the pre-compilation type, which is not limited by the embodiments of the present disclosure.

At step 308, the computing device 100 determines whether the effective calculated duty cycle satisfies a constraint.

As for the constraint condition, it may be set in a specific section or a threshold, for example. For example, whether the effective calculated duty cycle satisfies the constraint may be whether the effective calculated duty cycle is greater than or equal to a threshold. For example, the threshold value may be preset. For example, the threshold may be 0.5. It should be noted that the threshold may also be dynamically adjusted, which is not limited by embodiments of the present disclosure.

At step 310, if the effective computation duty cycle satisfies the constraint condition, the computing device 100 obtains a pre-compilation type static shape reference operator as an operator that matches the target operator.

For example, if the effective computation duty cycle satisfies the constraint, the computing device 100 invokes the early compilation type static shape reference operator.

It should be noted that, both in the embodiment described in connection with fig. 2 and the embodiment described in connection with fig. 3, the matching of the parameters of the target operator with the reference parameters of the pre-compilation type static shape reference operator (as in the embodiment described in connection with fig. 2) is preceded by considering the effective calculated duty cycle of the target operator with respect to the pre-compilation type static shape reference operator (as in the embodiment described in connection with fig. 3), which is merely exemplary and not limiting of the present disclosure. For example, in one embodiment of the present disclosure, the effective calculated duty cycle of the target operator with respect to the pre-compiled type static shape reference operator may be considered before matching the parameters of the target operator with the reference parameters of the pre-compiled type static shape reference operator.

In addition, in response to all of the precompiled type static shape reference operators not matching the target operator, the computing device 100 may also obtain precompiled type dynamic shape reference operators as operators matching the target operator or obtain operators matching the target operator by just-in-time compilation. The embodiment described in connection with fig. 4 is explained below.

Fig. 4 illustrates a flow chart of a method 400 for matching operators in accordance with an embodiment of the present disclosure. For example, the method 400 may be performed by the computing device 100 described in connection with fig. 1, or by the electronic device 500 described in connection with fig. 5. It should be understood that method 400 may also include additional blocks not shown and/or that the blocks shown may be omitted, the scope of the disclosure being not limited in this respect.

In step 402, in response to all of the precompiled type static shape reference operators not matching the target operator, the computing device 100 determines whether there is a precompiled type dynamic shape reference operator.

With respect to pre-compiling type dynamic shape reference operators, where dynamic shape refers to a reference operator is an operator configured to be able to process data to be processed of non-specific shapes. For example, the pre-compilation type dynamic shape reference operator can process arbitrarily shaped data to be processed. For another example, a pre-compilation type dynamic shape reference operator can process data to be processed of arbitrary size in a particular dimension of the shape. It should be noted that, the pre-compilation type dynamic shape reference operator generally has poor performance compared to the pre-compilation type static shape reference operator.

At step 404, if there is a pre-compilation type dynamic shape reference operator, the computing device 100 obtains the pre-compilation type dynamic shape reference operator as an operator that matches the target operator.

For example, computing device 100 invokes a precompiled type dynamic shape reference operator.

At step 406, if there is no pre-compilation type dynamic shape reference operator, the computing device 100 obtains an operator matching the target operator by just-in-time compilation.

For example, computing device 100 generates operators that match target operators by just-in-time compilation. It should be noted that, it is generally longer to obtain an operator matching with a target operator by just-in-time compilation than to obtain a static shape reference operator of a pre-compilation type or a dynamic shape reference operator of a pre-compilation type.

In an embodiment of the present disclosure, in the presence of a matched precompiled type static shape reference operator, acquiring a precompiled type static shape reference operator; acquiring a dynamic shape reference operator of the advanced compiling type under the condition that the matched static shape reference operator of the advanced compiling type does not exist; under the condition that the dynamic shape reference operator of the compiling type does not exist, acquiring an operator matched with the target operator through just-in-time compiling; at least can consider the requirements of matching time and operator performance.

Additionally, in one embodiment of the present disclosure, the method for matching operators may further comprise: based on the boundary check, reading the data to be processed; invoking a static shape reference operator of a pre-compiling type matched with the target operator to process the read data; and storing the processed data based on the boundary check. Before using a certain variable, the boundary check is to check whether the variable is within a specific range, and when reading data exceeding the boundary position, 0 is obtained, and when storing data exceeding the boundary position, it is ignored.

For example, in one example, in response to at least one of the parameters of the target operator being less than a corresponding one of the reference parameters, the amount of data of the data to be processed may be less than the amount of input data required for the pre-compilation type static shape reference operator, but reading the data to be processed based on the boundary check would fill in 0 such that the amount of data read is equal to the amount of input data required for the pre-compilation type static shape reference operator; in this case, the data amount of the data output by the pre-compilation type static shape reference operator may be larger than the desired output data amount corresponding to the data to be processed, and storing the processed data based on the boundary check ignores the redundant data so that the data amount of the stored processed data is equal to the desired output data amount corresponding to the data to be processed.

In the embodiment of the disclosure, the data is read and stored based on the boundary check, so that the data to be processed can be prevented from being filled with 0 by a user and the data output by the static shape reference operator of the advanced compiling type is prevented from being post-processed, the storage space is saved, and the automation degree of the process of matching the operators is further improved.

Additionally, in one embodiment of the present disclosure, there is also provided a computing device including: at least one processor; and at least one memory coupled to the at least one processor and storing instructions for execution by the at least one processor, the instructions when executed by the at least one processor cause the computing device to perform the method of matching operators as described above.

In addition, in one embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon computer program code which, when executed, performs a method of matching operators as described above.

Fig. 5 illustrates a block diagram of an exemplary electronic device 500 for implementing embodiments of the present disclosure. For example, computing device 100 as shown in fig. 1 may be implemented by electronic device 500. As shown, the electronic device 500 includes a Central Processing Unit (CPU) 502 that can perform various suitable actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM) 504 or loaded from a storage unit 516 into a Random Access Memory (RAM) 506. In the random access memory 506, various programs and data required for the operation of the electronic device 500 may also be stored. The central processing unit 502, the read only memory 504 and the random access memory 506 are connected to each other by a bus 508. An input/output (I/O) interface 510 is also connected to bus 508.

A number of components in the electronic device 500 are connected to the input/output interface 510, including: an input unit 512, such as a keyboard, mouse, microphone, etc.; an output unit 514 such as various types of displays, speakers, and the like; a storage unit 516 such as a magnetic disk, an optical disk, or the like; and a communication unit 518 such as a network card, modem, wireless communication transceiver, etc. The communication unit 518 allows the device 500 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.

The various processes and treatments described above, such as methods 200, 300, and 400, may be performed by central processing unit 502. For example, in some embodiments, the methods 200, 300, and 400 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 516. In some embodiments, some or all of the computer program may be loaded and/or installed onto electronic device 500 via read only memory 504 and/or communication unit 518. One or more of the acts of the methods 200, 300, and 400 described above may be performed when a computer program is loaded into the random access memory 506 and executed by the central processing unit 502.

The present disclosure relates to methods, apparatus, systems, electronic devices, computer readable storage media, and/or computer program products. The computer program product may include computer readable program instructions for performing various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge computing devices. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvements in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A method for matching operators, the method comprising:

acquiring parameters of a target operator to be matched;

comparing the parameters of the target operator with the reference parameters of the static shape reference operator of the pre-compiling type; and

taking the pre-compilation type static shape reference operator as a candidate operator matched with the target operator if at least one first type parameter of the parameters of the target operator is smaller than a corresponding first type parameter of the reference parameters and other parameters of the target operator are the same as corresponding other reference parameters of the reference parameters, wherein the first type parameter is a parameter related to operator input-output data shape;

In response to the operator matching the target operator with the pre-compiled type static shape reference operator as a candidate,:

calculating the calculated amount of the target operator based on the parameters of the target operator;

calculating the calculated amount of the pre-compiling type static shape reference operator based on the reference parameters of the pre-compiling type static shape reference operator;

calculating an effective calculation duty ratio of the target operator relative to the pre-compiling type static shape reference operator based on the calculation amount of the target operator and the calculation amount of the pre-compiling type static shape reference operator;

judging whether the effective calculation duty ratio meets constraint conditions or not; and

and if the effective calculation duty ratio meets the constraint condition, acquiring the static shape reference operator of the advanced compiling type as an operator matched with the target operator.

2. The method of claim 1, wherein the target operator comprises a point-wise operation type operator, a convolution operator, a matrix multiplication operator, and an average pooling operator; the target operator does not include an averaging type operator, a padding non-zero value type operator.

3. The method of claim 1, wherein a first type of parameter among the parameters of the target operator is determined via:

in response to the target operator being a convolution operator, such that a first type of parameter of the parameters of the target operator includes an output channel, an input channel, an output height, an output width, a filter height, a filter width, a batch size; or alternatively

In response to the target operator being a matrix multiplier, the first type of parameters of the target operator include a left matrix height, a left matrix width, a right matrix height.

4. The method of claim 1, wherein comparing parameters of the target operator with reference parameters of a pre-compiled type static shape reference operator comprises:

acquiring a table storing reference parameters of a static shape reference operator of an advanced compiling type; and

and comparing the parameters of the target operator with the reference parameters of the static shape reference operator of the pre-compiling type stored in the table.

5. The method according to claim 1, wherein the method further comprises:

and if each parameter in the parameters of the target operator is the same as the corresponding reference parameter in the reference parameters, acquiring the static shape reference operator of the advanced compiling type as an operator matched with the target operator.

6. The method according to claim 1, wherein the method further comprises:

judging whether the pre-compiling type dynamic shape reference operator exists or not according to the fact that all the pre-compiling type static shape reference operators are not matched with the target operator; and

and if the pre-compiling type dynamic shape reference operator exists, acquiring the pre-compiling type dynamic shape reference operator as an operator matched with the target operator.

7. The method of claim 6, wherein the method further comprises:

and if the pre-compiling type dynamic shape reference operator does not exist, acquiring an operator matched with the target operator through just-in-time compiling.

8. The method according to claim 1, wherein the method further comprises:

based on the boundary check, reading the data to be processed;

invoking a static shape reference operator of a pre-compiling type matched with the target operator to process the read data; and

and storing the processed data based on the boundary check.

9. A computing device, comprising:

at least one processor; and

At least one memory coupled to the at least one processor and storing instructions for execution by the at least one processor, the instructions when executed by the at least one processor cause the computing device to perform the method of any one of claims 1 to 8.

10. A computer readable storage medium having stored thereon computer program code which, when executed, performs the method according to any of claims 1 to 8.