CN113469360B

CN113469360B - Reasoning method and device

Info

Publication number: CN113469360B
Application number: CN202010244456.8A
Authority: CN
Inventors: 浦世亮; 叶挺群; 王鹏
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2023-10-20
Anticipated expiration: 2040-03-31
Also published as: CN113469360A

Abstract

The application discloses an inference method and an inference device, and belongs to the field of data processing. The method can determine the operator which is not supported by the target hardware and does not have the corresponding replaceable operator, namely, determine the operator of the first type in the neural network model, acquire the registered operator corresponding to the operator of the first type which is determined based on the domain specific language, process the registered operator to obtain the target executable code supported by the target hardware, and forward reasoning is carried out through the target hardware according to the target executable code and the neural network model. Because the design of the domain-specific language is irrelevant to hardware, a user can write a registration operator without knowing the hardware characteristics of the equipment, and the development difficulty is low. In addition, aiming at the operator of the same first type possibly corresponding to different hardware, a user only needs to write the operator once to obtain a corresponding register operator, and the register operator can be applied to different hardware, so that the development workload is low.

Description

Reasoning method and device

Technical Field

The present application relates to the field of data processing, and in particular, to a reasoning method and apparatus.

Background

Operators are used to indicate a data processing operation and describe a way of computation, for example, neural networks typically include basic convolution operators to indicate a convolution operation and pooling operators to indicate a pooling operation. The user can define the self-calculation mode according to the requirement, and the self-defined calculation mode can be called as a self-defined operator. And constructing and training to obtain a neural network model according to the basic operator and the custom operator. The neural network model can then be deployed on a variety of devices for forward reasoning.

Since the hardware conditions may be different for different devices, operators that can be supported by different devices may be different. In general, various devices may support underlying operators, but not necessarily custom operators. After the neural network model is trained, in order to enable the custom operators in the neural network model to adapt to equipment with various hardware conditions, users need to manually convert operators which are not supported by each equipment into executable codes which can be supported by corresponding equipment according to the hardware characteristics of each equipment, development workload is huge, users need to know the hardware characteristics of each equipment, and development difficulty is great. It can be seen that there is a need for a generic reasoning method supporting various operators, so that custom operators can automatically adapt to corresponding devices to complete forward reasoning.

Disclosure of Invention

The application provides an inference method and an inference device, which can automatically adapt to hardware equipment in the process of processing operators which are not supported by different hardware, thereby reducing the development workload and difficulty of users. The technical scheme is as follows:

in one aspect, there is provided a reasoning method, the method comprising:

determining a first type of operator from a plurality of operators included in a neural network model, wherein the first type of operator refers to an operator which is not supported by target hardware and a corresponding alternative operator does not exist in operators supported by the target hardware;

acquiring one or more registry operators corresponding to the operators of the first type, wherein the one or more registry operators are determined based on a domain-specific language;

processing the one or more registrants to obtain one or more target executable codes supported by the target hardware;

forward reasoning is performed by the target hardware based on the one or more target executable codes and the neural network model.

Optionally, the processing the one or more registries to obtain one or more target executable codes supported by the target hardware includes:

Processing the one or more registry operators through an interpreter to obtain one or more target language code segments, wherein the one or more registry operators are in one-to-one correspondence with the one or more target language code segments;

compiling the one or more target language code segments through a target compiler to obtain the one or more target executable codes corresponding to the one or more target language code segments one by one, wherein the target compiler is a compiler matched with the target hardware.

Optionally, the determining the first type of operator from the plurality of operators included in the neural network model includes:

determining the first type of operators from the operators according to a support operator list and a replacement operator list corresponding to the target hardware;

the support operator list comprises operators supported by the target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and corresponding replaceable operators.

Optionally, the determining, according to the support operator list and the replacement operator list corresponding to the target hardware, the first type of operator from the plurality of operators includes:

Determining a computational graph of the neural network model, the computational graph comprising a plurality of computational nodes, each computational node comprising one or more operators, and the plurality of computational nodes being arranged in an order of execution at forward reasoning;

and sequentially determining target operators which are not included in the support operator list in one or more operators included in each computing node according to the sequence of the computing nodes, and taking the target operators which are not included in the replacement operator list and are included in the computing nodes as the first type of operators.

Optionally, after determining the first type of operator from the plurality of operators included in the neural network model, the method further includes:

adding a first operator label for each operator of the first type in the neural network model, extracting calculation parameters of calculation nodes to which the corresponding operator of the first type belongs, and correspondingly storing the first operator label and the calculation parameters, wherein the first operator label is used for uniquely identifying the corresponding operator, and the first operator label is used for indicating that the type of the corresponding operator is the first type.

Determining a second type of operator from a plurality of operators included in the neural network model, wherein the second type of operator refers to an operator which is not supported by the target hardware and has a corresponding replaceable operator in operators supported by the target hardware;

and replacing the operators of the second type in the neural network model according to the replacement operator list to obtain an updated neural network model.

Optionally, the forward reasoning through the target hardware according to the one or more target executable codes and the neural network model includes:

sequentially selecting one of the plurality of computing nodes according to the sequence of the plurality of computing nodes included in the updated neural network model until each of the plurality of computing nodes has performed the following operations:

if one or more operators included in the selected computing node exist the first type of operators, acquiring computing parameters of the selected computing node and target executable codes corresponding to the first type of operators included in the selected computing node according to a first operator label corresponding to the first type of operators included in the selected computing node;

And executing forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.

sequentially selecting one of the plurality of computing nodes in an order of the plurality of computing nodes included in the neural network model until each of the plurality of computing nodes has performed the following operations:

if the fact that the second type of operators exist in one or more operators included in the selected computing node is determined according to the replacement operator list, replacing the second type of operators according to the replacement operator list to obtain updated computing nodes;

if one or more operators included in the updated computing node include the first type of operators, acquiring computing parameters of the selected computing node and target executable codes corresponding to the first type of operators included in the selected computing node according to a first operator label corresponding to the first type of operators included in the updated computing node;

And executing the forward reasoning calculation of the updated calculation node according to the acquired calculation parameters and the target executable code.

Alternatively, domain-specific language refers to a computer language that is applied to a particular application domain.

In another aspect, there is provided an inference apparatus, the apparatus comprising:

a first determining module, configured to determine a first type of operator from a plurality of operators included in a neural network model, where the first type of operator is an operator that is not supported by target hardware and that does not have a corresponding replaceable operator in the operators supported by the target hardware;

the acquisition module is used for acquiring one or more registry operators corresponding to the operators of the first type, wherein the one or more registry operators are determined and obtained based on a domain-specific language;

the first processing module is used for processing the one or more registrants to obtain one or more target executable codes supported by the target hardware;

and the reasoning module is used for carrying out forward reasoning through the target hardware according to the one or more target executable codes and the neural network model.

Optionally, the first processing module includes:

The interpretation unit is used for processing the one or more registry operators through the interpreter to obtain one or more target language code segments, and the one or more registry operators are in one-to-one correspondence with the one or more target language code segments;

and the compiling unit is used for compiling the one or more target language code segments through a target compiler to obtain the one or more target executable codes corresponding to the one or more target language code segments one by one, wherein the target compiler is a compiler matched with the target hardware.

Optionally, the first determining module includes:

the determining unit is used for determining the first type of operators from the operators according to the support operator list and the replacement operator list corresponding to the target hardware;

Optionally, the determining unit includes:

a first determining subunit, configured to determine a computation graph of the neural network model, where the computation graph includes a plurality of computation nodes, each computation node includes one or more operators, and the plurality of computation nodes are arranged according to an execution order during forward reasoning;

And the second determining subunit sequentially determines target operators which are not included in the support operator list in one or more operators included in each computing node according to the sequence of the computing nodes, and takes the target operators which are not included in the replacement operator list and are included in the computing nodes as the operators of the first type.

Optionally, the apparatus further comprises:

the second processing module is used for adding a first operator label to each operator of the first type in the neural network model, extracting the calculation parameters of the calculation nodes to which the corresponding operator of the first type belongs, correspondingly storing the first operator labels and the calculation parameters, wherein the first operator labels are used for uniquely identifying the corresponding operators, and the first operator labels are used for indicating that the types of the corresponding operators are of the first type.

Optionally, the apparatus further comprises:

a second determining module, configured to determine a second type of operator from a plurality of operators included in the neural network model, where the second type of operator is an operator that is not supported by the target hardware and a corresponding alternative operator exists in operators supported by the target hardware;

And the replacing module is used for replacing the operators of the second type in the neural network model according to the replacing operator list to obtain an updated neural network model.

Optionally, the reasoning module is specifically configured to:

In another aspect, a computer device is provided, where the computer device includes a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus, where the memory is used to store a computer program, and where the processor is used to execute the program stored on the memory to implement the steps of the inference method described above.

In another aspect, a computer readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, implements the steps of the reasoning method described above.

In another aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of the reasoning method described above.

The technical scheme provided by the application has at least the following beneficial effects:

in the application, the operator which is not supported by the target hardware and has no corresponding replaceable operator in the operators supported by the target hardware can be determined, namely, the operator of the first type in the neural network model is determined, then, the register operator corresponding to the operator of the first type which is determined based on the domain specific language is acquired, the register operator is processed, the target executable code supported by the target hardware is obtained, and forward reasoning is carried out through the target hardware according to the target executable code and the neural network model. Because the corresponding register operator can be obtained based on the field specific language for the first type of operator in the scheme, and the design of the field specific language is irrelevant to hardware, a user can write the corresponding register operator without knowing the hardware characteristics of the equipment, and the development difficulty is lower. In addition, aiming at the operators of the same first type possibly corresponding to different hardware, a user only needs to rewrite the operators once to obtain corresponding registry operators, and the registry operators can be applied to different hardware, so that the development workload is greatly reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an inference method provided by an embodiment of the present application;

FIG. 2 is a flow chart of a method for determining an updated neural network model according to an embodiment of the present application;

FIG. 3 is a flow chart of a method for forward reasoning according to an updated neural network model, provided by an embodiment of the present application;

FIG. 4 is a flow chart of another reasoning method provided by an embodiment of the present application;

fig. 5 is a schematic structural diagram of an inference apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

The operator is used for indicating a data processing operation, describing a calculation mode, constructing and training to obtain a neural network model according to the basic operator and the custom operator, and deploying the neural network model on various devices for forward reasoning. However, the hardware conditions of different devices may be different, and thus, operators that can be supported by the hardware of different devices may be different. In order to enable custom operators in a neural network model to adapt to devices with various hardware conditions so as to execute corresponding data processing operations, the application provides a general reasoning method, and operators which are not supported by various hardware of various devices can be easily converted into supported executable codes without knowing various hardware characteristics of the various devices by a user so as to complete forward reasoning.

For example, when forward reasoning such as face recognition, target detection or picture classification is needed, a neural network model can be constructed and trained according to the design of the convolution operator, the pooling operator and other basic operators and the custom operator, and the neural network model can be used for performing face recognition, target detection or picture classification. After training to obtain the neural network model, the neural network model needs to be deployed on each device to complete forward reasoning. However, the hardware of these devices may be different, for example, different devices may be configured with hardware (chips) of different vendor designs and manufactures, such as x86 CPU (Central Processing Unit ), GPU (Graphics Processing Unit, image processor), ARM (Advanced RISC Machine, advanced reduced instruction set machine) processor, ASIC chip designed for deep learning computation, or AI (Artificial Intelligence ) processor, etc. Wherein the computing cores, computing rates, supported computing precision, chip bandwidth, etc. of different hardware (chips) may be different, i.e. the hardware characteristics of different hardware (chips) may be different. Considering calculation utilization rate, time consumption, resource occupation and the like, operators supported by platform reasoning components correspondingly developed based on different hardware are different, namely operators supported by different hardware are different, the neural network model needs to conduct forward reasoning on various platform reasoning components developed based on various hardware, and unsupported operators corresponding to various hardware need to be converted into supported executable codes so as to complete forward reasoning on various hardware. The technical scheme provided by the application can be easily adapted to various hardware, so that the neural network model can complete forward reasoning such as face recognition, target detection or image classification on various devices.

The reasoning method provided by the embodiment of the application is explained in detail below.

Fig. 1 is a flowchart of an inference method provided in an embodiment of the present application. Referring to fig. 1, the method includes the following steps.

Step 101: an operator of a first type is determined from a plurality of operators included in the neural network model, the operator of the first type being an operator that is not supported by the target hardware and for which there is no corresponding alternative operator among the operators supported by the target hardware.

In an embodiment of the present application, one device may be equipped with one or more types of hardware, and the neural network model to be deployed on the device needs to perform forward reasoning on one of the hardware, which may be referred to as target hardware. The hardware characteristics of the target hardware and the target platform reasoning component developed based on the target hardware are important influencing factors for determining operators supported by the target hardware, and the neural network model needs to complete forward reasoning on the target platform reasoning component, so that all operators in the neural network model need to enable the target platform reasoning component to support, namely the target hardware. Based on this, the device may first determine the first type of operator from the plurality of operators included in the neural network model, that is, determine an operator that is not supported by the target hardware and that does not have a corresponding alternative operator among the operators supported by the target hardware.

In the embodiment of the application, the target hardware corresponds to a supported operator and an unsupported operator, and a part of operators supported by the target hardware correspond to alternative operators. It should be noted that the replaceable operator refers to an operator that is not supported by the target hardware but can be replaced with one operator that is supported.

Illustratively, assuming that the featuretype operator is an operator which is not supported by the target hardware, and the convfeaturetype operator is an operator supported by the target hardware, the convfeaturetype operator is an operator obtained by rearranging calculation data and designing a calculation flow according to the conv operator supported by the target hardware, the convfeaturetype operator is the same as the data processing operation described by the featuretype operator, that is, the featuretype operator can be an alternative operator corresponding to the convfeaturetype operator.

In the embodiment of the application, the device can determine the first type of operators from a plurality of operators included in the neural network model according to the support operator list and the replacement operator list corresponding to the target hardware. The support operator list comprises operators supported by the target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and the corresponding replaceable operators.

It should be noted that, the support operator list and the replacement operator list corresponding to the target hardware are determined in advance according to the hardware characteristics of the target hardware and the target platform reasoning component. The support operator list may include all operators supported by the target hardware, and is used to determine whether each operator included in the neural network model is an operator supported by the target hardware, that is, the support operator list may be used as a filtering condition to filter out operators that are not supported. The replacement operator list may include replaceable operators and corresponding supported operators for determining whether an operator not supported by the target hardware in the neural network model has a replacement scheme among the operators supported by the target hardware.

In an embodiment of the present application, the device may first determine a computational graph of the neural network model, where the computational graph includes a plurality of computational nodes, each of the computational nodes may include one or more operators, and the plurality of computational nodes may be arranged in an order of execution during forward reasoning. Then, the device may sequentially determine, according to the order of the plurality of computing nodes, target operators that are not included in the support operator list in one or more operators included in each computing node, and use the target operators that are not included in the replacement operator list included in the plurality of computing nodes as the first type of operators. That is, the device may filter out the first type of operator in the neural network model by traversing the various computing nodes included in the computational graph. It should be noted that the method for determining the calculation map of the neural network model may be selected according to the actual situation, which is not limited herein.

Illustratively, the support operator list includes a conv operator, a pooling operator, a Relu operator, a sigmoid operator, a convfeatureshape operator, etc., and the replacement operator list includes a convfeatureshape operator and a corresponding featureshape operator, indicating that the featureshape operator is a replaceable operator of the convfeatureshape operator. Assuming that one computing node of the computation graph of the neural network model includes a featurelet operator, a conv operator and a softmax operator, when traversing to the computing node, the conv operator can be determined to be a supported operator by comparing the support operator list, the featurelet operator and the softmax operator are both determined to be a target operator by comparing the support operator list, the featurelet operator and the softmax operator can be determined to be included in the replacement operator list by comparing the replacement operator list, a corresponding replaceable operator exists, and the softmax operator is not included in the replacement operator list, and therefore, the softmax operator can be determined to be the first type of operator.

In the embodiment of the application, after the device determines the first type of operators from the plurality of operators included in the neural network model, a first operator label can be added for each first type of operators in the neural network model, the calculation parameters of the calculation nodes to which the corresponding first type of operators belong are extracted, and the first operator labels and the extracted calculation parameters are correspondingly stored. The first operator label is used for uniquely identifying the corresponding operator, and the first operator label is used for indicating that the type of the corresponding operator is a first type.

It should be noted that, the steps of adding the first operator label to the first type of operator, extracting the corresponding calculation parameters and storing the corresponding calculation parameters by the device may be executed in the process of traversing each calculation node included in the calculation graph, that is, the steps of adding the label, extracting the calculation parameters and storing the corresponding calculation parameters are executed once every time one first type of operator is determined.

Because the first type of operator is an operator which is not supported by the target hardware and has no alternative scheme, the device can mark the first type of operator, extract corresponding calculation parameters, store each label and the extracted calculation parameters correspondingly, so as to obtain the corresponding calculation parameters according to the labels later and execute forward reasoning calculation.

The device may store the tag in correspondence with the calculation parameter in the form of a mapping table, or may store the tag in other forms. In addition, since there may be multiple computing nodes each having the same first type of operator, in order to facilitate distinguishing between the first types of operators of the respective computing nodes, the device may use different first identification fields to represent the multiple first types of operators as operators of different computing nodes, use the same second identification field to represent the multiple first types of operators as the same operator, and use the same third identification field to represent the multiple operators as the first types of operators, where the first identification field, the second identification field, and the third identification field together form a first operator tag.

Illustratively, it is assumed that each of the computing node a and the computing node B includes one first type of operator and is a featuretype operator, the computing parameters of the computing node a include w1, w2, B1 and B2, the computing node a performs a data processing operation on the several computing parameters through the featuretype operator to perform a forward reasoning computation of the computing node a, the computing parameters of the computing node B include w3, w4, B3 and B4, the computing node B performs a data processing operation on the several computing parameters through the featuretype operator to perform a forward reasoning computation of the computing node B, and in addition, it is assumed that the device stores the labels and the computing parameters in a mapping table form correspondingly. The equipment firstly traverses to a computing node A according to the sequence of the computing node, determines that a featurerope operator is an operator of a first type, can add a first operator label as A-T1-S1 for the operator, and adds { A-T1-S1: [ w1, w2, b1, b2] } is added to the mapping table and stored. The device then traverses to the compute node B, determines that the featurerope operator is also an operator of the first type, may add a first operator tag to the operator as B-T1-S1, and will { B-T1-S1: [ w3, w4, b3, b4] } is added to the mapping table and stored. The first operator label of the featuretype operator included in the computing node A consists of a first identification field 'A', a second identification field 'S1' and a third identification field 'T1', and the first operator label of the featuretype operator included in the computing node B consists of a first identification field 'B', a second identification field 'S1' and a third identification field 'T1'.

From the foregoing, it can be seen that the device stores a supporting operator list and a replacing operator list, based on which the device can also determine, from a plurality of operators included in the neural network model, a second type of operator, where the second type of operator is an operator that is not supported by the target hardware and where there is a corresponding replacing operator in the operators supported by the target hardware. The device may further replace the second type of operator in the neural network model according to the replacement operator list, to obtain an updated neural network model. That is, the device may determine the first type of operator and the second type of operator prior to forward reasoning.

Alternatively, the device may determine the second type of operator during the foregoing traversal of the computational graph, such that the device may determine both the first type of operator and the second type of operator with only one traversal to reduce overall time consumption. Alternatively, the device may traverse the computational graph again to determine the second type of operator.

In addition, the device performs a process of replacing the second type of operators in the neural network model according to the replacement operator list, which may be that a replacement operation is performed once every time a second type of operators is determined in the traversal process, and the updated neural network model may be obtained when the traversal is completed. Or, the device may first add a second operator tag to each second type of operator in the process of traversing to determine the second type of operators, and replace each second type of operator in the neural network according to the second operator tag and the replacement operator list after traversing is finished, so as to obtain an updated neural network model. Or, the device may first add a second operator tag to each operator of the second type during traversing to determine the operators of the second type, and then replace the operators of the second type during forward reasoning. The second operator label is used for uniquely identifying the corresponding operator, and the second operator label is used for indicating that the type of the corresponding operator is a second type.

Fig. 2 is a flowchart of a method for determining an updated neural network model according to an embodiment of the present application. Referring to fig. 2, with the neural network model as input, the device may determine a computation graph of the neural network model, where the computation graph includes a plurality of computation nodes, sequentially traverses each computation node in an order of the plurality of computation nodes, determines whether there is an operator that is not supported by the target hardware among operators included in the computation node currently traversed according to the supported operator list, if there is no unsupported operator, determines whether the traversal is ended, if the traversal is ended, outputs the updated neural network model, and if the traversal is not ended, traverses the next computation node. If the unsupported operator exists, judging whether the unsupported operator is an alternative operator or not, namely whether the unsupported operator is an operator of a second type or not according to the alternative operator list, if the unsupported operator is the alternative operator, replacing the unsupported operator by the corresponding supported operator in the alternative operator list to obtain an updated neural network model, and then executing the step of judging whether the traversal is finished. If the operator is not the replaceable operator, determining that the corresponding operator is the operator of the first type, adding a first operator label for the corresponding operator, extracting corresponding calculation parameters, correspondingly storing the first operator label and the corresponding calculation parameters, and then executing the step of judging whether the traversal is ended.

Step 102: one or more registry operators corresponding to the operators of the first type are obtained, and the one or more registry operators are determined based on the domain-specific language.

In an embodiment of the present application, after determining the first type of operators, the device may output the first type of operators on the user interface to prompt the user to write each of the first type of operators in a domain-specific language (domain-specific language).

It should be noted that, the domain-specific language refers to a computer language focusing on a specific application domain, for example, HTML (Hyper Text Markup Language ) for displaying a web page, regular expressions, SQL (Structured Query Language ) for constructing a database, AWK for linux, and the like can be understood as a domain-specific language. The domain-specific language provided by the embodiment of the application is a hardware-independent deep learning operator computing language designed for a developer, and the domain-specific language is opened for users to use, so that the users can rewrite the operators of the first type according to the domain-specific language, and the development difficulty of the users can be greatly reduced because the design of the domain-specific language is independent of the hardware.

After the user writes each first type of operator according to the prompt to obtain the corresponding registry operator, the device may obtain the registry operator corresponding to each first type of operator. It should be noted that the data processing operation described by each registry operator is identical to the data processing operation described by the corresponding operator of the first type.

Step 103: the one or more registrars are processed to obtain one or more target executable codes supported by target hardware.

In the embodiment of the application, after obtaining one or more registries, the device can process the one or more registries, for example, interpret, compile, code map and the like the registries according to the hardware characteristics of the target hardware, so as to obtain one or more target executable codes supported by the target hardware, so that the target executable codes can realize the same data processing operation as the registries, and finish the subsequent forward reasoning. One implementation of processing the registrar to obtain the target executable code supported by the target hardware will be described next.

In the embodiment of the application, the device can process the one or more registrants through the interpreter to obtain one or more target language code segments, wherein the one or more registrants are in one-to-one correspondence with the one or more target language code segments. The device may then compile the one or more target language code fragments via a target compiler to obtain one or more target executable code that corresponds one-to-one to the one or more target language code fragments. Wherein the target compiler is a compiler matching the target hardware.

It should be noted that, the target platform inference component developed based on the target hardware may run a code of one program language, that is, the target hardware only supports one program language, and the language of the code segment of the target language obtained after the register is processed by the interpreter is the program language supported by the target hardware. The interpreter provided in the embodiment of the application can have various configuration modes, wherein one configuration mode is that the device is provided with the interpreter which is configured to output codes of various program languages, so that the interpreter can process the register operator according to the program languages supported by the target hardware, and then output the target language code segments of the program languages supported by the target hardware. In another configuration, a plurality of interpreters are configured in the device, each interpreter corresponds to only one programming language, that is, each interpreter is configured to output codes of only one programming language, so that the device can determine a target interpreter according to the programming languages supported by target hardware, and the target interpreter processes the registry to obtain a corresponding target language code segment.

In addition, in the embodiment of the application, a plurality of compilers are configured in the device, and each compiler can compile codes of a corresponding programming language, that is, each compiler is correspondingly matched with one piece of hardware. After the device obtains the target language code segment through the interpreter, the target language code segment can be input into a corresponding target compiler, and the target executable code supported by the target hardware can be obtained by compiling the target language code segment through the target compiler.

Because each target executable code corresponds to each target language code segment one by one, each target language code segment corresponds to each registrar one by one, each registrar corresponds to each first type of operator one by one, and therefore, each first type of operator also corresponds to each target executable code one by one. Based on this, after obtaining each target executable code, the device may store the first operator label of each first type of operator in correspondence with the corresponding target executable code, that is, store the mapping relationship between the first operator label and the target executable code. Wherein, since the first operator labels of the same first type of operator in different compute nodes are different, but the corresponding target executable code is the same, in the mapping relationship, one target executable code may correspond to a plurality of different first operator labels.

Step 104: forward reasoning is performed by the target hardware based on the one or more target executable codes and the neural network model.

In embodiments of the present application, after obtaining one or more target executable codes, the device may perform forward reasoning through a target platform reasoning component developed based on target hardware, based on the one or more target executable codes and the neural network model. Wherein the target executable code is for implementing the same data processing operations in the compute node as the corresponding first type of operator.

From the foregoing, the device may determine only the first type of operator before performing forward reasoning, or the device may determine the first type of operator and the second type of operator before performing forward reasoning, and replace the second type of operator according to the replacement operator list, to obtain an updated neural network model, or only add a label to the second type of operator. Based on this, in an embodiment of the present application, the device performs forward reasoning with respect to the target hardware based on the one or more target executable codes and the neural network model, three of which are described below.

In a first implementation, before forward reasoning, the device determines only the first type of operator, so that the device may sequentially select one of the plurality of computing nodes in the order of the plurality of computing nodes included in the neural network model until each of the plurality of computing nodes has performed the following operations: if the operator of the second type exists in one or more operators included in the selected computing node according to the replacement operator list, replacing the operator of the second type according to the replacement operator list to obtain an updated computing node; if one or more operators included in the updated computing node exist the first type of operators, acquiring computing parameters of the selected computing node and target executable codes corresponding to the first type of operators included in the selected computing node according to first operator labels corresponding to the first type of operators included in the updated computing node; and executing the forward reasoning calculation of the updated calculation node according to the acquired calculation parameters and the target executable code.

In this implementation manner, in the process of performing forward reasoning, the device may first determine, when executing to one computing node, whether a second type of operator exists in the corresponding computing node according to the support operator list and the replacement operator list, and if the second type of operator exists, execute a replacement operation to obtain an updated computing node, and then perform forward reasoning calculation of the corresponding computing node. For the first type of operators in the computing nodes, the device can acquire corresponding computing parameters from the mapping relation between the stored labels and the computing parameters according to the first operator labels, and acquire corresponding target executable codes from the mapping relation between the stored labels and the target executable codes so as to perform forward reasoning computation of the corresponding computing nodes. That is, in this implementation, the device needs to determine the second type of operator according to the support operator list and the replacement operator list during the forward reasoning, and then perform the replacement operation corresponding to the corresponding computing node and the forward reasoning computation.

In a second implementation, before forward reasoning is performed, the device has determined the first type of operator and the second type of operator, and replaces the second type of operator according to the replacement operator list, so as to obtain an updated neural network model. In this way, the device may sequentially select one of the plurality of computing nodes in the order of the plurality of computing nodes included in the updated neural network model until each of the plurality of computing nodes has performed the following operations: if one or more operators included in the selected computing node exist an operator of a first type, acquiring computing parameters of the selected computing node and target executable codes corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the selected computing node; and executing forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.

In this implementation manner, the updated neural network model does not include the second type of operator, but may include the first type of operator, and the device may perform the forward reasoning calculation of the corresponding computing node according to the updated neural network model and the computing parameters and the target executable codes corresponding to the first type of operator in each computing node in the forward reasoning process, and the specific implementation manner may refer to the related description in the foregoing first implementation manner and will not be repeated herein.

It should be noted that, in the second implementation manner, since the updated neural network model does not include the second type of operators, but only includes the first type of operators and the supported operators, the device only needs to process the existing first type of operators in the forward reasoning process, so that the speed of forward reasoning is increased compared with that in the first implementation manner.

Fig. 3 is a flow chart of a method for performing forward reasoning provided by an embodiment of the present application. It is assumed that before forward reasoning is performed, the device determines the first type of operator and the second type of operator by traversing the computational graph once, and replaces the second type of operator according to the replacement operator list, so as to obtain an updated neural network model. Referring to fig. 3, the initial data for forward reasoning is input into the updated neural network model, and forward reasoning calculation of the corresponding calculation nodes is sequentially performed in accordance with the execution order at the time of forward reasoning. When executing to a computing node, firstly judging whether an operator corresponding to a first operator label exists in the corresponding computing node, so as to judge whether the operator of the first type exists, and if not, executing forward reasoning calculation of the corresponding computing node. If the calculation parameters are not acquired or the matched target executable codes are not acquired, an error prompt is output and used for indicating that an unsupported operator exists, and the forward reasoning cannot be completed so as to remind a user of processing. In this way, the device can output the forward reasoning result until each calculation node in the neural network model has performed the corresponding forward reasoning calculation.

In a third implementation, prior to forward reasoning, the device has determined the first type of operator and the second type of operator, but has not replaced the second type of operator according to a list of replacement operators, but has added a second operator tag to the second type of operator. In this way, the apparatus may sequentially select one of the plurality of computing nodes in order of the plurality of computing nodes included in the neural network model until each of the plurality of computing nodes has performed the following operations: if one or more operators included in the selected computing node are provided with operators of a second type, replacing the operators of the second type according to a replacement operator list to obtain updated computing nodes; if one or more operators included in the selected computing node exist an operator of a first type, acquiring computing parameters of the selected computing node and target executable codes corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the selected computing node; and executing forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.

In this implementation manner, the device may determine, during the forward reasoning process, whether an operator corresponding to the second operator label exists in the computing node for each execution to one computing node, so as to determine whether the second type of operator exists in the computing node, and if so, execute a replacement operation of the corresponding operator, so as to obtain an updated computing node. For the computing node with the first type of operator, the implementation manner of performing the corresponding forward reasoning computation may refer to the first implementation manner, and will not be described herein.

In the three implementations described above, the device may determine the first type of operator or the second type of operator according to the labels corresponding to the operators during forward reasoning, and in other possible implementations, whether the first type of operator and the second type of operator are determined before forward reasoning, the device may determine the two types of operators according to the support operator list and the replacement operator list during forward reasoning.

Fig. 4 is a flow chart of another reasoning method provided by an embodiment of the present application. Assuming that one or more pieces of hardware are assembled in the device, a corresponding platform reasoning component is developed based on each piece of hardware. Referring to fig. 4, a model adaptation tool, a domain-specific language compiler, and one or more platform reasoning components corresponding to the one or more hardware are configured in the device. Wherein the model adapting tool is configured with a list of support operators and a list of replacement operators corresponding to the various possible hardware (e.g. n types) to provide a filtering condition of operators supported by the various possible hardware and a replacement scheme of the replaceable operators, namely, a filtering condition of operators supported by the various possible platform reasoning components (e.g. n types) and a replacement scheme of the replaceable operators. The domain-specific language compiler is configured with one interpreter and a plurality of compilers, the interpreter being configured to output codes of various program languages.

First, the neural network model determines the platform reasoning component to be relied upon as the target platform reasoning component, e.g., the platform 1 reasoning component is the target platform reasoning component. Inputting the neural network model into a model adaptation tool, wherein the model adaptation tool can determine a first type of operator and a second type of operator in the neural network model by traversing a computational graph according to filtering conditions and alternative schemes corresponding to a target platform reasoning component, replace the second type of operator with an operator supported by target hardware, add a first operator label to the first type of operator, extract corresponding calculation parameters, and finally output the updated neural network model and the first type of operator by the model adaptation tool.

Secondly, the user writes each operator of the first type by using the domain-specific language according to the operators of the first type output by the model adaptation tool, and a corresponding registration operator is obtained.

Then, the registry operator is used as input of an interpreter, the interpreter can process each registry operator according to the program language supported by the target hardware to obtain corresponding target language code segments, and each obtained target language code segment is input into a target compiler. The target compiler may compile each target language code segment to obtain each target executable code supported by the target hardware.

And finally, inputting each target executable code segment and the updated neural network model into a target platform reasoning component to perform forward reasoning, and finally outputting a forward reasoning result.

Alternatively, in an embodiment of the present application, the process of determining the first type of operator or the second type of operator in the neural network model before forward reasoning may be performed on a first device, which may output the updated neural network model and the first type of operator, that is, the device may be configured with a model adaptation tool. The process of writing the first type of operator by the user using the domain-specific language to obtain the registry operator may be performed on the second device. The processing of the registrar to obtain the target executable code may be performed on a third device, i.e. a device provided with a domain specific language compiler. Finally, according to the target executable code and the neural network model, the forward reasoning process through the target hardware can be completed on the fourth device, namely, the hardware assembled on the fourth device is the target hardware. In other words, in the embodiment of the present application, each process may be performed on one device, or may be performed independently on different devices, or may be performed partially on one device, and partially on another device, where each device may communicate to transmit data.

As can be seen from FIG. 4, the model adaptation tool and the domain-specific language compiler provided by the embodiment of the application can be used as a whole set of reasoning system, and can meet various different hardware to complete various forward reasoning depending on different hardware.

In summary, in the embodiment of the present application, an operator that is not supported by the target hardware and that does not have a corresponding alternative operator in the operators supported by the target hardware may be determined, that is, a first type of operator in the neural network model is determined, then a registry corresponding to the first type of operator determined based on the domain-specific language is acquired, and the registry is processed, so as to obtain a target executable code supported by the target hardware, and forward reasoning is performed through the target hardware according to the target executable code and the neural network model. Because the corresponding register operator can be obtained based on the field specific language for the first type of operator in the scheme, and the design of the field specific language is irrelevant to hardware, a user can write the corresponding register operator without knowing the hardware characteristics of the equipment, and the development difficulty is lower. In addition, aiming at the operators of the same first type possibly corresponding to different hardware, a user only needs to rewrite the operators once to obtain corresponding registry operators, and the registry operators can be applied to different hardware, so that the development workload is greatly reduced.

Fig. 5 is a schematic diagram of a structure of an inference apparatus provided in an embodiment of the present application, where the inference apparatus 500 may be implemented as part or all of a computer device by software, hardware, or a combination of both. Referring to fig. 5, the apparatus 500 includes: a first determination module 501, an acquisition module 502, a first processing module 503, and an inference module 504.

A first determining module 501, configured to determine, from a plurality of operators included in the neural network model, an operator of a first type, where the operator of the first type is not supported by the target hardware and there is no operator of a corresponding alternative operator in the operators supported by the target hardware;

an obtaining module 502, configured to obtain one or more registrants corresponding to the first type of operator, where the one or more registrants are determined based on a domain-specific language;

a first processing module 503, configured to process one or more registrars to obtain one or more target executable codes supported by the target hardware;

an inference module 504 for forward reasoning through the target hardware based on the one or more target executable codes and the neural network model.

Optionally, the first processing module 503 includes:

the interpretation unit is used for processing one or more registry operators through the interpreter to obtain one or more target language code segments, and the one or more registry operators are in one-to-one correspondence with the one or more target language code segments;

And the compiling unit is used for compiling one or more target language code segments through a target compiler to obtain one or more target executable codes corresponding to the one or more target language code segments one by one, wherein the target compiler is a compiler matched with target hardware.

Optionally, the first determining module 501 includes:

the support operator list comprises operators supported by target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and the corresponding replaceable operators.

Optionally, the determining unit includes:

the second determining subunit sequentially determines, according to the order of the plurality of computing nodes, target operators which are not included in the support operator list in one or more operators included in each computing node, and uses the target operators which are not included in the replacement operator list and are included in the plurality of computing nodes as the first type of operators.

Optionally, the apparatus 500 further includes:

a second determining module, configured to determine a second type of operator from a plurality of operators included in the neural network model, where the second type of operator is an operator that is not supported by the target hardware and has a corresponding replaceable operator in the operators supported by the target hardware;

Optionally, the inference module 504 is specifically configured to:

If one or more operators included in the selected computing node exist an operator of a first type, acquiring computing parameters of the selected computing node and target executable codes corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the selected computing node;

Optionally, the inference module 504 is specifically configured to:

if the operator of the second type exists in one or more operators included in the selected computing node according to the replacement operator list, replacing the operator of the second type according to the replacement operator list to obtain an updated computing node;

if one or more operators included in the updated computing node are provided with the first type of operators, acquiring computing parameters of the selected computing node and target executable codes corresponding to the first type of operators included in the selected computing node according to a first operator label corresponding to the first type of operators included in the updated computing node;

It should be noted that: in the reasoning device provided in the above embodiment, only the division of the above functional modules is used for illustration, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the reasoning apparatus and the reasoning method provided in the foregoing embodiments belong to the same concept, and the specific implementation process is detailed in the method embodiment, which is not described herein again.

Fig. 6 is a block diagram of a computer device 600 according to an embodiment of the present application. The computer device 600 may be a terminal device or a server such as a desktop computer, a notebook computer, a tablet computer, a smart phone, etc.

In general, the computer device 600 includes: a processor 601 and a memory 602.

Processor 601 may include one or more processing cores, such as a 4-core processor, a 6-core processor, and the like. The processor 601 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). Processor 601 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU, and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 601 may integrate a GPU for use in connection with rendering and rendering of content to be displayed by the display screen. In some embodiments, the processor 601 may also include an AI processor for processing computing operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the reasoning method provided by the method embodiments of the present application.

In some embodiments, the computer device 600 may further optionally include: a peripheral interface 603, and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 603 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 604, a touch display 605, a camera 606, audio circuitry 607, a positioning component 608, and a power supply 609.

Peripheral interface 603 may be used to connect at least one Input/Output (I/O) related peripheral to processor 601 and memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 601, memory 602, and peripheral interface 603 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 604 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 604 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 604 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 604 may communicate with other computer devices via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuit 604 may also include NFC (Near Field Communication ) related circuits, which the present application is not limited to.

The display screen 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 605 is a touch display, the display 605 also has the ability to collect touch signals at or above the surface of the display 605. The touch signal may be input as a control signal to the processor 601 for processing. At this point, the display 605 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 605 may be a front panel provided to the computer device 600; in other embodiments, the display 605 may be at least two of different surfaces or in a folded configuration, each of which is disposed on the computer device 600; in other embodiments, the display 605 may be a flexible display disposed on a curved surface or a folded surface of the computer device 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The display 605 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 606 is used to capture images or video. Optionally, the camera assembly 606 includes a front camera and a rear camera. Typically, the front camera is disposed on a front panel of the computer device and the rear camera is disposed on a rear surface of the computer device. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

The audio circuit 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing, or inputting the electric signals to the radio frequency circuit 604 for voice communication. The microphone may be provided in a plurality of different locations of the computer device 600 for stereo acquisition or noise reduction purposes. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 607 may also include a headphone jack.

The location component 608 is used to locate the current geographic location of the computer device 600 to enable navigation or LBS (Location Based Service, location-based services). The positioning component 608 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, or the Galileo system of Russia.

The power supply 609 is used to power the various components in the computer device 600. The power source 609 may be alternating current, direct current, disposable battery or rechargeable battery. When the power source 609 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the computer device 600 further includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyroscope sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.

The acceleration sensor 611 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the computer device 600. For example, the acceleration sensor 611 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 601 may control the touch display screen 605 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 611. The acceleration sensor 611 may also be used for the acquisition of motion data of a game or a user.

The gyro sensor 612 may detect the body direction and the rotation angle of the computer device 600, and the gyro sensor 612 may collect the 3D motion of the user on the computer device 600 in cooperation with the acceleration sensor 611. The processor 601 may implement the following functions based on the data collected by the gyro sensor 612: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

Pressure sensor 613 may be disposed on a side frame of computer device 600 and/or on an underlying layer of touch screen 605. When the pressure sensor 613 is disposed at a side frame of the computer apparatus 600, a grip signal of the computer apparatus 600 by a user may be detected, and the processor 601 performs a left-right hand recognition or a quick operation according to the grip signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the touch display screen 605, the processor 601 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 605. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The fingerprint sensor 614 is used for collecting the fingerprint of the user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 614 may be provided on the front, back, or side of the computer device 600. When a physical key or vendor Logo is provided on the computer device 600, the fingerprint sensor 614 may be integrated with the physical key or vendor Logo.

The optical sensor 615 is used to collect ambient light intensity. In one embodiment, processor 601 may control the display brightness of touch display 605 based on the intensity of ambient light collected by optical sensor 615. Specifically, when the intensity of the ambient light is high, the display brightness of the touch display screen 605 is turned up; when the ambient light intensity is low, the display brightness of the touch display screen 605 is turned down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 based on the ambient light intensity collected by the optical sensor 615.

A proximity sensor 616, also referred to as a distance sensor, is typically provided on the front panel of the computer device 600. The proximity sensor 616 is used to capture the distance between the user and the front of the computer device 600. In one embodiment, when the proximity sensor 616 detects a gradual decrease in the distance between the user and the front of the computer device 600, the processor 601 controls the touch display 605 to switch from the bright screen state to the off screen state; when the proximity sensor 616 detects that the distance between the user and the front of the computer device 600 gradually increases, the touch display screen 605 is controlled by the processor 601 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the architecture shown in fig. 6 is not limiting as to the computer device 600, and may include more or fewer components than shown, or may combine certain components, or employ a different arrangement of components.

In some embodiments, there is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the inference method of the above embodiments. For example, the computer readable storage medium may be ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

It is noted that the computer readable storage medium mentioned in the present application may be a non-volatile storage medium, in other words, a non-transitory storage medium.

It should be understood that all or part of the steps to implement the above-described embodiments may be implemented by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The computer instructions may be stored in the computer-readable storage medium described above.

That is, in some embodiments, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform the steps of the inference method described above.

The above embodiments are not intended to limit the present application, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present application should be included in the scope of the present application.

Claims

1. A method of reasoning, the method comprising:

acquiring one or more registry operators corresponding to the operators of the first type, wherein the one or more registry operators are determined based on a domain-specific language, the domain-specific language is a deep learning operator computing language irrelevant to hardware, and the registry operators are the same as data processing operation described by the operators of the corresponding first type;

Compiling the one or more target language code segments through a target compiler to obtain one or more target executable codes supported by the target hardware in one-to-one correspondence with the one or more target language code segments, wherein the target compiler is a compiler matched with the target hardware;

2. The method of claim 1, wherein determining the first type of operator from a plurality of operators included in the neural network model comprises:

3. The method of claim 2, wherein determining the first type of operator from the plurality of operators according to the support operator list and the replacement operator list corresponding to the target hardware comprises:

4. The method of claim 1, wherein after determining the first type of operator from the plurality of operators included in the neural network model, further comprising:

5. The method of claim 4, wherein after determining the first type of operator from the plurality of operators included in the neural network model, further comprising:

6. The method of claim 5, wherein said forward reasoning by said target hardware based on said one or more target executable codes and said neural network model comprises:

7. The method of any of claims 1-4, wherein said forward reasoning by said target hardware based on said one or more target executable codes and said neural network model comprises:

8. The method of any of claims 1-6, wherein the domain-specific language is a computer language applied to a specific application domain.

9. An inference apparatus, the apparatus comprising:

the acquisition module is used for acquiring one or more registrants corresponding to the operators of the first type, wherein the one or more registrants are determined and obtained based on a domain-specific language, the domain-specific language is a deep learning operator computing language irrelevant to hardware, and the registrants are the same as data processing operation described by the operators of the corresponding first type;

the first processing module is used for processing the one or more registry operators through the interpreter to obtain one or more target language code segments, and the one or more registry operators are in one-to-one correspondence with the one or more target language code segments; compiling the one or more target language code segments through a target compiler to obtain one or more target executable codes supported by the target hardware in one-to-one correspondence with the one or more target language code segments, wherein the target compiler is a compiler matched with the target hardware;