CN113469360A - Inference method and device - Google Patents

Inference method and device Download PDF

Info

Publication number
CN113469360A
CN113469360A CN202010244456.8A CN202010244456A CN113469360A CN 113469360 A CN113469360 A CN 113469360A CN 202010244456 A CN202010244456 A CN 202010244456A CN 113469360 A CN113469360 A CN 113469360A
Authority
CN
China
Prior art keywords
operator
type
target
operators
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010244456.8A
Other languages
Chinese (zh)
Other versions
CN113469360B (en
Inventor
浦世亮
叶挺群
王鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202010244456.8A priority Critical patent/CN113469360B/en
Publication of CN113469360A publication Critical patent/CN113469360A/en
Application granted granted Critical
Publication of CN113469360B publication Critical patent/CN113469360B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/045Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/046Forward inferencing; Production systems

Abstract

The application discloses an inference method and an inference device, and belongs to the field of data processing. The method and the device can determine the operator which is not supported by the target hardware and does not have the corresponding replaceable operator, namely, the operator of the first type in the neural network model, then obtain the registration operator corresponding to the operator of the first type determined based on the domain specific language, process the registration operator to obtain the target executable code supported by the target hardware, and perform forward reasoning through the target hardware according to the target executable code and the neural network model. Because the design of the domain specific language is irrelevant to the hardware, a user can write the registration operator without knowing the hardware characteristics of the equipment, and the development difficulty is low. In addition, aiming at operators of the same first type which may correspond to different hardware, a user only needs to write the operators once to obtain corresponding registration operators, namely, the registration operators can be applied to different hardware, and the development workload is low.

Description

Inference method and device
Technical Field
The present application relates to the field of data processing, and in particular, to a reasoning method and apparatus.
Background
Operators are used to indicate a data processing operation and describe a computational approach, for example, neural networks typically include a basic convolution operator to indicate a convolution operation and a pooling operator to indicate a pooling operation. The user can also define the own calculation mode according to the requirement, and the calculation mode defined by the user can be called as a self-defined operator. And constructing and training the neural network model according to the basic operator and the user-defined operator. The neural network model can then be deployed on various devices for forward reasoning.
Since hardware conditions may differ between different devices, there may be differences in operators that can be supported by different devices. In general, various devices may support underlying operators, but do not necessarily support custom operators. After a neural network model is trained, in order to enable a user-defined operator in the neural network model to be capable of adapting to devices with various hardware conditions, a user needs to manually convert operators which are not supported by each device into executable codes which can be supported by the corresponding device according to hardware characteristics of each device, development workload is huge, and the user needs to know the hardware characteristics of each device, so that development difficulty is high. Therefore, a general reasoning method supporting various operators is needed at present, so that the user-defined operator can be automatically adapted to corresponding equipment to complete forward reasoning.
Disclosure of Invention
The application provides an inference method and an inference device, which can automatically adapt hardware equipment in the process of processing operators which are not supported by different hardware, thereby reducing the development workload and difficulty of users. The technical scheme is as follows:
in one aspect, a method of reasoning is provided, the method comprising:
determining a first type of operator from a plurality of operators included in a neural network model, wherein the first type of operator refers to an operator which is not supported by target hardware and does not have a corresponding replaceable operator in the operators supported by the target hardware;
acquiring one or more registration operators corresponding to the first type of operator, wherein the one or more registration operators are determined and obtained based on a domain specific language;
processing the one or more registration operators to obtain one or more target executable codes supported by the target hardware;
forward reasoning by the target hardware in accordance with the one or more target executable codes and the neural network model.
Optionally, the processing the one or more registration operators to obtain one or more target executable codes supported by the target hardware includes:
processing the one or more registration operators through an interpreter to obtain one or more target language code segments, wherein the one or more registration operators correspond to the one or more target language code segments one to one;
compiling the one or more target language code segments through a target compiler to obtain one or more target executable codes corresponding to the one or more target language code segments one by one, wherein the target compiler is a compiler matched with the target hardware.
Optionally, the determining an operator of a first type from a plurality of operators included in the neural network model includes:
determining the operator of the first type from the plurality of operators according to a support operator list and a replacement operator list corresponding to the target hardware;
the support operator list comprises operators supported by the target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and corresponding replaceable operators.
Optionally, the determining, according to the support operator list and the replacement operator list corresponding to the target hardware, the first type of operator from the plurality of operators includes:
determining a computational graph of the neural network model, wherein the computational graph comprises a plurality of computational nodes, each computational node comprises one or more operators, and the plurality of computational nodes are arranged according to an execution sequence in forward reasoning;
and sequentially determining a target operator which is not contained in the support operator list in one or more operators included in each computing node according to the sequence of the computing nodes, and taking the target operator which is not contained in the replacement operator list and included in the computing nodes as the first type of operator.
Optionally, after determining the operator of the first type from the plurality of operators included in the neural network model, the method further includes:
adding a first operator label to each operator of the first type in the neural network model, extracting a calculation parameter of a calculation node to which the corresponding operator of the first type belongs, and correspondingly storing the first operator label and the calculation parameter, wherein the first operator label is used for uniquely identifying the corresponding operator, and the first operator label is used for indicating that the type of the corresponding operator is the first type.
Optionally, after determining the operator of the first type from the plurality of operators included in the neural network model, the method further includes:
determining a second type of operator from a plurality of operators included in the neural network model, wherein the second type of operator refers to an operator which is not supported by the target hardware and has a corresponding replaceable operator in the operators supported by the target hardware;
and replacing the operator of the second type in the neural network model according to the replacement operator list to obtain an updated neural network model.
Optionally, said performing, by said target hardware, forward inference based on said one or more target executable codes and said neural network model, comprises:
according to the sequence of the plurality of computing nodes included in the updated neural network model, one of the plurality of computing nodes is sequentially selected to execute the following operations until each of the plurality of computing nodes has executed the following operations:
if the operator of the first type exists in one or more operators included in the selected computing node, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the selected computing node;
and executing the forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.
Optionally, said performing, by said target hardware, forward inference based on said one or more target executable codes and said neural network model, comprises:
according to the sequence of a plurality of computing nodes included in the neural network model, one of the computing nodes is sequentially selected to execute the following operations until each of the computing nodes executes the following operations:
if it is determined that a second type of operator exists in one or more operators included in the selected computing node according to the replacement operator list, replacing the second type of operator according to the replacement operator list to obtain an updated computing node;
if the operator of the first type exists in one or more operators included in the updated computing node, acquiring the computing parameter of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the updated computing node;
and executing the forward reasoning calculation of the updated calculation node according to the acquired calculation parameters and the target executable code.
Alternatively, a domain-specific language refers to a computer language that applies to a particular application domain.
In another aspect, an inference apparatus is provided, the apparatus including:
the device comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining a first type of operator from a plurality of operators included in a neural network model, and the first type of operator refers to an operator which is not supported by target hardware and does not have a corresponding replaceable operator in the operators supported by the target hardware;
the acquisition module is used for acquiring one or more registration operators corresponding to the first type of operator, and the one or more registration operators are determined and obtained based on a domain specific language;
the first processing module is used for processing the one or more registration operators to obtain one or more target executable codes supported by the target hardware;
and the reasoning module is used for carrying out forward reasoning through the target hardware according to the one or more target executable codes and the neural network model.
Optionally, the first processing module includes:
the interpretation unit is used for processing the one or more registration operators through the interpreter to obtain one or more target language code segments, and the one or more registration operators are in one-to-one correspondence with the one or more target language code segments;
and the compiling unit is used for compiling the one or more target language code segments through a target compiler to obtain one or more target executable codes corresponding to the one or more target language code segments one to one, and the target compiler is a compiler matched with the target hardware.
Optionally, the first determining module includes:
a determining unit, configured to determine, according to a support operator list and a replacement operator list corresponding to the target hardware, an operator of the first type from the multiple operators;
the support operator list comprises operators supported by the target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and corresponding replaceable operators.
Optionally, the determining unit includes:
a first determining subunit, configured to determine a computational graph of the neural network model, where the computational graph includes a plurality of computational nodes, each computational node includes one or more operators, and the plurality of computational nodes are arranged according to an execution sequence in a forward inference;
and the second determining subunit determines, in sequence according to the order of the plurality of computing nodes, a target operator, which is not included in the support operator list, among the one or more operators included in each computing node, and takes the target operator, which is not included in the replacement operator list, included in the plurality of computing nodes as the first type of operator.
Optionally, the apparatus further comprises:
the second processing module is configured to add a first operator tag to each operator of the first type in the neural network model, extract a calculation parameter of a calculation node to which the corresponding operator of the first type belongs, and store the first operator tag and the calculation parameter in a corresponding manner, where the first operator tag is used to uniquely identify the corresponding operator, and the first operator tag is used to indicate that the type of the corresponding operator is the first type.
Optionally, the apparatus further comprises:
a second determining module, configured to determine an operator of a second type from a plurality of operators included in the neural network model, where the operator of the second type is an operator that is not supported by the target hardware and a corresponding replaceable operator exists among the operators supported by the target hardware;
and the replacing module is used for replacing the operator of the second type in the neural network model according to the replacing operator list to obtain the updated neural network model.
Optionally, the inference module is specifically configured to:
according to the sequence of the plurality of computing nodes included in the updated neural network model, one of the plurality of computing nodes is sequentially selected to execute the following operations until each of the plurality of computing nodes has executed the following operations:
if the operator of the first type exists in one or more operators included in the selected computing node, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the selected computing node;
and executing the forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.
Optionally, the inference module is specifically configured to:
according to the sequence of a plurality of computing nodes included in the neural network model, one of the computing nodes is sequentially selected to execute the following operations until each of the computing nodes executes the following operations:
if it is determined that a second type of operator exists in one or more operators included in the selected computing node according to the replacement operator list, replacing the second type of operator according to the replacement operator list to obtain an updated computing node;
if the operator of the first type exists in one or more operators included in the updated computing node, acquiring the computing parameter of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the updated computing node;
and executing the forward reasoning calculation of the updated calculation node according to the acquired calculation parameters and the target executable code.
Alternatively, a domain-specific language refers to a computer language that applies to a particular application domain.
In another aspect, a computer device is provided, which includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus, the memory is used for storing computer programs, and the processor is used for executing the programs stored in the memory to implement the steps of the inference method.
In another aspect, a computer-readable storage medium is provided, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the inference method described above.
In another aspect, a computer program product is provided comprising instructions which, when run on a computer, cause the computer to perform the steps of the inference method described above.
The technical scheme provided by the application can at least bring the following beneficial effects:
in the application, an operator which is not supported by the target hardware and has no corresponding replaceable operator in operators supported by the target hardware can be determined, that is, an operator of a first type in the neural network model is determined, then, a registration operator corresponding to the operator of the first type determined based on the domain specific language is obtained, the registration operator is processed to obtain a target executable code supported by the target hardware, and forward reasoning is performed through the target hardware according to the target executable code and the neural network model. According to the scheme, for the operator of the first type, the corresponding registration operator can be determined and obtained based on the domain specific language, and the design of the domain specific language is irrelevant to hardware, so that a user can write the corresponding registration operator without knowing the hardware characteristics of equipment, and the development difficulty is low. In addition, aiming at operators of the same first type which may correspond to different hardware, a user only needs to rewrite the operator once to obtain a corresponding registration operator, and the registration operator can be applied to different hardware, so that the development workload is greatly reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of an inference method provided in an embodiment of the present application;
FIG. 2 is a flow chart of a method for determining an updated neural network model provided by an embodiment of the present application;
FIG. 3 is a flowchart of a method for forward reasoning based on an updated neural network model according to an embodiment of the present application;
FIG. 4 is a flow chart of another inference method provided by an embodiment of the present application;
fig. 5 is a schematic structural diagram of an inference device provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The operator is used for indicating a data processing operation and describing a calculation mode, a neural network model can be constructed and trained according to the basic operator and the self-defined operator, and the neural network model can be deployed on various devices for forward reasoning. However, hardware conditions may differ for different devices, and thus, operators that can be supported by hardware of different devices may differ. In order to enable the user-defined operators in the neural network model to be capable of adapting to devices with various hardware conditions so as to execute corresponding data processing operations, the application provides a general reasoning method, and operators which are not supported by various hardware of various devices can be easily converted into supported executable codes without a user knowing various hardware characteristics of the various devices so as to complete forward reasoning.
Illustratively, when forward reasoning such as face recognition, target detection or picture classification is required, a neural network model can be constructed and trained according to basic operators such as convolution operators and pooling operators and a custom operator design, and the neural network model can be used for face recognition, target detection, picture classification and the like. After the neural network model is trained, the neural network model needs to be deployed on each device to complete forward reasoning. However, the hardware of these devices may be different, for example, different devices may be configured with hardware (chips) designed and manufactured by different manufacturers, such as x86 CPU (Central Processing Unit), GPU (Graphics Processing Unit), ARM (Advanced RISC Machine) processor, ASIC chip or AI (Artificial Intelligence) processor designed for deep learning computation, and so on. The computation cores, the computation rates, the supported computation accuracies, the chip bandwidths, and the like of different hardware (chips) may be different, that is, the hardware characteristics of different hardware (chips) may be different. In consideration of calculation utilization rate, time consumption, resource occupation and the like, the neural network model needs to convert unsupported operators corresponding to various hardware into supported executable codes to complete forward reasoning on various hardware if forward reasoning is performed on various platform reasoning components developed based on various hardware based on different operators supported by the platform reasoning components developed corresponding to different hardware, namely different operators supported by different hardware. The technical scheme provided by the application can be easily adapted to various hardware, so that the neural network model can finish forward reasoning such as face recognition, target detection or image classification on various devices.
The following explains the inference method provided in the embodiments of the present application in detail.
Fig. 1 is a flowchart of an inference method according to an embodiment of the present application. Referring to fig. 1, the method includes the following steps.
Step 101: determining a first type of operator from a plurality of operators included in the neural network model, wherein the first type of operator refers to an operator which is not supported by the target hardware and does not have a corresponding replaceable operator in the operators supported by the target hardware.
In the embodiment of the present application, one or more kinds of hardware may be provided on one device, and the neural network model to be deployed on the device needs to complete forward reasoning on one of the hardware, which may be referred to as target hardware. The hardware characteristics of the target hardware and the target platform reasoning component developed based on the target hardware are important influencing factors for determining the operators which can be supported by the target hardware, and the neural network model needs to complete forward reasoning on the target platform reasoning component, so that all the operators in the neural network model can be supported by the target platform reasoning component, namely the target hardware. Based on this, the device may determine the operator of the first type from the plurality of operators included in the neural network model, that is, determine the operator that is not supported by the target hardware and does not have a corresponding replaceable operator among the operators supported by the target hardware.
In the embodiment of the present application, the target hardware corresponds to a supported operator and an unsupported operator, and there is a part of operators supported by the target hardware corresponding to a replaceable operator. It should be noted that the replaceable operator refers to an operator that is not supported by the target hardware but can be replaced by the supported operator.
Exemplarily, it is assumed that a featurescape operator is an operator that is not supported by the target hardware, and a convfeaturescape operator is an operator that is supported by the target hardware, the convfeaturescape operator is an operator obtained by rearranging the calculation data and designing the calculation stream according to a conv operator that is supported by the target hardware, and the convfeaturescape operator is the same as the data processing operation described by the featurescape operator, that is, the featurescape operator may be an alternative operator corresponding to the convfeaturescape operator.
In this embodiment of the present application, the device may determine, according to the support operator list and the replacement operator list corresponding to the target hardware, an operator of the first type from among a plurality of operators included in the neural network model. The support operator list comprises operators supported by the target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and the corresponding replaceable operators.
It should be noted that the support operator list and the replacement operator list corresponding to the target hardware are determined in advance according to the hardware characteristics of the target hardware and the target platform inference component. The supported operator list may include all operators supported by the target hardware, and is used to determine whether each operator included in the neural network model is an operator supported by the target hardware, that is, the supported operator list may be used as a filtering condition to screen out unsupported operators. The replacement operator list may include a replaceable operator and a corresponding supported operator for determining whether an operator in the neural network model that is not supported by the target hardware has a replacement scenario among the operators supported by the target hardware.
In an embodiment of the present application, a device may first determine a computational graph of a neural network model, where the computational graph includes a plurality of computational nodes, each computational node may include one or more operators, and the plurality of computational nodes may be arranged according to an execution order in a forward inference. Then, the device may sequentially determine, according to the order of the plurality of computing nodes, a target operator that is not included in the support operator list among the one or more operators included in each computing node, and use the target operator that is not included in the replacement operator list included in the plurality of computing nodes as the first type of operator. That is, the device may screen out the operator of the first type in the neural network model by traversing each computational node included in the computational graph. It should be noted that the method for determining the calculation graph of the neural network model may be selected according to the actual implementation, and is not limited herein.
Illustratively, the list of support operators includes a conv operator, a pooling operator, a Relu operator, a sigmoid operator, a convfeaturescape operator, etc., and the list of replacement operators includes a convfeaturescape operator and a corresponding featurescape operator, indicating that the featurescape operator is a replaceable operator of the convfeaturescape operator. Assuming that a computation node of a computation graph of a neural network model comprises a featurescape operator, a conv operator and a softmax operator, when traversing to the computation node, the conv operator can be determined as a supported operator by comparing a support operator list, the featurescape operator and the softmax operator are unsupported operators, the device can determine both the featurescape operator and the softmax operator as target operators, and then determine that the featurescape operator is contained in a replacement operator list by comparing the replacement operator list, a corresponding replaceable operator exists, and the softmax operator is not contained in the replacement operator list, and no corresponding replaceable operator exists, so that the softmax operator can be determined as a first type of operator.
In this embodiment of the application, after the device determines the first type of operator from the multiple operators included in the neural network model, a first operator tag may be added to each first type of operator in the neural network model, the calculation parameters of the calculation node to which the corresponding first type of operator belongs are extracted, and the first operator tag and the extracted calculation parameters are stored correspondingly. The first operator label is used for uniquely identifying the corresponding operator, and the first operator label is used for indicating that the type of the corresponding operator is the first type.
It should be noted that the step of adding the first operator label to the first type of operator, extracting the corresponding calculation parameter, and storing the calculation parameter correspondingly by the device may be performed in the process of traversing each calculation node included in the calculation graph, that is, the step of adding the label, extracting the calculation parameter, and storing the calculation parameter correspondingly is performed once every time one first type of operator is determined.
Because the operator of the first type is an operator which is not supported by the target hardware and has no alternative scheme, the device can mark the operator of the first type, extract the corresponding calculation parameters, and correspondingly store each label and the extracted calculation parameters, so that the corresponding calculation parameters can be obtained according to the labels in the following process to execute the forward reasoning calculation.
It should be noted that the device may store the tag and the calculation parameter in correspondence in the form of a mapping table, or may store them in another form. In addition, since there may be a plurality of computing nodes all having the same first type of operator, in order to distinguish the first type of operator of each computing node, the apparatus may use different first identification fields to indicate that the plurality of first type of operators are operators of different computing nodes, use the same second identification field to indicate that the plurality of first type of operators are the same operator, and use the same third identification field to indicate that the operators are the first type of operators, where the first identification field, the second identification field, and the third identification field jointly form the first operator label.
Illustratively, it is assumed that each of the compute node a and the compute node B includes an operator of a first type and is a featurescape operator, the compute parameters of the compute node a include w1, w2, B1 and B2, the compute node a performs data processing operations on the several compute parameters through the featurescape operator to perform forward inference computation on the compute node a, the compute parameters of the compute node B include w3, w4, B3 and B4, the compute node B performs data processing operations on the several compute parameters through the featurescape operator to perform forward inference computation on the compute node B, and the device further assumes that the tags and the compute parameters are stored in correspondence in the form of a mapping table. The device firstly traverses to the calculation node A according to the sequence of the calculation nodes, determines that the featurescape operator is the first type of operator, can add the first operator label to the operator as A-T1-S1, and adds the first operator label to { A-T1-S1: [ w1, w2, b1, b2] } are added to the map and stored. Thereafter, the device traverses to compute node B, determines that the featurescape operator is also the first type of operator, may add the first operator label to this operator as B-T1-S1, and assigns { B-T1-S1: [ w3, w4, b3, b4] } are added to the map and stored. Wherein, the first operator tag of the featurescape operator included in the computation node a is composed of the first identification field 'a', the second identification field 'S1' and the third identification field 'T1', and the first operator tag of the featurescape operator included in the computation node B is composed of the first identification field 'B', the second identification field 'S1' and the third identification field 'T1'.
As can be seen from the foregoing, the device stores a support operator list and a replacement operator list, and based on this, the device may further determine an operator of a second type from the plurality of operators included in the neural network model, where the operator of the second type is an operator that is not supported by the target hardware and has a corresponding replaceable operator among the operators supported by the target hardware. The device can also replace the operator of the second type in the neural network model according to the replacement operator list to obtain the updated neural network model. That is, the device may determine the operator of the first type and the operator of the second type before performing the forward inference.
Alternatively, the device may determine the operator of the second type in the process of traversing the computation graph, so that the device may determine both the operator of the first type and the operator of the second type through only one traversal, thereby reducing the total time consumption. Alternatively, the device may traverse the computational graph again to determine the second type of operator.
In addition, the device performs a process of replacing the second type of operator in the neural network model according to the replacement operator list, where the replacement operation is performed once every time one second type of operator is determined in the traversal process, and when the traversal is finished, the updated neural network model can be obtained. Or, the device may add a second operator label to each second type operator in the process of determining the second type operator in a traversal manner, and replace each second type operator in the neural network according to the second operator label and the replacement operator list after the traversal is completed, so as to obtain an updated neural network model. Or, the device may add a second operator label to each second type operator in the process of determining the second type operator in a traversal manner, and replace the second type operator in the process of forward inference. The second operator label is used for uniquely identifying the corresponding operator, and the second operator label is used for indicating that the type of the corresponding operator is the second type.
Fig. 2 is a flowchart of a method for determining an updated neural network model according to an embodiment of the present disclosure. Referring to fig. 2, a device may determine a computation graph of a neural network model by using the neural network model as an input, where the computation graph includes a plurality of computation nodes, sequentially traverse each computation node according to an order of the plurality of computation nodes, determine, according to a supported operator list, whether an operator that is not supported by target hardware exists among operators included in currently traversed computation nodes, determine, if an operator that is not supported does not exist, whether traversal is completed, if traversal is completed, output an updated neural network model, and if traversal is not completed, traverse a next computation node. And if the unsupported operator exists, judging whether the unsupported operator is a replaceable operator, namely whether the unsupported operator is the second type operator, according to the replacement operator list, if the unsupported operator is the second type operator, replacing the unsupported operator with the corresponding supported operator in the replacement operator list to obtain an updated neural network model, and then executing the step of judging whether the traversal is finished. And if the operator is not the replaceable operator, determining that the corresponding operator is the operator of the first type, adding a first operator label for the corresponding operator, extracting corresponding calculation parameters, correspondingly storing the first operator label and the corresponding calculation parameters, and then executing the step of judging whether the traversal is finished.
Step 102: and acquiring one or more registration operators corresponding to the first type of operator, wherein the one or more registration operators are determined and obtained based on the domain specific language.
In an embodiment of the present application, after determining the first type of operator, the device may output the first type of operator to a user interface to prompt a user to write each of the first type of operators in a domain-specific language (a domain-specific language).
It should be noted that the domain-specific Language refers to a computer Language focused on a certain application domain, and for example, HTML (Hyper Text Markup Language) for displaying a web page, regular expression, SQL (Structured Query Language) for constructing a database, and AWK for linux can be understood as a domain-specific Language. The field specific language provided by the embodiment of the application is a hardware-independent deep learning operator calculation language designed by developers, and is opened for users to use, so that the users can rewrite the first type of operators according to the field specific language, and the development difficulty of the users can be greatly reduced because the design of the field specific language is independent of hardware.
After the user writes each first type of operator according to the prompt to obtain the corresponding registration operator, the device may obtain the registration operator corresponding to each first type of operator. It should be noted that the data processing operation described by each registration operator is the same as the data processing operation described by the corresponding operator of the first type.
Step 103: and processing the one or more registration operators to obtain one or more target executable codes supported by the target hardware.
In this embodiment of the present application, after obtaining one or more registration operators, the device may process the one or more registration operators, for example, interpret, compile, code map the registration operators according to hardware characteristics of the target hardware, and obtain one or more target executable codes supported by the target hardware, so that the target executable codes may implement the same data processing operation as the registration operators, and complete subsequent forward reasoning. One implementation of processing the registration operator to obtain the target executable code supported by the target hardware will be described next.
In this embodiment, the device may process the one or more registration operators through the interpreter to obtain one or more target language code segments, where the one or more registration operators are in one-to-one correspondence with the one or more target language code segments. Then, the device may compile the one or more target language code segments through a target compiler to obtain one or more target executable codes corresponding to the one or more target language code segments one to one. The target compiler is a compiler matched with the target hardware.
It should be noted that the target platform inference component developed based on the target hardware may run a code of one program language, that is, the target hardware only supports one program language, and the language of the target language code segment obtained after the interpreter processes the registration operator is the program language supported by the target hardware. The interpreter provided in the embodiment of the present application may have multiple configurations, where one configuration is that a device is configured with one interpreter, and the interpreter is configured to output codes of various program languages, so that the interpreter may process a registration operator according to a program language supported by target hardware, and output a target language code segment of the program language supported by the target hardware. Another configuration is that a plurality of interpreters are configured in the device, each interpreter corresponds to only one program language, that is, each interpreter is configured to output codes of only one program language, so that the device can determine a target interpreter according to a program language supported by target hardware, and process a registration operator through the target interpreter to obtain a corresponding target language code segment.
In addition, in the embodiment of the present application, a plurality of compilers are configured in the device, and each compiler may compile a code of a corresponding program language, that is, each compiler corresponds to a kind of hardware. After the device obtains the target language code segment through the interpreter, the device can input the target language code segment into a corresponding target compiler, and the target language code segment is compiled through the target compiler to obtain a target executable code supported by target hardware.
Each target executable code corresponds to each target language code segment one to one, each target language code segment corresponds to each registration operator one to one, and each registration operator corresponds to each first type of operator one to one, so each first type of operator also corresponds to each target executable code one to one. Based on this, after obtaining each target executable code, the device may store the first operator tag of each first type operator in correspondence with the corresponding target executable code, that is, store the mapping relationship between the first operator tag and the target executable code. Since the first operator tags of the same first type of operator in different compute nodes are different, but the corresponding target executable codes are the same, in the mapping relationship, one target executable code may correspond to a plurality of different first operator tags.
Step 104: forward reasoning is performed by the target hardware according to the one or more target executable codes and the neural network model.
In an embodiment of the application, after obtaining the one or more target executable codes, the device may perform forward reasoning through a target platform reasoning component developed based on the target hardware according to the one or more target executable codes and the neural network model. Wherein the target executable code is for implementing the same data processing operations in the compute node as the corresponding first type of operator.
As can be seen from the foregoing, the device may determine only the first type of operator before performing the forward inference, or the device may determine the first type of operator and the second type of operator before performing the forward inference, and replace the second type of operator according to the replacement operator list to obtain the updated neural network model, or add a label only to the second type of operator. Based on this, in the embodiment of the present application, the device performs forward inference through target hardware according to the one or more target executable codes and the neural network model, and three implementations thereof will be described below.
In a first implementation, before performing the forward inference, the device determines only the operator of the first type, so that the device may sequentially select one of the plurality of computing nodes to perform the following operations according to an order of the plurality of computing nodes included in the neural network model, until each of the plurality of computing nodes has performed the following operations: if the operator of the second type exists in one or more operators included in the selected computing node according to the replacement operator list, replacing the operator of the second type according to the replacement operator list to obtain an updated computing node; if the operator of the first type exists in one or more operators included in the updated computing node, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to the first operator label corresponding to the operator of the first type included in the updated computing node; and executing the forward reasoning calculation of the updated calculation node according to the acquired calculation parameters and the target executable code.
In this implementation manner, in the process of performing forward inference, when a computing node is executed, the device may first determine whether an operator of the second type exists in the corresponding computing node according to the support operator list and the replacement operator list, if an operator of the second type exists, perform a replacement operation to obtain an updated computing node, and then perform forward inference computation of the corresponding computing node. For the first type of operator existing in the computation node, the device may obtain, according to the first operator tag, a corresponding computation parameter from a mapping relationship between the stored tag and the computation parameter, and obtain, from a mapping relationship between the stored tag and the target executable code, a corresponding target executable code, so as to perform forward inference computation of the corresponding computation node. That is, in this implementation, the device needs to determine the operator of the second type according to the support operator list and the replacement operator list during the forward inference process, and then perform the replacement operation corresponding to the corresponding computing node and the forward inference computation.
In a second implementation manner, before forward inference, the device determines the first type of operator and the second type of operator, and replaces the second type of operator according to the replacement operator list, so as to obtain an updated neural network model. In this way, the apparatus may sequentially select one of the plurality of computing nodes to perform the following operations in an order of the plurality of computing nodes included in the updated neural network model until each of the plurality of computing nodes has performed the following operations: if one or more operators included in the selected computing node have the operator of the first type, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to the first operator label corresponding to the operator of the first type included in the selected computing node; and executing forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.
In this implementation, the updated neural network model does not include the second type of operator, but may include the first type of operator, and the device may execute the forward inference calculation of the corresponding calculation node according to the updated neural network model and the calculation parameter and the target executable code corresponding to the first type of operator in each calculation node in the forward inference process.
It should be noted that, in the second implementation, since the updated neural network model does not include the second type of operator, but may only include the first type of operator and the supported operator, the device only needs to process the existing first type of operator in the forward inference process, which speeds up the forward inference speed compared to the first implementation.
Fig. 3 is a flowchart of a method for performing forward inference according to an embodiment of the present application. It is assumed that before forward inference, the device determines the first type of operator and the second type of operator through one-time traversal of the computation graph, and replaces the second type of operator according to the replacement operator list, so as to obtain an updated neural network model. Referring to fig. 3, initial data for forward inference is input to the updated neural network model, and forward inference calculations of corresponding computation nodes are sequentially performed according to an execution sequence during forward inference. When a computing node is executed, whether an operator corresponding to the first operator label exists in the corresponding computing node is judged firstly, whether an operator of the first type exists is judged accordingly, and if the operator does not exist, forward reasoning calculation of the corresponding computing node is executed. If the operator labels exist, acquiring corresponding calculation parameters according to the first operator labels, and acquiring matched target executable codes, if the operator labels exist, executing forward reasoning calculation of corresponding calculation nodes according to the acquired calculation parameters and the target executable codes, and if the calculation parameters are not acquired or the matched target executable codes are not acquired, outputting an error prompt for indicating that an unsupported operator exists and the forward reasoning cannot be completed so as to remind a user of processing. Thus, the device can output the forward reasoning result until each computing node in the neural network model executes the corresponding forward reasoning calculation.
In a third implementation manner, before performing forward inference, the device determines the first type of operator and the second type of operator, but does not replace the second type of operator according to the replacement operator list, and adds a second operator tag to the second type of operator. In this way, the apparatus may sequentially select one of the plurality of computing nodes to perform the following operations in an order of the plurality of computing nodes included in the neural network model until each of the plurality of computing nodes has performed the following operations: if one or more operators included in the selected computing node have the second type of operator, replacing the second type of operator according to the replacement operator list to obtain an updated computing node; if one or more operators included in the selected computing node have the operator of the first type, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to the first operator label corresponding to the operator of the first type included in the selected computing node; and executing forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.
In this implementation manner, in the forward inference process, the device may determine, every time a computing node is executed, whether the computing node has an operator corresponding to the second operator tag, to determine whether the computing node has an operator of the second type, and if so, execute a replacement operation of the corresponding operator to obtain an updated computing node. For the calculation node having the first type of operator, the implementation manner of performing the corresponding forward inference calculation may also refer to the foregoing first implementation manner, and details are not described here.
In the three implementations described above, the device may determine the first type of operator or the second type of operator according to the label corresponding to the operator during the forward inference, and in other possible implementations, the device may determine the two types of operators according to the support operator list and the replacement operator list during the forward inference, regardless of whether the first type of operator and the second type of operator are determined before the forward inference.
Fig. 4 is a flowchart of another inference method provided in the embodiments of the present application. Assuming that the device is equipped with one or more pieces of hardware, a corresponding platform inference component is developed on a per-hardware basis. Referring to fig. 4, a model adaptation tool, a domain-specific language compiler, and one or more platform inference components corresponding to the one or more hardware are configured in the device. The model adaptation tool is configured with a support operator list and a replacement operator list corresponding to various possible hardware (e.g., n kinds) to provide filtering conditions and replacement schemes of the operators supported by various possible hardware, that is, filtering conditions and replacement schemes of the operators supported by various possible platform inference components (e.g., n kinds). The domain-specific language compiler is configured with one interpreter and a plurality of compilers, and the interpreter is configured to output codes of various program languages.
First, the neural network model determines the platform inference component on which the neural network model will depend as the target platform inference component, e.g., the platform 1 inference component is the target platform inference component. Inputting the neural network model into a model adaptation tool, determining a first type operator and a second type operator in the neural network model by traversing a calculation graph according to a filtering condition and a replacement scheme corresponding to a target platform reasoning component by the model adaptation tool, replacing the second type operator with an operator supported by target hardware, adding a first operator label to the first type operator, extracting corresponding calculation parameters, and finally outputting the updated neural network model and the first type operator by the model adaptation tool.
Secondly, the user writes each operator of the first type by using a domain specific language according to the operators of the first type output by the model adaptation tool to obtain a corresponding registration operator.
Then, the register operators are used as the input of the interpreter, and the interpreter can process each register operator according to the program language supported by the target hardware to obtain the corresponding target language code segment, and input each obtained target language code segment into the target compiler. The target compiler may compile each target language code segment to obtain each target executable code supported by the target hardware.
And finally, inputting each target executable code segment and the updated neural network model into a target platform reasoning component to carry out forward reasoning and finally outputting a forward reasoning result.
Optionally, in this embodiment of the present application, the process of determining the first type of operator or the second type of operator in the neural network model before the forward inference may be completed on the first device, and the device may output the updated neural network model and the first type of operator, that is, the device may be configured with a model adaptation tool. The process that the user writes the first type of operator by using the domain-specific language to obtain the registration operator can be completed on the second device. The process of processing the registration operator to obtain the target executable code may be completed on a third device, that is, a domain-specific language compiler is configured on the third device. Finally, according to the target executable code and the neural network model, the process of forward reasoning through the target hardware can be completed on the fourth device, that is, the hardware assembled on the fourth device is the target hardware. In other words, in the embodiment of the present application, the above processes may be performed on one device, or may also be performed independently on different devices, or may be performed partially on one device and partially on other devices, and the devices may communicate with each other to transmit data.
As can be seen from fig. 4, the model adaptation tool and the domain specific language compiler provided in the embodiment of the present application can serve as a whole set of inference system, and can satisfy various different hardware to complete various forward inferences that depend on different hardware.
In summary, in the embodiment of the present application, an operator that is not supported by the target hardware and does not have a corresponding replaceable operator in the operators supported by the target hardware may be determined, that is, an operator of the first type in the neural network model is determined, then, a registration operator corresponding to the operator of the first type determined based on the domain-specific language is obtained and processed to obtain a target executable code supported by the target hardware, and forward inference is performed through the target hardware according to the target executable code and the neural network model. According to the scheme, for the operator of the first type, the corresponding registration operator can be determined and obtained based on the domain specific language, and the design of the domain specific language is irrelevant to hardware, so that a user can write the corresponding registration operator without knowing the hardware characteristics of equipment, and the development difficulty is low. In addition, aiming at operators of the same first type which may correspond to different hardware, a user only needs to rewrite the operator once to obtain a corresponding registration operator, and the registration operator can be applied to different hardware, so that the development workload is greatly reduced.
Fig. 5 is a schematic structural diagram of an inference apparatus provided in an embodiment of the present application, where the inference apparatus 500 may be implemented as part of or all of a computer device by software, hardware, or a combination of both. Referring to fig. 5, the apparatus 500 includes: a first determination module 501, an acquisition module 502, a first processing module 503, and an inference module 504.
A first determining module 501, configured to determine an operator of a first type from a plurality of operators included in the neural network model, where the operator of the first type is an operator that is not supported by the target hardware and a corresponding replaceable operator does not exist in the operators supported by the target hardware;
an obtaining module 502, configured to obtain one or more registration operators corresponding to the first type of operator, where the one or more registration operators are determined based on the domain specific language;
a first processing module 503, configured to process one or more registration operators to obtain one or more target executable codes supported by target hardware;
an inference module 504 for performing forward inference by the target hardware based on the one or more target executable codes and the neural network model.
Optionally, the first processing module 503 includes:
the interpreter unit is used for processing the one or more registration operators through the interpreter to obtain one or more target language code segments, and the one or more registration operators are in one-to-one correspondence with the one or more target language code segments;
and the compiling unit is used for compiling the one or more target language code segments through a target compiler to obtain one or more target executable codes corresponding to the one or more target language code segments one to one, and the target compiler is a compiler matched with the target hardware.
Optionally, the first determining module 501 includes:
the determining unit is used for determining an operator of a first type from a plurality of operators according to the support operator list and the replacement operator list corresponding to the target hardware;
the support operator list comprises operators supported by the target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and the corresponding replaceable operators.
Optionally, the determining unit includes:
the first determining subunit is used for determining a computational graph of the neural network model, the computational graph comprises a plurality of computational nodes, each computational node comprises one or more operators, and the plurality of computational nodes are arranged according to an execution sequence in the forward reasoning;
and the second determining subunit determines a target operator which is not contained in the support operator list in one or more operators included in each computing node in sequence according to the sequence of the computing nodes, and takes the target operator which is not contained in the replacement operator list and included in the computing nodes as the first type of operator.
Optionally, the apparatus 500 further comprises:
the second processing module is used for adding a first operator label to each first type of operator in the neural network model, extracting a calculation parameter of a calculation node to which the corresponding first type of operator belongs, and correspondingly storing the first operator label and the calculation parameter, wherein the first operator label is used for uniquely identifying the corresponding operator, and the first operator label is used for indicating that the type of the corresponding operator is the first type.
Optionally, the apparatus 500 further comprises:
the second determining module is used for determining a second type of operator from a plurality of operators included in the neural network model, wherein the second type of operator refers to an operator which is not supported by the target hardware and has a corresponding replaceable operator in the operators supported by the target hardware;
and the replacing module is used for replacing the operator of the second type in the neural network model according to the replacing operator list to obtain the updated neural network model.
Optionally, the inference module 504 is specifically configured to:
according to the sequence of the plurality of computing nodes included in the updated neural network model, one computing node in the plurality of computing nodes is sequentially selected to execute the following operations until each computing node in the plurality of computing nodes executes the following operations:
if one or more operators included in the selected computing node have the operator of the first type, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to the first operator label corresponding to the operator of the first type included in the selected computing node;
and executing forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.
Optionally, the inference module 504 is specifically configured to:
according to the sequence of a plurality of computing nodes included in the neural network model, one computing node in the plurality of computing nodes is sequentially selected to execute the following operations until each computing node in the plurality of computing nodes executes the following operations:
if the operator of the second type exists in one or more operators included in the selected computing node according to the replacement operator list, replacing the operator of the second type according to the replacement operator list to obtain an updated computing node;
if one or more operators included in the updated computing node have the operator of the first type, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to the first operator label corresponding to the operator of the first type included in the updated computing node;
and executing the forward reasoning calculation of the updated calculation node according to the acquired calculation parameters and the target executable code.
Alternatively, a domain-specific language refers to a computer language that applies to a particular application domain.
In summary, in the embodiment of the present application, an operator that is not supported by the target hardware and does not have a corresponding replaceable operator in the operators supported by the target hardware may be determined, that is, an operator of the first type in the neural network model is determined, then, a registration operator corresponding to the operator of the first type determined based on the domain-specific language is obtained and processed to obtain a target executable code supported by the target hardware, and forward inference is performed through the target hardware according to the target executable code and the neural network model. According to the scheme, for the operator of the first type, the corresponding registration operator can be determined and obtained based on the domain specific language, and the design of the domain specific language is irrelevant to hardware, so that a user can write the corresponding registration operator without knowing the hardware characteristics of equipment, and the development difficulty is low. In addition, aiming at operators of the same first type which may correspond to different hardware, a user only needs to rewrite the operator once to obtain a corresponding registration operator, and the registration operator can be applied to different hardware, so that the development workload is greatly reduced.
It should be noted that: the inference device provided in the above embodiment is only illustrated by the division of the above functional modules when performing inference, and in practical applications, the above function allocation may be completed by different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the inference device and the inference method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
Fig. 6 is a block diagram of a computer device 600 according to an embodiment of the present disclosure. The computer device 600 may be a terminal device or a server such as a desktop computer, a notebook computer, a tablet computer, a smart phone, and the like.
Generally, the computer device 600 includes: a processor 601 and a memory 602.
The processor 601 may include one or more processing cores, such as a 4-core processor, a 6-core processor, and so on. The processor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 601 may also include a main processor and a coprocessor, where the main processor is a processor, also called a CPU, for processing data in an awake state; a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 601 may be integrated with a GPU, which is responsible for rendering and drawing the content that the display screen needs to display. In some embodiments, processor 601 may also include an AI processor for processing computational operations related to machine learning.
The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the inference method provided by the method embodiments herein.
In some embodiments, the computer device 600 may further optionally include: a peripheral interface 603 and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 603 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 604, touch screen display 605, camera 606, audio circuitry 607, positioning component 606, and power supply 609.
The peripheral interface 603 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 601 and the memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 601, the memory 602, and the peripheral interface 603 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 604 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 604 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 604 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 604 may communicate with other computer devices via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 604 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 605 is a touch display screen, the display screen 605 also has the ability to capture touch signals on or over the surface of the display screen 605. The touch signal may be input to the processor 601 as a control signal for processing. At this point, the display 605 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 605 may be a front panel disposed on the computer device 600; in other embodiments, the display 605 may be at least two separate displays disposed on different surfaces of the computer device 600 or in a folded design; in other embodiments, the display 605 may be a flexible display disposed on a curved surface or on a folded surface of the computer device 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display 605 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.
The camera assembly 606 is used to capture images or video. Optionally, camera assembly 606 includes a front camera and a rear camera. Generally, a front camera is disposed on a front panel of a computer apparatus, and a rear camera is disposed on a rear surface of the computer apparatus. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
Audio circuitry 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing or inputting the electric signals to the radio frequency circuit 604 to realize voice communication. For stereo capture or noise reduction purposes, the microphones may be multiple and located at different locations on the computer device 600. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 607 may also include a headphone jack.
The Location component 608 is used to locate the current geographic Location of the computer device 600 to implement navigation or LBS (Location Based Service). The Positioning component 608 can be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.
The power supply 609 is used to supply power to the various components in the computer device 600. The power supply 609 may be ac, dc, disposable or rechargeable. When the power supply 609 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the computer device 600 also includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyro sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.
The acceleration sensor 611 may detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the computer apparatus 600. For example, the acceleration sensor 611 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 601 may control the touch screen display 605 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 611. The acceleration sensor 611 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 612 may detect a body direction and a rotation angle of the computer apparatus 600, and the gyro sensor 612 may cooperate with the acceleration sensor 611 to acquire a 3D motion of the user on the computer apparatus 600. The processor 601 may implement the following functions according to the data collected by the gyro sensor 612: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
The pressure sensors 613 may be disposed on the side bezel of the computer device 600 and/or underneath the touch display screen 605. When the pressure sensor 613 is disposed on the side frame of the computer device 600, the holding signal of the user to the computer device 600 can be detected, and the processor 601 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the touch display screen 605, the processor 601 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 605. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 614 is used for collecting a fingerprint of a user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 614 may be provided on the front, back, or side of the computer device 600. When a physical key or vendor Logo is provided on the computer device 600, the fingerprint sensor 614 may be integrated with the physical key or vendor Logo.
The optical sensor 615 is used to collect the ambient light intensity. In one embodiment, processor 601 may control the display brightness of touch display 605 based on the ambient light intensity collected by optical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 605 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 605 is turned down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.
The proximity sensor 616, also known as a distance sensor, is typically disposed on the front panel of the computer device 600. The proximity sensor 616 is used to capture the distance between the user and the front of the computer device 600. In one embodiment, the processor 601 controls the touch display screen 605 to switch from the bright screen state to the rest screen state when the proximity sensor 616 detects that the distance between the user and the front face of the computer device 600 is gradually decreased; when the proximity sensor 616 detects that the distance between the user and the front of the computer device 600 is gradually increasing, the touch display screen 605 is controlled by the processor 601 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in FIG. 6 does not constitute a limitation of the computer device 600, and may include more or fewer components than those shown, or combine certain components, or employ a different arrangement of components.
In some embodiments, a computer-readable storage medium is also provided, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the inference method in the above-mentioned embodiments. For example, the computer readable storage medium may be a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
It is noted that the computer-readable storage medium referred to herein may be a non-volatile storage medium, in other words, a non-transitory storage medium.
It should be understood that all or part of the steps for implementing the above embodiments may be implemented by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The computer instructions may be stored in the computer-readable storage medium described above.
That is, in some embodiments, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of the inference method described above.
The above-mentioned embodiments are provided not to limit the present application, and any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A method of reasoning, the method comprising:
determining a first type of operator from a plurality of operators included in a neural network model, wherein the first type of operator refers to an operator which is not supported by target hardware and does not have a corresponding replaceable operator in the operators supported by the target hardware;
acquiring one or more registration operators corresponding to the first type of operator, wherein the one or more registration operators are determined and obtained based on a domain specific language;
processing the one or more registration operators to obtain one or more target executable codes supported by the target hardware;
forward reasoning by the target hardware in accordance with the one or more target executable codes and the neural network model.
2. The method of claim 1, wherein said processing the one or more registration operators to obtain one or more target executables supported by the target hardware comprises:
processing the one or more registration operators through an interpreter to obtain one or more target language code segments, wherein the one or more registration operators correspond to the one or more target language code segments one to one;
compiling the one or more target language code segments through a target compiler to obtain one or more target executable codes corresponding to the one or more target language code segments one by one, wherein the target compiler is a compiler matched with the target hardware.
3. The method of claim 1, wherein determining the operator of the first type from a plurality of operators included in the neural network model comprises:
determining the operator of the first type from the plurality of operators according to a support operator list and a replacement operator list corresponding to the target hardware;
the support operator list comprises operators supported by the target hardware, and the replacement operator list is a mapping relation between the operators supported by the target hardware and corresponding replaceable operators.
4. The method of claim 3, wherein determining the first type of operator from the plurality of operators according to a list of support operators and a list of replacement operators corresponding to the target hardware comprises:
determining a computational graph of the neural network model, wherein the computational graph comprises a plurality of computational nodes, each computational node comprises one or more operators, and the plurality of computational nodes are arranged according to an execution sequence in forward reasoning;
and sequentially determining a target operator which is not contained in the support operator list in one or more operators included in each computing node according to the sequence of the computing nodes, and taking the target operator which is not contained in the replacement operator list and included in the computing nodes as the first type of operator.
5. The method of claim 1, wherein after determining the operator of the first type from the plurality of operators included in the neural network model, further comprising:
adding a first operator label to each operator of the first type in the neural network model, extracting a calculation parameter of a calculation node to which the corresponding operator of the first type belongs, and correspondingly storing the first operator label and the calculation parameter, wherein the first operator label is used for uniquely identifying the corresponding operator, and the first operator label is used for indicating that the type of the corresponding operator is the first type.
6. The method of claim 5, wherein after determining the first type of operator from the plurality of operators included in the neural network model, further comprising:
determining a second type of operator from a plurality of operators included in the neural network model, wherein the second type of operator refers to an operator which is not supported by the target hardware and has a corresponding replaceable operator in the operators supported by the target hardware;
and replacing the operator of the second type in the neural network model according to the replacement operator list to obtain an updated neural network model.
7. The method of claim 6, wherein said performing forward reasoning by said target hardware from said one or more target executable codes and said neural network model comprises:
according to the sequence of the plurality of computing nodes included in the updated neural network model, one of the plurality of computing nodes is sequentially selected to execute the following operations until each of the plurality of computing nodes has executed the following operations:
if the operator of the first type exists in one or more operators included in the selected computing node, acquiring the computing parameters of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the selected computing node;
and executing the forward reasoning calculation of the selected calculation node according to the acquired calculation parameters and the target executable code.
8. The method of any one of claims 1-5, wherein said performing, by said target hardware, forward inference based on said one or more target executables and said neural network model comprises:
according to the sequence of a plurality of computing nodes included in the neural network model, one of the computing nodes is sequentially selected to execute the following operations until each of the computing nodes executes the following operations:
if it is determined that a second type of operator exists in one or more operators included in the selected computing node according to the replacement operator list, replacing the second type of operator according to the replacement operator list to obtain an updated computing node;
if the operator of the first type exists in one or more operators included in the updated computing node, acquiring the computing parameter of the selected computing node and the target executable code corresponding to the operator of the first type included in the selected computing node according to a first operator label corresponding to the operator of the first type included in the updated computing node;
and executing the forward reasoning calculation of the updated calculation node according to the acquired calculation parameters and the target executable code.
9. The method of any one of claims 1-8, wherein the domain-specific language is a computer language that is applied to a specific application domain.
10. An inference apparatus, characterized in that the apparatus comprises:
the device comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining a first type of operator from a plurality of operators included in a neural network model, and the first type of operator refers to an operator which is not supported by target hardware and does not have a corresponding replaceable operator in the operators supported by the target hardware;
the acquisition module is used for acquiring one or more registration operators corresponding to the first type of operator, and the one or more registration operators are determined and obtained based on a domain specific language;
the first processing module is used for processing the one or more registration operators to obtain one or more target executable codes supported by the target hardware;
and the reasoning module is used for carrying out forward reasoning through the target hardware according to the one or more target executable codes and the neural network model.
CN202010244456.8A 2020-03-31 2020-03-31 Reasoning method and device Active CN113469360B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010244456.8A CN113469360B (en) 2020-03-31 2020-03-31 Reasoning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010244456.8A CN113469360B (en) 2020-03-31 2020-03-31 Reasoning method and device

Publications (2)

Publication Number Publication Date
CN113469360A true CN113469360A (en) 2021-10-01
CN113469360B CN113469360B (en) 2023-10-20

Family

ID=77865445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010244456.8A Active CN113469360B (en) 2020-03-31 2020-03-31 Reasoning method and device

Country Status (1)

Country Link
CN (1) CN113469360B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115113528A (en) * 2022-07-06 2022-09-27 昆仑芯(北京)科技有限公司 Operation control method, device, equipment and medium of neural network model
CN115309407A (en) * 2022-10-12 2022-11-08 中国移动通信有限公司研究院 Method and system capable of realizing calculation power abstraction

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126046A1 (en) * 2003-02-11 2006-06-15 Asml Netherlands B.V. Lithographic apparatus and method for optimizing illumination using a photolithographic simulation
CN107786973A (en) * 2017-10-30 2018-03-09 清华大学深圳研究生院 Wireless network user method for secret protection and computer-readable recording medium
US20180096226A1 (en) * 2016-10-04 2018-04-05 Magic Leap, Inc. Efficient data layouts for convolutional neural networks
US20180315158A1 (en) * 2017-04-28 2018-11-01 Intel Corporation Programmable coarse grained and sparse matrix compute hardware with advanced scheduling
US20180349189A1 (en) * 2017-06-03 2018-12-06 Apple Inc. Dynamic task allocation for neural networks
CN109919315A (en) * 2019-03-13 2019-06-21 科大讯飞股份有限公司 A kind of forward inference method, apparatus, equipment and the storage medium of neural network
CN110378413A (en) * 2019-07-17 2019-10-25 Oppo广东移动通信有限公司 Neural network model processing method, device and electronic equipment
CN110569106A (en) * 2019-08-27 2019-12-13 Oppo广东移动通信有限公司 Code loading method and device, electronic equipment and computer readable medium
CN110659070A (en) * 2018-06-29 2020-01-07 赛灵思公司 High-parallelism computing system and instruction scheduling method thereof
US20200073677A1 (en) * 2018-08-31 2020-03-05 International Business Machines Corporation Hybrid computing device selection analysis
CN110866610A (en) * 2019-11-20 2020-03-06 苏州浪潮智能科技有限公司 Deep learning model distributed operation method and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126046A1 (en) * 2003-02-11 2006-06-15 Asml Netherlands B.V. Lithographic apparatus and method for optimizing illumination using a photolithographic simulation
US20180096226A1 (en) * 2016-10-04 2018-04-05 Magic Leap, Inc. Efficient data layouts for convolutional neural networks
US20180315158A1 (en) * 2017-04-28 2018-11-01 Intel Corporation Programmable coarse grained and sparse matrix compute hardware with advanced scheduling
US20180349189A1 (en) * 2017-06-03 2018-12-06 Apple Inc. Dynamic task allocation for neural networks
CN107786973A (en) * 2017-10-30 2018-03-09 清华大学深圳研究生院 Wireless network user method for secret protection and computer-readable recording medium
CN110659070A (en) * 2018-06-29 2020-01-07 赛灵思公司 High-parallelism computing system and instruction scheduling method thereof
US20200073677A1 (en) * 2018-08-31 2020-03-05 International Business Machines Corporation Hybrid computing device selection analysis
CN109919315A (en) * 2019-03-13 2019-06-21 科大讯飞股份有限公司 A kind of forward inference method, apparatus, equipment and the storage medium of neural network
CN110378413A (en) * 2019-07-17 2019-10-25 Oppo广东移动通信有限公司 Neural network model processing method, device and electronic equipment
CN110569106A (en) * 2019-08-27 2019-12-13 Oppo广东移动通信有限公司 Code loading method and device, electronic equipment and computer readable medium
CN110866610A (en) * 2019-11-20 2020-03-06 苏州浪潮智能科技有限公司 Deep learning model distributed operation method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WONIL CHANG等: "Flexible Reasoning of Boolean Constraints in Recurrent Neural Networks with Dual Representation", 《ICONIP》, pages 106 - 112 *
李博杰: "基于可编程网卡的高性能数据中心系统", 《中国博士学位论文全文数据库信息科技辑》, pages 137 - 4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115113528A (en) * 2022-07-06 2022-09-27 昆仑芯(北京)科技有限公司 Operation control method, device, equipment and medium of neural network model
CN115309407A (en) * 2022-10-12 2022-11-08 中国移动通信有限公司研究院 Method and system capable of realizing calculation power abstraction

Also Published As

Publication number Publication date
CN113469360B (en) 2023-10-20

Similar Documents

Publication Publication Date Title
CN110490179B (en) License plate recognition method and device and storage medium
CN112256425B (en) Load balancing method and system, computer cluster, information editing method and terminal
CN108132790B (en) Method, apparatus and computer storage medium for detecting a garbage code
CN110321126B (en) Method and device for generating page code
CN111400002B (en) Application process and processor core binding method and terminal
CN112749362A (en) Control creating method, device, equipment and storage medium
CN110647881A (en) Method, device, equipment and storage medium for determining card type corresponding to image
CN113469360B (en) Reasoning method and device
CN113867848A (en) Method, device and equipment for calling graphic interface and readable storage medium
CN109828915B (en) Method, device, equipment and storage medium for debugging application program
CN107943484B (en) Method and device for executing business function
CN111753606A (en) Intelligent model upgrading method and device
CN111898535A (en) Target identification method, device and storage medium
CN111666076A (en) Layer adding method, device, terminal and storage medium
CN112230781A (en) Character recommendation method and device and storage medium
CN112132222B (en) License plate category identification method and device and storage medium
CN109816047B (en) Method, device and equipment for providing label and readable storage medium
CN109388732B (en) Music map generating and displaying method, device and storage medium
CN111294320B (en) Data conversion method and device
CN113268234A (en) Page generation method, device, terminal and storage medium
CN113051485A (en) Group searching method, device, terminal and storage medium
CN112416356A (en) JSON character string processing method, device, equipment and storage medium
CN113076452A (en) Application classification method, device, equipment and computer readable storage medium
CN111984738A (en) Data association method, device, equipment and storage medium
CN112990421A (en) Method, device and storage medium for optimizing operation process of deep learning network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant