CN115423101A

CN115423101A - Tensor data calculation reasoning method and device based on compiler and storage medium

Info

Publication number: CN115423101A
Application number: CN202211000929.5A
Authority: CN
Inventors: 姜汉; 王臣汉; 潘相瑜; 吕天蕾; 王岩鑫
Original assignee: Beijing Computing Tianjin Information Technology Co ltd
Current assignee: Beijing Computing Tianjin Information Technology Co ltd
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2022-12-02

Abstract

The invention relates to the technical field of deep learning, and discloses a compiler-based tensor data calculation reasoning method, a compiler-based tensor data calculation reasoning device and a storage medium, wherein the method comprises the following steps of: acquiring a specified input model file, and acquiring a target uniform format intermediate representation element according to the specified input model file; generating target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization strategy; constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library; reasoning the target tensor data according to the target equipment executable file and the model type; through the mode, the format is unified based on the front-end access layer, then the data optimization is carried out based on the intermediate conversion layer and the intermediate layer optimization strategy, and the computational reasoning is carried out based on the target equipment executable file and the model type which are constructed by the terminal execution layer, so that the computational reasoning of various frames and platforms can be adapted, and the development workload of a developer for carrying out the model reasoning is simplified.

Description

Tensor data calculation reasoning method and device based on compiler and storage medium

Technical Field

The invention relates to the technical field of deep learning, in particular to a tensor data calculation inference method and device based on a compiler and a storage medium.

Background

With the continuous development of artificial intelligence, deep learning techniques are widely applied in various lines, while deep learning applications are not separated from inference frameworks, such as TensorFlow, pytorreh, TNN and the like, but different inference frameworks have different functions, such as TensorFlow and pytorreh, which are platform-level frameworks and can be used for training and inference, and TNN framework can only be used for inference, and whatever the function framework is, some relevant acceleration devices, such as NVIDIA GPU, apple M1, ARM and the like, are adapted behind the framework, while adaptation and optimization degrees of different inference frameworks are different, and different kinds of frameworks need to be adapted to multiple model files because of different training platforms, and model files include multiple data types, operation flows and data flows, which ultimately causes fragmentation of deep learning inference, and such fragmentation inference causes additional development work to be added when a model is performed by a developer, and the developer needs to know conversion modes between various frameworks and platforms, which causes tedious development workload.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide a compiler-based tensor data computational reasoning method, a compiler-based tensor data computational reasoning device and a storage medium, and aims to solve the technical problem that the development workload of a developer for model reasoning is relatively complex due to the fact that the prior art cannot adapt to computational reasoning of various frameworks and platforms.

In order to achieve the above object, the present invention provides a tensor data calculation and inference method based on compiler, which includes the following steps:

acquiring a specified input model file, and acquiring a target uniform format intermediate representation element according to the specified input model file;

generating a target self-decoding according to the target uniform format intermediate representation element and an intermediate layer optimization strategy;

constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library;

and reasoning the target tensor data according to the target equipment executable file and the model type.

Optionally, the obtaining of the specified input model file and obtaining the target uniform format intermediate representation element according to the specified input model file include:

acquiring a specified input model file, and acquiring a corresponding numerical weight, a tensor structure and a computation graph according to the specified input model file;

respectively acquiring the numerical weight, the tensor structure and the data format of the computational graph;

and when the condition that the data formats of any two of the numerical weight, the tensor structure and the calculation graph are inconsistent is met, converting the numerical weight, the tensor structure and the calculation graph into a target uniform format intermediate representation element through a front-end input layer.

Optionally, the generating a target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization policy includes:

acquiring the number of layers of the middle conversion layer, and obtaining a data optimization strategy arranged at each layer according to the number of layers of the middle conversion layer and the middle layer suboptimum strategy;

optimizing the intermediate representation element of the target uniform format according to the data optimization strategy of each layer to obtain a tensor calculation model file;

and generating target self-decoding according to the tensor calculation model file.

Optionally, the generating a target self-decoding from the tensor computation model file includes:

obtaining corresponding file types and tensor calculation model data according to the tensor calculation model file;

selecting a target coding strategy from a coding strategy set according to the file type;

and coding the tensor calculation model data through the target coding strategy to obtain target self-decoding.

Optionally, the building a target device executable according to the target self-decoding, hardware driving and development tool library includes:

acquiring a plurality of hardware drive sets and development tool library sets;

determining a corresponding self-decoding type according to the target self-decoding;

matching the self-decoding type with the hardware drive set to obtain a hardware drive;

matching the self-decoding type with the development tool library set to obtain a development tool library;

and translating the target self-decoding according to the hardware drive and the development tool library to obtain an executable file of the target equipment.

Optionally, the translating the target self-decoding according to the hardware driver and the development tool library to obtain a target device executable file includes:

compiling the target self-decoding according to the hardware drive and the development tool library to obtain a current assembly file;

assembling the current assembly file to obtain a current binary file;

and linking the current binary file with a database to be called to obtain an executable file of the target equipment.

Optionally, the inferring the target tensor data according to the target device executable file and the model type includes:

acquiring information of equipment to be operated, and determining target operation equipment according to the information of the equipment to be operated;

executing the target equipment executable file through the target running equipment to obtain a file execution result;

determining a model type according to the executable file of the target equipment;

and reasoning the target tensor data according to the file execution result and the model type.

In order to achieve the above object, the present invention provides a compiler-based tensor data calculation and inference apparatus including:

the acquisition module is used for acquiring a specified input model file and obtaining a target uniform format intermediate representation element according to the specified input model file;

the generating module is used for generating target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization strategy;

the construction module is used for constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library;

and the reasoning module is used for reasoning the target tensor data according to the target equipment executable file and the model type.

In addition, to achieve the above object, the present invention further provides a compiler-based tensor data calculation inference device, including: a memory, a processor, and a compiler-based tensor data computational inference program stored on the memory and executable on the processor, the compiler-based tensor data computational inference program configured to implement the compiler-based tensor data computational inference method as described above.

In addition, to achieve the above object, the present invention further provides a storage medium having stored thereon a compiler-based tensor data calculation inference program, which, when executed by a processor, implements the compiler-based tensor data calculation inference method as described above.

The tensor data calculation inference method based on the compiler obtains a target uniform format intermediate representation element according to an appointed input model file by obtaining the appointed input model file; generating target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization strategy; constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library; reasoning the target tensor data according to the executable file of the target equipment and the model type; through the mode, the format is unified based on the front-end access layer, data optimization is performed based on the intermediate conversion layer and the intermediate layer optimization strategy, and computational reasoning is performed based on the target equipment executable file and the model type which are constructed by the terminal execution layer, so that the computational reasoning method can adapt to the computational reasoning of various frames and platforms, and the development workload of a developer for model reasoning is simplified.

Drawings

Fig. 1 is a schematic structural diagram of a compiler-based tensor data computation inference apparatus in a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a first embodiment of a compiler-based tensor data calculation inference method according to the present invention;

FIG. 3 is a flowchart illustrating a second embodiment of the compiler-based tensor data calculation inference method of the present invention;

fig. 4 is a functional block diagram of a tensor data calculation and inference device based on a compiler according to a first embodiment of the present invention.

The implementation, functional features and advantages of the present invention will be further described with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a compiler-based tensor data calculation inference device in a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the compiler-based tensor data calculation inference device may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the architecture shown in figure 1 does not constitute a limitation of compiler-based tensor data computation inference equipment, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and a compiler-based tensor data calculation inference program.

In the tensor data calculation inference device based on the compiler shown in fig. 1, the network interface 1004 is mainly used for data communication with the network integration platform workstation; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the compiler-based tensor data calculation inference device of the present invention may be disposed in the compiler-based tensor data calculation inference device, which calls the compiler-based tensor data calculation inference program stored in the memory 1005 through the processor 1001 and executes the compiler-based tensor data calculation inference method provided by the embodiment of the present invention.

Based on the hardware structure, the embodiment of the tensor data calculation reasoning method based on the compiler is provided.

Referring to fig. 2, fig. 2 is a flowchart illustrating a first embodiment of a tensor data calculation inference method based on a compiler according to the present invention.

In a first embodiment, the compiler-based tensor data computational inference method includes the steps of:

and S10, acquiring a specified input model file, and obtaining a target uniform format intermediate representation element according to the specified input model file.

It should be noted that the main execution body of this embodiment is tensor data calculation inference equipment based on a compiler, and may also be other equipment that can implement the same or similar function, for example, a tensor calculation oriented compiler, and the present embodiment is not limited to this.

It should be understood that the tensor calculation-oriented compiler supports model files of various deep learning frameworks, and the design structure of the tensor calculation-oriented compiler comprises three layers, namely a front-end access layer, a middle conversion layer and a terminal execution layer, and is applied to the deep learning inference field.

It can be understood that the specified input model file refers to a file corresponding to a model that needs to be deployed on the acceleration device, the specified input model file may be of multiple types, the target uniform format intermediate representation element refers to an intermediate representation element constructed by each data in the specified input model file, and the target uniform format intermediate representation element is intended for an intermediate conversion layer of a next layer.

Further, step S10 includes: acquiring an appointed input model file, and acquiring a corresponding numerical weight, a tensor structure and a calculation graph according to the appointed input model file; respectively acquiring the numerical weight, the tensor structure and the data format of the computational graph; and when the condition that the data formats of any two of the numerical weight, the tensor structure and the calculation graph are inconsistent is met, converting the numerical weight, the tensor structure and the calculation graph into a target uniform format intermediate representation element through a front-end input layer.

It can be understood that, after receiving a specified input model file, a tensor calculation-oriented compiler needs to unify the formats of the specified input model files because the formats of the model files of each deep learning framework are different, specifically, obtains a corresponding numerical weight, a tensor structure and a computation graph according to the specified input model file, then determines whether the data formats of the numerical weight, the tensor structure and the computation graph are all the same, if yes, directly transmits the data formats to an intermediate conversion layer for processing, and if not, integrates the numerical weight, the tensor structure and the computation graph into an intermediate representation element with the same format through a front-end access layer, that is, an intermediate representation element with a target unified format.

And S20, generating target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization strategy.

It is to be understood that the target self-decoding refers to deep self-decoding, the target self-decoding is for performing layer usage for a terminal of a next layer, the middle layer optimization strategy refers to optimization strategies arranged at each layer of the middle conversion layer, and each layer may be provided with one or more optimization strategies, for example, an operator fusion optimization strategy, a matrix decomposition optimization strategy, for example, reducing the number of times of exchanging data between a display memory and a memory through the operator fusion optimization strategy, and increasing parallelism through the matrix decomposition optimization strategy, and after obtaining the target uniform format middle representation element, optimizing the target uniform format middle representation element through the middle layer optimization strategy to generate the target self-decoding.

And S30, constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library.

It should be understood that the target device executable file refers to a file supporting execution on a tensor calculation-oriented compiler, the target device executable file may be a binary model file, the hardware driver refers to a driver required by hardware during running, the development tool library refers to a tool library attached to a hardware platform and used for developing different functions, the matching degree of the hardware driver and the development tool library with the target self-decoding is the highest, and after the target self-decoding is obtained, the target device executable file is constructed according to the target self-decoding, the hardware driver and the development tool library.

Further, step S30 includes: acquiring a plurality of hardware drive sets and development tool library sets; determining a corresponding self-decoding type according to the target self-decoding; matching the self-decoding type with the hardware drive set to obtain a hardware drive; matching the self-decoding type with the development tool library set to obtain a development tool library; and translating the target self-decoding according to the hardware drive and the development tool library to obtain a target device executable file.

It can be understood that the self-decoding type refers to a type to which the target self-decoding belongs, and since hardware drivers and development tool libraries corresponding to different types of self-decoding are different, the hardware drivers and development tool libraries need to be respectively matched in a plurality of hardware driver sets and development tool library sets according to the self-decoding type, the matching degree of the hardware drivers and development tool libraries and the target self-decoding is the highest, and then the target self-decoding is translated into the executable file of the target device according to the hardware drivers and development tool libraries.

Further, the translating the target self-decoding according to the hardware driver and the development tool library to obtain a target device executable file includes: compiling the target self-decoding according to the hardware drive and the development tool library to obtain a current assembly file; assembling the current assembly file to obtain a current binary file; and linking the current binary file with a database to be called to obtain an executable file of the target equipment.

It should be understood that after obtaining the target self-decoding, firstly, the target self-decoding assembly is performed by using a hardware driver and a developer tool library, specifically, an-S command may be used, after the assembly is completed, a current assembly file is obtained, then, the current assembly file is compiled, specifically, a-c command may be used, and after the compilation is completed, a current binary file is obtained, where the binary file does not support execution on a tensor calculation-oriented compiler, and therefore, a database to be called needs to be linked, where the database to be called refers to a database that needs to be called when a file is set to be in an executable format, specifically, the current binary file is linked with the database to be called to obtain a target device executable file, and specifically, an-o command may be used.

And S40, reasoning the target tensor data according to the target equipment executable file and the model type.

It can be understood that the model type refers to a type of a model corresponding to a specified input model file, after obtaining an executable file of a target device, the executable file of the target device is executed on a target running device, and after the execution is completed, an inference result of target tensor data is obtained, for example, the model type is an image recognition model, and after the execution is completed, an output inference result is an image feature.

Further, step S40 includes: acquiring information of equipment to be operated, and determining target operation equipment according to the information of the equipment to be operated; executing the target equipment executable file through the target running equipment to obtain a file execution result; determining a model type according to the executable file of the target equipment; and reasoning the target tensor data according to the file execution result and the model type.

It should be understood that the device information to be operated refers to device information that needs to be operated by a developer, the target operation device is determined according to the device information to be operated, then the executable file of the target device is imported into the target operation device for execution, so as to obtain a file execution result, then the target tensor data is inferred according to the file execution result and the model type, that is, the type of the inference result of the target tensor data is obtained according to the model type, and then the inference result of the target tensor data is obtained according to the file execution result and the model type.

In the embodiment, a target uniform format intermediate representation element is obtained according to an appointed input model file by obtaining the appointed input model file; generating target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization strategy; constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library; reasoning the target tensor data according to the target equipment executable file and the model type; through the mode, the format is unified based on the front-end access layer, then the data optimization is carried out based on the intermediate conversion layer and the intermediate layer optimization strategy, and the computational reasoning is carried out based on the target equipment executable file and the model type which are constructed by the terminal execution layer, so that the computational reasoning of various frames and platforms can be adapted, and the development workload of a developer for carrying out the model reasoning is simplified.

In an embodiment, as shown in fig. 3, a second embodiment of the tensor data calculation inference method based on compiler according to the present invention is proposed based on the first embodiment, and the step S20 includes:

step S201, acquiring the number of layers of the middle conversion layer, and obtaining a data optimization strategy arranged at each layer according to the number of layers of the middle conversion layer and the middle layer suboptimum strategy.

It should be understood that the number of layers refers to the number of all layers of the intermediate conversion layer, the intermediate layer suboptimal strategy refers to an optimization strategy arranged at the intermediate conversion layer, the number of intermediate layer suboptimal strategies may be multiple, and then the data optimization strategies arranged at the layers are obtained according to the number of layers and the intermediate layer suboptimal strategy, for example, the data optimization strategy arranged at the first layer is an operator fusion optimization strategy, and the data optimization strategies arranged at the second layer is an operator fusion optimization strategy and a matrix decomposition optimization strategy.

And S202, respectively optimizing the intermediate representation elements in the target uniform format according to the data optimization strategies of all layers to obtain a tensor calculation model file.

It can be understood that, because each model file and the target device are different, the direct operation may cause an unadaptable condition, and even if the adaptation is possible, an optimal performance scheme may not be obtained, so that format conversion is required, specifically, after obtaining the target uniform format intermediate representation element given by the front-end access layer, the intermediate conversion layer may perform multi-level optimization on the target uniform format intermediate representation element by using a data optimization strategy according to each layer of the specified target.

And S203, generating target self-decoding according to the tensor calculation model file.

Further, step S203 includes: obtaining corresponding file types and tensor calculation model data according to the tensor calculation model file; selecting a target coding strategy from a coding strategy set according to the file type; and coding the tensor calculation model data through the target coding strategy to obtain target self-decoding.

It can be understood that the file type refers to the type of a tensor calculation model file, the tensor calculation model data refers to each data in the tensor calculation model file, the target coding strategy refers to a strategy for coding the data, the target coding strategy is to select a coding strategy adapted to the tensor calculation model data in a coding set through the file type, and then the tensor calculation model data is coded through the target coding strategy to obtain target self-decoding.

In the embodiment, the data optimization strategy arranged at each layer is obtained by acquiring the number of layers of the middle conversion layer and the middle layer suboptimum strategy; optimizing the intermediate representation element of the target uniform format according to the data optimization strategies of all layers to obtain a tensor calculation model file; generating target self-decoding according to the tensor calculation model file; by the method, the data optimization strategies arranged on each layer are obtained according to the number of layers of the middle conversion layer and the sub-optimization strategy of the middle layer, then the target uniform format middle representation elements are optimized through the data optimization strategies of each layer according to the specified target, the tensor calculation model file is obtained after the optimization is completed, and then the target self-decoding is automatically generated according to the tensor calculation model file, so that the accuracy of generating the target self-decoding can be effectively improved, and the adaptability between the model file and the target equipment is further improved.

In addition, an embodiment of the present invention further provides a storage medium, where a compiler-based tensor data calculation inference program is stored, and the compiler-based tensor data calculation inference program implements the steps of the compiler-based tensor data calculation inference method when executed by a processor.

Since the storage medium adopts all technical solutions of all the embodiments, at least all the beneficial effects brought by the technical solutions of the embodiments are achieved, and no further description is given here.

In addition, referring to fig. 4, an embodiment of the present invention further provides a tensor data calculation and inference device based on compiler, where the tensor data calculation and inference device based on compiler includes:

the obtaining module 10 is configured to obtain a specified input model file, and obtain a target uniform format intermediate representation element according to the specified input model file.

And the generating module 20 is configured to generate the target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization strategy.

And the building module 30 is used for building the executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library.

And the reasoning module 40 is used for reasoning the target tensor data according to the target equipment executable file and the model type.

In the embodiment, a target uniform format intermediate representation element is obtained according to an appointed input model file by acquiring the appointed input model file; generating target self-decoding according to the target uniform format intermediate representation element and the intermediate layer optimization strategy; constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library; reasoning the target tensor data according to the target equipment executable file and the model type; through the mode, the format is unified based on the front-end access layer, data optimization is performed based on the intermediate conversion layer and the intermediate layer optimization strategy, and computational reasoning is performed based on the target equipment executable file and the model type which are constructed by the terminal execution layer, so that the computational reasoning method can adapt to the computational reasoning of various frames and platforms, and the development workload of a developer for model reasoning is simplified.

It should be noted that the above-described work flows are only exemplary, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of them to achieve the purpose of the solution of the embodiment according to actual needs, and the present invention is not limited herein.

In addition, the technical details that are not described in detail in this embodiment can be referred to the tensor data calculation inference method based on the compiler provided in any embodiment of the present invention, and are not described herein again.

In an embodiment, the obtaining module 10 is further configured to obtain an appointed input model file, and obtain a corresponding numerical weight, a tensor structure, and a computation graph according to the appointed input model file; respectively acquiring the numerical weight, the tensor structure and the data format of the calculation graph; and when the condition that the data formats of any two of the numerical weight, the tensor structure and the calculation graph are inconsistent is met, converting the numerical weight, the tensor structure and the calculation graph into a target uniform format intermediate expression element through a front-end input layer.

In an embodiment, the generating module 20 is further configured to obtain the number of layers of the middle conversion layer, and obtain a data optimization strategy set in each layer according to the number of layers of the middle conversion layer and the middle layer suboptimum strategy; optimizing the intermediate representation element of the target uniform format according to the data optimization strategy of each layer to obtain a tensor calculation model file; and generating target self-decoding according to the tensor calculation model file.

In an embodiment, the generating module 20 is further configured to obtain corresponding file types and tensor calculation model data according to the tensor calculation model file; selecting a target coding strategy from a coding strategy set according to the file type; and coding the tensor calculation model data through the target coding strategy to obtain target self-decoding.

In an embodiment, the building module 30 is further configured to obtain a number of sets of hardware drivers and a number of sets of development tool libraries; determining a corresponding self-decoding type according to the target self-decoding; matching the self-decoding type with the hardware drive set to obtain a hardware drive; matching the self-decoding type with the development tool library set to obtain a development tool library; and translating the target self-decoding according to the hardware drive and the development tool library to obtain an executable file of the target equipment.

In an embodiment, the building module 30 is further configured to compile the target self-decoding according to the hardware driver and the development tool library to obtain a current assembly file; assembling the current assembly file to obtain a current binary file; and linking the current binary file with a database to be called to obtain an executable file of the target equipment.

In an embodiment, the inference module 40 is further configured to obtain information of a device to be operated, and determine a target operation device according to the information of the device to be operated; executing the executable file of the target equipment through the target running equipment to obtain a file execution result; determining a model type according to the executable file of the target equipment; and reasoning the target tensor data according to the file execution result and the model type.

Other embodiments or methods of implementing the compiler-based tensor data computation inference engine of the present invention are described in detail above with reference to the various method embodiments, and are not intended to be exhaustive.

Furthermore, it should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or system comprising the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present invention or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g. Read Only Memory (ROM)/RAM, magnetic disk, optical disk), and includes several instructions for enabling a terminal device (which may be a mobile phone, a computer, an integrated platform workstation, or a network device, etc.) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes performed by the present invention or directly or indirectly applied to other related technical fields are also included in the scope of the present invention.

Claims

1. A tensor data calculation inference method based on compiler is characterized by comprising the following steps:

acquiring an appointed input model file, and acquiring a target uniform format intermediate representation element according to the appointed input model file;

and reasoning the target tensor data according to the executable file of the target equipment and the model type.

2. The compiler-based tensor data computational inference method of claim 1, wherein obtaining a specified input model file from which to derive a target uniform-format intermediate representation comprises:

acquiring an appointed input model file, and acquiring a corresponding numerical weight, a tensor structure and a calculation graph according to the appointed input model file;

respectively acquiring the numerical weight, the tensor structure and the data format of the calculation graph;

3. The compiler-based tensor data computational inference method of claim 1, wherein the generating a target self-decode from the target uniform-format intermediate representational element and an intermediate-layer optimization strategy comprises:

optimizing the intermediate representation element of the target uniform format according to the data optimization strategies of all layers to obtain a tensor calculation model file;

4. The compiler-based tensor data computational inference method of claim 3, wherein the generating a target self-decode from the tensor computation model file comprises:

5. The compiler-based tensor data computational inference method of claim 1, wherein the building a target device executable from the target self-decoding, hardware-driven, and development toolkit comprises:

acquiring a plurality of hardware drive sets and development tool library sets;

and translating the target self-decoding according to the hardware drive and the development tool library to obtain a target device executable file.

6. The compiler-based tensor data computational inference method of claim 5, wherein the translating the target self-decode according to the hardware driver and the development toolkit, resulting in a target device executable, comprises:

assembling the current assembly file to obtain a current binary file;

7. The compiler-based tensor data computational inference method of any one of claims 1-6, wherein the inferring target tensor data from the target device executable and model type comprises:

8. A compiler-based tensor data computational inference apparatus, characterized in that the compiler-based tensor data computational inference apparatus comprises:

the acquisition module is used for acquiring a specified input model file and acquiring a target uniform format intermediate representation element according to the specified input model file;

the creation module is used for constructing an executable file of the target equipment according to the target self-decoding, the hardware driving and the development tool library;

9. A compiler-based tensor data computation inference device, comprising: a memory, a processor, and a compiler-based tensor data computational inference program stored on the memory and executable on the processor, the compiler-based tensor data computational inference program configured with instructions to implement the compiler-based tensor data computational inference method of any one of claims 1-7.

10. A storage medium having stored thereon a compiler-based tensor data computational inference program which, when executed by a processor, implements the compiler-based tensor data computational inference method as claimed in any one of claims 1 to 7.