CN113469364A

CN113469364A - Inference platform, method and device

Info

Publication number: CN113469364A
Application number: CN202010247286.9A
Authority: CN
Inventors: 冯仁光; 叶挺群; 王鹏
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2021-10-01
Anticipated expiration: 2040-03-31
Also published as: CN113469364B

Abstract

The embodiment of the application provides an inference platform, a method and a device. The system comprises an inference platform, a private layer interface and a service layer, wherein the inference platform comprises an inference library and the private layer interface, and the private layer interface is used for defining execution logic of a private layer according to the layer information of the private layer input through the private layer interface; the reasoning library is used for executing an input deep learning algorithm model by utilizing a support layer covered by the reasoning library and a registered private layer so as to reason the input image and obtain a reasoning result; invoking the private layer interface to execute the executed private layer defined execution logic whenever the executed layer is the private layer. The method has the advantages that one reasoning platform can finish reasoning of various different deep learning models with user-defined private layers, reasoning libraries do not need to be developed respectively aiming at different application scenes, development efficiency is improved, labor cost of development is reduced, flexibility of application development is improved, and opening of equipment is achieved.

Description

Inference platform, method and device

Technical Field

The application relates to the technical field of deep learning, in particular to an inference platform, a method and a device.

Background

And (4) carrying out forward reasoning calculation by using a neural network model obtained based on deep learning training, wherein the process is called reasoning. In an actual application scenario, a user may have a certain customization requirement for a network model, for example, operators and calculation methods used for customizing partial layers in the network model, and these layers customized according to the requirement of the user are hereinafter referred to as private layers. The network models are different according to different customization requirements of users.

The inference can be realized by an inference platform, the inference platform comprises an inference library, various layers of execution logic are stored in the inference library, and the inference platform can perform inference according to the execution logic contained in the inference library. In the related art, in order to enable the inference platform to complete inference of the network model including the private layer, a corresponding inference library may be developed in advance for the private layer in the network model, so that the inference library includes an execution logic of the private layer. However, the customization requirements of different users may vary according to different application scenarios. If the inference library is respectively developed according to each customized requirement, more time is spent, the inference efficiency is low, the application of the user is diversified, and if all the work is completed by inference library developers, the workload is heavy, and the user is difficult to flexibly develop the application.

Disclosure of Invention

The embodiment of the application aims to provide an inference platform, a method and a device, so that a corresponding inference library does not need to be developed aiming at different application scenes, the inference efficiency of the inference library is improved, the labor cost is reduced, and a user can develop applications more flexibly. The specific technical scheme is as follows:

in a first aspect of embodiments of the present application, there is provided an inference platform, including an inference library, a private layer interface;

the private layer interface is used for acquiring layer information of a private layer; defining the execution logic of the private layer according to the layer information of the private layer; and registering the private layer in the inference library;

the reasoning library is used for executing an input deep learning algorithm model by utilizing a support layer covered by the reasoning library and a registered private layer so as to reason the input image and obtain a reasoning result; and calling the private layer interface to execute the defined execution logic of the executed private layer each time the executed layer is the registered private layer.

In a possible embodiment, the inference library is specifically configured to execute an input deep learning algorithm model for image recognition by using a support layer already covered by the inference library and a registered private layer, so as to recognize an input image and obtain a recognition result.

In a possible embodiment, the private layer interface is further configured to obtain a custom parameter; when the user-defined parameters are called by the inference library, executing the execution logic defined by the private layer executed by the inference library according to the form configured by the user-defined parameters.

In a possible embodiment, the custom parameter is used to configure a system resource occupied by executing the execution logic defined by the private layer, and the private layer interface is specifically used to call the system resource configured by the custom parameter and execute the execution logic defined by the private layer executed by the inference library.

In a possible embodiment, the custom parameter is used to configure a calculation mode for implementing the execution logic defined by the private layer, and the private layer interface is specifically used to perform calculation according to the calculation mode configured by the custom parameter, so as to execute the execution logic defined by the private layer executed by the inference library.

In a second aspect of the embodiments of the present application, there is provided an inference method applied to the private layer interface of the inference platform according to any one of the first aspect, the method including:

acquiring layer information of a private layer;

defining execution logic of the private layer according to the layer information; and registering the private layer in an inference library, such that the inference library invokes the private layer interface when executing the private layer;

when called by the inference library, the execution logic defined by the private layer is executed.

In a possible embodiment, the method further comprises:

obtaining a self-defined parameter;

the executing the execution logic defined by the private layer comprises:

and executing the execution logic defined by the private layer according to the form configured by the custom parameter.

In a possible embodiment, the custom parameter is used to configure a system resource occupied by executing a private layer defined execution logic, and the executing the private layer defined execution logic executed by the inference library according to the form configured by the custom parameter includes:

and calling the system resource configured by the custom parameter, and executing the execution logic defined by the private layer executed by the inference library.

In a possible embodiment, the custom parameter is used to configure a computing manner for implementing the execution logic defined by the private layer, and the executing the execution logic defined by the private layer executed by the inference library according to the form configured by the custom parameter includes:

and calculating according to the calculation mode configured by the custom parameters to execute the execution logic defined by the private layer executed by the inference library.

In a third aspect of embodiments of the present application, there is provided an inference apparatus, where the apparatus is applied to a private layer interface of an inference platform as described in any one of the above first aspects, the apparatus includes:

the layer information acquisition module is used for acquiring the layer information of the private layer;

a private layer definition module, configured to define an execution logic of the private layer according to the layer information; and registering the private layer in an inference library, such that the inference library invokes the private layer interface when executing the private layer;

and the algorithm implementation module is used for executing the execution logic defined by the private layer when the algorithm implementation module is called by the inference library.

In a possible embodiment, the apparatus further includes a parameter obtaining module, configured to obtain the custom parameter;

the algorithm implementation module is specifically configured to execute the execution logic defined by the private layer according to a form configured by the custom parameter.

In a possible embodiment, the custom parameter is used to configure a system resource occupied by executing the execution logic defined by the private layer, and the algorithm implementation module is specifically used to invoke the system resource configured by the custom parameter and execute the execution logic defined by the private layer executed by the inference library.

In a possible embodiment, the custom parameter is used to configure a calculation mode for implementing the execution logic defined by the private layer, and the algorithm implementation module is specifically used to perform calculation according to the calculation mode configured by the custom parameter, so as to execute the execution logic defined by the private layer executed by the inference library.

In a fourth aspect of embodiments of the present application, there is provided an electronic device, including:

a memory for storing a computer program;

a processor for implementing the method steps according to the second aspect when executing the program stored in the memory.

In a fifth aspect of embodiments of the present application, a computer-readable storage medium is provided, in which a computer program is stored, which, when being executed by a processor, performs the method steps of any one of the above-mentioned second aspects.

The inference platform, the method and the device provided by the embodiment of the application can register the user-defined model into the inference library according to different application scenes by opening the private layer interface in the inference platform, and realize the model algorithm containing the customized private layer by calling the private layer interface, so that one inference platform can finish inference of various different deep learning models with the user-defined private layer, and the corresponding inference libraries do not need to be developed respectively aiming at different application scenes, thereby improving the development efficiency, reducing the labor cost of development, improving the flexibility of application development and realizing the opening of equipment. Of course, not all advantages described above need to be achieved at the same time in the practice of any one product or method of the present application.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of an inference method provided in an embodiment of the present application;

fig. 2 is a schematic structural diagram of an inference platform provided in an embodiment of the present application;

fig. 3 is another schematic flow chart of an inference method provided in an embodiment of the present application;

fig. 4 is a schematic flowchart of a private layer interface algorithm implementation provided in an embodiment of the present application;

fig. 5 is a schematic structural diagram of an inference device provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

For clearer explanation of the inference platform provided in the embodiment of the present application, the inference platform may be one electronic device or may be composed of multiple electronic devices, and for example, the inference platform may be an independent server or may be composed of multiple servers that establish communication connections with each other, which is not limited in this embodiment. Referring to fig. 1, an inference flow of an inference platform is described below, where fig. 1 shows a model inference method provided in an embodiment of the present application, and the method may include:

and S101, creating an inference library through an inference library interface.

The created inference library can cover various layers, the layers covered in the inference library can be different according to different application scenes, but the layers covered in the inference library are often limited, so that one or more layers in the deep learning model used by the user can not be covered in the inference library in some application scenes. For convenience of description herein, it is assumed that all layers employed in the deep learning model have been covered in the inference library.

And S102, inputting the deep learning model and the image to be processed into an inference library.

The deep learning model may be a deep learning model with different functions according to different application scenarios, and may be, for example, a deep learning model for performing image recognition, or a deep learning model for extracting video optical flow information, which is not limited in this embodiment. The image to be processed may be a single image or a plurality of images, for example, a plurality of consecutive image frames in a video.

And S103, executing the input deep learning model by the inference library by utilizing the covered layers so as to process the input image to be processed and obtain an inference result.

As with the foregoing analysis, it is assumed herein for convenience of description that the various layers employed by the deep learning model have been covered in an inference library. Since the inference library can implement the execution logic of these covered layers, the inference library can complete the execution of the deep learning model to get the inference result.

The inference library may analyze the deep learning model to determine each layer adopted in the deep learning model, search layer information of the layer which is already covered locally, and implement execution logic of the layer according to the layer information.

However, if at least one layer employed by the deep learning model is not covered in the inference library, the inference library may not implement the execution logic of the layer, resulting in the inability to reason about the input image according to the execution logic represented by the model. In the related art, a new inference library can be newly developed through the inference library interface, but the time and labor cost for developing the new inference library are high, so that the inference efficiency is low, and the flexibility of application development is low.

Based on this, the embodiment of the present application provides an inference platform, which may include an inference library 210 and a private layer interface 220 as shown in fig. 2.

The private layer interface 220 is configured to obtain layer information of the private layer, define execution logic of the private layer according to the layer information of the private layer, and register the private layer in the inference repository 210.

The inference library 210 is used for executing the input deep learning algorithm model by utilizing the support layer covered by the inference library 210 and the registered private layer so as to infer the input image and obtain an inference result; and calling the private layer interface to execute the defined execution logic of the executed private layer each time the executed layer is the registered private layer.

By adopting the embodiment, the user-defined model can be registered in the inference library according to different application scenes by opening the private layer interface in the inference platform, and the model algorithm comprising the customized private layer is realized by calling the private layer interface, so that one inference platform can finish inference of various different deep learning models with the user-defined private layer, and the corresponding inference libraries do not need to be respectively developed aiming at different application scenes, thereby improving the development efficiency, reducing the labor cost of development, improving the flexibility of application development and realizing the opening of equipment.

In some possible embodiments, in order to unify the representation forms of the layer information, the layer information of the private layer acquired by the private layer interface 220 is the same as the representation form of the layer information of the layer registered in the inference library 210 when the inference library 210 is created. In a possible embodiment, the private layer file input by the user through the preset user terminal may be read, such as private _ layer.c (private layer file written using c or c + + language), private _ layer.py (private layer file written using python language), and the like, and depending on the application scenario, the private layer file may also be written in other computer languages to obtain the layer information of the private layer.

The inference library 210 is divided into two cases when executing the algorithm employed by the deep learning model of the input. Assuming that the executed algorithm is already included in the inference library 210 when the inference library 210 is created, the layer information of the algorithm is registered in the inference library 210, and the inference library 210 may implement the algorithm according to the implementation manner indicated by the layer information of the algorithm. Assuming that the executed algorithm is not registered in the inference library 210 at the time of creation of the inference library 210, i.e., the algorithm is a private layer registered in the inference library through a private layer interface, the inference library 210 may invoke the private layer interface 220. The private layer interface 220 may be invoked when the inference library 210 resolves to a private layer in the deep learning model.

The private layer interface 220 pre-acquires layer information of the private layer, and thus, when called by the inference library 210, can implement the private layer based on the layer information of the private layer. When the inference library 210 parses the private layer in the deep learning model, the private layer interface 220 is called to determine the memory occupied by acquiring the private layer model. The private layer interface 220 applies for the corresponding memory. The inference library 210 allocates corresponding model memory for the private layer and invokes the private layer interface 220 to create the private layer model. Similarly, inference library 210 calls private layer interface 220 to compute the memory required for the algorithm employed to create the private layer model. The inference library 210 allocates a corresponding algorithm memory for the private layer 220, and invokes the private layer interface 220 to create an instance of the algorithm, which is executed to reason the private layer 220 input to obtain the private layer output. According to the different positions of the private layer in the deep learning model, the input of the private layer can be a feature map of the image to be processed, and can also be a calculation result output by a layer above the private layer in the deep learning model.

In one possible embodiment, the private layer interface 220 may also be used to obtain custom parameters that are used to represent the manner in which the private layer executes. The private layer interface may be specifically configured to execute, when called by the inference library, the execution logic defined by the private layer executed by the inference library based on the layer information of the private layer and according to a form configured by the custom parameter. It is to be understood that the same execution logic may be executed and implemented in a plurality of different manners, and the plurality of different execution manners may achieve different technical effects, which may be required to achieve different technical effects in different application scenarios. Therefore, the embodiment can enable the user to carry out deep self-definition on the private layer in the deep learning model through the private layer interface of the inference platform open type, so that the obtained inference result can better meet the actual requirement of the user

In different application scenarios, the custom parameter may be used to indicate different meanings, for example, in one possible embodiment, the custom parameter is used to configure the system resource occupied by the execution logic defined by the execution private layer, and the private layer interface 220 is specifically used to call the system resource indicated by the custom parameter, and execute the execution logic defined by the private layer executed by the inference library.

For example, the customization parameters may be specifically used to configure how much memory is occupied by the model and/or algorithm used by the customization layer. Taking image recognition as an example, in some application scenarios, the computing load of the electronic device executing image recognition may be larger, so that the system resource occupied by the execution logic defined by the execution private layer can be reduced by adjusting the custom parameter, so as to avoid the electronic device from being stuck due to too much computing load.

In other application scenarios, the computing load of the electronic device performing image recognition may be low, so that the system resources occupied by the execution logic defined in the execution private layer may be increased by adjusting the custom parameter, so as to fully utilize the computing resources and improve the efficiency of image recognition.

For another example, in one possible embodiment, the custom parameter is used to represent a computational manner in which the private layer defined execution logic is implemented, and the private layer interface 220 is specifically used to perform computations according to the computational manner represented by the custom parameter to execute the private layer defined execution logic executed by the inference library.

The configuration mode of the calculation mode may be different according to different application scenarios, and for example, the custom parameter may configure the calculation mode by configuring one or more of the following parameters:

the model used by the custom layer, the algorithm used by the custom layer, the computational logic of the forward execution computation of the custom layer, and the size of the output of the custom layer.

Wherein the model may be represented in the form of a handle to the model, the algorithm may be represented in the form of a handle to the algorithm, and the size of the output may be represented by a parameter of reshape (a function for resizing).

Taking image recognition as an example, in some application scenarios, in order to improve the accuracy of image recognition, the execution logic of the custom layer may be implemented by using a calculation mode with higher algorithm complexity but higher calculation accuracy, and in other application scenarios, in order to improve the efficiency of image recognition, the execution logic of the custom layer may be implemented by using a calculation mode with lower algorithm complexity. Thus, implementing the custom layer defined execution logic in different computing manners may achieve different technical effects.

Referring to fig. 3, fig. 3 is a schematic flow chart of an inference method provided in the embodiment of the present application, which may be applied to a private layer interface in an inference platform, and the method may include:

s301, layer information of the private layer is obtained.

S302, defining the execution logic of the private layer according to the private layer information, and registering the private layer in the inference library, so that the inference library calls a private layer interface when executing the private layer.

S303, when being called by the inference base, executing the execution logic defined by the private layer.

For the layer information in S301, reference may be made to the foregoing description about the private layer interface, which is not described herein again.

In a possible embodiment, the implementation procedure of the private layer in S302 can be as shown in fig. 4, and includes:

s401, the private layer interface obtains the memory needed by the private layer model.

S402, the private layer creates a private layer model.

S403, the private layer interface obtains the memory required by the algorithm adopted by the private layer model.

S404, the private layer interface creates an algorithm adopted by the private layer model.

S405, the private layer interface obtains the self-defined parameters.

S406, the private layer interface configures the created algorithm according to the mode represented by the custom parameter.

S407, the private layer interface executes the configured algorithm to process the input of the private layer, so as to obtain the output of the private layer.

And S408, releasing the algorithm of the private layer interface.

And S409, releasing the model by the private layer interface.

It should be understood that fig. 4 is only a schematic diagram of an implementation flow of the private layer provided in this embodiment of the present application, and as described above, in different application scenarios, meanings indicated by the custom parameters may be different, and thus an execution sequence between S405 and other steps may be different from that shown in fig. 4. For example, S405 may be performed before any step of S401-S404, which is not limited in this embodiment.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an inference apparatus provided in an embodiment of the present application, where the apparatus is applied to a private layer interface of an inference platform as described in any one of the above, and the apparatus includes:

a layer information obtaining module 501, configured to obtain layer information of a private layer, where the layer information is used to represent an implementation manner of the private layer, so that the inference library calls the private layer interface whenever the private layer is executed;

a private layer definition module 502, configured to define execution logic of the private layer according to the layer information; and registering the private layer in an inference library, such that the inference library invokes the private layer interface when executing the private layer;

an algorithm implementation module 503, configured to execute the execution logic defined by the private layer when called by the inference library.

the algorithm implementation module 503 is specifically configured to execute the execution logic defined by the private layer according to the form configured by the custom parameter.

In a possible embodiment, the custom parameter is used to configure a system resource occupied by executing the execution logic defined by the private layer, and the algorithm implementation module 503 is specifically used to invoke the system resource configured by the custom parameter and execute the execution logic defined by the private layer executed by the inference library.

In a possible embodiment, the custom parameter is used to configure a calculation mode for implementing the execution logic defined by the private layer, and the algorithm implementation module 503 is specifically configured to perform calculation according to the calculation mode configured by the custom parameter, so as to execute the execution logic defined by the private layer executed by the inference library.

An embodiment of the present application further provides an electronic device, as shown in fig. 6, including:

a memory 601 for storing a computer program;

the processor 602 is configured to implement the following steps when executing the program stored in the memory 601:

acquiring layer information of a private layer;

In a possible embodiment, the method further comprises:

obtaining a self-defined parameter;

the executing the execution logic defined by the private layer comprises:

The Memory mentioned in the above electronic device may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

The inference platform can also be referred to as an inference system, comprising a private layer interface and an inference library. And model reasoning of different application scenes is realized. Such as a face recognition model, a vehicle recognition model, a speech recognition model, a foreground recognition model, an action recognition model, an attribute classification model, etc.

In yet another embodiment provided by the present application, there is also provided a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform any of the above-described embodiments of the inference method.

In yet another embodiment provided by the present application, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the inference methods of the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the embodiments of the apparatus, the electronic device, the computer-readable storage medium, and the computer program product, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. An inference platform, characterized in that the inference platform comprises an inference base, a private layer interface;

2. The inference platform of claim 1, wherein the inference library is specifically configured to execute an input deep learning algorithm model for image recognition by using a support layer already covered by the inference library and a registered private layer, so as to recognize an input image and obtain a recognition result.

3. The inference platform of claim 1, wherein the private layer interface is further configured to obtain custom parameters; when the user-defined parameters are called by the inference library, executing the execution logic defined by the private layer executed by the inference library according to the form configured by the user-defined parameters.

4. The inference platform of claim 3, wherein the custom parameter is configured to configure a system resource occupied by executing a private layer defined execution logic, and the private layer interface is specifically configured to invoke the system resource configured by the custom parameter and execute the private layer defined execution logic executed by the inference library.

5. The inference platform of claim 3, wherein the custom parameter is configured to configure a computing manner for implementing the private layer defined execution logic, and the private layer interface is specifically configured to perform computing according to the computing manner configured by the custom parameter to execute the private layer defined execution logic executed by the inference library.

6. A method of reasoning, applied to a private layer interface of a reasoning platform according to any of claims 1-5, the method comprising:

acquiring layer information of a private layer;

7. The method of claim 6, further comprising:

obtaining a self-defined parameter;

the executing the execution logic defined by the private layer comprises:

8. The method according to claim 7, wherein the custom parameter is used to configure system resources occupied by executing the private layer defined execution logic, and the executing the private layer defined execution logic executed by the inference library according to the form configured by the custom parameter includes:

9. The method of claim 7, wherein the custom parameter is used to configure a computing manner for implementing the private layer defined execution logic, and wherein executing the private layer defined execution logic executed by the inference library in a form configured by the custom parameter comprises:

10. An inference apparatus, characterized in that it is applied to the private layer interface of the inference platform according to any of claims 1-3, said apparatus comprising: