CN113608729B

CN113608729B - Method for realizing deployment client

Info

Publication number: CN113608729B
Application number: CN202110946751.2A
Authority: CN
Inventors: 王玉梁
Original assignee: Shandong New Generation Information Industry Technology Research Institute Co Ltd
Current assignee: Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority date: 2021-08-18
Filing date: 2021-08-18
Publication date: 2023-07-04
Anticipated expiration: 2041-08-18
Also published as: CN113608729A

Abstract

The invention relates to the field of deep learning, in particular to a realization method for deploying a client, which comprises the steps of firstly defining an abstract class and five public methods, inheriting the abstract class by a newly added model to generate a derivative class, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model; and the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end. Compared with the prior art, the method is easy to expand the newly added model, supports a plurality of model deployment, is more suitable for embedded equipment with higher performance requirements based on C++ realization, and can be used for calling a plurality of languages by the generated dynamic link library to realize richer front-end functions.

Description

Method for realizing deployment client

Technical Field

The invention relates to the field of deep learning, and particularly provides a client deployment implementation method.

Background

Model training is only a small part of deep learning, and model deployment is the most important link of deep learning technology landing. The currently mainstream deployment frameworks include python service deployment, java direct loading model deployment, dock and tf-service deployment models and Nvidia triton deployment frameworks.

Each deployment mode has advantages and disadvantages, wherein the python service deployment and java direct loading deployment reasoning speed is low, a python environment is needed, and the embedded environment with high performance requirements is not suitable.

The latter two modes adopt a docker container deployment, and http and grpc interfaces are externally provided; the grpc service is more advantageous in batch processing of image data, so that the grpc service is suitable for model reasoning of video stream input, and the framework supports a plurality of model deployments; the Nvidia triton reasoning speed is the fastest in the 4 deployment modes.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a practical realization method for deploying client ends.

The technical scheme adopted for solving the technical problems is as follows:

firstly, defining an abstract class and five public methods, inheriting the abstract class by a newly added model to generate a derivative class, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model;

and the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end.

Further, when five common methods of abstract class definition model preprocessing, input, output, reasoning and post-processing are performed, a member variable in the abstract class is provided with a model information parameter structure body for storing parameter content after model analysis, input and output names, height and width of model input, channels, input data format, disposable reasoning picture number, and http-client and grpc-client type pointers of access triton-server.

Further, when the member variables in the abstract class have model information parameter structures, firstly establishing an http_client and a grpc_client object pointer according to the name of the model and the address of the reasoning server when initializing the abstract class; and then, using a pointer to access a triton-server to acquire model_meta and model_config data, and finally, analyzing model information according to the model_meta and model_config, storing the model information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.

Furthermore, when five public methods are realized by dispatching class inheritance abstract classes, each frame of image needs to be preprocessed, input, output, reasoning and post-processing flows according to the deep learning model processing logic.

Furthermore, the processing flow of each link of different models is different, and the model preprocessing interface performs cutting, image color mode conversion, feeding model normalization and image data storage operation on the images.

Further, the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model reasoning interface;

the model output interface creates an output type pointer of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.

Furthermore, the model reasoning interface adopts a function callback mode, the parameters of the model inputs, outputs are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in a vector at the function callback position;

the model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out frames with lower scores, and stores the result into a structural body for DBSCAN clustering operation;

and packaging the reasoning result into json format data to return by using an rapidjson third-party data packaging component.

Further, when the C++ engineering interface is packaged into the C interface, the C++ engineering interface comprises an initialization interface, an inference output interface and a resource release interface for go language call.

Further, the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model names needing to be inferred, and transmits the model names, triton server addresses and camera addresses into abstract class construction to analyze model information;

capturing a frame of image data by an inference output interface, and obtaining an inference json format result through preprocessing, input, inference and post-processing processes;

the resource releasing interface releases the resources created in the initializing interface and the reasoning output interface.

Compared with the prior art, the method for realizing the deployment client has the following outstanding beneficial effects:

the invention can be easily expanded for the newly added model, supports a plurality of model deployment, is more suitable for embedded equipment with higher performance requirements based on C++ realization, and can be used for calling a plurality of languages to realize richer front-end functions.

The Triton client only needs to realize 5 public interfaces for the expansion of the newly added model, the deployment of the client is simple and easy to operate, and the later maintenance cost is low. The Triton client can generate a dynamic link library and package the dynamic link library into 3C interfaces for calling of various languages, and a richer front-end function is realized by utilizing an reasoning result; the method supports the simultaneous deployment of a plurality of models, is more suitable for batch processing of picture data by adopting a grpc interaction mode, and is more suitable for embedded equipment with high performance requirements based on C++.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for implementing a client deployment.

Detailed Description

In order to provide a better understanding of the aspects of the present invention, the present invention will be described in further detail with reference to specific embodiments. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

A preferred embodiment is given below:

as shown in FIG. 1, in one implementation method of a deployment client in this embodiment, first, an abstract class and five common methods are defined, and a newly added model inherits the abstract class to generate a derived class and implements five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model.

1. 5 common methods for preprocessing, inputting, outputting, reasoning and post-processing of abstract class definition model

The member variables in the abstract class are provided with model information parameter structures for storing parameter contents after model analysis, input and output names, the height and width of model input, channels, input data formats, one-time reasoning picture numbers, http-client and grpc-client type pointers for accessing triton-server and the like.

Firstly, creating http_client and grpc_client object pointers according to the name of an incoming model and the address of an inference server when initializing abstract classes, then accessing triton-server by using pointers to acquire model_meta and model_config data, finally analyzing information of the model according to the model_meta and the model_config, storing the information in a defined model information structure, and finally analyzing the width, the height and the channel of an image according to an image format.

2. 5 public methods for realizing dispatch class inheritance abstract class

According to the deep learning model processing logic, each frame of image needs to be subjected to preprocessing, input, output, reasoning and post-processing flows; the processing flow is different for each link of different models.

The model preprocessing interface performs clipping, image color mode conversion, feeding model normalization and image data storage operation on the image (some models do not need clipping and only need image data storage operation on the original image).

And the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model reasoning interface.

The model reasoning interface adopts a function callback mode, the model inputs, outputs parameters are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in the vector at the function callback position.

The model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out the frame with lower score, and stores the result into a structural body for DBSCAN clustering operation (some models do not need the operation);

and finally, packaging the reasoning result into json format data by using an rapidjson third-party data packaging component, and returning.

3. Encapsulating C++ interface into C interface for calling by other languages

Compiling the codes into a so dynamic link library, and packaging the processing flow into a C interface, wherein the C interface comprises initializing, reasoning output and a resource release interface for go language call;

the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model names needing to be inferred, and transmits the model names, triton server addresses and camera addresses into abstract class construction to analyze model information.

The reasoning output interface captures a frame of image data, and then a json format result is obtained through preprocessing, input, reasoning and post-processing processes.

The resource release interface releases the resources created in the initializing and reasoning output interface.

The above specific embodiments are merely illustrative of specific cases of the present invention, and the scope of the present invention includes, but is not limited to, the specific embodiments described above, and any suitable modification or replacement made by one of ordinary skill in the art, which meets the claims of a deployment client implementation method according to the present invention, shall fall within the scope of the present invention.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A deployment client implementation method is characterized in that firstly, an abstract class and five public methods are defined, a newly added model inherits the abstract class to generate a derivative class, and five methods of model preprocessing, input, output, reasoning and post-processing are realized according to input and output information of each model;

the analysis of model information and the creation of a gprc-client end are realized in the abstract class construction function, and finally, a C++ engineering interface is packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end;

when five public methods of abstract class definition model preprocessing, input, output, reasoning and post-processing are carried out, a model information parameter structure body is arranged on member variables in the abstract class and is used for storing parameter contents after model analysis, and input and output names, the height and width of model input, channels, input data formats, disposable reasoning picture numbers, and http-client and grpc-client type pointers of access triton-server are adopted;

when member variables in the abstract class have model information parameter structures, firstly establishing an http_client and a grpc_client object pointer according to the name of an incoming model and the address of an inference server when the abstract class is initialized; and then, using a pointer to access a triton-server to acquire model_meta and model_config data, and finally, analyzing model information according to the model_meta and model_config, storing the model information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.

2. The method for implementing the client deployment according to claim 1, wherein when five common methods are implemented by dispatching class inheritance abstract classes, each frame of image needs to be preprocessed, input, output, reasoning and post-processing according to the deep learning model processing logic.

3. The method for implementing the client deployment according to claim 2, wherein the processing flow of each link of different models is different, and the model preprocessing interface performs operations of cutting, image color mode conversion, feeding model normalization and image data storage on the images.

4. The method for implementing the client deployment according to claim 3, wherein the model input interface creates input pointers of triton-client types according to the parsed model information and uses the input pointers as input parameters of the model reasoning interface;

5. The method for realizing the client deployment according to claim 4, wherein the model reasoning interface adopts a function callback mode, the model inputs, outputs parameters are transmitted into the reasoning interface, the reasoning result is obtained, and the result is stored in a vector at the function callback position;

6. The method for implementing a client deployment of claim 5, wherein the c++ engineering interface is encapsulated as a C interface, comprising initializing, reasoning output and resource release interfaces for go language call.

7. The method for implementing the client deployment according to claim 6, wherein the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model name to be inferred, and transmits the model name, triton server address and camera address into the abstract class structure to analyze the model information;