CN113608729A

CN113608729A - Method for realizing client end deployment

Info

Publication number: CN113608729A
Application number: CN202110946751.2A
Authority: CN
Inventors: 王玉梁
Original assignee: Shandong New Generation Information Industry Technology Research Institute Co Ltd
Current assignee: Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority date: 2021-08-18
Filing date: 2021-08-18
Publication date: 2021-11-05
Anticipated expiration: 2041-08-18
Also published as: CN113608729B

Abstract

The invention relates to the field of deep learning, and particularly provides a client end deployment implementation method, which comprises the steps of firstly defining an abstract class and five public methods, generating a derivative class by inheriting the abstract class by a newly added model, and implementing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model; and the analysis of model information and the creation of a gprc-client end are realized in the abstract class constructor, and finally, the C + + engineering interface is encapsulated into 3C interfaces and compiled into a so dynamic link library which is provided for other languages at the front end to call. Compared with the prior art, the method has the advantages that the newly added model is easy to expand, multiple model deployments are supported, the method is more suitable for embedded equipment with higher performance requirements based on C + +, the generated dynamic link library can be called by multiple languages, and richer front-end functions are realized.

Description

Method for realizing client end deployment

Technical Field

The invention relates to the field of deep learning, and particularly provides an implementation method for deploying a client side.

Background

Model training is only a small part of deep learning, and model deployment is the most important link of falling on the ground of deep learning technology. Currently, the mainstream deployment frameworks include python service deployment, java direct loading model deployment, docker and tf-serving deployment models and Nvidia triton deployment frameworks.

Each deployment mode has advantages and disadvantages, wherein the inference speed of python service deployment and java direct loading deployment is slow, a python environment is needed, and an embedded environment with high performance requirements is not suitable.

The latter two modes adopt a docker container for deployment and provide http and grpc interfaces externally; the grpc service has more advantages in batch processing of image data, so that the grpc service is suitable for model reasoning of video stream input, and the framework supports multiple model deployments; the Nvidia triton reasoning speed is fastest in the 4 deployment modes.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for realizing client end deployment with strong practicability.

The technical scheme adopted by the invention for solving the technical problems is as follows:

an implementation method for deploying a client side comprises the steps of firstly, defining an abstract class and five public methods, generating a derived class by inheriting the abstract class by a newly added model, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model;

and the analysis of model information and the creation of a gprc-client end are realized in the abstract class constructor, and finally, the C + + engineering interface is encapsulated into 3C interfaces and compiled into a so dynamic link library which is provided for other languages at the front end to call.

Furthermore, when five public methods of preprocessing, inputting, outputting, reasoning and post-processing of the abstract class definition model are carried out, member variables in the abstract class are provided with a model information parameter structure body for storing parameter contents after model analysis, input and output names, height, width, channel, input data format, one-time reasoning picture number of model input, http-client and grpc-client type pointers for accessing the trinon-server.

Further, when the member variables in the abstract class have model information parameter structures, firstly, creating http _ client and grpc _ client object pointers according to the transmitted model name and the address of the reasoning server when initializing the abstract class; and then, accessing the triton-server by using the pointer to acquire model _ meta and model _ config data, finally analyzing the information of the model according to the model _ meta and the model _ config, storing the information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.

Furthermore, when the dispatching class inherits the abstract class to realize five public methods, each frame of image needs to be preprocessed, input, output, reasoning and post-processing according to the deep learning model processing logic.

Furthermore, the processing flow is different for each link of different models, and the model preprocessing interface performs operations of image cutting, image color mode conversion, model feeding normalization and image data storage on the images.

Further, the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model inference interface;

the model output interface creates the output type pointers of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.

Furthermore, the model reasoning interface adopts a function callback mode, transmits parameters of the models inputs and outputs into the reasoning interface, and stores the obtained reasoning result in the vector at the function callback position;

traversing the model output outputs by the model post-processing interface, storing the inference result into a vector container according to the model output name, filtering out a frame with a lower score, and storing the result into a structural body for DBSCAN clustering operation;

the inference results are packaged as json format data returns using rapidjson third party data packaging components.

Further, when the C + + engineering interface is packaged into the C interface, the C + + engineering interface comprises an initialization interface, an inference output interface and a resource release interface for calling of the go language.

Further, the initialization interface starts a camera firstly, creates a corresponding subclass according to the transmitted model name needing reasoning, and transmits the model name, the triton server address and the camera address into an abstract class structure to analyze model information;

the reasoning output interface captures a frame of image data, and a json format reasoning result is obtained through the preprocessing, input, reasoning and post-processing processes;

and the resource release interface releases the resources established in the initialization interface and the reasoning output interface.

Compared with the prior art, the implementation method for deploying the client terminal has the following outstanding beneficial effects:

the method can easily expand the newly added model, supports the deployment of a plurality of models, is more suitable for embedded equipment with higher performance requirements based on C + +, and can generate a dynamic link library for a plurality of languages to call so as to realize richer front-end functions.

The Triton client only needs to realize 5 public interfaces for expanding the newly added model, the deployment of the client is simple and easy to operate, and the later maintenance cost is low. The Triton client can generate a dynamic link library and package the dynamic link library into 3C interfaces for calling of multiple languages, and richer front-end functions are realized by using inference results; the method supports simultaneous deployment of a plurality of models, adopts a grpc interaction mode to be more suitable for processing image data in batches, and is more suitable for embedded equipment with high performance requirements based on C + +.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart diagram of an implementation method for deploying a client side.

Detailed Description

The present invention will be described in further detail with reference to specific embodiments in order to better understand the technical solutions of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

A preferred embodiment is given below:

as shown in fig. 1, in the implementation method for deploying the client side in this embodiment, first, an abstract class and five public methods are defined, and the newly added model inherits the abstract class to generate a derived class, and implements five methods of model preprocessing, input, output, inference and post-processing according to input and output information of each model.

1. 5 public methods for preprocessing, inputting, outputting, reasoning and post-processing of abstract class definition model

The member variables in the abstract class comprise a model information parameter structure body used for storing parameter contents after model analysis, input and output names, height and width of model input, channels, input data formats, one-time reasoning picture number, http-client and grpc-client type pointers for accessing the trinon-server and the like.

Firstly, creating an http _ client object pointer and a grpc _ client object pointer according to an incoming model name and an address of an inference server when initializing an abstract class, then accessing a triton-server by using the pointers to acquire model _ meta and model _ config data, finally analyzing the information of the model according to the model _ meta and the model _ config and storing the information in a defined model information structure, and finally analyzing the width, the height and the channel of the image according to the image format.

2. Method for realizing 5 public methods by dispatching class inheritance abstract class

According to the deep learning model processing logic, each frame of image needs to be subjected to preprocessing, input, output, reasoning and post-processing procedures; the processing flow is different for each link of different models.

The model preprocessing interface cuts images, converts image color modes, feeds the model normalization and stores image data (some models only need to store the image data of the original images without cutting).

And the model input interface creates input pointers of the triton-client type according to the analyzed model information and uses the input pointers as input parameters of the model inference interface.

The model reasoning interface adopts a function callback mode, model input and output parameters are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in the vector at the function callback position.

Traversing the model output outputs by the model post-processing interface, storing the inference result into a vector container according to the model output name, filtering out a frame with a lower score, and storing the result into a structural body to perform DBSCAN clustering operation (some models do not need the operation);

and finally, packaging the inference result into json format data return by using a rapidjson third-party data packaging component.

3. Packaging the C + + interface into a C interface for other languages to call

Compiling the codes into a so dynamic link library, and packaging a processing flow into a C interface, wherein the C interface comprises an initialization interface, an inference output interface and a resource release interface for calling a go language;

the initialization interface starts a camera, creates corresponding subclasses according to the model names which need to be inferred, and transmits the model names, triton server addresses and camera addresses into an abstract class structure to analyze model information.

The reasoning output interface captures a frame of image data, and a json format reasoning result is obtained through the preprocessing, input, reasoning and post-processing processes.

The resource release interface releases resources created in the initialization and inference output interface.

The above embodiments are only specific cases of the present invention, and the protection scope of the present invention includes but is not limited to the above embodiments, and any suitable changes or substitutions that are consistent with the claims of a client end implementation method of the present invention and are made by a person of ordinary skill in the art shall fall within the protection scope of the present invention.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A method for realizing deployment of client end is characterized in that firstly, an abstract class and five public methods are defined, a newly added model inherits the abstract class to generate a derived class, and five methods of model preprocessing, input, output, reasoning and post-processing are realized according to input and output information of each model;

2. The implementation method for deploying the client side according to claim 1, wherein when five public methods including preprocessing, inputting, outputting, reasoning and post-processing of the abstract class definition model are performed, member variables in the abstract class have a model information parameter structure body for storing parameter contents after model parsing, input and output names, height, width and channel of model input, input data format, one-time reasoning picture number, http-client and grpc-client type pointers for accessing a trinon-server.

3. The client deployment implementation method of claim 2, wherein when the member variables in the abstract class have model information parameter structures, firstly creating http _ client and grpc _ client object pointers according to an incoming model name and an address of an inference server when initializing the abstract class; and then, accessing the triton-server by using the pointer to acquire model _ meta and model _ config data, finally analyzing the information of the model according to the model _ meta and the model _ config, storing the information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.

4. The implementation method of claim 3, wherein when the dispatching class inherits the abstract class to implement five common methods, each frame of image needs to be preprocessed, input, output, inferred and post-processed according to the deep learning model processing logic.

5. The implementation method of claim 4, wherein the processing flow is different for each link of different models, and the model preprocessing interface performs operations of image clipping, image color mode conversion, model feeding normalization, and image data storage.

6. The client end deployment implementation method of claim 5, wherein the model input interface creates triton-client type input pointers according to the parsed model information and uses the triton-client type input pointers as input parameters of the model inference interface;

7. The client-side deployment implementation method of claim 6, wherein the model inference interface adopts a function callback mode, and transmits model inputs and outputs parameters to the inference interface, and the inference result is obtained and stored in vector at the function callback position;

8. The implementation method for deploying the client side according to claim 7, wherein the C + + engineering interface is encapsulated into a C interface, and includes an initialization interface, an inference output interface, and a resource release interface for go language call.

9. The client end deployment implementation method of claim 8, wherein the initialization interface starts a camera first, creates corresponding subclasses according to the model names to be inferred, and transmits the model names, triton server addresses and camera addresses into an abstract class structure to analyze model information;