CN113608729B - Method for realizing deployment client - Google Patents
Method for realizing deployment client Download PDFInfo
- Publication number
- CN113608729B CN113608729B CN202110946751.2A CN202110946751A CN113608729B CN 113608729 B CN113608729 B CN 113608729B CN 202110946751 A CN202110946751 A CN 202110946751A CN 113608729 B CN113608729 B CN 113608729B
- Authority
- CN
- China
- Prior art keywords
- model
- interface
- reasoning
- client
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/31—Programming languages or programming paradigms
- G06F8/315—Object-oriented languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Transfer Between Computers (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention relates to the field of deep learning, in particular to a realization method for deploying a client, which comprises the steps of firstly defining an abstract class and five public methods, inheriting the abstract class by a newly added model to generate a derivative class, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model; and the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end. Compared with the prior art, the method is easy to expand the newly added model, supports a plurality of model deployment, is more suitable for embedded equipment with higher performance requirements based on C++ realization, and can be used for calling a plurality of languages by the generated dynamic link library to realize richer front-end functions.
Description
Technical Field
The invention relates to the field of deep learning, and particularly provides a client deployment implementation method.
Background
Model training is only a small part of deep learning, and model deployment is the most important link of deep learning technology landing. The currently mainstream deployment frameworks include python service deployment, java direct loading model deployment, dock and tf-service deployment models and Nvidia triton deployment frameworks.
Each deployment mode has advantages and disadvantages, wherein the python service deployment and java direct loading deployment reasoning speed is low, a python environment is needed, and the embedded environment with high performance requirements is not suitable.
The latter two modes adopt a docker container deployment, and http and grpc interfaces are externally provided; the grpc service is more advantageous in batch processing of image data, so that the grpc service is suitable for model reasoning of video stream input, and the framework supports a plurality of model deployments; the Nvidia triton reasoning speed is the fastest in the 4 deployment modes.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a practical realization method for deploying client ends.
The technical scheme adopted for solving the technical problems is as follows:
firstly, defining an abstract class and five public methods, inheriting the abstract class by a newly added model to generate a derivative class, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model;
and the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end.
Further, when five common methods of abstract class definition model preprocessing, input, output, reasoning and post-processing are performed, a member variable in the abstract class is provided with a model information parameter structure body for storing parameter content after model analysis, input and output names, height and width of model input, channels, input data format, disposable reasoning picture number, and http-client and grpc-client type pointers of access triton-server.
Further, when the member variables in the abstract class have model information parameter structures, firstly establishing an http_client and a grpc_client object pointer according to the name of the model and the address of the reasoning server when initializing the abstract class; and then, using a pointer to access a triton-server to acquire model_meta and model_config data, and finally, analyzing model information according to the model_meta and model_config, storing the model information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.
Furthermore, when five public methods are realized by dispatching class inheritance abstract classes, each frame of image needs to be preprocessed, input, output, reasoning and post-processing flows according to the deep learning model processing logic.
Furthermore, the processing flow of each link of different models is different, and the model preprocessing interface performs cutting, image color mode conversion, feeding model normalization and image data storage operation on the images.
Further, the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model reasoning interface;
the model output interface creates an output type pointer of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
Furthermore, the model reasoning interface adopts a function callback mode, the parameters of the model inputs, outputs are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in a vector at the function callback position;
the model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out frames with lower scores, and stores the result into a structural body for DBSCAN clustering operation;
and packaging the reasoning result into json format data to return by using an rapidjson third-party data packaging component.
Further, when the C++ engineering interface is packaged into the C interface, the C++ engineering interface comprises an initialization interface, an inference output interface and a resource release interface for go language call.
Further, the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model names needing to be inferred, and transmits the model names, triton server addresses and camera addresses into abstract class construction to analyze model information;
capturing a frame of image data by an inference output interface, and obtaining an inference json format result through preprocessing, input, inference and post-processing processes;
the resource releasing interface releases the resources created in the initializing interface and the reasoning output interface.
Compared with the prior art, the method for realizing the deployment client has the following outstanding beneficial effects:
the invention can be easily expanded for the newly added model, supports a plurality of model deployment, is more suitable for embedded equipment with higher performance requirements based on C++ realization, and can be used for calling a plurality of languages to realize richer front-end functions.
The Triton client only needs to realize 5 public interfaces for the expansion of the newly added model, the deployment of the client is simple and easy to operate, and the later maintenance cost is low. The Triton client can generate a dynamic link library and package the dynamic link library into 3C interfaces for calling of various languages, and a richer front-end function is realized by utilizing an reasoning result; the method supports the simultaneous deployment of a plurality of models, is more suitable for batch processing of picture data by adopting a grpc interaction mode, and is more suitable for embedded equipment with high performance requirements based on C++.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for implementing a client deployment.
Detailed Description
In order to provide a better understanding of the aspects of the present invention, the present invention will be described in further detail with reference to specific embodiments. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A preferred embodiment is given below:
as shown in FIG. 1, in one implementation method of a deployment client in this embodiment, first, an abstract class and five common methods are defined, and a newly added model inherits the abstract class to generate a derived class and implements five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model.
And the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end.
1. 5 common methods for preprocessing, inputting, outputting, reasoning and post-processing of abstract class definition model
The member variables in the abstract class are provided with model information parameter structures for storing parameter contents after model analysis, input and output names, the height and width of model input, channels, input data formats, one-time reasoning picture numbers, http-client and grpc-client type pointers for accessing triton-server and the like.
Firstly, creating http_client and grpc_client object pointers according to the name of an incoming model and the address of an inference server when initializing abstract classes, then accessing triton-server by using pointers to acquire model_meta and model_config data, finally analyzing information of the model according to the model_meta and the model_config, storing the information in a defined model information structure, and finally analyzing the width, the height and the channel of an image according to an image format.
2. 5 public methods for realizing dispatch class inheritance abstract class
According to the deep learning model processing logic, each frame of image needs to be subjected to preprocessing, input, output, reasoning and post-processing flows; the processing flow is different for each link of different models.
The model preprocessing interface performs clipping, image color mode conversion, feeding model normalization and image data storage operation on the image (some models do not need clipping and only need image data storage operation on the original image).
And the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model reasoning interface.
The model output interface creates an output type pointer of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
The model reasoning interface adopts a function callback mode, the model inputs, outputs parameters are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in the vector at the function callback position.
The model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out the frame with lower score, and stores the result into a structural body for DBSCAN clustering operation (some models do not need the operation);
and finally, packaging the reasoning result into json format data by using an rapidjson third-party data packaging component, and returning.
3. Encapsulating C++ interface into C interface for calling by other languages
Compiling the codes into a so dynamic link library, and packaging the processing flow into a C interface, wherein the C interface comprises initializing, reasoning output and a resource release interface for go language call;
the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model names needing to be inferred, and transmits the model names, triton server addresses and camera addresses into abstract class construction to analyze model information.
The reasoning output interface captures a frame of image data, and then a json format result is obtained through preprocessing, input, reasoning and post-processing processes.
The resource release interface releases the resources created in the initializing and reasoning output interface.
The above specific embodiments are merely illustrative of specific cases of the present invention, and the scope of the present invention includes, but is not limited to, the specific embodiments described above, and any suitable modification or replacement made by one of ordinary skill in the art, which meets the claims of a deployment client implementation method according to the present invention, shall fall within the scope of the present invention.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (7)
1. A deployment client implementation method is characterized in that firstly, an abstract class and five public methods are defined, a newly added model inherits the abstract class to generate a derivative class, and five methods of model preprocessing, input, output, reasoning and post-processing are realized according to input and output information of each model;
the analysis of model information and the creation of a gprc-client end are realized in the abstract class construction function, and finally, a C++ engineering interface is packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end;
when five public methods of abstract class definition model preprocessing, input, output, reasoning and post-processing are carried out, a model information parameter structure body is arranged on member variables in the abstract class and is used for storing parameter contents after model analysis, and input and output names, the height and width of model input, channels, input data formats, disposable reasoning picture numbers, and http-client and grpc-client type pointers of access triton-server are adopted;
when member variables in the abstract class have model information parameter structures, firstly establishing an http_client and a grpc_client object pointer according to the name of an incoming model and the address of an inference server when the abstract class is initialized; and then, using a pointer to access a triton-server to acquire model_meta and model_config data, and finally, analyzing model information according to the model_meta and model_config, storing the model information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.
2. The method for implementing the client deployment according to claim 1, wherein when five common methods are implemented by dispatching class inheritance abstract classes, each frame of image needs to be preprocessed, input, output, reasoning and post-processing according to the deep learning model processing logic.
3. The method for implementing the client deployment according to claim 2, wherein the processing flow of each link of different models is different, and the model preprocessing interface performs operations of cutting, image color mode conversion, feeding model normalization and image data storage on the images.
4. The method for implementing the client deployment according to claim 3, wherein the model input interface creates input pointers of triton-client types according to the parsed model information and uses the input pointers as input parameters of the model reasoning interface;
the model output interface creates an output type pointer of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
5. The method for realizing the client deployment according to claim 4, wherein the model reasoning interface adopts a function callback mode, the model inputs, outputs parameters are transmitted into the reasoning interface, the reasoning result is obtained, and the result is stored in a vector at the function callback position;
the model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out frames with lower scores, and stores the result into a structural body for DBSCAN clustering operation;
and packaging the reasoning result into json format data to return by using an rapidjson third-party data packaging component.
6. The method for implementing a client deployment of claim 5, wherein the c++ engineering interface is encapsulated as a C interface, comprising initializing, reasoning output and resource release interfaces for go language call.
7. The method for implementing the client deployment according to claim 6, wherein the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model name to be inferred, and transmits the model name, triton server address and camera address into the abstract class structure to analyze the model information;
capturing a frame of image data by an inference output interface, and obtaining an inference json format result through preprocessing, input, inference and post-processing processes;
the resource releasing interface releases the resources created in the initializing interface and the reasoning output interface.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110946751.2A CN113608729B (en) | 2021-08-18 | 2021-08-18 | Method for realizing deployment client |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110946751.2A CN113608729B (en) | 2021-08-18 | 2021-08-18 | Method for realizing deployment client |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113608729A CN113608729A (en) | 2021-11-05 |
CN113608729B true CN113608729B (en) | 2023-07-04 |
Family
ID=78341095
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110946751.2A Active CN113608729B (en) | 2021-08-18 | 2021-08-18 | Method for realizing deployment client |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113608729B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018001065A1 (en) * | 2016-06-27 | 2018-01-04 | 中兴通讯股份有限公司 | Method, device and system for managing application |
CN109165024A (en) * | 2018-07-26 | 2019-01-08 | 天讯瑞达通信技术有限公司 | A kind of method of operation platform automatic deployment and monitoring server system |
CN111488197A (en) * | 2020-04-14 | 2020-08-04 | 浙江新再灵科技股份有限公司 | Deep learning model deployment method and system based on cloud server |
CN112418427A (en) * | 2020-11-25 | 2021-02-26 | 广州虎牙科技有限公司 | Method, device, system and equipment for providing deep learning unified reasoning service |
-
2021
- 2021-08-18 CN CN202110946751.2A patent/CN113608729B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018001065A1 (en) * | 2016-06-27 | 2018-01-04 | 中兴通讯股份有限公司 | Method, device and system for managing application |
CN109165024A (en) * | 2018-07-26 | 2019-01-08 | 天讯瑞达通信技术有限公司 | A kind of method of operation platform automatic deployment and monitoring server system |
CN111488197A (en) * | 2020-04-14 | 2020-08-04 | 浙江新再灵科技股份有限公司 | Deep learning model deployment method and system based on cloud server |
CN112418427A (en) * | 2020-11-25 | 2021-02-26 | 广州虎牙科技有限公司 | Method, device, system and equipment for providing deep learning unified reasoning service |
Non-Patent Citations (2)
Title |
---|
HUNTING PERNICIOUS ATTACKS IN WEB APPLICATIONS WITH XPROBER;R. Suguna et al.;《American Journal of Applied Sciences》;1164-1171 * |
适用于Docker环境的DevOps平台的设计与实践;韩东;《知网》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113608729A (en) | 2021-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107301098A (en) | A kind of remote procedure call device based on Thrift agreements, method and system | |
CN110275700A (en) | A kind of cross-platform multipad Development Framework and method based on electron | |
CN112764875B (en) | Intelligent calculation-oriented lightweight portal container microservice system and method | |
CN104932905A (en) | Automatic code generation method from AADL to C language | |
CN105955731B (en) | Method and system for quickly compiling mobile game | |
CN109902274A (en) | A kind of method and system converting json character string to thrift binary stream | |
CN105975261B (en) | A kind of runtime system and operation method called towards unified interface | |
CN105743955B (en) | A kind of extension JavaScript object method | |
CN111666572A (en) | Automatic change infiltration test frame | |
CN111158690A (en) | Desktop application framework, construction method, desktop application running method and storage medium | |
CN112711423A (en) | Engine construction method, intrusion detection method, electronic device and readable storage medium | |
CN113608729B (en) | Method for realizing deployment client | |
CN106919511A (en) | The analogy method of application, simulation application and its operation method and simulation system | |
CN114218052B (en) | Service interaction diagram generation method, device, equipment and storage medium | |
CN104423932B (en) | The method that Binary Element is called in Javascript | |
CN111596905A (en) | Method, device, storage medium and terminal for generating java object | |
CN111104122B (en) | Method for mapping xml service logic to java service logic | |
CN117519877A (en) | Rendering method and device of quick application card, storage medium and electronic equipment | |
CN111602115A (en) | Model driving method for application program development based on ontology | |
WO2023124657A1 (en) | Micro-application running method and apparatus, device, storage medium, and program product | |
CN113792704B (en) | Cloud deployment method and device of face recognition model | |
CN115292074A (en) | gPC protocol-based track analysis algorithm service calling method and device | |
CN114661402A (en) | Interface rendering method and device, electronic equipment and computer readable medium | |
CN109117207B (en) | Data processing method of business process model | |
Ivanović et al. | Transforming service compositions into cloud-friendly actor networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |