CN113608729B - Method for realizing deployment client - Google Patents

Method for realizing deployment client Download PDF

Info

Publication number
CN113608729B
CN113608729B CN202110946751.2A CN202110946751A CN113608729B CN 113608729 B CN113608729 B CN 113608729B CN 202110946751 A CN202110946751 A CN 202110946751A CN 113608729 B CN113608729 B CN 113608729B
Authority
CN
China
Prior art keywords
model
interface
reasoning
client
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110946751.2A
Other languages
Chinese (zh)
Other versions
CN113608729A (en
Inventor
王玉梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong New Generation Information Industry Technology Research Institute Co Ltd
Original Assignee
Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong New Generation Information Industry Technology Research Institute Co Ltd filed Critical Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority to CN202110946751.2A priority Critical patent/CN113608729B/en
Publication of CN113608729A publication Critical patent/CN113608729A/en
Application granted granted Critical
Publication of CN113608729B publication Critical patent/CN113608729B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention relates to the field of deep learning, in particular to a realization method for deploying a client, which comprises the steps of firstly defining an abstract class and five public methods, inheriting the abstract class by a newly added model to generate a derivative class, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model; and the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end. Compared with the prior art, the method is easy to expand the newly added model, supports a plurality of model deployment, is more suitable for embedded equipment with higher performance requirements based on C++ realization, and can be used for calling a plurality of languages by the generated dynamic link library to realize richer front-end functions.

Description

Method for realizing deployment client
Technical Field
The invention relates to the field of deep learning, and particularly provides a client deployment implementation method.
Background
Model training is only a small part of deep learning, and model deployment is the most important link of deep learning technology landing. The currently mainstream deployment frameworks include python service deployment, java direct loading model deployment, dock and tf-service deployment models and Nvidia triton deployment frameworks.
Each deployment mode has advantages and disadvantages, wherein the python service deployment and java direct loading deployment reasoning speed is low, a python environment is needed, and the embedded environment with high performance requirements is not suitable.
The latter two modes adopt a docker container deployment, and http and grpc interfaces are externally provided; the grpc service is more advantageous in batch processing of image data, so that the grpc service is suitable for model reasoning of video stream input, and the framework supports a plurality of model deployments; the Nvidia triton reasoning speed is the fastest in the 4 deployment modes.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a practical realization method for deploying client ends.
The technical scheme adopted for solving the technical problems is as follows:
firstly, defining an abstract class and five public methods, inheriting the abstract class by a newly added model to generate a derivative class, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model;
and the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end.
Further, when five common methods of abstract class definition model preprocessing, input, output, reasoning and post-processing are performed, a member variable in the abstract class is provided with a model information parameter structure body for storing parameter content after model analysis, input and output names, height and width of model input, channels, input data format, disposable reasoning picture number, and http-client and grpc-client type pointers of access triton-server.
Further, when the member variables in the abstract class have model information parameter structures, firstly establishing an http_client and a grpc_client object pointer according to the name of the model and the address of the reasoning server when initializing the abstract class; and then, using a pointer to access a triton-server to acquire model_meta and model_config data, and finally, analyzing model information according to the model_meta and model_config, storing the model information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.
Furthermore, when five public methods are realized by dispatching class inheritance abstract classes, each frame of image needs to be preprocessed, input, output, reasoning and post-processing flows according to the deep learning model processing logic.
Furthermore, the processing flow of each link of different models is different, and the model preprocessing interface performs cutting, image color mode conversion, feeding model normalization and image data storage operation on the images.
Further, the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model reasoning interface;
the model output interface creates an output type pointer of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
Furthermore, the model reasoning interface adopts a function callback mode, the parameters of the model inputs, outputs are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in a vector at the function callback position;
the model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out frames with lower scores, and stores the result into a structural body for DBSCAN clustering operation;
and packaging the reasoning result into json format data to return by using an rapidjson third-party data packaging component.
Further, when the C++ engineering interface is packaged into the C interface, the C++ engineering interface comprises an initialization interface, an inference output interface and a resource release interface for go language call.
Further, the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model names needing to be inferred, and transmits the model names, triton server addresses and camera addresses into abstract class construction to analyze model information;
capturing a frame of image data by an inference output interface, and obtaining an inference json format result through preprocessing, input, inference and post-processing processes;
the resource releasing interface releases the resources created in the initializing interface and the reasoning output interface.
Compared with the prior art, the method for realizing the deployment client has the following outstanding beneficial effects:
the invention can be easily expanded for the newly added model, supports a plurality of model deployment, is more suitable for embedded equipment with higher performance requirements based on C++ realization, and can be used for calling a plurality of languages to realize richer front-end functions.
The Triton client only needs to realize 5 public interfaces for the expansion of the newly added model, the deployment of the client is simple and easy to operate, and the later maintenance cost is low. The Triton client can generate a dynamic link library and package the dynamic link library into 3C interfaces for calling of various languages, and a richer front-end function is realized by utilizing an reasoning result; the method supports the simultaneous deployment of a plurality of models, is more suitable for batch processing of picture data by adopting a grpc interaction mode, and is more suitable for embedded equipment with high performance requirements based on C++.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for implementing a client deployment.
Detailed Description
In order to provide a better understanding of the aspects of the present invention, the present invention will be described in further detail with reference to specific embodiments. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A preferred embodiment is given below:
as shown in FIG. 1, in one implementation method of a deployment client in this embodiment, first, an abstract class and five common methods are defined, and a newly added model inherits the abstract class to generate a derived class and implements five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model.
And the analysis of model information and the creation of the gprc-client end are realized in the abstract class construction function, and finally, the C++ engineering interfaces are packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end.
1. 5 common methods for preprocessing, inputting, outputting, reasoning and post-processing of abstract class definition model
The member variables in the abstract class are provided with model information parameter structures for storing parameter contents after model analysis, input and output names, the height and width of model input, channels, input data formats, one-time reasoning picture numbers, http-client and grpc-client type pointers for accessing triton-server and the like.
Firstly, creating http_client and grpc_client object pointers according to the name of an incoming model and the address of an inference server when initializing abstract classes, then accessing triton-server by using pointers to acquire model_meta and model_config data, finally analyzing information of the model according to the model_meta and the model_config, storing the information in a defined model information structure, and finally analyzing the width, the height and the channel of an image according to an image format.
2. 5 public methods for realizing dispatch class inheritance abstract class
According to the deep learning model processing logic, each frame of image needs to be subjected to preprocessing, input, output, reasoning and post-processing flows; the processing flow is different for each link of different models.
The model preprocessing interface performs clipping, image color mode conversion, feeding model normalization and image data storage operation on the image (some models do not need clipping and only need image data storage operation on the original image).
And the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model reasoning interface.
The model output interface creates an output type pointer of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
The model reasoning interface adopts a function callback mode, the model inputs, outputs parameters are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in the vector at the function callback position.
The model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out the frame with lower score, and stores the result into a structural body for DBSCAN clustering operation (some models do not need the operation);
and finally, packaging the reasoning result into json format data by using an rapidjson third-party data packaging component, and returning.
3. Encapsulating C++ interface into C interface for calling by other languages
Compiling the codes into a so dynamic link library, and packaging the processing flow into a C interface, wherein the C interface comprises initializing, reasoning output and a resource release interface for go language call;
the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model names needing to be inferred, and transmits the model names, triton server addresses and camera addresses into abstract class construction to analyze model information.
The reasoning output interface captures a frame of image data, and then a json format result is obtained through preprocessing, input, reasoning and post-processing processes.
The resource release interface releases the resources created in the initializing and reasoning output interface.
The above specific embodiments are merely illustrative of specific cases of the present invention, and the scope of the present invention includes, but is not limited to, the specific embodiments described above, and any suitable modification or replacement made by one of ordinary skill in the art, which meets the claims of a deployment client implementation method according to the present invention, shall fall within the scope of the present invention.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (7)

1. A deployment client implementation method is characterized in that firstly, an abstract class and five public methods are defined, a newly added model inherits the abstract class to generate a derivative class, and five methods of model preprocessing, input, output, reasoning and post-processing are realized according to input and output information of each model;
the analysis of model information and the creation of a gprc-client end are realized in the abstract class construction function, and finally, a C++ engineering interface is packaged into 3C interfaces and compiled into a so dynamic link library which is provided for calling other languages at the front end;
when five public methods of abstract class definition model preprocessing, input, output, reasoning and post-processing are carried out, a model information parameter structure body is arranged on member variables in the abstract class and is used for storing parameter contents after model analysis, and input and output names, the height and width of model input, channels, input data formats, disposable reasoning picture numbers, and http-client and grpc-client type pointers of access triton-server are adopted;
when member variables in the abstract class have model information parameter structures, firstly establishing an http_client and a grpc_client object pointer according to the name of an incoming model and the address of an inference server when the abstract class is initialized; and then, using a pointer to access a triton-server to acquire model_meta and model_config data, and finally, analyzing model information according to the model_meta and model_config, storing the model information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.
2. The method for implementing the client deployment according to claim 1, wherein when five common methods are implemented by dispatching class inheritance abstract classes, each frame of image needs to be preprocessed, input, output, reasoning and post-processing according to the deep learning model processing logic.
3. The method for implementing the client deployment according to claim 2, wherein the processing flow of each link of different models is different, and the model preprocessing interface performs operations of cutting, image color mode conversion, feeding model normalization and image data storage on the images.
4. The method for implementing the client deployment according to claim 3, wherein the model input interface creates input pointers of triton-client types according to the parsed model information and uses the input pointers as input parameters of the model reasoning interface;
the model output interface creates an output type pointer of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
5. The method for realizing the client deployment according to claim 4, wherein the model reasoning interface adopts a function callback mode, the model inputs, outputs parameters are transmitted into the reasoning interface, the reasoning result is obtained, and the result is stored in a vector at the function callback position;
the model post-processing interface traverses the model output and stores the reasoning result into a vector container according to the model output name, filters out frames with lower scores, and stores the result into a structural body for DBSCAN clustering operation;
and packaging the reasoning result into json format data to return by using an rapidjson third-party data packaging component.
6. The method for implementing a client deployment of claim 5, wherein the c++ engineering interface is encapsulated as a C interface, comprising initializing, reasoning output and resource release interfaces for go language call.
7. The method for implementing the client deployment according to claim 6, wherein the initialization interface firstly starts the camera, then creates corresponding subclasses according to the model name to be inferred, and transmits the model name, triton server address and camera address into the abstract class structure to analyze the model information;
capturing a frame of image data by an inference output interface, and obtaining an inference json format result through preprocessing, input, inference and post-processing processes;
the resource releasing interface releases the resources created in the initializing interface and the reasoning output interface.
CN202110946751.2A 2021-08-18 2021-08-18 Method for realizing deployment client Active CN113608729B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110946751.2A CN113608729B (en) 2021-08-18 2021-08-18 Method for realizing deployment client

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110946751.2A CN113608729B (en) 2021-08-18 2021-08-18 Method for realizing deployment client

Publications (2)

Publication Number Publication Date
CN113608729A CN113608729A (en) 2021-11-05
CN113608729B true CN113608729B (en) 2023-07-04

Family

ID=78341095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110946751.2A Active CN113608729B (en) 2021-08-18 2021-08-18 Method for realizing deployment client

Country Status (1)

Country Link
CN (1) CN113608729B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018001065A1 (en) * 2016-06-27 2018-01-04 中兴通讯股份有限公司 Method, device and system for managing application
CN109165024A (en) * 2018-07-26 2019-01-08 天讯瑞达通信技术有限公司 A kind of method of operation platform automatic deployment and monitoring server system
CN111488197A (en) * 2020-04-14 2020-08-04 浙江新再灵科技股份有限公司 Deep learning model deployment method and system based on cloud server
CN112418427A (en) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 Method, device, system and equipment for providing deep learning unified reasoning service

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018001065A1 (en) * 2016-06-27 2018-01-04 中兴通讯股份有限公司 Method, device and system for managing application
CN109165024A (en) * 2018-07-26 2019-01-08 天讯瑞达通信技术有限公司 A kind of method of operation platform automatic deployment and monitoring server system
CN111488197A (en) * 2020-04-14 2020-08-04 浙江新再灵科技股份有限公司 Deep learning model deployment method and system based on cloud server
CN112418427A (en) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 Method, device, system and equipment for providing deep learning unified reasoning service

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUNTING PERNICIOUS ATTACKS IN WEB APPLICATIONS WITH XPROBER;R. Suguna et al.;《American Journal of Applied Sciences》;1164-1171 *
适用于Docker环境的DevOps平台的设计与实践;韩东;《知网》;全文 *

Also Published As

Publication number Publication date
CN113608729A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
CN107301098A (en) A kind of remote procedure call device based on Thrift agreements, method and system
CN110275700A (en) A kind of cross-platform multipad Development Framework and method based on electron
CN112764875B (en) Intelligent calculation-oriented lightweight portal container microservice system and method
CN104932905A (en) Automatic code generation method from AADL to C language
CN105955731B (en) Method and system for quickly compiling mobile game
CN109902274A (en) A kind of method and system converting json character string to thrift binary stream
CN105975261B (en) A kind of runtime system and operation method called towards unified interface
CN105743955B (en) A kind of extension JavaScript object method
CN111666572A (en) Automatic change infiltration test frame
CN111158690A (en) Desktop application framework, construction method, desktop application running method and storage medium
CN112711423A (en) Engine construction method, intrusion detection method, electronic device and readable storage medium
CN113608729B (en) Method for realizing deployment client
CN106919511A (en) The analogy method of application, simulation application and its operation method and simulation system
CN114218052B (en) Service interaction diagram generation method, device, equipment and storage medium
CN104423932B (en) The method that Binary Element is called in Javascript
CN111596905A (en) Method, device, storage medium and terminal for generating java object
CN111104122B (en) Method for mapping xml service logic to java service logic
CN117519877A (en) Rendering method and device of quick application card, storage medium and electronic equipment
CN111602115A (en) Model driving method for application program development based on ontology
WO2023124657A1 (en) Micro-application running method and apparatus, device, storage medium, and program product
CN113792704B (en) Cloud deployment method and device of face recognition model
CN115292074A (en) gPC protocol-based track analysis algorithm service calling method and device
CN114661402A (en) Interface rendering method and device, electronic equipment and computer readable medium
CN109117207B (en) Data processing method of business process model
Ivanović et al. Transforming service compositions into cloud-friendly actor networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant