CN113608729A - Method for realizing client end deployment - Google Patents

Method for realizing client end deployment Download PDF

Info

Publication number
CN113608729A
CN113608729A CN202110946751.2A CN202110946751A CN113608729A CN 113608729 A CN113608729 A CN 113608729A CN 202110946751 A CN202110946751 A CN 202110946751A CN 113608729 A CN113608729 A CN 113608729A
Authority
CN
China
Prior art keywords
model
interface
client
input
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110946751.2A
Other languages
Chinese (zh)
Other versions
CN113608729B (en
Inventor
王玉梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong New Generation Information Industry Technology Research Institute Co Ltd
Original Assignee
Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong New Generation Information Industry Technology Research Institute Co Ltd filed Critical Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority to CN202110946751.2A priority Critical patent/CN113608729B/en
Publication of CN113608729A publication Critical patent/CN113608729A/en
Application granted granted Critical
Publication of CN113608729B publication Critical patent/CN113608729B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention relates to the field of deep learning, and particularly provides a client end deployment implementation method, which comprises the steps of firstly defining an abstract class and five public methods, generating a derivative class by inheriting the abstract class by a newly added model, and implementing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model; and the analysis of model information and the creation of a gprc-client end are realized in the abstract class constructor, and finally, the C + + engineering interface is encapsulated into 3C interfaces and compiled into a so dynamic link library which is provided for other languages at the front end to call. Compared with the prior art, the method has the advantages that the newly added model is easy to expand, multiple model deployments are supported, the method is more suitable for embedded equipment with higher performance requirements based on C + +, the generated dynamic link library can be called by multiple languages, and richer front-end functions are realized.

Description

Method for realizing client end deployment
Technical Field
The invention relates to the field of deep learning, and particularly provides an implementation method for deploying a client side.
Background
Model training is only a small part of deep learning, and model deployment is the most important link of falling on the ground of deep learning technology. Currently, the mainstream deployment frameworks include python service deployment, java direct loading model deployment, docker and tf-serving deployment models and Nvidia triton deployment frameworks.
Each deployment mode has advantages and disadvantages, wherein the inference speed of python service deployment and java direct loading deployment is slow, a python environment is needed, and an embedded environment with high performance requirements is not suitable.
The latter two modes adopt a docker container for deployment and provide http and grpc interfaces externally; the grpc service has more advantages in batch processing of image data, so that the grpc service is suitable for model reasoning of video stream input, and the framework supports multiple model deployments; the Nvidia triton reasoning speed is fastest in the 4 deployment modes.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method for realizing client end deployment with strong practicability.
The technical scheme adopted by the invention for solving the technical problems is as follows:
an implementation method for deploying a client side comprises the steps of firstly, defining an abstract class and five public methods, generating a derived class by inheriting the abstract class by a newly added model, and realizing five methods of model preprocessing, input, output, reasoning and post-processing according to input and output information of each model;
and the analysis of model information and the creation of a gprc-client end are realized in the abstract class constructor, and finally, the C + + engineering interface is encapsulated into 3C interfaces and compiled into a so dynamic link library which is provided for other languages at the front end to call.
Furthermore, when five public methods of preprocessing, inputting, outputting, reasoning and post-processing of the abstract class definition model are carried out, member variables in the abstract class are provided with a model information parameter structure body for storing parameter contents after model analysis, input and output names, height, width, channel, input data format, one-time reasoning picture number of model input, http-client and grpc-client type pointers for accessing the trinon-server.
Further, when the member variables in the abstract class have model information parameter structures, firstly, creating http _ client and grpc _ client object pointers according to the transmitted model name and the address of the reasoning server when initializing the abstract class; and then, accessing the triton-server by using the pointer to acquire model _ meta and model _ config data, finally analyzing the information of the model according to the model _ meta and the model _ config, storing the information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.
Furthermore, when the dispatching class inherits the abstract class to realize five public methods, each frame of image needs to be preprocessed, input, output, reasoning and post-processing according to the deep learning model processing logic.
Furthermore, the processing flow is different for each link of different models, and the model preprocessing interface performs operations of image cutting, image color mode conversion, model feeding normalization and image data storage on the images.
Further, the model input interface creates an input pointer of a triton-client type according to the analyzed model information and uses the input pointer as an input parameter of the model inference interface;
the model output interface creates the output type pointers of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
Furthermore, the model reasoning interface adopts a function callback mode, transmits parameters of the models inputs and outputs into the reasoning interface, and stores the obtained reasoning result in the vector at the function callback position;
traversing the model output outputs by the model post-processing interface, storing the inference result into a vector container according to the model output name, filtering out a frame with a lower score, and storing the result into a structural body for DBSCAN clustering operation;
the inference results are packaged as json format data returns using rapidjson third party data packaging components.
Further, when the C + + engineering interface is packaged into the C interface, the C + + engineering interface comprises an initialization interface, an inference output interface and a resource release interface for calling of the go language.
Further, the initialization interface starts a camera firstly, creates a corresponding subclass according to the transmitted model name needing reasoning, and transmits the model name, the triton server address and the camera address into an abstract class structure to analyze model information;
the reasoning output interface captures a frame of image data, and a json format reasoning result is obtained through the preprocessing, input, reasoning and post-processing processes;
and the resource release interface releases the resources established in the initialization interface and the reasoning output interface.
Compared with the prior art, the implementation method for deploying the client terminal has the following outstanding beneficial effects:
the method can easily expand the newly added model, supports the deployment of a plurality of models, is more suitable for embedded equipment with higher performance requirements based on C + +, and can generate a dynamic link library for a plurality of languages to call so as to realize richer front-end functions.
The Triton client only needs to realize 5 public interfaces for expanding the newly added model, the deployment of the client is simple and easy to operate, and the later maintenance cost is low. The Triton client can generate a dynamic link library and package the dynamic link library into 3C interfaces for calling of multiple languages, and richer front-end functions are realized by using inference results; the method supports simultaneous deployment of a plurality of models, adopts a grpc interaction mode to be more suitable for processing image data in batches, and is more suitable for embedded equipment with high performance requirements based on C + +.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart diagram of an implementation method for deploying a client side.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments in order to better understand the technical solutions of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A preferred embodiment is given below:
as shown in fig. 1, in the implementation method for deploying the client side in this embodiment, first, an abstract class and five public methods are defined, and the newly added model inherits the abstract class to generate a derived class, and implements five methods of model preprocessing, input, output, inference and post-processing according to input and output information of each model.
And the analysis of model information and the creation of a gprc-client end are realized in the abstract class constructor, and finally, the C + + engineering interface is encapsulated into 3C interfaces and compiled into a so dynamic link library which is provided for other languages at the front end to call.
1. 5 public methods for preprocessing, inputting, outputting, reasoning and post-processing of abstract class definition model
The member variables in the abstract class comprise a model information parameter structure body used for storing parameter contents after model analysis, input and output names, height and width of model input, channels, input data formats, one-time reasoning picture number, http-client and grpc-client type pointers for accessing the trinon-server and the like.
Firstly, creating an http _ client object pointer and a grpc _ client object pointer according to an incoming model name and an address of an inference server when initializing an abstract class, then accessing a triton-server by using the pointers to acquire model _ meta and model _ config data, finally analyzing the information of the model according to the model _ meta and the model _ config and storing the information in a defined model information structure, and finally analyzing the width, the height and the channel of the image according to the image format.
2. Method for realizing 5 public methods by dispatching class inheritance abstract class
According to the deep learning model processing logic, each frame of image needs to be subjected to preprocessing, input, output, reasoning and post-processing procedures; the processing flow is different for each link of different models.
The model preprocessing interface cuts images, converts image color modes, feeds the model normalization and stores image data (some models only need to store the image data of the original images without cutting).
And the model input interface creates input pointers of the triton-client type according to the analyzed model information and uses the input pointers as input parameters of the model inference interface.
The model output interface creates the output type pointers of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
The model reasoning interface adopts a function callback mode, model input and output parameters are transmitted into the reasoning interface, a reasoning result is obtained, and the result is stored in the vector at the function callback position.
Traversing the model output outputs by the model post-processing interface, storing the inference result into a vector container according to the model output name, filtering out a frame with a lower score, and storing the result into a structural body to perform DBSCAN clustering operation (some models do not need the operation);
and finally, packaging the inference result into json format data return by using a rapidjson third-party data packaging component.
3. Packaging the C + + interface into a C interface for other languages to call
Compiling the codes into a so dynamic link library, and packaging a processing flow into a C interface, wherein the C interface comprises an initialization interface, an inference output interface and a resource release interface for calling a go language;
the initialization interface starts a camera, creates corresponding subclasses according to the model names which need to be inferred, and transmits the model names, triton server addresses and camera addresses into an abstract class structure to analyze model information.
The reasoning output interface captures a frame of image data, and a json format reasoning result is obtained through the preprocessing, input, reasoning and post-processing processes.
The resource release interface releases resources created in the initialization and inference output interface.
The above embodiments are only specific cases of the present invention, and the protection scope of the present invention includes but is not limited to the above embodiments, and any suitable changes or substitutions that are consistent with the claims of a client end implementation method of the present invention and are made by a person of ordinary skill in the art shall fall within the protection scope of the present invention.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (9)

1. A method for realizing deployment of client end is characterized in that firstly, an abstract class and five public methods are defined, a newly added model inherits the abstract class to generate a derived class, and five methods of model preprocessing, input, output, reasoning and post-processing are realized according to input and output information of each model;
and the analysis of model information and the creation of a gprc-client end are realized in the abstract class constructor, and finally, the C + + engineering interface is encapsulated into 3C interfaces and compiled into a so dynamic link library which is provided for other languages at the front end to call.
2. The implementation method for deploying the client side according to claim 1, wherein when five public methods including preprocessing, inputting, outputting, reasoning and post-processing of the abstract class definition model are performed, member variables in the abstract class have a model information parameter structure body for storing parameter contents after model parsing, input and output names, height, width and channel of model input, input data format, one-time reasoning picture number, http-client and grpc-client type pointers for accessing a trinon-server.
3. The client deployment implementation method of claim 2, wherein when the member variables in the abstract class have model information parameter structures, firstly creating http _ client and grpc _ client object pointers according to an incoming model name and an address of an inference server when initializing the abstract class; and then, accessing the triton-server by using the pointer to acquire model _ meta and model _ config data, finally analyzing the information of the model according to the model _ meta and the model _ config, storing the information in a defined model information structure, and analyzing the width, the height and the channel of the image according to the image format.
4. The implementation method of claim 3, wherein when the dispatching class inherits the abstract class to implement five common methods, each frame of image needs to be preprocessed, input, output, inferred and post-processed according to the deep learning model processing logic.
5. The implementation method of claim 4, wherein the processing flow is different for each link of different models, and the model preprocessing interface performs operations of image clipping, image color mode conversion, model feeding normalization, and image data storage.
6. The client end deployment implementation method of claim 5, wherein the model input interface creates triton-client type input pointers according to the parsed model information and uses the triton-client type input pointers as input parameters of the model inference interface;
the model output interface creates the output type pointers of the model according to the model output name defined by the triton-server end, and can support the creation of a plurality of different outputs.
7. The client-side deployment implementation method of claim 6, wherein the model inference interface adopts a function callback mode, and transmits model inputs and outputs parameters to the inference interface, and the inference result is obtained and stored in vector at the function callback position;
traversing the model output outputs by the model post-processing interface, storing the inference result into a vector container according to the model output name, filtering out a frame with a lower score, and storing the result into a structural body for DBSCAN clustering operation;
the inference results are packaged as json format data returns using rapidjson third party data packaging components.
8. The implementation method for deploying the client side according to claim 7, wherein the C + + engineering interface is encapsulated into a C interface, and includes an initialization interface, an inference output interface, and a resource release interface for go language call.
9. The client end deployment implementation method of claim 8, wherein the initialization interface starts a camera first, creates corresponding subclasses according to the model names to be inferred, and transmits the model names, triton server addresses and camera addresses into an abstract class structure to analyze model information;
the reasoning output interface captures a frame of image data, and a json format reasoning result is obtained through the preprocessing, input, reasoning and post-processing processes;
and the resource release interface releases the resources established in the initialization interface and the reasoning output interface.
CN202110946751.2A 2021-08-18 2021-08-18 Method for realizing deployment client Active CN113608729B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110946751.2A CN113608729B (en) 2021-08-18 2021-08-18 Method for realizing deployment client

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110946751.2A CN113608729B (en) 2021-08-18 2021-08-18 Method for realizing deployment client

Publications (2)

Publication Number Publication Date
CN113608729A true CN113608729A (en) 2021-11-05
CN113608729B CN113608729B (en) 2023-07-04

Family

ID=78341095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110946751.2A Active CN113608729B (en) 2021-08-18 2021-08-18 Method for realizing deployment client

Country Status (1)

Country Link
CN (1) CN113608729B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018001065A1 (en) * 2016-06-27 2018-01-04 中兴通讯股份有限公司 Method, device and system for managing application
CN109165024A (en) * 2018-07-26 2019-01-08 天讯瑞达通信技术有限公司 A kind of method of operation platform automatic deployment and monitoring server system
CN111488197A (en) * 2020-04-14 2020-08-04 浙江新再灵科技股份有限公司 Deep learning model deployment method and system based on cloud server
CN112418427A (en) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 Method, device, system and equipment for providing deep learning unified reasoning service

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018001065A1 (en) * 2016-06-27 2018-01-04 中兴通讯股份有限公司 Method, device and system for managing application
CN109165024A (en) * 2018-07-26 2019-01-08 天讯瑞达通信技术有限公司 A kind of method of operation platform automatic deployment and monitoring server system
CN111488197A (en) * 2020-04-14 2020-08-04 浙江新再灵科技股份有限公司 Deep learning model deployment method and system based on cloud server
CN112418427A (en) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 Method, device, system and equipment for providing deep learning unified reasoning service

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
R. SUGUNA ET AL.: "HUNTING PERNICIOUS ATTACKS IN WEB APPLICATIONS WITH XPROBER", 《AMERICAN JOURNAL OF APPLIED SCIENCES》 *
韩东: "适用于Docker环境的DevOps平台的设计与实践", 《知网》 *

Also Published As

Publication number Publication date
CN113608729B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
CN107301098B (en) Remote procedure calling device, method and system based on Thrift protocol
US10776567B2 (en) Method for compiling page data, method, device and storage medium for page rendering
US10742773B2 (en) Protocol conversion method, platform, and protocol conversion gateway
EP3731161A1 (en) Model application method and system, and model management method and server
US10824537B2 (en) Method, device, and computer readable medium for tracing computing system
CN112364101A (en) Data synchronization method and device, terminal equipment and medium
CN109634751B (en) Method for realizing communication between application layer and bottom layer by utilizing electron framework
CN107643889B (en) Page rendering method and device based on template engine
CN112764875B (en) Intelligent calculation-oriented lightweight portal container microservice system and method
CN108196764A (en) Application architecture dispositions method, device, system and cloud platform
CN112860256B (en) Visual configuration system and method for Linux kernel equipment tree
WO2022104612A1 (en) Data distribution flow configuration method and apparatus, electronic device, and storage medium
CN112083926A (en) Web user interface generation method and device
CN109710244B (en) User-defined animation configuration method and device, equipment and storage medium
CN109840267B (en) Data ETL system and method
CN114816370A (en) Method for splitting SDK static library at iOS end at any fine granularity
CN113608729B (en) Method for realizing deployment client
CN114218052A (en) Service interaction graph generation method, device, equipment and storage medium
CN111488731B (en) File generation method, device, computer equipment and storage medium
CN117519877A (en) Rendering method and device of quick application card, storage medium and electronic equipment
US20200286012A1 (en) Model application method, management method, system and server
CN111104122A (en) Method for mapping xml service logic to java service logic
US20220283787A1 (en) System and method supporting graphical programming based on neuron blocks, and storage medium
CN114661402A (en) Interface rendering method and device, electronic equipment and computer readable medium
CN115292074A (en) gPC protocol-based track analysis algorithm service calling method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant