CN117149413A - Cloud service integrated deployment system and method for universal AI algorithm model - Google Patents

Cloud service integrated deployment system and method for universal AI algorithm model Download PDF

Info

Publication number
CN117149413A
CN117149413A CN202311030986.2A CN202311030986A CN117149413A CN 117149413 A CN117149413 A CN 117149413A CN 202311030986 A CN202311030986 A CN 202311030986A CN 117149413 A CN117149413 A CN 117149413A
Authority
CN
China
Prior art keywords
model
reasoning
configuration
request
universal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311030986.2A
Other languages
Chinese (zh)
Inventor
王军德
李诒雯
周明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Kotei Informatics Co Ltd
Original Assignee
Wuhan Kotei Informatics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Kotei Informatics Co Ltd filed Critical Wuhan Kotei Informatics Co Ltd
Priority to CN202311030986.2A priority Critical patent/CN117149413A/en
Publication of CN117149413A publication Critical patent/CN117149413A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a cloud service integrated deployment system and a cloud service integrated deployment method for a general AI algorithm model, wherein the system comprises the following components: the universal model module is used for decoupling each AI model and dividing each AI model into a plurality of modules so as to realize the integration of the AI models; the configuration management module is used for extracting configuration parameters from the AI model to be deployed and transmitting the parameters to the universal model management module and the application module; the model management module is used for responding to the reasoning request or the transferred configuration parameters of the application module, calling or updating the AI model in the universal model and returning the reasoning result of the AI model to the application module; and the application module is used for responding to the AI model calling request, processing the AI model calling request into an reasoning request and transmitting the reasoning request to the model management module. The invention combines the AI model decoupling with the NACOS service to realize the flexible deployment of the AI model, and obviously reduces the application landing cost and development time.

Description

Cloud service integrated deployment system and method for universal AI algorithm model
Technical Field
The invention belongs to the technical field of AI algorithm cloud service integration deployment, and particularly relates to a universal AI algorithm model cloud service integration deployment system and method.
Background
The rapid integration and deployment of AI algorithms have become a further important block of competitive inner coils of numerous platforms and manufacturers, how a great number of front research results can quickly become a new function, a new bright point and improvement of platform performance have become increasingly important, especially in cloud service platform products such as data labeling, the desire for front technology is more intense, and each new research result is probably a key technology capable of improving labeling efficiency, so that a rapid integration and deployment emerging algorithm model has a natural requirement.
Aiming at the AI algorithm integration deployment mode, a common web interface deployment mode and a C++ deployment mode are commonly adopted in the industry, but the simple web interface deployment mode does not have good integrated model codes, can not manage the model well, can reduce the utilization rate of the model, and is more suitable for some common small-sized test experiment scenes; c++ deployment can enable a model to run efficiently, but deployment is complex and difficult to maintain, the model is required to be deeply modified, a large number of code reconstruction is required to be carried out on an original python-trained model, data preprocessing and the like are required to be rewritten, and the model realized by using a specific framework is very laborious, is difficult to integrate and is unacceptable in deployment period, so that the method is only suitable for some mature industrial application scenes.
Disclosure of Invention
In order to improve the flexibility and expansibility of AI model management, the invention provides a universal AI algorithm model cloud service integrated deployment system which carries out unified integrated deployment and scheduling on models realized by facing various different leading edge research results and reduces the cost of model transformation, and the system comprises: the universal model module is used for analyzing and decoupling each AI model and dividing each AI model into a plurality of modules so as to realize integration of data input, model loading, model reasoning and model output of the AI models; the configuration management module is used for extracting configuration parameters from the AI model to be deployed and transmitting the configuration parameters to the universal model management module and the application module through NACOS service; the model management module is used for responding to the reasoning request of the application module or the configuration parameters transmitted by the configuration management module, calling or updating the AI model in the universal model and returning the reasoning result of the AI model to the application module; and the application module is used for responding to the AI model calling request, processing the response AI model calling request into an reasoning request and transmitting the reasoning request to the model management module.
In some embodiments of the invention, the generic model module comprises: the analysis unit is used for carrying out file analysis on each AI model and positioning the loading, data input, output and model reasoning parts of each AI model; the decoupling unit is used for dividing each AI model into a plurality of decoupled modules according to the positioning result of the analysis unit; and the model realization unit is used for realizing data input, model loading, model reasoning and model output of the model according to the plurality of decoupled modules.
In some embodiments of the invention, the configuration management module comprises: the extraction unit is used for extracting the configuration parameters of each AI model; a configuration unit, configured to configure a key value pair of the configuration parameter based on the NACOS service; the synchronization unit is used for acquiring the configuration file of the online server and synchronizing the configuration file to the local server; and the analysis unit is used for analyzing the configuration file into configuration parameters which can be directly called by each AI model.
In some embodiments of the invention, the model management module comprises: responding to the reasoning request of the application module or the configuration parameters transmitted by the configuration management module, calling or updating the AI model in the universal model, and returning the reasoning result of the AI model to the application module registration unit for registering the AI model after integration; the transmission unit is used for transmitting the configuration parameters to the configuration management module according to the reasoning request; the calling unit is used for responding to the reasoning request, calling a registered AI model in the universal model and returning the reasoning result of the registered AI model to the application module; and the updating unit is used for updating the registered AI model in response to the configuration parameters transmitted by the configuration management module.
In some embodiments of the invention, further comprising: and the service discovery module is used for responding to the batch requests and distributing the requests to a plurality of AI model services by utilizing load balancing of the NACOS.
Further, the service discovery module includes: the discovery unit is used for automatically registering the deployed AI model to the NACOS server through NACOS service so as to be accessed by a user; a configuration unit configured to configure a registered AI model through a NACOS service; and the load balancing unit is used for distributing the request load by self-configuration through the nano load balancing so as to support concurrent call requests of the batch AI model.
The second aspect of the invention provides a deployment method of the cloud service integrated deployment system based on the general AI algorithm model provided in the first aspect, comprising the following steps: analyzing and decoupling each AI model, and dividing each AI model into a plurality of modules to realize integration of data input, model loading, model reasoning and model output of the AI model; extracting configuration parameters from the AI model to be deployed, and transmitting the configuration parameters to a general model management module and an application module through NACOS service; responding to the reasoning request of the application module or the configuration parameters transmitted by the configuration management module, calling or updating the AI model in the universal model, and returning the reasoning result of the AI model to the application module; responding to the AI model call request, processing the response AI model call request into an reasoning request, and transmitting the reasoning request to a model management module.
In a third aspect of the present invention, there is provided an electronic apparatus comprising: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the deployment method of the universal AI algorithm model cloud service integrated deployment system provided by the second aspect of the invention.
In a fourth aspect of the present invention, there is provided a computer readable medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the deployment method of the universal AI algorithm model cloud service integration deployment system provided in the second aspect of the present invention.
The beneficial effects of the invention are as follows:
the labeling platform can efficiently and rapidly integrate and deploy the emerging leading edge algorithm research results in the web platform after the method is adopted, so that the rapid conversion and landing application of the results are realized, and the effect and effectiveness of the results are checked. Compared with the traditional python non-invasive deployment and simple transformation deployment, the method can manage the model more flexibly and effectively, exert the performance of the model algorithm more fully, and reduce delay and resource consumption; compared with C++ transformation type deployment, the method can greatly reduce workload and development and deployment period, remarkably reduce the landing cost of algorithm application tests and shorten development time.
Drawings
FIG. 1 is a schematic diagram of the basic structure of a general AI algorithm model cloud service integration deployment system in accordance with some embodiments of the invention;
FIG. 2 is a schematic diagram of a general AI algorithm model cloud service integration deployment system in accordance with some embodiments of the invention;
FIG. 3 is a schematic diagram of a specific architecture of a general AI algorithm model cloud service integration deployment system in accordance with some embodiments of the invention;
FIG. 4 is a flow chart of a deployment method of a general AI algorithm model cloud service integration deployment system in accordance with some embodiments of the invention;
fig. 5 is a schematic structural diagram of an electronic device according to some embodiments of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Referring to fig. 1 and 2, in a first aspect of the present invention, there is provided a general AI algorithm model cloud service integration deployment system 1, including: the universal model module 11 is configured to analyze and decouple each AI model, and divide each AI model into a plurality of modules, so as to implement integration of data input, model loading, model reasoning and model output of the AI models; a configuration management module 12, configured to extract configuration parameters from the AI model to be deployed, and transmit the configuration parameters to the generic model management module 11 and the application module 14 through the NACOS service; the model management module 13 is configured to respond to the reasoning request of the application module or the configuration parameters transferred by the configuration management module 12, call or update the AI model in the generic model, and return the reasoning result of the AI model to the application module; an application module 14 for responding to the AI model call request, processing the response AI model call request as an inference request, and transmitting the inference request to the model management module 13.
Referring to fig. 2 and 3, in some embodiments of the present invention, the generic model module 11 includes: the analysis unit is used for carrying out file analysis on each AI model and positioning the loading, data input, output and model reasoning parts of each AI model; the decoupling unit is used for dividing each AI model into a plurality of decoupled modules according to the positioning result of the analysis unit; and the model realization unit is used for realizing data input, model loading, model reasoning and model output of the model according to the plurality of decoupled modules.
Specifically, aiming at almost all current AI algorithm models, the running process of the AI algorithm models can be simply divided into four parts of data input (preprocessing), model loading, model reasoning and data output (post processing), so that a set of model integration framework with universality can be realized by reasonably allocating functions of the four parts, and the specific integration operation steps are as follows:
(1) Analyzing codes of an algorithm model to be integrated and deployed, and mainly positioning the codes to four parts of loading, data input, data output and model reasoning of the model;
(2) Splitting into four decoupled modules by simple reconstruction;
(3) The basic integration work of the algorithm model is completed by constructing a new subclass InferencM odel inheritance CommonModel parent class, and implementing abstract methods of the parent class about model loading, reasoning, pre-input and post-input processing and the like. Illustratively, the CommonModel parent code is constructed as follows:
referring to fig. 2 and 3, in some embodiments of the invention, the configuration management module 12 includes: the extraction unit is used for extracting the configuration parameters of each AI model; a configuration unit, configured to configure a key value pair of the configuration parameter based on the NACOS service; the synchronization unit is used for acquiring the configuration file of the online server and synchronizing the configuration file to the local server; and the analysis unit is used for analyzing the configuration file into configuration parameters which can be directly called by each AI model.
Specifically, some configuration parameters of the model, such as parameters of a weight file position, a recognition result confidence threshold, an IOU threshold, an input data size weight and the like of the model, are extracted, configuration class predictionConfig and configuration parameter key value pairs are formed, and related parameters can be directly obtained through the configuration class by using the follow-up model starting reasoning. Specifically, the method comprises the following steps:
(1) Extracting model configuration parameters, and perfecting the corresponding relation between the predictionConfig class and the parameters;
(2) Newly building a configuration file in Nacos service, and setting (1) a configuration parameter key value pair;
(3) The Nacos registration monitoring model configures Nacos service addresses and configuration item names;
(4) After the flash service is started, the online configuration is acquired at fixed time through a timing task and synchronized to the local;
(5) The configuration parameter content is instantiated into PrerectonConfig through configuration analysis and is used for model loading reasoning and the like.
With continued reference to fig. 2 and 3, in some embodiments of the invention, the model management module 13 includes: invoking or updating the AI model in the universal model in response to the reasoning request of the application module 14 or the configuration parameters transferred by the configuration management module 12, and returning the reasoning result of the AI model to the application module registration unit for registering the integrated AI model; a transfer unit for transferring the configuration parameters 12 to the configuration management module according to the reasoning request; the calling unit is used for responding to the reasoning request, calling a registered AI model in the universal model and returning the reasoning result of the registered AI model to the application module 14; an updating unit for updating the registered AI model in response to the configuration parameters delivered by the configuration management module 12.
Specifically, model schedule management: under the condition that model integration is completed, the calling process of the model is managed by a model scheduling manager Modelmanager, and a scheduler can decide when to load the model, release the model, call the model for reasoning, update the model configuration restarting model and the like according to user needs, configuration and the like. The above-mentioned including model schedule management includes the following processes:
(1) Registering the integrated model InferencM model in a ModelManager;
(2) The model manager transmits configuration information to the model class;
(3) When a user calls an interface, only a ModelManager reasoning interface is required to be called, a model manager automatically checks whether a model is available or not according to a registered model, whether the model needs to be started or not, and then model reasoning is called and a return result is obtained;
(4) After the user reconfigures the Nacos parameters, the local configuration is updated, and the ModelManager is informed that the model parameters need to be updated, and the ModelManager obtains the parameters to be transmitted to the model and uses the new parameters to restart the model;
in some embodiments of the present invention, the application module 14 is configured to respond to the AI model call request, process the response AI model call request into an inference request, and transmit the inference request to the model management module 13.
Specifically, a rest style fast application interface API is constructed by using a python's flash lightweight web framework, and data request and return are performed based on a JSON format. It is understood that the above procedure can be regarded as an API access procedure of the AI model.
In some embodiments of the invention, further comprising: the service discovery module 15 is configured to respond to the batch request, and use load balancing of the NACOS to deploy the request in a distributed manner to a plurality of AI model services.
Further, the service discovery module 15 includes: the discovery unit is used for automatically registering the deployed AI model to the NACOS server through NACOS service so as to be accessed by a user; a configuration unit configured to configure a registered AI model through a NACOS service; and the load balancing unit is used for distributing the request load by self-configuration through the nano load balancing so as to support concurrent call requests of the batch AI model.
Specifically, nacos distributed service discovery: the most important functions of the Nacos component are service discovery and configuration management.
(1) Service discovery: the algorithm deployment end does not need to inform the specific position of the deployment service in advance, but automatically registers to the NACOS server through the NACOS component, and the user side directly acquires the required service call address through accessing the NACOS service;
(2) Configuration management: the algorithm model and some related dynamic configuration items of the service can be directly put on the naocs for management, and related configuration is used when the service is deployed and related configuration is acquired through the access registration of the naocs to start the service and the algorithm.
(3) Load balancing: when a large amount of model calculation force support is needed, a large amount of same algorithm micro-services can be deployed under the condition of hardware resource permission, and the load of the self-allocation request is balanced through the nano load, so that a large amount of concurrent algorithm requests can be supported.
Example 2
Referring to fig. 5, in a second aspect of the present invention, there is provided a deployment method of the universal AI algorithm model cloud service integration deployment system provided in the first aspect, including: s100, analyzing and decoupling each AI model, and dividing each AI model into a plurality of modules so as to realize integration of data input, model loading, model reasoning and model output of the AI models; s200, extracting configuration parameters from an AI model to be deployed, and transmitting the configuration parameters to a general model management module and an application module through NACOS service; s300, responding to an reasoning request of the application module or configuration parameters transmitted by the configuration management module, calling or updating an AI model in the universal model, and returning a reasoning result of the AI model to the application module; s400, responding to an AI model calling request, processing the AI model calling request into an reasoning request, and transmitting the reasoning request to a model management module.
Further, in step S100, the analyzing and decoupling each AI model and dividing each AI model into a plurality of modules to implement the integration of data input, model loading, model reasoning and model output of the AI models includes: analyzing each AI model and dividing each AI model into a plurality of modules to realize integration of data input, model loading, model reasoning and model output of the AI models; dividing each AI model into a plurality of decoupled modules according to the analysis result; and according to the plurality of decoupled modules, realizing data input, model loading, model reasoning and model output of the model.
In one embodiment of the present invention, the deployment method includes the processes of general AI algorithm model integration, model scheduling management, data input and output processing, and the synchronization and distributed deployment of the nacos of the model configuration. The function of each process is described as follows:
1. AI algorithm model integration: in order to be compatible with the code integration of various AI algorithm models, the patent designs a universal parent module, and algorithms of different types can be integrated uniformly by inheriting the corresponding reasoning methods in the module.
2. Model scheduling management: under the condition that model integration is completed in the last step, the patent designs a general model scheduling manager, and the scheduling manager is used for completing the call of the algorithm module so as to realize the decoupling of model scheduling and interface service.
3. Algorithm API access: the input and output of the data are carried out through the web interface, the user request is distributed through the Nacos load balance, and the algorithm API is called to carry out reasoning and return a result.
4. The synchronization of the nacos of the model configuration: and uniformly managing the model and the service related configuration by using a nano technology, constructing a configuration hot update module, and realizing hot restarting of the model.
The discovery of the NAcos distributed service: by utilizing the load balancing of the nacos, the model service can perform multipoint distributed deployment under the condition of large request demand, and the load response of the model service is used for calling the requests in a large quantity.
Example 3
Referring to fig. 5, a third aspect of the present invention provides an electronic device, including: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the deployment method of the universal AI algorithm model cloud service integrated deployment system in the second aspect.
The electronic device 500 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with programs stored in a Read Only Memory (ROM) 502 or loaded from a storage 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following devices may be connected to the I/O interface 505 in general: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 507 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 508 including, for example, a hard disk; and communication means 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 shows an electronic device 500 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 5 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or from the storage means 508, or from the ROM 502. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 501. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more computer programs which, when executed by the electronic device, cause the electronic device to:
computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++, python and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (10)

1. A universal AI algorithm model cloud service integration deployment system, comprising:
the universal model module is used for analyzing and decoupling each AI model and dividing each AI model into a plurality of modules so as to realize integration of data input, model loading, model reasoning and model output of the AI models;
the configuration management module is used for extracting configuration parameters from the AI model to be deployed and transmitting the configuration parameters to the universal model management module and the application module through NACOS service;
the model management module is used for responding to the reasoning request of the application module or the configuration parameters transmitted by the configuration management module, calling or updating the AI model in the universal model and returning the reasoning result of the AI model to the application module;
and the application module is used for responding to the AI model calling request, processing the response AI model calling request into an reasoning request and transmitting the reasoning request to the model management module.
2. The generic AI algorithm model cloud service integration deployment system of claim 1, wherein the generic model module comprises:
the analysis unit is used for carrying out file analysis on each AI model and positioning the loading, data input, output and model reasoning parts of each AI model;
the decoupling unit is used for dividing each AI model into a plurality of decoupled modules according to the positioning result of the analysis unit;
and the model realization unit is used for realizing data input, model loading, model reasoning and model output of the model according to the plurality of decoupled modules.
3. The universal AI algorithm model cloud service integration deployment system of claim 1, wherein the configuration management module comprises:
the extraction unit is used for extracting the configuration parameters of each AI model;
a configuration unit, configured to configure a key value pair of the configuration parameter based on the NACOS service;
the synchronization unit is used for acquiring the configuration file of the online server and synchronizing the configuration file to the local server;
and the analysis unit is used for analyzing the configuration file into configuration parameters which can be directly called by each AI model.
4. The universal AI algorithm model cloud service integration deployment system of claim 1, wherein the model management module comprises:
invoking or updating AI model in the generic model in response to the reasoning request of the application module or the configuration parameters transferred by the configuration management module, and returning the reasoning result of the AI model to the application module
A registration unit for registering the integrated AI model;
the transmission unit is used for transmitting the configuration parameters to the configuration management module according to the reasoning request;
the calling unit is used for responding to the reasoning request, calling a registered AI model in the universal model and returning the reasoning result of the registered AI model to the application module;
and the updating unit is used for updating the registered AI model in response to the configuration parameters transmitted by the configuration management module.
5. The universal AI algorithm model cloud service integration deployment system of claim 1, further comprising: and the service discovery module is used for responding to the batch requests and distributing the requests to a plurality of AI model services by utilizing load balancing of the NACOS.
6. The universal AI algorithm model cloud service integration deployment system of claim 5, wherein the service discovery module comprises:
the discovery unit is used for automatically registering the deployed AI model to the NACOS server through NACOS service so as to be accessed by a user;
a configuration unit configured to configure a registered AI model through a NACOS service;
and the load balancing unit is used for distributing the request load by self-configuration through the nano load balancing so as to support concurrent call requests of the batch AI model.
7. A method of cloud service integration deployment system based on the generic AI algorithm model of claim 1, comprising:
analyzing and decoupling each AI model, and dividing each AI model into a plurality of modules to realize integration of data input, model loading, model reasoning and model output of the AI model;
extracting configuration parameters from the AI model to be deployed, and transmitting the configuration parameters to a general model management module and an application module through NACOS service;
responding to the reasoning request of the application module or the configuration parameters transmitted by the configuration management module, calling or updating the AI model in the universal model, and returning the reasoning result of the AI model to the application module;
responding to the AI model call request, processing the response AI model call request into an reasoning request, and transmitting the reasoning request to a model management module.
8. The universal AI algorithm model cloud service integration deployment method of claim 7, comprising: the analyzing and decoupling each AI model and dividing each AI model into a plurality of modules to realize the integration of data input, model loading, model reasoning and model output of the AI models comprises the following steps:
analyzing each AI model and dividing each AI model into a plurality of modules to realize integration of data input, model loading, model reasoning and model output of the AI models;
dividing each AI model into a plurality of decoupled modules according to the analysis result;
and according to the plurality of decoupled modules, realizing data input, model loading, model reasoning and model output of the model.
9. An electronic device, comprising: one or more processors; storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the generic AI algorithm model cloud service integration deployment method of any of claims 7-8.
10. A computer readable medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the general AI algorithm model cloud service integration deployment method of any of claims 7 to 8.
CN202311030986.2A 2023-08-14 2023-08-14 Cloud service integrated deployment system and method for universal AI algorithm model Pending CN117149413A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311030986.2A CN117149413A (en) 2023-08-14 2023-08-14 Cloud service integrated deployment system and method for universal AI algorithm model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311030986.2A CN117149413A (en) 2023-08-14 2023-08-14 Cloud service integrated deployment system and method for universal AI algorithm model

Publications (1)

Publication Number Publication Date
CN117149413A true CN117149413A (en) 2023-12-01

Family

ID=88883438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311030986.2A Pending CN117149413A (en) 2023-08-14 2023-08-14 Cloud service integrated deployment system and method for universal AI algorithm model

Country Status (1)

Country Link
CN (1) CN117149413A (en)

Similar Documents

Publication Publication Date Title
EP3764220B1 (en) Automatic application updates
US8819683B2 (en) Scalable distributed compute based on business rules
US10019298B2 (en) Middleware interface and middleware interface generator
CN109117252B (en) Method and system for task processing based on container and container cluster management system
JP7012689B2 (en) Command execution method and device
CN113934464A (en) Method and device for starting android application in Linux system and electronic equipment
WO2024077885A1 (en) Management method, apparatus and device for container cluster, and non-volatile readable storage medium
CN112787999B (en) Cross-chain calling method, device, system and computer readable storage medium
CN112256414A (en) Method and system for connecting multiple computing storage engines
CN106873970A (en) The installation method and device of a kind of operating system
CN112988223A (en) Frame integration method and device, electronic equipment and storage medium
CN111831461A (en) Method and device for processing business process
CN111078516A (en) Distributed performance test method and device and electronic equipment
CN115686805A (en) GPU resource sharing method and device, and GPU resource sharing scheduling method and device
CN113448650A (en) Live broadcast function plug-in loading method, device, equipment and storage medium
CN110717992B (en) Method, apparatus, computer system and readable storage medium for scheduling model
CN116257320B (en) DPU-based virtualization configuration management method, device, equipment and medium
CN115361382B (en) Data processing method, device, equipment and storage medium based on data group
CN111414154A (en) Method and device for front-end development, electronic equipment and storage medium
CN116302271A (en) Page display method and device and electronic equipment
CN117149413A (en) Cloud service integrated deployment system and method for universal AI algorithm model
CN111488268A (en) Dispatching method and dispatching device for automatic test
CN114564249A (en) Recommendation scheduling engine, recommendation scheduling method, and computer-readable storage medium
CN115604333B (en) Distributed big data analysis service scheduling method and system based on dubbo
CN112817573B (en) Method, apparatus, computer system, and medium for building a streaming computing application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination