WO2023168875A1

WO2023168875A1 - Method and apparatus for starting model service, and device, medium and product

Info

Publication number: WO2023168875A1
Application number: PCT/CN2022/105180
Authority: WO
Inventors: 罗阳; 钱正宇; 胡鸣人; 施恩; 袁正雄; 褚振方; 黄悦; 王国彬; 李金麒
Original assignee: 北京百度网讯科技有限公司
Priority date: 2022-03-10
Filing date: 2022-07-12
Publication date: 2023-09-14
Also published as: CN114706622A; CN114706622B

Abstract

A method and apparatus for starting a model service, and a device, a medium and a product, which relate to the technical field of computers, and in particular, to the technical field of AI platforms. The method comprises: in response to a service model being triggered to start, acquiring a mirror file corresponding to the service model, wherein the mirror file comprises meta-information of the service model, and context information of a service process, which is run by the service model (S101); and loading the mirror file, so as to start the service model for a service (S102). By means of the method, a mirror file corresponding to a service model can be acquired, and the service model is started by loading the mirror file, such that the time for starting the model is shortened, and the speed for starting a model service is accelerated, thereby improving the user experience.

Description

Methods, devices, equipment, media and products for initiating model services

Technical field

The present disclosure relates to the field of computer technology, particularly to the field of AI platform technology, and specifically to a method, device, equipment, medium and product for starting a model service.

Background technique

With the advancement of science and technology, artificial intelligence (Artificial Intelligence, AI) technology has developed rapidly and is applied to various fields in human life. For example, smart homes, smart wearable devices, virtual assistants, autonomous driving, drones, robots, smart medical care, smart customer service, etc., all fall under the category of AI technology.

AI services provide services to users through the AI service platform, enabling the deployment and launch of customer-customized AI models. As the number of customers increases, the number of different models increases and the memory resources occupied become increasingly large.

Contents of the invention

The present disclosure provides a method, device, equipment, storage medium and program product for starting a model service.

According to an aspect of the present disclosure, a method of starting a model service is provided, including: in response to a service model being triggered to start, obtaining an image file corresponding to the service model, wherein the image file includes the service model meta information, and context information of the service process run by the service model; load the image file to start the service model for service.

According to another aspect of the present disclosure, a device for starting a model service is provided, including: an acquisition module, configured to acquire an image file corresponding to the service model when the service model is triggered to start, wherein the image file includes meta information of the service model and context information of the service process run by the service model; a startup module is used to load the image file to start the service model.

According to yet another aspect of the present disclosure, an electronic device is provided, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores information that can be used by the at least one processor. Execution instructions, the instructions are executed by the at least one processor to enable the at least one processor to execute the method for starting a model service according to any one of the present disclosure.

According to yet another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the startup model service according to any one of the present disclosure. Methods.

According to yet another aspect of the present disclosure, a computer program product is provided, including a computer program that, when executed by a processor, implements the method for starting a model service described in any one of the present disclosure.

It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

Description of the drawings

The accompanying drawings are used to better understand the present solution and do not constitute a limitation of the present disclosure. in:

Figure 1 is a schematic flowchart of a method of starting a model service according to the present disclosure;

Figure 2 is a schematic flowchart of a method of creating an image file according to the present disclosure;

Figure 3 is a schematic flowchart of a method for running a service process according to the control service model of the present disclosure;

Figure 4 is a schematic flowchart of a method for obtaining context information of a service process running a service model and meta-information of the service model according to the present disclosure;

Figure 5 is a schematic flowchart of a method for obtaining context information of a service process running a service model and meta-information of the service model according to the present disclosure;

Figure 6 is a schematic flowchart of a method of loading the image file to start the service model according to the present disclosure;

Figure 7 is a schematic system structure diagram for implementing a method of starting a model service according to the present disclosure;

Figure 8 is a block diagram of a device for launching a model service according to the present disclosure;

Figure 9 is a block diagram of a device for launching a model service according to the present disclosure;

FIG. 10 is a block diagram of an electronic device used to implement a method for starting a model service according to an embodiment of the present disclosure.

Detailed ways

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding and should be considered to be exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

As a branch of computer science, AI simulates responses in a manner similar to human intelligence. Research in this field includes robotics, language recognition, image recognition, natural language processing, and expert systems. Since the birth of AI technology, the theory and technology have become increasingly mature, and it has been widely used in various industries. By combining business in this field with AI technology, costs can be reduced and efficiency increased.

In order to meet the broad market demand, customer-customized AI models are deployed and launched through the AI service platform, and the resources occupied by a large number of models are becoming increasingly large. When the model service starts, it takes a long time to start because the model needs to be reloaded.

In current technology, in order to solve the time-consuming problem of AI model loading, the model can be pruned, quantified, and other operations can be performed to reduce the model size to speed up the loading of the AI model. However, pruning and quantifying the model will reduce the model service accuracy, and pruning and quantifying the model will also increase costs. Or, use lazy loading when starting the AI model to improve the startup speed of the model. For example, do not adopt an inter-layer fusion strategy or do not use instruction set optimization, etc. However, in this way, the AI model loading speed can only be limitedly improved, and it affects the speed of using the model for inference, causing traffic loss.

In view of this, embodiments of the present disclosure provide a method for starting the model service. When the AI service is started, the image file corresponding to the AI service model is obtained and loaded. By loading the image file, the AI model is quickly loaded to realize the startup of the service. , thereby improving user experience.

Figure 1 is a schematic flowchart of a method of starting a model service according to the present disclosure. As shown in Figure 1, the method of the present disclosure includes the following steps.

In step S101, in response to the service model being triggered to start, an image file corresponding to the service model is obtained, where the image file includes meta-information of the service model and context information of the service process run by the service model.

In the embodiment of the present disclosure, the service model is an AI service model as an example for description. When the AI service model is started, the image file corresponding to the AI service model is obtained to realize the normal application of the AI service model. The image file includes meta-information of the AI service model. The meta-information is used to record the functions of the AI service model and the conditions for implementing model inference services and other self-dependencies. It can include: service process identifier, service Universal Resource Locator (Universal Resource Locator, URL) and other service logs, etc. The image file corresponding to the AI service model also includes context information of the service process run by the service model, such as the process of loading the model into memory, the status of the AI service model process, etc.

In step S102, the image file is loaded to start the service model for service.

In the embodiment of the present disclosure, when starting the AI service model, the model is not loaded, but the image file corresponding to the AI service model is obtained, and the AI service model is started to provide services through the obtained image file.

According to the embodiments of the present disclosure, when the AI service model is started and services are provided, the image file corresponding to the AI service model is obtained, and the AI service model is started by loading the image file, thereby reducing the model startup time and improving the model service startup speed, thereby improving user experience. experience.

Figure 2 is a schematic flowchart of a method of creating an image file according to the present disclosure. As shown in Figure 2, the method of the present disclosure includes the following steps.

In step S201, the service model is controlled to run the service process.

In the embodiment of the present disclosure, the image file corresponding to the AI service model may be created in advance, the image file may be created when the AI service model service is first started, or the image file may be created before the AI service model provides services. Setup process to obtain the image file when using the AI service. Controlling the running service process of the AI service model includes installing the environment required to run the AI service model, for example, configuring the underlying hardware driver, installing the dependent libraries required to run the AI service model, etc. After the running environment is installed, the AI service model is started to provide corresponding services. Understandably, the AI service model can provide services based on various protocols, such as Hyper Text Transfer Protocol (Hyper Text Transfer Protocol, HTTP) services and remote procedure calls. (Remote Procedure Call, RPC) service, Transmission Control Protocol (Transmission Control Protocol, TCP) service, etc. This embodiment of the present disclosure does not limit this.

In step S202, the context information of the service process run by the service model and the meta-information of the service model are obtained.

In the embodiment of the present disclosure, when the AI service model provides services, the context information of the service process run by the service model and the meta-information of the service model can be obtained through checkpoint operations. Through the system call (ptrace) mechanism, a special code is injected into the service process run by the service model and run to collect the context information of the service process. Understandably, in the ptrace mechanism, the parent process can monitor and control other processes, change the registers and kernel images in the child process, and implement breakpoint debugging and system call tracing. Before executing the system call, if the current process is in a "tracked" state, control is given to the tracking process so that the tracking process can view or modify the registers of the tracked process. When performing a Checkpoint operation, the context information of the service process run by the AI service model and the metainformation of the service model are dumped corresponding to the Checkpoint checkpoint.

In step S203, an image file including meta information and context information is created.

When the AI service model runs the service, the context information of the service process run by the AI service model and the meta-information of the AI service model obtained are dumped to create an image file including the meta-information and context information of the AI service model. When storing image files including meta information and context information, you can shield the specific storage details of the backend by providing a unified application interface. The backend can support local storage, distributed storage, object storage and other methods to realize the storage of image files.

According to the embodiment of the present disclosure, the AI service model runs the service process to provide services, and creates an image file including the context information of the running service process and the meta-information of the service model, so that when the AI service model is triggered again, the AI is started by loading the image file. Service model, improve the model service startup speed and reduce the model service startup time.

Figure 3 is a schematic flowchart of a method for running a service process according to the control service model of the present disclosure. As shown in Figure 3, the method of the present disclosure includes the following steps.

In step S301, the running service model is called.

In step S302, the working instance included in the service model is triggered to load the service model to wake up the service model and run the service process.

In the embodiment of the present disclosure, the AI service model is started, and the AI service model runs a service process to provide corresponding services. You can call the running AI service model through a script and trigger the work instance included in the AI service model to load the service model. The AI service model can include multiple work instances, and multiple work instances use the service model in parallel to perform reasoning to provide services.

According to embodiments of the present disclosure, the running AI service model is called, the working instance in the AI service model is triggered to load the AI service model, to wake up the service model to run the service process, and obtain the context information of the service process run by the service model and the service model's Meta information makes the image file corresponding to the service model include the information of the AI service model, providing guarantee for starting the AI service model by loading the image file.

Figure 4 is a schematic flowchart of a method for obtaining context information of a service process running a service model and meta-information of the service model according to the present disclosure. As shown in Figure 4, the method of the present disclosure includes the following steps.

In step S401, in response to determining that the service model is successfully awakened, context information of the service process run by the service model is monitored.

In step S402, context information is stored and meta-information of the service model is recorded.

In the embodiment of the present disclosure, the AI service model starts and runs the service process to provide corresponding services. Call the running AI service model through a script to wake up the AI service model. The AI service model includes multiple working instances. The multiple working instances respond to the received request to call the AI service, load the AI service model, and perform inference work based on the loaded AI service model. By polling multiple work instances included in the AI service model, it is determined that the AI service model has been awakened successfully. When it is determined that the AI service model has been successfully awakened, a Checkpoint operation is performed on the AI service model to store the context information of the service process run by the model and record the meta-information of the AI service model.

According to the embodiment of the present disclosure, the running AI service model is called, and the AI service model running service process is awakened. When it is determined that the AI service model is successfully awakened, the context information of the service process run by the AI service model is stored, and the elements of the AI service model are recorded. Information to create the image file corresponding to the AI service model, providing guarantee for starting the AI service model by loading the image file.

Figure 5 is a schematic flowchart of a method for obtaining context information of a service process run by the service model and meta-information of the service model according to the present disclosure. As shown in Figure 5, the method of the present disclosure includes the following steps.

In step S501, if it is determined that the working instances larger than the number threshold have completed loading of the service model, and inference based on the service model has been successfully completed, it is determined that the service model has been awakened successfully.

In step S502, the context information of the service process run by the service model is monitored.

In step S503, context information is stored and meta-information of the service model is recorded.

In this embodiment of the present disclosure, an image file is created that includes the meta-information of the AI service model and the context information of the service process run by the AI service model, so that when the AI service model is started, the AI service model can be loaded based on the image file instead. Provide services through AI service models. When creating an image file, the running AI service model is called through a script, triggering multiple working instances included in the AI service model to load the service model. Through polling, when it is determined that the AI service model is successfully awakened, the context information of the service process run by the service model is stored, and the meta-information of the service model is recorded, thereby creating a mirror file of meta-information and context information. Among them, multiple working instances included in the AI service model load the AI service model in response to the received request to call the AI service, and perform inference work based on the loaded AI service model. If it is determined that the working instances larger than the number threshold have completed the loading of the service model, and the inference based on the service model has been successfully completed, it is determined that the service model has been awakened successfully.

According to the embodiment of the present disclosure, the running AI service model is called, the AI service model is awakened to run the service process, and the working instances greater than the number threshold are determined to complete the loading of the AI service model, and when inference is successfully completed based on the AI service model, the AI service model is stored. Context information of the running service process, and record the meta-information of the AI service model to fully record the service process information of all working instances of the AI service model, create the image file corresponding to the AI service model, and start the AI service model by loading the image file provide assurance.

Figure 6 is a schematic flowchart of a method of loading the image file to start the service model according to the present disclosure. As shown in Figure 6, the method of the present disclosure includes the following steps.

In step S601, the image file is parsed to obtain the context information of the service process and the meta-information of the service model.

In the embodiment of the present disclosure, the AI service model is started and run to provide services. By obtaining and loading the image file including the meta information of the AI service model and the context information of the service process run by the AI service model, instead of modifying the AI service model, Load to restore the service of the AI service model. Copy the image file locally, load it into memory, and parse the image file to obtain the context information of the AI service process and the metainformation of the AI service model.

In step S602, based on the meta-information of the service model and the context information of the service process, the target running state of the service process run by the service model is determined.

In step S603, the service process run by the service model is controlled to provide services in the target operating state.

In this embodiment of the present disclosure, the target operating state of the AI service model is determined by parsing the obtained meta-information of the AI service model and the context information of the service process, that is, restoring the operating state of the AI service model to the AI service included in the image file. The running status of the model running service process. The AI service model continues to provide services based on the target operating status.

According to the embodiments of the present disclosure, when the AI service model needs to be started, there is no need to load the model. The AI service model is restored by reading and parsing the pre-created image file, and the service is restored based on the meta-information of the service model and the context information of the service process. The running status of the service process run by the model ensures that the model can effectively provide services, reduce model startup time, and improve model service startup speed.

Figure 7 is a schematic structural diagram of a system for implementing a method of starting a model service according to the present disclosure. As shown in Figure 7, the method of starting model services in the embodiment of the present disclosure can be implemented based on Checkpoint and Restore in User space (CRIU) in user space. CRIU can implement checkpoint and recovery functions in user space. , that is, checkpoint process files, apply the process files obtained from backup, and restore the process. The system can include a launch module, a warm up module, a dump module, a restore module, a storage module, etc. Among them, the launch module is responsible for installing the environmental dependencies required for the operation of AI services, including underlying hardware drivers, dependency libraries required for runtime, etc., and starting the AI services normally, and providing external services based on HTTP services, RPC services, TCP services, etc. The Launch module uses multiple algorithms to launch multiple AI services. When the launch module is running normally, the warmup module calls the AI service run by the AI service model through a script, triggering multiple working instances inside the AI service to receive requests to call the running service process to wake up the AI service. The Dump module rotates and trains whether the warmup module has completed the awakening of the AI service model, that is, multiple instances within the AI service model have completed loading the service model, and completed inference work based on the service model. After confirming that the AI service model is fully awakened, perform the Checkpoint operation of the AI service, that is, store the service process context information in the AI service process tree running the AI service model in the storage module, and record the meta-information of the AI service model. The meta-information can Including service process coding, service URL, service log, etc., create an image file. The storage module stores the image files generated by the checkpoint operation of the dump module, and provides a unified application interface to shield the specific storage details of the backend. The backend can be implemented in various ways such as local storage, distributed storage, and object storage. The recovery module copies the image file corresponding to the AI service model from the storage module to the local, loads it into the memory, analyzes it, restores the running status of the service process run by the AI service model, and continues to provide external services. The recovery module can also implement multiple loading of image files corresponding to the AI service model through the above method.

According to the method of starting a model service according to the embodiment of the present disclosure, when the AI service model is started, the image file corresponding to the AI service model is read instead of loading the model, thereby reducing the model startup time and improving the model service startup speed, thereby improving the user experience.

Based on the same concept, embodiments of the present disclosure also provide a device for starting a model service.

It can be understood that, in order to implement the above functions, the device provided by the embodiments of the present disclosure includes hardware structures and/or software modules corresponding to each function. Combined with the units and algorithm steps of each example disclosed in the embodiments of the present disclosure, the embodiments of the present disclosure can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving the hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered to go beyond the scope of the technical solutions of the embodiments of the present disclosure.

FIG. 8 is a block diagram of a device for launching a model service according to the present disclosure.

As shown in Figure 8, the device 700 for starting a model service in the embodiment of the present disclosure includes: an acquisition module 701 and a startup module 702.

The acquisition module 701 is used to obtain the image file corresponding to the service model when the service model is triggered to start, where the image file includes meta-information of the service model and context information of the service process run by the service model;

The startup module 702 is used to load the image file to start the service model for service.

Figure 9 is a block diagram of a device for launching a model service according to the present disclosure.

As shown in Figure 9, the device 700 for starting a model service in the embodiment of the present disclosure also includes: a creation module 703.

The creation module 703 is used to control the service model to run the service process, obtain the context information of the service process run by the service model and the meta-information of the service model, and create an image file including meta-information and context information.

In the exemplary embodiment of the present disclosure, the creation module 703 is also used to: call the running service model; trigger the work instance included in the service model to load the service model to wake up the service model to run the service process.

In an exemplary embodiment of the present disclosure, the acquisition module 701 is further configured to: in response to determining that the service model is successfully awakened, monitor the context information of the service process run by the service model; store the context information, and record the meta-information of the service model.

In an exemplary embodiment of the present disclosure, the acquisition module 701 is also configured to determine that the service model is successfully awakened if it is determined that the working instances greater than the number threshold complete loading of the service model and inference is successfully completed based on the service model.

In the exemplary embodiment of the present disclosure, the startup module 702 is also used to: parse the image file to obtain the context information of the service process and the meta-information of the service model; determine the service model based on the meta-information of the service model and the context information of the service process. The target running state of the running service process; control the service process run by the service model to provide services in the target running state.

In summary, according to the device for starting model services according to the embodiment of the present disclosure, when the AI service model starts and provides services, the image file corresponding to the AI service model is obtained, and the AI service model is started by loading the image file, thereby reducing the model startup time. Improve the startup speed of model services to improve user experience.

In the technical solution of this disclosure, the acquisition, storage and application of user personal information involved are in compliance with relevant laws and regulations and do not violate public order and good customs.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

Figure 10 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit implementations of the disclosure described and/or claimed herein.

As shown in FIG. 10 , the device 800 includes a computing unit 801 that can execute according to a computer program stored in a read-only memory (ROM) 802 or loaded from a storage unit 808 into a random access memory (RAM) 803 Various appropriate actions and treatments. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. Computing unit 801, ROM 802 and RAM 803 are connected to each other via bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

Multiple components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, optical disk, etc. ; and communication unit 809, such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.

Computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processing processor (DSP), and any appropriate processor, controller, microcontroller, etc. The computing unit 801 performs various methods and processes described above, such as a method of starting a model service. For example, in some embodiments, the method of launching the model service may be implemented as a computer software program that is tangibly embodied in a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the method of starting the model service described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the method of starting the model service in any other suitable manner (eg, by means of firmware).

Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on a chip implemented in a system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or a combination thereof. These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor The processor, which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. An output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.

The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communications network). Examples of communication networks include: local area network (LAN), wide area network (WAN), and the Internet.

Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other. The server can be a cloud server, a distributed system server, or a server combined with a blockchain.

According to the technical solutions provided by the embodiments of the present disclosure, the present disclosure can obtain the image file corresponding to the AI service model when the AI service model is started and provides services, and realize the startup of the AI service model by loading the image file, thereby reducing the model startup time and improving the model Service startup speed, thereby improving user experience.

It should be understood that various forms of the process shown above may be used, with steps reordered, added or deleted. For example, each step described in the present disclosure can be executed in parallel, sequentially, or in a different order. As long as the desired results of the technical solution disclosed in the present disclosure can be achieved, there is no limitation here.

The above-mentioned specific embodiments do not constitute a limitation on the scope of the present disclosure. It will be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions are possible depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this disclosure shall be included in the protection scope of this disclosure.

Claims

A method to start model services, including:

In response to the service model being triggered to start, obtain the image file corresponding to the service model, wherein the image file includes meta-information of the service model and context information of the service process run by the service model;

Load the image file to start the service model for service.
The method according to claim 1, wherein the image file is created in the following manner:

Control the service model to run the service process;

Obtain the context information of the service process run by the service model and the meta-information of the service model;

Create an image file including the meta information and the context information.
The method according to claim 2, controlling the service model to run a service process includes:

Call the running service model;

Triggering the work instance included in the service model to load the service model to wake up the service model to run a service process.
The method according to claim 2 or 3, wherein said obtaining the context information of the service process run by the service model and the meta-information of the service model includes:

In response to determining that the service model is successfully awakened, monitor the context information of the service process run by the service model;

The context information is stored, and meta-information of the service model is recorded.
The method according to claim 4, wherein determining that the service model is successfully awakened includes:

If it is determined that the working instances greater than the number threshold have completed loading of the service model, and inference is successfully completed based on the service model, it is determined that the service model has been awakened successfully.
The method according to claim 5, wherein loading the image file to start the service model includes:

Parse the image file to obtain the context information of the service process and the meta-information of the service model;

Based on the meta-information of the service model and the context information of the service process, determine the target running state of the service process run by the service model;

Control the service process run by the service model and provide services in the target operating state.
A device for starting model services, including:

An acquisition module, configured to acquire an image file corresponding to the service model when the service model is triggered to start, wherein the image file includes the meta-information of the service model and the service process of the service model. contextual information;

A startup module is used to load the image file to start the service model for service.
The device of claim 7, further comprising:

A creation module, configured to control the service process running by the service model, obtain the context information of the service process run by the service model and the meta-information of the service model, and create an image file including the meta-information and the context information.
The device according to claim 8, the creation module is further used for:

Call the running service model;

Triggering the work instance included in the service model to load the service model to wake up the service model to run a service process.
The device according to claim 8 or 9, wherein the acquisition module is also used for:

In response to determining that the service model is successfully awakened, monitor the context information of the service process run by the service model;

The context information is stored, and meta-information of the service model is recorded.
The device according to claim 10, wherein the acquisition module is also used for:

If it is determined that the working instances greater than the number threshold have completed loading of the service model, and inference is successfully completed based on the service model, it is determined that the service model has been awakened successfully.
The device according to claim 11, wherein the startup module is also used for:

Parse the image file to obtain the context information of the service process and the meta-information of the service model;

Based on the meta-information of the service model and the context information of the service process, determine the target running state of the service process run by the service model;

Control the service process run by the service model and provide services in the target operating state.
An electronic device including:

at least one processor; and

a memory communicatively connected to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1-6. Method to start model service.
A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method of starting a model service according to any one of claims 1-6.
A computer program product, comprising a computer program that, when executed by a processor, implements the method for starting a model service according to any one of claims 1-6.