CN114327530A

CN114327530A - Method and device for updating model file and computing equipment

Info

Publication number: CN114327530A
Application number: CN202011073718.5A
Authority: CN
Inventors: 周新中; 蔡永锦
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-09-30
Filing date: 2020-09-30
Publication date: 2022-04-12

Abstract

The application discloses a method, a device and computing equipment for updating a model file, relates to the field of cloud computing, and solves the problems that how to avoid updating a container mirror image when updating the model file and how to reduce the updating time. The method comprises the steps that when the artificial intelligence application is operated based on a container environment, after a system obtains a first model file, the second model file is replaced by the first model file, namely the second model file related to a container mirror image is replaced by the first model file, and the artificial intelligence application is operated by using the first model file under the container environment. The parameters of the artificial intelligence model described by the first model file and the second model file are different.

Description

Method and device for updating model file and computing equipment

Technical Field

The embodiment of the application relates to the field of cloud computing, in particular to a method and a device for updating a model file and computing equipment.

Background

Currently, cloud nodes or edge nodes may run Artificial Intelligence (AI) applications based on the container environment. The cloud node stores container images, and the cloud node runs the containers by using the container images. The edge node may obtain a container image from the cloud node to run the container. The container image contains the model files, the application and the associated files for running the container. The model file is the key content of the AI application. The model file describes information such as graph structure, parameters, and weights of the artificial intelligence model required to run the AI application. After AI reasoning for many times, the artificial intelligence model can be trained by using the data obtained by reasoning to obtain a new model file. The new model file describes the graph structure, parameters, weights and other information of the optimized artificial intelligent model. In the case of updating a stored model file with a new model file, other files in the container image are also updated, resulting in a long update time. Therefore, how to avoid updating the container mirror image when updating the model file is an urgent problem to be solved.

Disclosure of Invention

The application provides a method and a device for updating a model file, and solves the problems that how to avoid updating a container mirror image when the model file is updated and the updating time is shortened.

In a first aspect, the present application provides a method for updating a model file, where the method is applicable to a service device, and specifically includes the following steps: when the container mirror image is associated with the second model file and the business device runs the AI application based on the container environment, the business device acquires the first model file, replaces the second model file with the first model file, enables the container mirror image to be associated with the first model file, and runs the AI application by using the first model file under the container environment.

Therefore, the business device stores the model file independently, so that the model file is separated from the container mirror image, when the model file needs to be updated, only the model needs to be updated, and the whole container mirror image does not need to be updated any more. Therefore, the time for updating the model file is effectively shortened, and the updating efficiency is improved.

Wherein the first model file and the second model file are different. It is understood that the first model file and the second model file may be different values of parameters describing the same artificial intelligence model. Alternatively, the first model file and the second model file may be parameters describing different artificial intelligence models.

In one possible implementation, replacing the second model file with the first model file includes: the service device replaces the second model file with the first model file in accordance with the replacement message. The replacement message contains the name of the first model file, the storage path of the second model file, and the name of the second model file. After replacing the second model file with the first model file, the first model file is associated with the container image. It should be understood that the container image does not contain the first model file. Because the first model file is associated with the container mirror image, the first model file can be read when the business device operates the container according to the container mirror image, so that the container does not need to be restarted, the model file is updated under the condition that a user does not sense the container, and the experience degree of the user is effectively improved.

In another possible implementation manner, the service device acquires the first model file according to the location, indicated by the first address, where the first model file is stored.

In another possible implementation manner, before obtaining the first model file, the method further includes: and after acquiring the second model file according to the position, indicated by the second address, of the second model file, the service device receives the application deployment message, associates the storage path of the second model file with the target directory in the container mirror image, acquires the container mirror image according to the position, indicated by the third address, of the storage container mirror image, runs the container by using the container mirror image, and runs the AI application by using the second model file under the container environment. It should be understood that the container image does not contain the second model file. The application deployment message includes a name of the second model file, a storage path of the second model file, a target directory in the container image, and a third address.

In a second aspect, the present application provides a method for updating a model file, where the method is applicable to a management device, and specifically includes the following steps: after the management device obtains the first model file sent by the terminal equipment, the first model file is sent to the service device, so that the service device can obtain the model file needing to be updated. And after the management device receives the replacement message sent by the terminal equipment, sending the replacement message to the service device so that the service device can take effect of the first model file. The replacement message is used to indicate that the second model file is replaced with the first model file. The first model file describes parameters of the artificial intelligence model. The parameters of the artificial intelligence model described by the first model file and the second model file are different.

Therefore, the management device sends the model file to the service device independently, so that the service device stores the model file independently, the model file is separated from the container mirror image, and when the model file needs to be updated, the model file only needs to be updated, and the whole container mirror image does not need to be updated. Therefore, the time for updating the model file is effectively shortened, and the updating efficiency is improved.

In one possible implementation, the replacement message is used to indicate that the first model file is associated with the container image. The replacement message contains the name of the first model file, the storage path of the second model file, and the name of the second model file. It should be understood that the container image does not contain the first model file.

In another possible implementation manner, before sending the first model file, the method further includes: the management device acquires a first model file and stores the first model file; and sending a first address to the service device according to the updating message sent by the terminal equipment, wherein the first address indicates the position for storing the first model file. Therefore, the service device obtains the first model file according to the first address.

In another possible implementation manner, the method further includes: after the management device acquires the second model file and the container mirror image, if the management device receives a first application deployment message sent by the terminal equipment, the management device sends a second address and a second application deployment message, as well as the second model file and the container mirror image to the service device. The first application deployment message includes a name of the second model file and a target directory in the container image. The second address indicates a location where the second model file is stored. The second application deployment message includes a name of the second model file, a storage path of the second model file, a target directory in the container image, and a third address. The third address indicates a location of the storage container mirror. It should be understood that the container image does not contain the second model file.

In another possible implementation, the first application deployment message and the replacement message are messages indicated by the user. The user can operate the display interface of the terminal equipment, and the terminal equipment responds to the operation of the user and sends the first application deployment message and the replacement message to the management device. The management device may receive a first application deployment message and a replacement message.

In a third aspect, a business apparatus is provided, the apparatus comprising means for performing the method of updating a model file in the first aspect or any one of the possible designs of the first aspect.

In a fourth aspect, there is provided a management apparatus comprising means for performing the method of updating a model file in the second aspect or any one of the possible designs of the second aspect.

In a fifth aspect, a computing device is provided, the computer comprising at least one processor and a memory for storing a set of computer instructions; the operational steps of the method of updating a model file in the first aspect or any one of the possible implementations of the first aspect are performed when the set of computer instructions is executed by a processor.

In a sixth aspect, a computing device is provided, the computer comprising at least one processor and a memory for storing a set of computer instructions; when the set of computer instructions is executed by a processor, the operational steps of the method of updating a model file of the second aspect or any of its possible implementations are performed.

In a seventh aspect, a computer-readable storage medium is provided, comprising: computer software instructions; the computer software instructions, when executed in a service device, cause the service device to perform the operational steps of the method as described in the first aspect or any one of the possible implementations of the first aspect.

In an eighth aspect, there is provided a computer-readable storage medium comprising: computer software instructions; the computer software instructions, when executed in the management apparatus, cause the management apparatus to perform the operational steps of the method as described in the second aspect or any one of the possible implementations of the second aspect.

A ninth aspect provides a computer program product for causing a computer to perform the operational steps of the method as described in the first aspect or any one of the possible implementations of the first aspect, when the computer program product runs on the computer.

A tenth aspect provides a computer program product for causing a computer to perform the operational steps of the method as described in the second aspect or any one of the possible implementations of the second aspect, when the computer program product runs on the computer.

The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.

Drawings

Fig. 1 is a schematic diagram of a cloud data center provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of a cloud environment provided by an embodiment of the present application;

FIG. 3 is a flowchart of a method for deploying a model file according to an embodiment of the present application;

FIG. 4 is a flowchart of updating a model file according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an interface for deploying a model file according to an embodiment of the present application;

FIG. 6 is a schematic interface diagram of updating a model file according to an embodiment of the present disclosure;

fig. 7 is a schematic composition diagram of a service device according to an embodiment of the present application;

fig. 8 is a schematic composition diagram of a management device according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a computing device according to an embodiment of the present application;

fig. 10 is a schematic composition diagram of another computing device provided in the embodiments of the present application.

Detailed Description

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Artificial intelligence is a branch of computer science that works by knowing the essence of intelligence and producing a new intelligent machine that can react in a similar way to human intelligence. The artificial intelligence research mainly comprises robots, language recognition, image recognition, natural language processing and the like.

In general, a cloud service provider may abstract the functions of the artificial intelligence model into a cloud service, and deploy the cloud service in a cloud data center. The user can consult and purchase the cloud service through the cloud service platform. After the user purchases the cloud service, the data can be uploaded to the cloud data center through the terminal device, and the cloud data center operates the artificial intelligence model to obtain a reasoning result.

Optionally, the software provider may also package the functionality of the artificial intelligence model as a software package. After the user purchases the software package, the user deploys the software package on the own server, or the user deploys the software package on the cloud server. For example, a tenant purchases a cloud service of a computing resource provided by a cloud service provider through a cloud service platform, deploys an artificial intelligence model in the computing resource (for example, a virtual machine) of a cloud data center rented by the tenant, runs the artificial intelligence model in the purchased computing resource, and performs inference operation to obtain an inference result.

The cloud data center is an entity which provides cloud services to users by using basic resources in a cloud computing mode. Fig. 1 is a schematic view of a cloud data center provided in an embodiment of the present application. Cloud data center 100 includes a pool of device resources owned by a cloud service provider (including computing resources 110, storage resources 120, and network resources 130). The computing resources 110 included in the cloud data center 100 may be computing devices (e.g., servers). A business module 140 is deployed on a server or a virtual machine in the cloud data center 100, and functions of an artificial intelligence model are realized. The business module can be distributed on a plurality of servers, or distributed on a plurality of virtual machines, or distributed on the virtual machines and the servers.

A cloud service platform 150 may also be deployed on a server or on a virtual machine in the cloud data center 100. A client 210 (e.g., a browser or an application) is deployed on the terminal device 200. A client (client), which may also be referred to as a client, corresponds to a server and provides a program of local service for a user. The user can access the cloud service platform 150 through the client 210 to consult and purchase cloud services. For example, the cloud service is AI inference, with an AI inference module deployed on a server or on a virtual machine in the cloud data center 100. After the user accesses the cloud service platform 150 through the terminal device 200 to purchase the AI inference cloud service, the data is uploaded to the cloud service platform 150 through the terminal device 200, the cloud data center 100 operates the service module 140 to obtain an AI inference result, and the AI inference result is fed back to the terminal device 200.

Cloud data center 100 also includes a management module 160. The management module 160 is used to implement the management function of the life cycle of the artificial intelligence application. The management functions of the lifecycle include, but are not limited to: deployment, upgrading, operation and maintenance, monitoring and the like.

Fig. 2 is a schematic diagram of a cloud environment according to an embodiment of the present disclosure. The difference from fig. 1 is that the management module 160 can deploy the service module 140 in one edge device or run on one or more edge devices in the edge environment 300, so that the user can obtain the inference result as soon as possible. The edge environment refers to a data processing center which is closer to a terminal device providing data. The edge environment includes one or more edge devices. For example, the edge device may be a roadside device with computing capability provided at the roadside of the traffic road, and the terminal device may be a camera.

The management module 160 is further configured to implement management and operation and maintenance of the edge device, management and operation and maintenance of the edge application, and establish a data channel between the edge node and the cloud data center.

Optionally, the management module 160 may deploy the business module 140 to a plurality of edge devices in the edge environment 300. The management module 160 manages the lifecycle of the traffic modules 140 on a plurality of edge devices in the edge environment 300.

It should be understood that the business module 140, the cloud service platform 150, and the management module 160 may be functional modules running on different servers. For example, the device that deploys the traffic module 140 may be referred to as a traffic server. The device on which the management module 160 is deployed may be referred to as a management server. Alternatively, the business module 140 and the management module 160 may be deployed on one server or two different servers, without limitation.

Since the cloud data center 100 is deployed with more applications, if the cloud data center 100 runs multiple applications at the same time, resources in the cloud data center 100 are preempted between processes. Generally, different applications can be operated in an isolated manner by using a container (container) technology, so that the utilization rate of resources and the service deployment efficiency in the cloud data center 100 are improved.

The container is instantiated from a container image. Containers may be understood as processes. One application or a plurality of applications run in each container, and different containers are isolated from each other. A container image (image) is a file system that runs a container. The processes in the container depend on the files in the container image. The container image includes an application, an executable file required to run the container, a dependent file, a library file, a configuration file, and the like. For example, assume that the business module 140 is used to implement an AI inference function. The container image comprises an AI application program, a model file of the AI application, an executable file, a dependency file, a library file, a configuration file and the like required for running the container.

Generally, since the container image contains the model file, other files (such as an AI application) in the container image are also updated when the model file is updated, resulting in a long update time. In order to solve the problem, an embodiment of the present application provides a method for updating a model file, where the model file is stripped from a container mirror image, when an AI application is deployed, the model file and the container mirror image are deployed respectively, when the model file is updated, the model file is downloaded separately, and the model file associated with the container mirror image is replaced with a new model file, where parameters of an artificial intelligence model described by a first model file and parameters of an artificial intelligence model described by a second model file are different. The whole container mirror image does not need to be updated, so that the time for updating the model file is effectively shortened, and the updating efficiency is improved.

In addition, in the conventional technology, when the container image is updated, the container needs to be restarted, so that the AI application service of the container operation is interrupted. Moreover, the operation of updating the model file is very frequent, and the number of service interruption is very large, thereby influencing the user experience. The method for updating the model file only needs to update the model file and does not need to restart the container, thereby avoiding service interruption and effectively improving user experience.

Next, the method for deploying the model file and updating the model file provided in this embodiment will be described in detail with reference to the drawings. Fig. 3 is a flowchart of a deployment model file according to this embodiment. Here, the service device and the management device are explained as an example. The service device may be a service server or edge device that deploys a service module 140 that implements the AI inference function. The business device can also be provided with a model management module, and the model management module is used for managing model files. The management device is a management server that deploys the management module 160.

S310, the management device acquires the container mirror image.

S320, the management device acquires the first model file.

In some embodiments, the first model file may be pre-trained by the user on the other device. The user accesses the cloud service platform 150 through the terminal device 200 to upload the first model file and the container image. After receiving the first model file and the container image sent by the terminal device 200, the management apparatus stores the first model file and the container image. It is understood that the container image contains the AI application and the executables, dependency files, library files, configuration files, etc. needed to run the container. The container image does not include the first model file. The first model file describes parameters of the artificial intelligence model. The parameters of the artificial intelligence model include the batch size of the artificial intelligence model, the graph structure and the weight of the artificial intelligence model, and the like.

In other embodiments, the business device training dataset results in a first model file, and the business device may prompt the user through the cloud service platform 150 that training is complete. The user may send an upload message through the cloud service platform 150 instructing the business device to upload the first model file and the container image to the management device.

S330, the management device receives a first application deployment message indicated by the user, wherein the first application deployment message comprises the name of the first model file and the target directory in the container mirror image.

The first application deployment message is used for indicating that the container is operated by the container mirror image, and the target directory in the container mirror image is the directory in which the first model file is associated with the container mirror image. Therefore, the business device can conveniently mirror the container with the first model file to run the AI application.

S340, the management device sends the first address to the service device.

And S350, the service device acquires the first model file from the management device according to the first address.

The first address indicates a location where the first model file is stored. The service device may send a first download message to the management device, the first download message including the first address. And after receiving the first download message, the management device reads the first model file from the position indicated by the first address and transmits the first model file to the service device.

And after receiving the first model file sent by the management device, the service device stores the first model file to a first specified position. The first designated location may be user pre-configured.

And S360, the management device sends a second application deployment message to the service device, wherein the second application deployment message comprises the name of the first model file, the storage path of the first model file, the target directory and the second address in the container mirror image.

It should be understood that, after receiving the name of the first model file and the target directory in the container image, the management device autonomously determines a storage path in the business device for storing the first model file.

S370, the service device obtains the container image from the management device according to the second address.

The second address indicates a location of the storage container image. The service device may send a second download message to the management device, the second download message including the second address. And after receiving the second download message, the management device reads the container mirror image from the position indicated by the second address and transmits the container mirror image to the service device.

And after receiving the container mirror image sent by the management device, the service device stores the container mirror image to a second appointed position. The second designated position may be user pre-configured.

And S380, the business device associates the storage path of the first model file with the target directory in the container mirror image.

The service device may read the first model file from the first designated location according to the name of the first model file, store the first model file to the storage path indicated by the second application deployment message, and associate the storage path of the first model file with the target directory in the container image according to the second application deployment message.

In some embodiments, the command for the business device to associate the storage path of the first model file with the target directory in the container image may be docker run-it-v … bin/bash. Illustratively, the target directory in the container mirror is/etc/models/. The storage path of the first model file is/opt/models/uid/, i.e. the storage path/opt/models/uid/, stores the first model file. The name of the first model file is modelv1. om. The service device may use a command docker run-it-v/opt/models/uid:/etc/models/bin/bash to associate the storage path of the first model file with the target directory in the container image.

And S390, when the business device runs the container by using the container mirror image, running the AI application by using the first model file under the container environment.

Since the storage path of the first model file is associated with the target directory in the container mirror image, when the service device runs the container according to the container mirror image and the service device accesses the target directory/etc/models/in the container mirror image, the model file modelv1.om is read from the/opt/models/uid/directory of the service device, and the AI application is run by using the first model file in the container environment.

Therefore, the business device stores the model file and the container mirror image independently, and when the container is operated according to the container mirror image, the model file is called from other places of the host, so that the separation from the container mirror image is realized. If the business device updates the model file, only the model needs to be updated, and the whole container mirror image does not need to be updated any more. Therefore, the time for updating the model file can be shortened, and the updating efficiency is improved.

Further, fig. 4 is a flowchart of updating a model file according to this embodiment.

S410, the management device acquires a second model file.

In some embodiments, the second model file may be pre-trained by the user on the other device. And the user accesses the cloud service platform 150 through the terminal device 200 to upload the second model file. The management apparatus stores the second model file after receiving the second model file transmitted by the terminal device 200.

In other embodiments, the business device training data set results in a second model file, and the business device may prompt the user through the cloud service platform 150 that training is complete. The user may send an upload message through the cloud service platform 150 instructing the business device to upload the second model file to the management device.

It should be understood that the container image does not contain the second model file. The second model file describes parameters of the artificial intelligence model. The first model file and the second model file are different. It is understood that the first model file and the second model file may be different values of parameters describing the same artificial intelligence model. Alternatively, the first model file and the second model file may be parameters describing different artificial intelligence models. When the AI application is run by using the second model file in the container environment, a better AI result can be obtained.

S420, the management device receives an updating message indicated by the user, and the updating message indicates that the first model file is replaced by the second model file.

S430, the management device sends the third address to the service device.

And S440, the service device acquires the second model file from the management device according to the third address.

The third address indicates a location where the second model file is stored. The service device may send a third download message to the management device, the third download message including the third address. And after receiving the third download message, the management device reads the second model file from the position indicated by the third address and transmits the second model file to the service device.

And after receiving the second model file sent by the management device, the service device stores the second model file to the first specified position. The first designated location may be user pre-configured.

S450, the management device receives a replacement message indicated by the user, wherein the replacement message is used for indicating that the first model file is replaced by the second model file.

In some embodiments, the replacement message is used to indicate that the second model file is associated with the container image. The replacement message contains the name of the second model file, the storage path of the first model file, and the name of the first model file. Understandably, the container image does not contain the second model file.

S460, the management device sends the replacement message to the service device.

And S470, the service device replaces the first model file with the second model file.

After receiving the replacement message, the service device may read the first model file from the first designated location according to the name of the first model file, and replace the first model file in the storage path of the first model file with the second model file. Understandably, since the directory storing the first model file is already associated with the target directory in the container image, the second model file is associated with the target directory in the container image after the business device replaces the first model file in the storage path of the first model file with the second model file. For example, the storage path of the first model file is/opt/models/uid/, i.e., the storage path/opt/models/uid/down stores the second model file. The name of the second model file is modelv2. om.

It should be noted that S450 and S460 may also be optional steps. After the service device acquires the second model file, S470 may be executed, that is, the second model file is used to replace the first model file.

Optionally, since the first model file and the second model file are used to implement the same AI inference function, the name of the first model file and the name of the second model file may be the same. If the business device stores the model file indicated by the name of the first model file of multiple versions, the business device can read the first model file from the specified position according to the name of the first model file and the version number of the second model file. Thus, the replacement message also includes the version number of the first model file.

And S480, the business device runs the AI application by using the second model file under the container environment.

At this time, since the service device replaces the first model file with the second model file, since the storage path of the second model file is associated with the target directory in the container mirror image, when the service device runs the container according to the container mirror image and the service device accesses the target directory/etc/models/in the container mirror image, the model file model v2.om is read from the/opt/models/uid/directory of the service device, and the AI application is run using the second model file in the container environment.

Optionally, the service device may periodically check the target directory in the container image, and if the model file associated with the target directory in the container image is updated, the service device runs the AI application in the container environment using the updated model file.

Therefore, the business device stores the model file independently, the separation from the container mirror image is realized, after the business device obtains the new model file, when the model file is updated, only the model needs to be updated, and the whole container mirror image does not need to be updated any more. Therefore, the time for updating the model file is effectively shortened, and the updating efficiency is improved.

In some embodiments, the user may operate the display interface of the terminal device 200, and the terminal device 200 sends an instruction to the management apparatus to instruct the management apparatus to deploy the model file or update the model file. Fig. 5 is a schematic diagram of a deployment model file and an update model file according to this embodiment.

As shown in fig. 5 (a), the user accesses the cloud service platform 150 through a browser on the terminal device 200. The interface of the cloud service platform 150 displays options such as model training, model reasoning, and the like. The user clicks on the "model inference" option 220. As shown in fig. 5 (b), the cloud service platform 150 displays a model inference interface in response to a click operation by a user. The model inference interface of the cloud service platform 150 displays an "upload data" button 230. The user clicks the "upload data" button 230. As shown in fig. 5 (c), the data to be inferred is selected from the local file, and the terminal device 200 uploads the data to be inferred to the cloud data center 100. As shown in (d) of fig. 5, the model inference interface of the cloud service platform 150 may also display options for a plurality of model files, and an option for container mirroring. The plurality of model files may be pre-configured in the storage resource 120 of the cloud data center 100, so that a user can select the model files and the container images according to his or her own intention. The interface of the cloud services platform 150 may also display a "start inference" button 240. The user clicks the "start reasoning" button. After the cloud service platform 150 receives the data to be inferred uploaded by the terminal device 200, the first model file, and the container image, the management module 160 determines a storage path for storing the first model file in the service apparatus and a target directory in the container image. The management module 160 sends an application deployment message to at least one business device. And the service device associates the storage path of the first model file with the target directory in the container mirror image according to the application deployment message, runs the container by using the container mirror image, and runs AI inference by using the first model file under the container environment. Optionally, as shown in (e) in fig. 5, the terminal device 200 may further display an inference result fed back by the cloud data center 100.

Further, if the service device updates the model file, as shown in fig. 6, the terminal device 200 may further prompt the user that the model file is updated and that the stored model file needs to be replaced. The terminal device 200 may also display a "yes" button 610 and a "no" button 620. If the user clicks the "yes" button 610, the terminal apparatus 200 transmits an update message to the management device in response to the user's clicking operation. The management device instructs the business device to replace the first model file with the second model file.

For the specific process of deploying the model file and updating the model file, reference may be made to the description of the foregoing embodiments, which are not repeated.

It is understood that, in order to implement the functions of the above embodiments, the server includes a corresponding hardware structure and/or software module for performing each function. Those of skill in the art will readily appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software driven hardware depends on the particular application scenario and design constraints imposed on the solution.

Fig. 7 is a schematic structural diagram of a possible service device provided in an embodiment of the present application. These service devices can be used to implement the functions of the service devices in the above method embodiments, so that the beneficial effects of the above method embodiments can also be achieved. In an embodiment of the present application, the service device may be an application server that deploys the service module 140 in the cloud data center 100 shown in fig. 1, or an edge device that deploys the service module 140 in the edge environment 300 shown in fig. 2, and may also be a module (e.g., a chip) that is applied to the server or the edge device.

As shown in FIG. 7, business apparatus 700 includes a model management module 710 and an execution module 720. Model management module 710 is also used to store model files. The service device 700 is used to implement the functions of the service device in the method embodiments shown in fig. 3 or fig. 4 described above.

When the service device 700 is used to implement the functions of the service device in the method embodiment shown in fig. 3: the model management module 710 is configured to perform S350 and S380; the execution module 720 is configured to execute S360, S370, and S390.

When the service device 700 is used to implement the functions of the service device in the method embodiment shown in fig. 4: model management module 710 is to perform S440 and S470; the execution module 720 is configured to execute S480.

The business apparatus 700 further comprises a storage module 730, and the storage module 730 is used for storing the model file and the container image.

The service device 700 may further include an agent module, where the agent module is configured to receive the message sent by the management device and execute the related command; and reporting the state information of the container in the service device to the management device.

More detailed descriptions about the model management module 710 and the operation module 720 can be directly obtained by referring to the related descriptions in the method embodiment shown in fig. 3 or fig. 4, which are not repeated herein.

Fig. 8 is a schematic structural diagram of a possible management device according to an embodiment of the present application. These management devices can be used to implement the functions of the management device in the above method embodiments, and therefore, the beneficial effects of the above method embodiments can also be achieved. In the embodiment of the present application, the management apparatus may be a management server that deploys the management module 160 in the cloud data center 100 shown in fig. 1, and may also be a module (e.g., a chip) applied to the management server.

As shown in fig. 8, the management apparatus 800 includes a communication module 810. The management device 800 is used to implement the functions of the management device in the method embodiments shown in fig. 3 or fig. 4 described above.

When the management apparatus 800 is used to implement the functions of the management apparatus in the method embodiment shown in fig. 3: the communication module 810 is configured to perform S310 to S360.

When the management apparatus 800 is used to implement the functions of the management apparatus in the method embodiment shown in fig. 4: the communication module 810 is configured to perform S410 to S460.

The management device 800 further includes a storage module 820, and the storage module 820 is used for storing the model file and the container image.

More detailed descriptions about the communication module 810 can be directly obtained by referring to the related descriptions in the method embodiment shown in fig. 3 or fig. 4, which are not repeated herein.

Fig. 9-10 provide a computing device. The computing device 900 shown in fig. 9 may be specifically configured to implement the function of the service apparatus 700 in the embodiment shown in fig. 7, and the computing device 1000 shown in fig. 10 may be specifically configured to implement the function of the management apparatus 800 in the embodiment shown in fig. 8.

As shown in fig. 9, computing device 900 includes a bus 901, a processor 902, a communication interface 903, and a memory 904. The processor 902, memory 904, and communication interface 903 communicate over a bus 901. The bus 901 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 9, but this does not indicate only one bus or one type of bus. The communication interface 903 is used for communication with the outside, such as receiving model files and container images.

The processor 902 may be a Central Processing Unit (CPU). The memory 904 may include a volatile memory (volatile memory), such as a Random Access Memory (RAM). The memory 904 may also include a non-volatile memory (non-volatile) such as a read-only memory (ROM), a flash memory, an HDD, or an SSD.

The memory 904 has stored therein executable code that the processor 902 executes to perform the aforementioned methods of updating the model file and methods of deploying the model file.

Specifically, in the case of implementing the embodiment shown in fig. 7 and the modules described in the embodiment of fig. 7 are implemented by software, the memory 904 stores software or program codes required for executing the functions of the model management module 710 and the execution module 720 in fig. 7, the memory 904 implements the function of the storage module 730, and the processor is used for executing the instructions in the memory 904, executing the method for updating the model file applied to the business apparatus 700 and the method for deploying the model file.

As shown in fig. 10, computing device 1000 includes a bus 1001, a processor 1002, a communication interface 1003, and a memory 1004. The processor 1002, the memory 1004, and the communication interface 1003 communicate with each other via a bus 1001. The memory 1004 stores model files and container images. The communication interface 1003 implements the functions of the communication module 810, the memory 1004 implements the functions of the storage module 820, and the processor 1002 is configured to execute the instructions in the memory 1004, perform the method of updating the model file and the method of deploying the model file applied to the management apparatus 800.

It is understood that the processor in the embodiments of the present application may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. The general purpose processor may be a microprocessor, but may be any conventional processor.

Embodiments of the present application also provide a computer-readable storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the above-mentioned method for updating a model file and the method for deploying a model file, which are applied to the business apparatus 700.

Embodiments of the present application also provide a computer-readable storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the above-described method for updating a model file and the method for deploying a model file, which are applied to the management apparatus 800.

The embodiment of the application also provides a computer program product, and when the computer program product is executed by a computer, the computer executes any one of the methods. The computer program product may be a software installation package, which may be downloaded and executed on a computer in case it is desired to use any of the methods described above.

The method steps in the embodiments of the present application may be implemented by hardware, or may be implemented by software instructions executed by a processor. The software instructions may consist of corresponding software modules that may be stored in Random Access Memory (RAM), flash memory, read-only memory (ROM), programmable ROM, Erasable PROM (EPROM), Electrically EPROM (EEPROM), registers, a hard disk, a removable hard disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. In addition, the ASIC may reside in a network device or a terminal device. Of course, the processor and the storage medium may reside as discrete components in a network device or a terminal device.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network appliance, a user device, or other programmable apparatus. The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire or wirelessly. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape; or an optical medium, such as a Digital Video Disc (DVD); it may also be a semiconductor medium, such as a Solid State Drive (SSD).

In the embodiments of the present application, unless otherwise specified or conflicting with respect to logic, the terms and/or descriptions in different embodiments have consistency and may be mutually cited, and technical features in different embodiments may be combined to form a new embodiment according to their inherent logic relationship.

In the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In the description of the text of the present application, the character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula of the present application, the character "/" indicates that the preceding and following related objects are in a relationship of "division".

It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for descriptive convenience and are not intended to limit the scope of the embodiments of the present application. The sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic.

Claims

1. A method of updating a model file, comprising:

when an Artificial Intelligence (AI) application is operated based on a container environment, acquiring a first model file, wherein a container mirror image required by operating a container is associated with a second model file, and the first model file and the second model file are different;

replacing the second model file with the first model file;

running the AI application using the first model file in the container environment.

2. The method of claim 1, wherein replacing the second model file with the first model file comprises:

and replacing the second model file with the first model file according to a replacement message, wherein the replacement message comprises the name of the first model file, the storage path of the second model file and the name of the second model file.

3. The method of claim 2, wherein replacing the second model file with the first model file comprises:

associating the first model file with the container image, the container image not containing the first model file.

4. The method according to any one of claims 1-3, wherein obtaining a first model file comprises:

receiving a first address indicating a location where the first model file is stored;

and acquiring the first model file according to the first address.

5. The method according to any of claims 1-4, wherein prior to obtaining the first model file, the method further comprises:

acquiring the second model file according to a second address, wherein the second address indicates a position for storing the second model file;

receiving an application deployment message, wherein the application deployment message comprises a name of the second model file, a storage path of the second model file, a target directory in the container image and a third address, and the third address indicates a position for storing the container image;

associating a storage path of the second model file with a target directory in the container image;

acquiring the container mirror image according to the third address, wherein the container mirror image does not contain the second model file;

and running the container by using the container mirror image, and running the AI application by using the second model file under the container environment.

6. A method of updating a model file, comprising:

receiving a replacement message sent by a user, wherein the replacement message is used for indicating that a first model file is used for replacing a second model file, and the first model file and the second model file are different;

and sending the replacement message.

7. The method of claim 6, wherein the replacement message is used to indicate that the first model file is associated with a container image, wherein the replacement message contains a name of the first model file, a storage path of the second model file, and a name of the second model file, and wherein the container image does not contain the first model file;

the method further comprises the following steps: and sending the first model file.

8. The method according to claim 6 or 7, wherein prior to said sending the first model file, the method further comprises:

acquiring the first model file and storing the first model file;

and sending a first address according to an updating message sent by the terminal equipment, wherein the first address indicates the position for storing the first model file.

9. The method according to any one of claims 6-8, further comprising:

receiving the second model file and a container mirror image uploaded by the terminal equipment, wherein the container mirror image does not contain the second model file;

receiving a first application deployment message sent by the terminal device, wherein the first application deployment message comprises the name of the second model file and a target directory in the container mirror image;

transmitting a second address indicating a location where the second model file is stored;

sending a second application deployment message, wherein the second application deployment message comprises a name of the second model file, a storage path of the second model file, a target directory in the container mirror image and a third address, and the third address indicates a location where the container mirror image is stored;

and sending the second model file and the container mirror image.

10. The method of claim 9, wherein the first application deployment message and the replacement message are user indicated messages.

11. A business apparatus, comprising:

the model management module is used for acquiring a first model file when an Artificial Intelligence (AI) application is operated based on a container environment, wherein a container mirror image required by operation of a container is associated with a second model file, and the first model file is different from the second model file;

the model management module is also used for replacing the second model file with the first model file;

an execution module to execute the AI application using the first model file in the container environment.

12. The apparatus of claim 11, wherein the model management module, when replacing the second model file with the first model file, is specifically configured to:

13. The apparatus of claim 12, wherein the model management module, when replacing the second model file with the first model file, is specifically configured to:

14. The apparatus according to any one of claims 11 to 13, wherein the model management module, when acquiring the first model file, is specifically configured to:

and acquiring the first model file according to the first address.

15. The apparatus according to any one of claims 11-14,

the model management module is further configured to obtain the second model file according to a second address, where the second address indicates a location where the second model file is stored;

the operating module is further configured to receive an application deployment message, where the application deployment message includes a name of the second model file, a storage path of the second model file, a target directory in the container image, and a third address, and the third address indicates a location where the container image is stored;

the model management module is further configured to associate a storage path of the second model file with a target directory in the container mirror image;

the operation module is further configured to obtain the container mirror image according to the third address, where the container mirror image does not include the second model file;

the operation module is further configured to operate the container using the container mirror image, and operate the AI application using the second model file in the container environment.

16. A management device, comprising:

the communication module is used for receiving a replacement message sent by terminal equipment, wherein the replacement message is used for indicating that a first model file is used for replacing a second model file, and the first model file is different from the second model file;

the communication module is further configured to send the replacement message.

17. The apparatus of claim 16, wherein the replacement message is used to indicate that the first model file is associated with a container image, wherein the replacement message contains a name of the first model file, a storage path of the second model file, and a name of the second model file, and wherein the container image does not contain the first model file;

the communication module is further used for sending the first model file.

18. The apparatus of claim 16 or 17,

the communication module is further used for acquiring the first model file and storing the first model file;

the communication module is further configured to send a first address according to an update message sent by the terminal device, where the first address indicates a location where the first model file is stored.

19. The apparatus of any one of claims 16-18,

the communication module is further configured to receive the second model file and a container mirror image uploaded by the terminal device, where the container mirror image does not include the second model file;

the communication module is further configured to receive a first application deployment message sent by the terminal device, where the first application deployment message includes a name of the second model file and a target directory in the container mirror image;

the communication module is further configured to send a second address, where the second address indicates a location where the second model file is stored;

the communication module is further configured to send a second application deployment message, where the second application deployment message includes a name of the second model file, a storage path of the second model file, a target directory in the container image, and a third address, and the third address indicates a location where the container image is stored;

the communication module is further configured to send the second model file and the container image.

20. The apparatus of claim 19, wherein the first application deployment message and the replacement message are user indicated messages.

21. A computing device comprising a memory and a processor, the memory for storing a set of computer instructions; when executed by the processor, perform the operational steps of the method of any of claims 1 to 5, or the operational steps of the method of any of claims 6 to 10.

22. A computer readable storage medium having stored therein a computer program or instructions which, when executed by a computer, carry out the operational steps of the method of any one of claims 1 to 5 or carry out the operational steps of the method of any one of the preceding claims 6 to 10.