CN110297640B - Model deployment method and device, storage medium and electronic equipment - Google Patents

Model deployment method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN110297640B
CN110297640B CN201910507186.2A CN201910507186A CN110297640B CN 110297640 B CN110297640 B CN 110297640B CN 201910507186 A CN201910507186 A CN 201910507186A CN 110297640 B CN110297640 B CN 110297640B
Authority
CN
China
Prior art keywords
model
target
node
deployed
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910507186.2A
Other languages
Chinese (zh)
Other versions
CN110297640A (en
Inventor
后永波
尹非凡
宋斌
王建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201910507186.2A priority Critical patent/CN110297640B/en
Publication of CN110297640A publication Critical patent/CN110297640A/en
Application granted granted Critical
Publication of CN110297640B publication Critical patent/CN110297640B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management

Abstract

The present disclosure relates to a method, an apparatus, a storage medium, and an electronic device for model deployment, which may receive a model deployment request message sent by a plurality of target nodes in a node set to be deployed, where the model deployment request message is used to request deployment of a target model; responding to the received model deployment request message, and acquiring the target model; sending the target model to each of the target nodes; after each target node in the node set to be deployed is determined to successfully receive the target model, sending a model loading instruction to each target node so that each target node loads the target model according to the model loading instruction; and after each target node in the node set to be deployed is determined to be loaded with the target model successfully, controlling each target node to take the target model as a current application model.

Description

Model deployment method and device, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of model deployment, and in particular, to a method, an apparatus, a storage medium, and an electronic device for model deployment.
Background
With the continuous development of machine learning technology, machine learning models (such as decision tree models, deep neural network models, and the like) are widely applied to various business scenarios, such as takeaway delivery business scenarios, the machine learning process is divided into offline training and online prediction stages, before online prediction is performed, the offline trained machine learning models need to be deployed into memories of nodes of a server executing model calculation, and then characteristics are transmitted to perform model calculation and real-time prediction.
In the related technology, when model deployment is performed, each service node of the same model executes a model deployment task at a fixed time, specifically, version detection is performed first, model downloading is performed when a model of a latest version is detected, model loading is performed after downloading is successful, and version switching is performed after loading is successful, so that model deployment is completed.
Disclosure of Invention
The purpose of the present disclosure is to provide a method, an apparatus, a storage medium, and an electronic device for model deployment.
In a first aspect, a method for model deployment is provided, which is applied to a model management server, and the method includes: receiving model deployment request messages sent by a plurality of target nodes in a node set to be deployed, wherein the model deployment request messages are used for requesting to deploy a target model; responding to the received model deployment request message, and acquiring the target model; sending the target model to each of the target nodes; after each target node in the node set to be deployed is determined to successfully receive the target model, sending a model loading instruction to each target node so that each target node loads the target model according to the model loading instruction; and after each target node in the node set to be deployed is determined to be loaded with the target model successfully, controlling each target node to take the target model as a current application model.
Optionally, the method further comprises: if a processing success message of the target processing item sent by each target node is received, determining that each target node successfully completes the target processing item, wherein the target processing item comprises receiving the target model or loading the target model; or, if a processing failure message of the target transaction item sent by at least one target node is received, controlling at least one target node to execute the target transaction item again until a processing success message of the target transaction item sent by at least one target node is received.
Optionally, if the target transaction is to receive the target model, the controlling at least one of the target nodes to re-execute the target transaction includes: resending the target model to at least one of the target nodes; if the target transaction is to load the target model, the controlling at least one of the target nodes to re-execute the target transaction comprises: and re-sending the model loading instruction to at least one target node.
Optionally, the method further comprises: if the processing failure message sent by at least one target node is received for continuous preset times, determining that model deployment of all target nodes in the node set to be deployed fails; or, if the processing failure message sent by at least one target node is continuously received within a first preset time, determining that model deployment of all target nodes in the node set to be deployed fails; or, if the second preset time is reached, a processing notification message sent by at least one target node is not received, and the at least one target node is deleted from the to-be-deployed node set, where the processing notification message includes the processing success message and the processing failure message.
Optionally, before receiving a model deployment request message sent by a plurality of target nodes in a node set to be deployed, the method further includes: determining whether a target model to be deployed exists; when the target model is determined to exist, determining a plurality of target nodes forming the node set to be deployed from model deployment nodes; and sending model deployment information to the target nodes, wherein the model deployment information comprises identification information of the target model, so that the target nodes can send the model deployment request message to the model management server according to the identification information.
Optionally, the determining whether there is a target model to be deployed includes: determining whether a newly added model exists in a preset file directory at present, and determining that the target model exists when the newly added model exists in the preset file directory at present; or, determining whether a user-selected model exists, and when it is determined that the user-selected model exists, determining that the target model exists.
Optionally, the determining, from the model deployment nodes, a plurality of target nodes constituting the set of nodes to be deployed includes: acquiring a current application model of each model deployment node; and determining the model deployment nodes of which the current application models are inconsistent with the target models as the target nodes forming the node set to be deployed.
In a second aspect, an apparatus for model deployment is provided, which is applied to a model management server, and includes: the system comprises a receiving module, a deploying module and a deploying module, wherein the receiving module is used for receiving model deploying request messages sent by a plurality of target nodes in a node set to be deployed, and the model deploying request messages are used for requesting to deploy a target model; an obtaining module, configured to obtain the target model in response to the received model deployment request message; a first sending module, configured to send the target model to each of the target nodes; a second sending module, configured to send a model loading instruction to each target node after it is determined that each target node in the to-be-deployed node set successfully receives the target model, so that each target node loads the target model according to the model loading instruction; and the control module is used for controlling each target node in the node set to be deployed to take the target model as the current application model after determining that each target node is successfully loaded with the target model.
In a third aspect, a computer readable storage medium is provided, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method according to the first aspect of the disclosure.
In a fourth aspect, an electronic device is provided, comprising: a memory having a computer program stored thereon; a processor for executing the computer program in the memory to implement the steps of the method of the first aspect of the disclosure.
According to the technical scheme, model deployment request messages sent by a plurality of target nodes in a node set to be deployed are received, and the model deployment request messages are used for requesting to deploy a target model; responding to the received model deployment request message, and acquiring the target model; sending the target model to each of the target nodes; after each target node in the node set to be deployed is determined to successfully receive the target model, sending a model loading instruction to each target node so that each target node loads the target model according to the model loading instruction; after determining that each target node in the node set to be deployed successfully loads the target model, controlling each of said target nodes to treat said target model as a current application model, such that, when each link of each target node model deployment in the node set to be deployed is successful, each target node in the node set to be deployed can uniformly take the target model as the current application model, when any target node in the node set to be deployed fails to perform one link of model deployment, all target nodes in the node set to be deployed do not carry out model deployment of the next link, the model deployment of all the target nodes in the node set to be deployed is considered to be failed, and at this time, and the models of all target nodes in the node set to be deployed continuously keep the old versions, so that the version consistency of the model deployment of all the target nodes in the node set to be deployed is ensured.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow chart illustrating a method of a first model deployment in accordance with an exemplary embodiment;
FIG. 2 is a flow diagram illustrating a method of a second model deployment in accordance with an exemplary embodiment;
FIG. 3 is a block diagram illustrating a model deployed apparatus in accordance with an exemplary embodiment;
FIG. 4 is a block diagram illustrating an electronic device in accordance with an example embodiment.
Detailed Description
The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
The method is mainly applied to a scenario of model deployment in a distributed environment, in a distributed File System (HDFS), if a service (e.g., takeaway delivery duration prediction service) needs to load a target model (e.g., a deep neural network model), and the target model is usually retrained according to a preset period to generate a new version, assuming that the service has 100 service nodes, the latest version of the model needs to be deployed in a memory of 100 service nodes, and a real-time prediction is performed subsequently, in an existing model deployment scheme, the 100 service nodes each execute a model deployment task at regular time, that is, flows of the 100 service node deployment models are independent tasks, synchronous execution cannot be guaranteed, when any link of the deployment task fails, a model version deployed by the node may be inconsistent with other nodes, however, if the model versions are not consistent, the results of online real-time prediction may be affected, and thus the difficulty in analyzing the business effect may be increased.
In order to solve the existing problems, the present disclosure provides a method, an apparatus, a storage medium, and an electronic device for model deployment, where the method may be applied to a model management server, and the present disclosure uses a plurality of target nodes in a set of nodes to be deployed as a control entity for model deployment, and the plurality of target nodes in the set of nodes to be deployed synchronously perform model downloading, loading, and switching to ensure consistency of model deployment of the plurality of target nodes in the set of nodes to be deployed, specifically, first receives a model deployment request message for requesting to deploy a target model sent by the plurality of target nodes in the set of nodes to be deployed, and the model management server obtains the target model in response to the received model deployment request message and sends the target model to each of the target nodes, and the model management server may obtain a reception notification message (including a reception success message or a reception failure message) returned by each of the target nodes, determining whether each target node in the set of nodes to be deployed successfully receives the target model, and after determining that each target node in the set of nodes to be deployed successfully receives the target model, sending a model loading instruction to each target node so that each target node can load the target model according to the model loading instruction, and then the model management server can determine whether each target node in the set of nodes to be deployed successfully loads the target model according to a loading notification message (including a loading success message or a loading failure message) returned by each target node, and after determining that each target node in the set of nodes to be deployed successfully loads the target model, can control each target node to use the target model as a current application model, so that when each target node in the set of nodes to be deployed, when each link of model deployment is successful, all target nodes in the node set to be deployed can uniformly take the target model as a current application model; when any target node in the node set to be deployed fails to perform one link of model deployment, all target nodes in the node set to be deployed do not perform model deployment of the next link, that is, the model deployment of all target nodes in the node set to be deployed is regarded as the model deployment failure of all target nodes in the node set to be deployed, and at the moment, the model of each target node in the node set to be deployed continues to maintain the old version, so that the version consistency of model deployment of each target node in the node set to be deployed is ensured.
The following description of the embodiments of the present disclosure will be made with reference to the accompanying drawings.
FIG. 1 is a flow chart illustrating a method of model deployment that may be applied to a model management server, as shown in FIG. 1, comprising the steps of:
in step 101, a model deployment request message sent by a plurality of target nodes in a node set to be deployed is received.
The model deployment request message is used to request deployment of a target model, where the target model may include a newly trained latest version of the model (which may be obtained through Spark/MR training in general), or any version of the model selected by a user (for example, a version of a user operation model is degraded so that a current application model of a target node is restored to a previous version), and the target model may include a machine learning model such as a decision tree model, a neural network model, or the like, and the node to be deployed is an assembly of a plurality of target nodes that need to deploy the target model.
The model deployment includes model downloading, model loading, and model switching (or model using), and in this step, the model deployment request message may include model downloading request information, and further, the model downloading request information may include model identification information of the target model requested to be downloaded.
It should be noted that, before executing this step, it is generally necessary to determine whether the target model to be deployed exists, and if it is determined that the target model exists, determine a plurality of target nodes forming the set of nodes to be deployed from model deployment nodes, and then send model deployment information to the plurality of target nodes, where the model deployment information includes identification information of the target model, so that the plurality of target nodes send the model deployment request message to the model management server according to the identification information.
The model deployment node may include all nodes that need to rely on the target model to complete model calculation, and the target node is a model deployment node that has not deployed the target model.
In the present disclosure, whether the target model to be deployed exists may be determined in any one of the following two ways:
the first method is as follows: the target model can be determined to exist by determining whether the newly added model currently exists in the preset file directory or not and determining that the target model exists when the newly added model currently exists in the preset file directory.
The preset file directory is a file directory preset by a user in the distributed file system, and the newly added model can be understood as a model of the latest version generated by retraining.
In the distributed file system, before a model of a latest version is obtained through offline training, model registration is usually required on a model management server, for example, registration information such as a model name, a model type, model version information, and a storage path of the model in a preset file directory of the HDFS is defined, so that after the model of the latest version is generated through training, the model obtained through the latest training can be marked according to the registration information, and the model is uploaded to the corresponding storage path for storage.
The second method comprises the following steps: it is determined whether a user-selected model exists, and when it is determined that the user-selected model exists, it is determined that the target model exists.
In the second mode, in consideration of different business requirements in an actual application scenario, a user may also select any version of the model from among multiple versions of the same generated model as the target model to deploy, for example, the user may operate the model version degradation to restore the current application model of the target node to a previous version, and therefore, in the second mode, if the model selected by the user is obtained, it may be determined that the target model to be deployed exists.
In addition, in the process of determining the target nodes forming the node set to be deployed from the model deployment nodes, the current application model of each model deployment node may be obtained, and the model deployment node of which the current application model is inconsistent with the target model is determined as the target node forming the node set to be deployed.
Before the target node deploys the target model, the current application model of the target node may include a model with the same name as the target model but different version, or may include a model with a different name from the target model.
In an actual model deployment scenario, after model deployment is completed by each node, model identification information (such as a model name, a version number, and the like) of a newly deployed model of the node may be reported to a model management server, so that the model management server may timely master the current application model being used by each node, and therefore, the model management server may determine the current application model of each model deployment node according to the model identification information reported by each node, and if the model name of the current application model is different from that of the target model, or if the model version number is different, it may be considered that the current application model is inconsistent with the target model.
It should be noted that, before deploying the target model to the target node, if the model name of the current application model of the target node is different from the target model, it is indicated that any version of the target model is not deployed in the target node; if the model name of the current application model of the target node is the same as that of the target model but the version number of the current application model is different from that of the target model, it is indicated that the model of other versions of the target model is deployed in the target node, and the current model deployment task is used for switching the versions of the models.
In step 102, the target model is obtained in response to the received model deployment request message.
In a possible implementation manner, after receiving the model deployment request message, the model management server can acquire the model identification information of the target model to be deployed, in this step, in order to obtain the target model more quickly, the target model may be first searched in the local storage space of the model management server based on the obtained model identification information of the target model, and if the target model is not found in the local storage space, the model deployment request message may be forwarded to a remote server for downloading the target model from the remote server, and, after the downloading is complete, the target model is stored to the local storage space, therefore, other target nodes can directly acquire the target model from the local storage space, so that the model acquisition rate is increased, and the model deployment efficiency is further improved.
In step 103, the target model is sent to each of the target nodes.
In step 104, after it is determined that each target node in the set of nodes to be deployed successfully receives the target model, a model loading instruction is sent to each target node, so that each target node loads the target model according to the model loading instruction.
In step 105, after it is determined that each target node in the node set to be deployed successfully loads the target model, each target node is controlled to use the target model as a current application model.
The following describes specific embodiments of step 104 and step 105:
in the control process of model deployment of the present disclosure, a heartbeat reporting manner may be adopted to control each link of deploying the target model, that is, a processing notification message of a target processing item is sent to the model management server by a target node (the processing notification message includes a processing success message or a processing failure message, for example, 1 may indicate a processing success, and 0 indicates a processing failure), and the target processing item may include receiving the target model or loading the target model, so that, in step 104, if a receiving success message (that is, a downloading success message) sent by each target node is received, it is determined that each target node in the set of nodes to be deployed successfully receives the target model, it is further understood that each target node successfully downloads the target model, in step 105, if a loading success message sent by each target node is received, it is determined that each of the target nodes in the set of nodes to be deployed successfully loads the target model.
In addition, if a processing failure message of the target processing item sent by at least one target node is received, controlling at least one target node to execute the target processing item again until a processing success message of the target processing item sent by at least one target node is received, specifically, if the target processing item is to receive the target model, the target model can be sent to at least one target node again, so that at least one target node can receive the target model again to improve the success rate of model reception; if the target processing item is loading the target model, the model loading instruction can be sent to at least one target node again, so that at least one target node can reload the target model according to the model loading instruction, and the success rate of model loading is improved.
Considering that in an actual application scenario, one or more target nodes in the node set to be deployed may continuously receive the processing failure message sent by the target node within a period of time due to a problem occurring in a network or a device, at this time, to ensure consistency of model deployment of each target node in the node set to be deployed, in a possible implementation manner, if the processing failure message sent by at least one target node is received continuously for a preset number of times or if the processing failure message sent by at least one target node is continuously received within a first preset time, it may be determined that model deployment of all target nodes in the node set to be deployed fails, that is, when model deployment of at least one target node in the node set to be deployed fails, the model deployment of all target nodes in the node set to be deployed may be regarded as a failure, at this time, each target node in the node set to be deployed maintains the model of the old version, so as to ensure the consistency of the model versions of the target nodes.
It should be noted that in an actual model deployment scenario, when a node device to be deployed with a target model is down, model deployment may not be performed normally, and therefore, a node that is down may be deleted from the set of nodes to be deployed to ensure that other nodes may deploy the target model normally, and specifically, if the second preset time is reached, at least one processing notification message sent by the target node is not received (at this time, at least one target node may be considered as a condition that the down occurs), at least one target node may be deleted from the set of nodes to be deployed, so that other target nodes in the deleted set of nodes to be deployed may be controlled to continue model deployment.
It should be further noted that, in consideration of two situations that exist in an actual model deployment scenario, one situation is that another version of the target model has been deployed in a target node of the target model to be deployed, and in this situation, the current model deployment task is to switch the version of the current application model corresponding to the target node to the target model; in another case, no version of the target model is deployed in the target node to which the target model is to be deployed, and in this case, the current model deployment task is to deploy the target model to the target node, so that, in step 105, before the target node deploys the target model, if the current application model is a model (i.e., another version of the target model has been deployed in the target node) with the same name as the target model but a different version, before the target node deploys the target model, when each target node is controlled to use the target model as the current application model, the target node may be controlled to replace the model version, that is, the version of the target model is used as the deployed version of the current application model; if any version of the target model is not deployed in the target node before the target node deploys the target model, when each target node is controlled to use the target model as a current application model, the target model needs to be deployed to the target node, and then the target model can be used as the current application model of the target node.
By adopting the method, when all links for model deployment are successful in all target nodes in the node set to be deployed, all the target nodes in the node set to be deployed can uniformly take the target model as the current application model; when any target node in the node set to be deployed fails to perform one link of model deployment, all target nodes in the node set to be deployed do not perform model deployment of the next link, that is, the model deployment of all target nodes in the node set to be deployed is regarded as the model deployment failure of all target nodes in the node set to be deployed, and at the moment, the model of each target node in the node set to be deployed continues to maintain the old version, so that the version consistency of model deployment of each target node in the node set to be deployed is ensured.
Fig. 2 is a flowchart illustrating a method of model deployment applied to a model management server, according to an exemplary embodiment, the method comprising the steps of:
in step 201, it is determined whether there is a target model to be deployed.
The target model may include a newly trained latest version of the model (which may be obtained through Spark/MR training in general), or any version of the model selected by the user (for example, the user operates the model version to degrade, so that the current application model of the target node is restored to a previous version), and the target model may include a decision tree model, a neural network model, or other machine learning models.
In this step, whether the target model to be deployed exists can be determined by any one of the following two ways:
the first method is as follows: the target model can be determined to exist by determining whether the newly added model currently exists in the preset file directory or not and determining that the target model exists when the newly added model currently exists in the preset file directory.
The preset file directory is a file directory preset by a user in the distributed file system, and the newly added model can be understood as a model of the latest version generated by retraining.
In the distributed file system, before a model of a latest version is obtained through offline training, model registration is usually required on a model management server, for example, registration information such as a model name, a model type, model version information, and a storage path of the model in a preset file directory of the HDFS is defined, so that after the model of the latest version is generated through training, the model obtained through the latest training can be marked according to the registration information, and the model is uploaded to the corresponding storage path for storage.
The second method comprises the following steps: it is determined whether a user-selected model exists, and when it is determined that the user-selected model exists, it is determined that the target model exists.
In the second mode, in consideration of different business requirements in an actual application scenario, a user may also select any version of the model from among multiple versions of the same generated model as the target model to deploy, for example, the user may operate the model version degradation to restore the current application model of the target node to a previous version, and therefore, in the second mode, if the model selected by the user is obtained, it may be determined that the target model to be deployed exists.
In this embodiment, after determining that the target model exists, a plurality of target nodes forming a set of nodes to be deployed may be determined from model deployment nodes by performing step 202 and step 203, where the model deployment nodes may include all nodes that need to rely on the target model to complete model calculation, the target nodes are model deployment nodes that have not yet deployed the target model, and the set of nodes to be deployed is a set formed by a plurality of target nodes that need to deploy the target model.
In step 202, the current application model of each model deployment node is obtained.
Before the target node deploys the target model, the current application model of the target node may include a model with the same name as the target model but different version, or may include a model with a different name from the target model.
In an actual model deployment scenario, after model deployment is completed by each node, model identification information (such as a model name, a version number, and the like) of a newly deployed model of the node may be reported to a model management server, so that the model management server may timely grasp the current application model being used by each node.
In step 203, the model deployment node where the current application model is inconsistent with the target model is determined as the target node forming the node set to be deployed.
In this step, if the model names or version numbers of the current application model and the target model are different, it can be considered that the current application model is inconsistent with the target model.
For example, assuming that the target model is model 1 and the version number of the target model is 2.0, four model deployment nodes that need to complete model calculation depending on the target model are respectively node 1, node 2, node 3 and node 4, if the current application model obtained for node 1 is model 1 and the version number is 1.0, the current application model for node 2 is model 1 and the version number is 1.0, the current application model for node 3 is model 1 and the version number is 1.0, the current application model for node 4 is model 1 and the version number is 2.0, it can be determined that the current application models of node 1, node 2 and node 3 are all not consistent with the target model, and the current application model for node 4 is consistent with the target model, so it can be determined that node 1, node 2 and node 3 are target nodes that constitute the set of nodes to be deployed, which the above example is only, the present disclosure is not limited thereto.
In step 204, model deployment information is sent to a plurality of the target nodes.
Wherein, the model deployment information may include identification information of the target model, so that the plurality of target nodes send a model deployment request message to the model management server according to the identification information.
In this step, the model deployment request message may include model download request information, and further, the model download request information may include model identification information of the target model requested to be downloaded.
In step 205, a model deployment request message sent by a plurality of target nodes in a node set to be deployed is received.
Wherein the plurality of target nodes may include all target nodes in the set of nodes to be deployed.
In step 206, the target model is obtained in response to the received model deployment request message.
In a possible implementation manner, after receiving the model deployment request message, the model management server can acquire the model identification information of the target model to be deployed, in this step, in order to obtain the target model more quickly, the target model may be first searched in the local storage space of the model management server based on the obtained model identification information of the target model, and if the target model is not found in the local storage space, the model deployment request message may be forwarded to a remote server for downloading the target model from the remote server, and, after the downloading is complete, the target model is stored to the local storage space, therefore, other target nodes can directly acquire the target model from the local storage space, so that the model acquisition rate is increased, and the model deployment efficiency is further improved.
In step 207, the target model is sent to each of the target nodes.
In step 208, it is determined whether each of the target nodes in the set of nodes to be deployed successfully receives the target model.
In an actual model deployment scenario, there are two final states of each target node, one is that a current application model of the target node is switched to a target model (such as a model of a latest version), the model deployment is successful, the target node starts a new version, and the other is that a link of a model deployment task retries for several times and fails (or fails when a first preset time is reached), so that the whole model deployment task fails, the target node does not execute the model switching, and continues to use the original version model, so that each model deployment link of each target node needs to be controlled to ensure the deployment consistency of a plurality of target node models in a node set to be deployed.
In the control process of model deployment of the present disclosure, a heartbeat reporting manner may be adopted to control each link of deploying the target model, that is, a processing notification message of a target processing item (the processing notification message includes a processing success message or a processing failure message, for example, 1 may be used to indicate processing success, and 0 is used to indicate processing failure) is sent to the model management server by the target node, and the target processing item may include receiving the target model or loading the target model, so that, in steps 208 to 212, the model management server may perform control of model deployment according to the processing notification message reported by each target node.
In this step, if a reception success message (i.e., a download success message) sent by each target node is received, it is determined that each target node successfully receives the target model, and it can be further understood that each target node successfully downloads the target model.
Illustratively, assuming that the set of nodes to be deployed includes three target nodes, namely node 1, node 2 and node 3, before executing this step, the model management server sends the obtained target model to the three target nodes, and each target node, if receiving the target model successfully, may return a reception success message to the model management server, and assuming that, after a third preset time (the third preset time is less than the first preset time), the reception success message sent by node 1 is received, the reception failure message sent by node 2 is received, the reception success message sent by node 3 is received, it may be determined that, of the three target nodes in the set of nodes to be deployed, node 2 does not currently receive the target model successfully, and at this time, after receiving the reception failure message sent by node 2, the target model may be retransmitted to the node 2, so that the node 2 may re-receive the target model, and assuming that the node 2 successfully receives the retransmitted target model, the model management server may receive the reception success message returned by the node 2, at this time, it may be determined that all three target nodes in the model deployment node set successfully receive the target model.
If it is determined that each of the target nodes successfully receives the target model, performing steps 210 to 211;
if it is determined that at least one of the target nodes did not successfully receive the target model, step 209 is performed.
In step 209, if a reception failure message sent by at least one target node in the set of nodes to be deployed is received, the target model is sent to the at least one target node again until the reception success message sent by the at least one target node is received.
In this step, if a reception failure message (i.e., a download failure message) sent by at least one target node in the node set to be deployed is received, to ensure deployment consistency of each target node model in the node set to be deployed, at least one target node may be controlled to re-receive (i.e., re-download) the target model until a reception success message sent by the at least target node is received, specifically, the target model may be sent to a target node that does not successfully receive the target model again, so that at least one target node may re-receive the target model, and thus a success rate of model reception is improved.
Considering that in an actual application scenario, one or more target nodes in the node set to be deployed may continuously receive a reception failure message sent by the target node within a period of time due to a problem occurring in a network or a device, at this time, to ensure consistency of model deployment of each target node in the node set to be deployed, in a possible implementation manner, if the reception failure message sent by at least one target node is received continuously for a preset number of times or the reception failure message sent by at least one target node is continuously received within a first preset time, it may be determined that model deployment of all target nodes in the node set to be deployed fails, that is, when model deployment of at least one target node in the node set to be deployed fails, the model deployment of all target nodes in the node set to be deployed may be regarded as a failure, at this time, each target node in the node set to be deployed maintains the model of the old version, so as to ensure the consistency of the model versions of the target nodes.
For example, continuing to take three target nodes including node 1, node 2, and node 3 in the node set to be deployed as an example, after the model management server sends the target model to the three target nodes, assuming that the reception success messages returned by node 1 and node 3 are received after the third preset time elapses, but the reception failure message sent by node 2 is received 3 times (i.e. the preset times) continuously (or the reception failure message sent by node 2 is received continuously within 10 minutes (i.e. the first preset time)), which indicates that, in the node set to be deployed, only node 1 and node 3 successfully receive the target model, node 2 does not successfully receive the target model, and the model deployment of node 2 fails, at this time, to ensure the deployment consistency of the three target node models, node 1, node 2, and node 3 in the node set to be deployed may be deployed, The model deployment of the three target nodes, that is, the node 2 and the node 3, is regarded as a failure, so that the node 1, the node 2 and the node 3 in the node set to be deployed all stop deploying the target model, and the node 1, the node 2 and the node 3 all maintain the models of the old versions, thereby achieving the consistency of the model versions of each target node in the node set to be deployed.
It should be further noted that, in an actual model deployment scenario, when a node device to be deployed with a target model is down, model deployment may not be performed normally, and therefore, a node that is down may be deleted from the set of nodes to be deployed to ensure that other nodes may deploy the target model normally, and specifically, if a second preset time is reached, at least one reception notification message sent by the target node is not received (at this time, at least one target node may be considered as a condition that the down occurs), at least one target node may be deleted from the set of nodes to be deployed, so that other target nodes in the deleted set of nodes to be deployed may be controlled to continue model deployment, where the reception notification message includes a reception success message or a reception failure message.
In step 210, a model load instruction is sent to each of the target nodes.
After step 208 is executed, if it is determined that each target node in the to-be-deployed node set successfully receives the target model, the model management server may control the whole target node in the to-be-deployed node set to enter a next link of model deployment, that is, model loading.
In step 211, it is determined whether each of the target nodes in the set of nodes to be deployed successfully loads the target model.
In this step, similar to the specific implementation manner of step 208, if the loading success message sent by each target node is received, it is determined that each target node successfully loads the target model.
If it is determined that each target node successfully loads the target model, go to step 213;
if it is determined that at least one of the target nodes did not successfully load the target model, step 212 is performed.
In step 212, if a loading failure message sent by at least one target node in the node set to be deployed is received, the model loading instruction is sent to at least one target node again until the loading success message sent by the at least target node is received.
Similarly, the specific implementation manner of this step can be analogized to the related description in step 209, in order to ensure the consistency of the deployment of each target node model in the node set to be deployed, in a possible implementation manner, if at least one loading failure message sent by the target node is received for a preset number of times, or continuously receiving the loading failure message sent by at least one target node within the first preset time, it may also be determined that model deployment of all target nodes in the set of nodes to be deployed fails, that is, when model deployment of at least one target node in the set of nodes to be deployed fails, that is, model deployment of all target nodes in the node set to be deployed can be regarded as failure, and at this time, and each target node in the node set to be deployed keeps the model of the old version, so that the consistency of the model versions of the target nodes is ensured.
In the process of controlling each target node to load the target model, if the second preset time is reached, the load notification message sent by at least one target node is not received, and at least one target node may also be regarded as a downtime condition, so that at least one target node may be deleted from the set of nodes to be deployed, and thus, other target nodes in the deleted set of nodes to be deployed may be controlled to continue model deployment, where the load notification message includes a load success message or a load failure message.
In step 213, each of the target nodes is controlled to use the target model as the current application model.
In an actual model deployment scene, two situations generally exist, one situation is that other versions of a target model are already deployed in a target node of the target model to be deployed, and in this situation, a current model deployment task is to switch versions of a current application model corresponding to the target node to the target model; in another case, no version of the target model is deployed in the target node to which the target model is to be deployed, and therefore, corresponding to the above two cases, in this step, before the target node deploys the target model, if the current application model is a model (i.e. other version of the target model already deployed in the target node) with the same name as the target model but different version, before the target node deploys the target model, when each target node is controlled to use the target model as the current application model, the target node may be controlled to replace the model version, that is, the version of the target model is used as the deployed version of the current application model, for example, the target model is model 1 of version 2.0, before switching, the current application model of each target node is model 1 of version 1.0, after the step is executed, the current application model of each target node can be uniformly switched from the model 1 of the 1.0 version to the model 1 of the 2.0 version, so that the upgrade of the model version is realized, and after the version switching is performed, the current application models of each target node in the nodes to be deployed are all the target models, so that the deployment consistency of each target node model in the node set to be deployed is realized.
If any version of the target model is not deployed in the target node before the target node deploys the target model, when each target node is controlled to use the target model as a current application model, the target model needs to be deployed to the target node, and then the target model can be used as the current application model of the target node.
By adopting the method, when all links for model deployment are successful in all target nodes in the node set to be deployed, all the target nodes in the node set to be deployed can uniformly take the target model as the current application model; when any target node in the node set to be deployed fails to perform one link of model deployment, all target nodes in the node set to be deployed do not perform model deployment of the next link, that is, the model deployment of all target nodes in the node set to be deployed is regarded as the model deployment failure of all target nodes in the node set to be deployed, and at the moment, the model of each target node in the node set to be deployed continues to maintain the old version, so that the version consistency of model deployment of each target node in the node set to be deployed is ensured.
Fig. 3 is a block diagram illustrating an apparatus for model deployment applied to a model management server according to an exemplary embodiment, and the apparatus includes:
a receiving module 301, configured to receive a model deployment request message sent by a plurality of target nodes in a node set to be deployed, where the model deployment request message is used to request to deploy a target model;
an obtaining module 302, configured to obtain the target model in response to the received model deployment request message;
a first sending module 303, configured to send the target model to each of the target nodes;
a second sending module 304, configured to send a model loading instruction to each target node after determining that each target node in the set of nodes to be deployed successfully receives the target model, so that each target node loads the target model according to the model loading instruction;
the control module 305 is configured to, after it is determined that each target node in the set of nodes to be deployed successfully loads the target model, control each target node to use the target model as a current application model.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
By adopting the device, when all links for model deployment are successful in all target nodes in the node set to be deployed, all the target nodes in the node set to be deployed can uniformly take the target model as the current application model; when any target node in the node set to be deployed fails to perform one link of model deployment, all target nodes in the node set to be deployed do not perform model deployment of the next link, that is, the model deployment of all target nodes in the node set to be deployed is regarded as the model deployment failure of all target nodes in the node set to be deployed, and at the moment, the model of each target node in the node set to be deployed continues to maintain the old version, so that the version consistency of model deployment of each target node in the node set to be deployed is ensured.
Fig. 4 is a block diagram illustrating an electronic device 400 according to an example embodiment. For example, the electronic device 400 may be provided as a server. Referring to fig. 4, the electronic device 400 comprises a processor 422, which may be one or more in number, and a memory 432 for storing computer programs executable by the processor 422. The computer program stored in memory 432 may include one or more modules that each correspond to a set of instructions. Further, the processor 422 may be configured to execute the computer program to perform the model deployment method described above.
Additionally, electronic device 400 may also include a power component 426 and a communication component 450, the power component 426 may be configured to perform power management of the electronic device 400, and the communication component 450 may be configured to enable communication, e.g., wired or wireless communication, of the electronic device 400. The electronic device 400 may also include input/output (I/O) interfaces 458. The electronic device 400 may operate based on an operating system stored in the memory 432, such as Windows Server, Mac OSXTM, UnixTM, LinuxTM, and the like.
In another exemplary embodiment, a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the model deployment method described above is also provided. For example, the computer readable storage medium may be the memory 432 described above that includes program instructions executable by the processor 422 of the electronic device 400 to perform the model deployment method described above.
In another exemplary embodiment, a computer program product is also provided, which comprises a computer program executable by a programmable apparatus, the computer program having code portions for performing the model deployment method described above when executed by the programmable apparatus.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.

Claims (9)

1. A method for model deployment, applied to a model management server, the method comprising:
receiving model deployment request messages sent by a plurality of target nodes in a node set to be deployed, wherein the model deployment request messages are used for requesting deployment of a target model, and the target model comprises a decision tree model and a neural network model;
responding to the received model deployment request message, and acquiring the target model;
sending the target model to each of the target nodes;
after each target node in the node set to be deployed is determined to successfully receive the target model, sending a model loading instruction to each target node so that each target node loads the target model according to the model loading instruction;
after each target node in the node set to be deployed is determined to be loaded with the target model successfully, each target node is controlled to take the target model as a current application model;
before receiving a model deployment request message sent by a plurality of target nodes in a node set to be deployed, the method further includes:
determining whether the target model to be deployed exists;
when the target model is determined to exist, determining a plurality of target nodes forming the node set to be deployed from model deployment nodes;
and sending model deployment information to the target nodes, wherein the model deployment information comprises identification information of the target model, so that the target nodes can send the model deployment request message to the model management server according to the identification information.
2. The method of claim 1, further comprising:
if a processing success message of the target processing item sent by each target node is received, determining that each target node successfully completes the target processing item, wherein the target processing item comprises receiving the target model or loading the target model; alternatively, the first and second electrodes may be,
and if a processing failure message of the target processing item sent by at least one target node is received, controlling at least one target node to execute the target processing item again until a processing success message of the target processing item sent by at least one target node is received.
3. The method of claim 2, wherein if the target transaction is to receive the target model, the controlling at least one of the target nodes to re-execute the target transaction comprises: resending the target model to at least one of the target nodes;
if the target transaction is to load the target model, the controlling at least one of the target nodes to re-execute the target transaction comprises: and re-sending the model loading instruction to at least one target node.
4. The method of claim 2, further comprising:
if the processing failure message sent by at least one target node is received for continuous preset times, determining that model deployment of all target nodes in the node set to be deployed fails; alternatively, the first and second electrodes may be,
if the processing failure message sent by at least one target node is continuously received within a first preset time, determining that model deployment of all target nodes in the node set to be deployed fails; alternatively, the first and second electrodes may be,
and if the second preset time is reached, not receiving a processing notification message sent by at least one target node, and deleting the at least one target node from the to-be-deployed node set, wherein the processing notification message comprises the processing success message and the processing failure message.
5. The method of claim 1, wherein the determining whether the target model exists to be deployed comprises:
determining whether a newly added model exists in a preset file directory at present, and determining that the target model exists when the newly added model exists in the preset file directory at present; alternatively, the first and second electrodes may be,
determining whether a user-selected model exists, and when it is determined that the user-selected model exists, determining that the target model exists.
6. The method of claim 1, wherein the determining the plurality of target nodes from the model deployment nodes that compose the set of nodes to be deployed comprises:
acquiring a current application model of each model deployment node;
and determining the model deployment nodes of which the current application models are inconsistent with the target models as the target nodes forming the node set to be deployed.
7. An apparatus for model deployment, applied to a model management server, the apparatus comprising:
the system comprises a receiving module, a model deployment request message and a target model deployment module, wherein the receiving module is used for receiving the model deployment request message sent by a plurality of target nodes in a node set to be deployed, the model deployment request message is used for requesting to deploy a target model, and the target model comprises a decision tree model and a neural network model;
an obtaining module, configured to obtain the target model in response to the received model deployment request message;
a first sending module, configured to send the target model to each of the target nodes;
a second sending module, configured to send a model loading instruction to each target node after it is determined that each target node in the to-be-deployed node set successfully receives the target model, so that each target node loads the target model according to the model loading instruction;
the control module is used for controlling each target node in the node set to be deployed to take the target model as a current application model after determining that each target node is successfully loaded with the target model;
the apparatus is further configured to:
before receiving model deployment request messages sent by a plurality of target nodes in a node set to be deployed, determining whether the target models to be deployed exist;
when the target model is determined to exist, determining a plurality of target nodes forming the node set to be deployed from model deployment nodes;
and sending model deployment information to the target nodes, wherein the model deployment information comprises identification information of the target model, so that the target nodes can send the model deployment request message to the model management server according to the identification information.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
9. An electronic device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to carry out the steps of the method of any one of claims 1 to 6.
CN201910507186.2A 2019-06-12 2019-06-12 Model deployment method and device, storage medium and electronic equipment Active CN110297640B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910507186.2A CN110297640B (en) 2019-06-12 2019-06-12 Model deployment method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910507186.2A CN110297640B (en) 2019-06-12 2019-06-12 Model deployment method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN110297640A CN110297640A (en) 2019-10-01
CN110297640B true CN110297640B (en) 2020-10-16

Family

ID=68027915

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910507186.2A Active CN110297640B (en) 2019-06-12 2019-06-12 Model deployment method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN110297640B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023004806A1 (en) * 2021-07-30 2023-02-02 西门子股份公司 Device deployment method for ai model, system, and storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210117859A1 (en) * 2019-10-20 2021-04-22 Nvidia Corporation Live updating of machine learning models
CN112860286A (en) * 2019-11-27 2021-05-28 佛山市云米电器科技有限公司 Application model updating method, intelligent household equipment, system and storage medium
CN111026436B (en) * 2019-12-09 2021-04-02 支付宝(杭州)信息技术有限公司 Model joint training method and device
CN111240698A (en) * 2020-01-14 2020-06-05 北京三快在线科技有限公司 Model deployment method and device, storage medium and electronic equipment
CN111966382A (en) * 2020-08-28 2020-11-20 上海寻梦信息技术有限公司 Online deployment method and device of machine learning model and related equipment
CN115250485A (en) * 2021-04-27 2022-10-28 华为技术有限公司 Method and device for distributing models

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103858122A (en) * 2011-08-03 2014-06-11 艾玛迪斯简易股份公司 Method and system to maintain strong consistency of distributed replicated contents in a client/server system
CN105635216A (en) * 2014-11-03 2016-06-01 华为软件技术有限公司 Distributed application upgrade method, device and distributed system
CN106855819A (en) * 2017-01-01 2017-06-16 国云科技股份有限公司 A kind of method of automatic deployment operating system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729147B (en) * 2014-03-06 2021-09-21 华为技术有限公司 Data processing method in stream computing system, control node and stream computing system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103858122A (en) * 2011-08-03 2014-06-11 艾玛迪斯简易股份公司 Method and system to maintain strong consistency of distributed replicated contents in a client/server system
CN105635216A (en) * 2014-11-03 2016-06-01 华为软件技术有限公司 Distributed application upgrade method, device and distributed system
CN106855819A (en) * 2017-01-01 2017-06-16 国云科技股份有限公司 A kind of method of automatic deployment operating system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023004806A1 (en) * 2021-07-30 2023-02-02 西门子股份公司 Device deployment method for ai model, system, and storage medium

Also Published As

Publication number Publication date
CN110297640A (en) 2019-10-01

Similar Documents

Publication Publication Date Title
CN110297640B (en) Model deployment method and device, storage medium and electronic equipment
CN107357571B (en) Maintenance method and system for equipment component program
CN107547245B (en) Version upgrading method and device
KR20120066116A (en) Web service information processing method and web service compositing method and apparatus using the same
US9934018B2 (en) Artifact deployment
CN106980565B (en) Upgrading process monitoring method and device
JP7345921B2 (en) OTA differential update method and system for master-slave architecture
US11223522B1 (en) Context-based intelligent re-initiation of microservices
US10216593B2 (en) Distributed processing system for use in application migration
CN112702195A (en) Gateway configuration method, electronic device and computer readable storage medium
CN113419818B (en) Basic component deployment method, device, server and storage medium
CN111158751A (en) Windows environment deployment method, electronic equipment and storage medium
CN112039710B (en) Service fault processing method, terminal equipment and readable storage medium
CN111240698A (en) Model deployment method and device, storage medium and electronic equipment
CN114168252A (en) Information processing system and method, network scheme recommendation component and method
CN112395072A (en) Model deployment method and device, storage medium and electronic equipment
CN105872106A (en) Over-the-air upgrade method, over-the-air server and terminal
CN111381932B (en) Method, device, electronic equipment and storage medium for triggering application program change
CN111708558B (en) High concurrency terminal firmware updating method and updating system
CN112148420B (en) Abnormal task processing method based on container technology, server and cloud platform
CN110874238B (en) Online service updating method and device
CN113010266A (en) Cloud service restarting method and device
CN116204197A (en) Plug-in deployment method and device, storage medium and electronic equipment
CN105009097A (en) Message transmission device, message transmission method, and message transmission program
CN116107603B (en) Firmware upgrading method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant