CN114879953A - Model deployment method, device, equipment and medium - Google Patents

Model deployment method, device, equipment and medium Download PDF

Info

Publication number
CN114879953A
CN114879953A CN202210498334.0A CN202210498334A CN114879953A CN 114879953 A CN114879953 A CN 114879953A CN 202210498334 A CN202210498334 A CN 202210498334A CN 114879953 A CN114879953 A CN 114879953A
Authority
CN
China
Prior art keywords
model
service
identifier
identification
model service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210498334.0A
Other languages
Chinese (zh)
Inventor
卢凌云
张晨
王全礼
李昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202210498334.0A priority Critical patent/CN114879953A/en
Publication of CN114879953A publication Critical patent/CN114879953A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/35Creation or generation of source code model driven
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • G06F8/63Image based installation; Cloning; Build to order
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The present application relates to the field of data processing technologies, and in particular, to a model deployment method, apparatus, device, and medium. The method is used for solving the problem that in the prior art, when a model is deployed, a professional is required to deploy, and the problem is limited by a window on the production line. In the embodiment of the application, the corresponding relation between the identifier of the model and the algorithm is stored in the electronic device, so that after the first identifier of the model is received, the first configuration algorithm corresponding to the model of the first identifier can be determined, the model is deployed by adopting the first configuration algorithm, the problems that the deployment is performed on the application device and is limited by a window on the production line can be solved, and a code for deploying each model is written respectively when a professional is not required to go on the production line of each model, so that the application of the model is facilitated.

Description

Model deployment method, device, equipment and medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a model deployment method, apparatus, device, and medium.
Background
With the growing and increasingly deeper application of deep learning in the field of artificial intelligence, emerging technologies represented by computer vision and natural language processing technologies have fallen into various industries. The current rapid deployment model can provide higher benefits for enterprises more rapidly.
However, most of the models used by the enterprises at present are deployed in windows on production lines, that is, the professional deploys the models on the application devices that need the application models at a fixed time, for example, model deployment is performed by the professional No. 1 every month, and when the professional deploys the models, the professional deploys the models on the application devices corresponding to the enterprises by writing codes that support the model operation. This requires the deployment of professionals, and is also limited by windows on production lines, which is not beneficial to the application of the model by enterprises.
Disclosure of Invention
The embodiment of the application provides a model deployment method, a model deployment device, model deployment equipment and a model deployment medium, which are used for solving the problems that in the prior art, when a model is deployed, a professional is required to deploy, and the problem is limited by a window on a production line.
In a first aspect, an embodiment of the present application provides a model deployment method, where the method includes:
receiving a trained model and a first identification of the model;
determining a first configuration algorithm corresponding to the first identifier according to a corresponding relation between the identifier of a pre-stored model and the algorithm, and deploying the model through the first configuration algorithm;
and generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
Further, the method further comprises:
receiving an application request sent by application equipment, wherein the application request carries a second identifier of a model to be used;
and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
Further, the method further comprises:
if a combined service generation request is received, acquiring an inference graph carried in the combined service generation request, a third identifier of a model to be combined and a fourth identifier corresponding to the combined service, wherein the third identifier is recorded in each node in the inference graph;
determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service;
combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain combined service;
and adding the corresponding relation between the fourth identification and the combined service in the corresponding relation between the stored identification of the model and the model service.
Further, after determining the first configuration algorithm corresponding to the first identifier and before deploying the model by the first configuration algorithm, the method further includes:
constructing a mirror image of the first configuration algorithm;
said deploying said model by said first configuration algorithm comprises:
and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
Further, the method further comprises:
and if the fourth model service corresponding to the fifth identifier currently meets a preset deactivation condition, deactivating the fourth model service and releasing resources occupied by the fourth model service.
Further, the step that the fourth model service corresponding to the fifth identifier currently meets the preset deactivation condition includes:
the received carried models to be used reach a set number for the application requests of the fifth identifier; or
The current moment is a preset deactivation moment of a fourth model service corresponding to the fifth identifier; or
The deployed time length of the fourth model service corresponding to the fifth identifier reaches a first preset time length.
Further, the method further comprises:
if the fourth model service corresponding to the deactivated fifth identifier currently meets the preset deployment condition, determining a second configuration algorithm corresponding to the fifth identifier according to the corresponding relation between the identifier of the model stored in advance and the algorithm, and deploying the model of the fifth identifier through the second configuration algorithm;
and generating a fourth model service corresponding to the deployed fifth identified model.
Further, the step of currently meeting the preset deployment condition by the fourth model service corresponding to the deactivated fifth identifier includes:
the current moment is a preset deployment moment of the fourth model service corresponding to the fifth identifier; or
The time length of the deactivated fourth model service corresponding to the deactivated fifth identification reaches a second preset time length.
Further, after the generating of the first model service corresponding to the deployed model and before the saving of the correspondence between the first identifier and the first model service, the method further includes:
judging whether the corresponding relation between the stored model identification and the model service contains the first identification or not;
the saving the correspondence between the first identifier and the first model service includes:
if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service;
and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
In a second aspect, an embodiment of the present application further provides a model deployment apparatus, where the apparatus includes:
the receiving module is used for receiving the trained model and the first identification of the model;
the processing module is used for determining a first configuration algorithm corresponding to the first identifier according to the corresponding relation between the identifier of the model and the algorithm which are stored in advance, and deploying the model through the first configuration algorithm; and generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
Further, the processing module is further configured to receive an application request sent by an application device, where the application request carries a second identifier of a model to be used; and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
Further, the processing module is further configured to, if a combined service generation request is received, obtain an inference graph carried in the combined service generation request, a third identifier of a model to be combined recorded in each node in the inference graph, and a fourth identifier corresponding to the combined service; determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service; combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain combined service; and adding the corresponding relation between the fourth identification and the combined service in the corresponding relation between the stored identification of the model and the model service.
Further, the processing module is further configured to construct a mirror image of the first configuration algorithm; and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
Further, the processing module is further configured to, if a fourth model service corresponding to the fifth identifier currently meets a preset disabling condition, disable the fourth model service, and release resources occupied by the fourth model service.
Further, the processing module is further configured to determine, according to a correspondence between identifiers of pre-stored models and algorithms, a second configuration algorithm corresponding to the fifth identifier if a fourth model service corresponding to the deactivated fifth identifier currently meets a preset deployment condition, and deploy the model of the fifth identifier through the second configuration algorithm; and generating a fourth model service corresponding to the deployed fifth identified model.
Further, the processing module is further configured to determine whether the stored correspondence between the identifier of the model and the model service includes the first identifier; if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service; and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes at least a processor and a memory, and the processor is configured to implement the steps of the model deployment method according to any one of the above when executing the computer program stored in the memory.
In a fourth aspect, the present application further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the model deployment method according to any one of the above.
In a fifth aspect, an embodiment of the present application further provides a computer program product, where the computer program product includes: computer program code for causing a computer to perform the steps of the model deployment method as described in any one of the above when said computer program code is run on a computer.
In the embodiment of the application, the electronic device receives a trained model and a first identifier of the model, determines a first configuration algorithm corresponding to the model of the first identifier according to a correspondence between the identifier of the model and the algorithm stored in advance, deploys the model through the first configuration algorithm, generates a first model service corresponding to the model after deployment, and stores the correspondence between the first identifier of the model and the first model service. In the embodiment of the application, the corresponding relation between the identifier of the model and the algorithm is stored in the electronic device, so that after the first identifier of the model is received, the first configuration algorithm corresponding to the model of the first identifier can be determined, the model is deployed by adopting the first configuration algorithm, the problems that the deployment is performed on the application device and is limited by a window on the production line can be solved, and a code for deploying each model is written respectively when a professional is not required to go on the production line of each model, so that the application of the model is facilitated.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic diagram of a model deployment process provided in an embodiment of the present application;
fig. 2 is a schematic flowchart of an inference chart provided in an embodiment of the present application;
FIG. 3 is a schematic process diagram of a service for generating a model according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a process for generating a composite service according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a model deployment apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will now be described in further detail with reference to the accompanying drawings, wherein like reference numerals refer to like elements throughout. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the embodiment of the application, the electronic device receives a trained model and a first identifier of the model, determines a first configuration algorithm corresponding to the model of the first identifier according to a correspondence between the identifier of the model and the algorithm stored in advance, deploys the model through the first configuration algorithm, generates a first model service corresponding to the model after deployment, and stores the correspondence between the first identifier of the model and the first model service.
In order to complete deployment of a model quickly and conveniently, embodiments of the present application provide a model deployment method, apparatus, device, and medium.
Example 1:
fig. 1 is a schematic diagram of a model deployment process provided in an embodiment of the present application, where the process includes the following steps:
s101: a trained model and a first identification of the model are received.
The model deployment method provided by the embodiment of the application is applied to the electronic equipment which has a connection relation with the application equipment, and the electronic equipment can be intelligent equipment such as a PC (personal computer) or a server.
In the embodiment of the application, when a user has a need for deploying a model, the trained model and the first identifier of the model can be sent to the electronic device through the device used by the user. The user may select the trained model and the first identifier of the model on a preset page of the device used by the user, and click a preset button, for example, a "submit" button, so that the electronic device can receive the trained model and the first identifier of the model. Specifically, the device used by the user transmits the code of the model to the electronic device. The device used by the user has a connection relationship with the electronic device.
In addition, in this embodiment of the application, the electronic device may also send a model obtaining request to the preset device according to a preset time interval, and after receiving the model obtaining request, if there is a model that is trained and is not sent to the electronic device, the preset device sends the model that is trained and is not sent to the electronic device and the first identifier of the model to the electronic device.
S102: determining a first configuration algorithm corresponding to the model of the first identifier according to the corresponding relation between the identifier of the model and the algorithm which is stored in advance, deploying the model through the first configuration algorithm, generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
In the embodiment of the application, in order to implement deployment of a model, a corresponding relationship between an identifier of the model and an algorithm is pre-stored in an electronic device, after the electronic device receives a pre-trained model and a first identifier of the model, the corresponding relationship between the identifier of the pre-stored model and the algorithm is determined, the algorithm corresponding to the first identifier is a first configuration algorithm, and after the electronic device determines the first configuration algorithm, the model is deployed through the first configuration algorithm.
In the embodiment of the present application, the correspondence between the identifier of the model and the algorithm is one-to-one, and is in the form of:
ALG i →MODEL i
wherein, MODEL i For identification of models, ALG i And an algorithm corresponding to the identified model.
In the embodiment of the present application, the algorithm corresponding to the identifier is configured by a professional in advance, and the professional may be configured with a preset number of algorithms in advance, for example, the algorithms may be ALGs respectively 1 、ALG 2 、ALG 3 、…,ALG i 、…,ALG n
In this embodiment, the algorithm includes a code corresponding to the algorithm and resources that depend when the model is deployed through the algorithm, where the resources include the number of Central Processing Unit (CPU) cores, the size of a CPU memory, the size of a Graphics Processing Unit (GPU) memory, and the like.
In the embodiment of the present application, the information included in one algorithm is as follows:
Figure BDA0003633767890000081
Code i =″/home/ap/text_cnn″
Resource i ={CpuCore:1,CpuMemory:2G,GpuMemory:2G}
wherein, ALG i Refers to the configuration algorithm, Code i Indicating the location, Resource, of the code store corresponding to the configuration algorithm i Representing the resources that are relied upon when deploying the model. The/home/ap/text _ cnn is a configuration algorithm ALG i The location where the corresponding code is specifically stored, CpuCore: 1 indicates that the number of specifically dependent CPU cores is 1, cpumememory: 2G indicates that the size of the CPU memory specifically depended on is 2G, Gpu memory: 2G indicates that the size of the GPU video memory specifically depended on is 2G.
In this embodiment of the application, after an electronic device deploys a model through a first configuration algorithm, a first model service corresponding to the deployed model may be generated, where the first model service is a model that can be run, that is, a model with a capability of processing data, the first model service includes a link for receiving data to be processed, after an application device acquires the first model service, the application device inputs data to be processed in the link in the first model service, and the first model service may perform corresponding processing on the data.
After the electronic equipment generates the first model service corresponding to the model, the corresponding relation between the first identification and the first model service is stored. Because the algorithm corresponding to the identifier is configured in advance, a professional does not need to write a corresponding code each time to deploy the model, and an ordinary user can achieve the deployment of the model. In addition, the electronic device is not an application device of the application model, so that the electronic device is not limited by a window on the production line, when the application device has a requirement for obtaining the model service, the electronic device can send the corresponding model service to the application device due to the connection relationship between the electronic device and the application device, and for the application device, the electronic device is not limited by the window on the production line, and the corresponding model service can be received at any time.
In the embodiment of the application, the corresponding relation between the identifier of the model and the algorithm is stored in the electronic device, so that after the first identifier of the model is received, the first configuration algorithm corresponding to the model of the first identifier can be determined, the model is deployed by adopting the first configuration algorithm, the problems that the deployment is performed on the application device and is limited by a window on the production line can be solved, and a code for deploying each model is written respectively when a professional is not required to go on the production line of each model, so that the application of the model is facilitated.
Example 2:
in order to implement the application of the model, on the basis of the foregoing embodiment, in an embodiment of the present application, the method further includes:
receiving an application request sent by application equipment, wherein the application request carries a second identifier of a model to be used;
and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
In this embodiment of the application, in order to implement application of a model, an electronic device may receive an application request sent by an application device, where the application device refers to a device in an enterprise that has a requirement for an application model, and when an employee of the enterprise has a requirement for applying the model, the employee of the enterprise may select a second identifier of the model to be used through the application device, and click a preset button, the electronic device may receive the application request sent by the application device, where the application request carries the second identifier of the model that the application device wants to obtain, that is, the second identifier of the model to be used.
After receiving an application request sent by the application device, the electronic device may obtain a second identifier of a model to be used carried in the application request, obtain a model service corresponding to the second identifier in a correspondence between a pre-stored identifier and the model service, and determine that the model service corresponding to the second identifier is the second model service. The electronic device, upon determining the second model service, may send the second model service to the application device. After receiving a second model service sent by the electronic device, the application device can directly apply the model through the second model service, the second model service is a model capable of processing data, the second model service comprises a link connected to the data to be processed, the link inputs the data corresponding to the data to be processed, and the second model service can correspondingly process the data. In this embodiment, the electronic device may provide an Application Program Interface (API), and the electronic device may send the second model service to the Application device through the API.
Example 3:
in order to generate a model service suitable for a complex application scenario, on the basis of the foregoing embodiments, in an embodiment of the present application, the method further includes:
if a combined service generation request is received, acquiring an inference graph carried in the combined service generation request, a third identifier of a model to be combined and a fourth identifier corresponding to the combined service, wherein the third identifier is recorded in each node in the inference graph;
determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifiers and the model services;
combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain combined service;
and adding the corresponding relation between the fourth identifier and the combined service in the stored corresponding relation between the identifiers and the model service.
In the embodiment of the application, the model service corresponding to the trained model is only applicable to simpler scenes, for example, only objects existing in the image can be identified. However, for some complex scenes, it is impossible to implement only through one model service, for example, in a natural language processing scene, the case of the text content is first converted, and then the category corresponding to the text content is obtained, wherein the category includes entertainment category, economy category, and the like. In order to meet more complex application scenarios, a method for generating a combined service is provided in the embodiments of the present application.
In the embodiment of the application, when there is a need for generating the composite service, a professional may input the inference graph through a preset page of a device used by the professional, specifically, a node included in the inference graph is selected first, an arrow is selected to establish a connection relationship between each selected node so as to obtain the inference graph, a third identifier of a model to be combined may be input in the selected node, in addition, a fourth identifier corresponding to the composite service may be input, and after the professional clicks a preset button, the electronic device may receive a composite service generation request. The inference graph comprises a plurality of nodes, the nodes have a connection relation, and each node further comprises a third identification of the model to be combined. The electronic device may be a device used by a professional, or may be a device having a connection relationship with a device used by a professional. The inference graph is a directed acyclic graph.
After receiving the composite service generation request, the electronic device may first obtain the inference graph carried in the composite service generation request, the third identifier of the model to be combined recorded in each node in the inference graph, and the fourth identifier corresponding to the composite service. After acquiring the third identifier of the model to be combined recorded in each node in the inference graph, the electronic device may determine the third model service corresponding to the third identifier according to the correspondence between the pre-stored identifier and the model service, and correspondingly combine the third model service corresponding to each node in the inference graph according to the connection relationship between the nodes in the inference graph, so as to obtain the combined service corresponding to the inference graph.
Specifically, if nodes in the inference graph are connected in sequence, and there is no case where a plurality of nodes are connected to one node or one node is connected to a plurality of nodes, that is, there is only a case where one node points to one node, when the electronic device combines the third model services corresponding to the nodes according to the connection relationship between the nodes in the inference graph, the output of the third model service corresponding to the node that is ordered before is used as the input of the third model service corresponding to the node that the node points to. For example, if node a points to node B, the output of the third model service corresponding to node a is used as the input of the third model service corresponding to node B.
If a plurality of nodes are connected with one node in the inference graph, that is, at least two nodes point to one node, the electronic device takes the output of the third model service corresponding to the at least two nodes as the input of the third model service corresponding to the nodes pointed by the at least two nodes according to the corresponding connection relationship in the inference graph, that is, the pointing relationship corresponding to the nodes. For example, if node C points to node D and node E also points to node D, the output of the third model service corresponding to node C and the output of the third model service corresponding to node E are used as the inputs of the third model service corresponding to node D.
If a situation that one node is connected with a plurality of nodes exists in the inference graph, that is, a situation that one node points to at least two nodes exists, the electronic device takes the outputs of the third model services corresponding to the nodes pointing to at least two nodes as the inputs of the third model services corresponding to the at least two nodes respectively according to the corresponding connection relation in the inference graph, that is, the pointing relation corresponding to the nodes. For example, if node F points to node G and node F also points to node H, the output of the third model service corresponding to node F is used as the input of the third model service corresponding to node G and node H, respectively.
In order to facilitate obtaining the corresponding combined service, after the electronic device generates the combined service, the electronic device adds the correspondence between the fourth identifier and the combined service to the correspondence between the stored identifier and the model service. In the embodiment of the application, the electronic device may generate the corresponding combined service according to the inference graph carried in the combined service generation request and the third identifier of the model to be combined recorded in each node in the inference graph. The combined service is a model which can be operated, the combined service comprises a link for receiving data to be processed, after the application equipment acquires the combined service, the application equipment inputs the data to be processed in the link in the combined service, and the combined service can perform corresponding processing on the data. Since the composite service comprises a plurality of model services, the composite service can meet complex application scenarios.
In the embodiment of the present application, in order to make the obtained processing result more accurate, before the data is processed by using the model service, the data may be preprocessed, specifically, in the embodiment of the present application, the data may be preprocessed by using the data processing service, after the data is processed by using the model service, the data may be further processed by reprocessing, a correspondence between an identifier of the data processing and the data processing service may be pre-stored in the electronic device, and an identifier corresponding to a data processing method may also be recorded in a node in the inference graph, where the preprocessing may be, for example, word segmentation, word removal, case conversion, specific entity replacement, simplified and traditional Chinese conversion in natural language processing; the data reprocessing process may be, for example, data field merging, data splicing, data weighted averaging, and the like. Specifically, in the embodiment of the present application, a flag is recorded in each node of the inference graph, and the flag may be a flag of data processing or a flag of a model.
Specifically, when the electronic device generates the combined service, the service corresponding to the identifier recorded in each node is acquired, and the service corresponding to the identifier recorded in each node is combined according to the connection relationship between each node in the inference graph. If the identifier of the data processing to be combined is recorded in a certain node, the corresponding data processing service is obtained, if the identifier of the model to be combined is recorded in a certain node, the corresponding model service is obtained, and the electronic device combines the data processing service corresponding to the identifier of the data processing recorded in each node or the model service corresponding to the identifier of the model according to the connection relationship between each node in the inference graph, so that the complex application scenarios are further met.
In the embodiment of the application, the method can meet complex application scenarios, for example, the method can perform de-adjective operation on the text content, then call two different model services to determine the category corresponding to the text content, and finally perform weighted averaging on the categories corresponding to the text content determined by the two different model services, thereby accurately determining the category corresponding to the text content.
Taking a data reprocessing algorithm as an example of data weighted average, it is assumed that classification results and corresponding probabilities obtained by two different classification model services are respectively:
m_SERVICE m ={label 1 :score m1 ,label 2 :score m2 }
m_SERVICE n ={label 1 :score n1 ,label 2 :score n2 }
wherein, m _ SERVICE m Serving one of two classification models, m _ SERVICE n Serving the other of the two classification models, label 1 :score m1 Representing classification model SERVICE m _ SERVICE m The determined classification result is label 1 Has a probability of score m1 ;label 2 :score m2 Representing classification model SERVICE m _ SERVICE m The determined classification result is label 2 Has a probability of score m2 ;label 1 :score n1 Representing classification model SERVICE m _ SERVICE n The determined classification result is label 1 Has a probability of score n1 ;label 2 :score n2 Representing classification model SERVICE m _ SERVICE n The determined classification result is label 2 Has a probability of score n2
The final results obtained were:
Figure BDA0003633767890000131
where result represents the final obtained result, w m Representing classification model SERVICE m _ SERVICE m The weight value corresponding to the result of (a),
Figure BDA0003633767890000133
serving m _ SERVICE for classification model m As a result of (a), w n Representing classification model SERVICE m _ SERVICE n The weight value corresponding to the result of (a),
Figure BDA0003633767890000132
serving m _ SERVICE for classification model n The result of (1). Wherein the weight value may be a node correspondingly recorded in the inference graph. label 1 Indicates the result of the classification, label 2 Represents another classification result, w m *score m1 +w n *score n1 Indicates the classification result is label 1 Probability of (score) m1 Representing classification model SERVICE m _ SERVICE m The determined classification result is label 1 Probability of (c), score n1 Representing classification model SERVICE m _ SERVICE n The determined classification result is label 1 Probability of (a), w m *score m2 +w n *score n2 Indicates the classification result is label 2 Probability of (c), score m2 Representing classification model SERVICE m _ SERVICE m The determined classification result is label 2 Probability of (c), score n2 Representation class model SERVICE m _ SERVICE n The determined classification result is label 2 The probability of (c).
Table 1 is a flowchart example of an inference graph provided in an embodiment of the present application.
Figure BDA0003633767890000141
TABLE 1
Wherein the "inference graph" column in Table 1 contains the composite service generationThe identifier of the composite SERVICE carried in the request, the "preprocessing" column includes an identifier of data processing performed before the data is processed by the model SERVICE, the "model" column includes an identifier corresponding to the model SERVICE, the "data reprocessing" column includes an identifier of data processing performed after the data is processed by the model SERVICE, and the identifier g _ SERVICE of the composite SERVICE indicated in the second row in table 1 1 The corresponding combined SERVICE is generated by combining p _ SERVICE 1 As m _ SERVICE 1 And m _ SERVICE is input, and 1 as p _ SERVICE 2 The input of (2); the third row in Table 1 refers to the identification g _ SERVICE of the composite SERVICE 2 The corresponding combined SERVICE is generated by using p _ SERVICE 1 As m _ SERVICE 1 Input of m _ SERVICE, m _ SERVICE 1 As m _ SERVICE 2 And m _ SERVICE is input, and 2 as p _ SERVICE 2 The input of (1); the fourth row in Table 1 refers to the identification g _ SERVICE of the composite SERVICE 3 The corresponding combined SERVICE is generated by using p _ SERVICE 1 Respectively as m _ SERVICE 1 And m _ SERVICE 2 Input of m _ SERVICE, m _ SERVICE 1 M _ SERVICE and 2 as p _ SERVICE 2 The input of (1); the fifth row in Table 1 refers to the identification g _ SERVICE of the composite SERVICE 4 The corresponding combined SERVICE is generated by using p _ SERVICE 1 As m _ SERVICE 1 Input of m _ SERVICE, m _ SERVICE 1 As p _ SERVICE 2 After p _ SERVICE, will 2 As m _ SERVICE 3 And m _ SERVICE is input, and 3 as p _ SERVICE 3 Is input.
Fig. 2 is a schematic flowchart of an inference graph provided in an embodiment of the present application.
As can be seen from fig. 2, the data processing service 1 and the data processing service 2 may be called first, the model service 1 and the model service 2 may be called after the data processing service 2 is called, the data processing service 3 may be called after the model service 1 and the model service 2 are called, that is, the output of the data processing service 1 is used as the input of the data processing service 2, the output of the data processing service 2 is used as the input of the model service 1 and the model service 2, and the output of the model service 1 and the model service 2 is used as the input of the data processing service 3.
Example 4:
in order to accurately complete deployment of a model, on the basis of the foregoing embodiments, in an embodiment of the present application, after determining a first configuration algorithm corresponding to the first identifier and before deploying the model by using the first configuration algorithm, the method further includes:
constructing a mirror image of the first configuration algorithm;
said deploying said model by said first configuration algorithm comprises:
and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
In the embodiment of the application, in order to provide the operating environment corresponding to the first configuration algorithm, so as to implement deployment of the model, after the electronic device obtains the first configuration algorithm corresponding to the first identifier, a mirror image corresponding to the first configuration algorithm may be constructed. In this embodiment, the electronic device may use kaniko to construct a mirror image corresponding to the first configuration algorithm. The kaniko is a tool for constructing a mirror image, and specifically, how to construct a mirror image corresponding to the first configuration algorithm is the prior art, and is not described herein again. In this embodiment of the present application, the mirror image may be referred to as an inference mirror image, where the mirror image may be understood as a micro running environment for a first configuration algorithm, where the mirror image includes algorithm logic that can provide a corresponding running environment and a Python foundation package that the first configuration algorithm depends on for running, and the mirror image is a foundation resource required by model deployment.
After the mirror image corresponding to the first configuration algorithm is constructed, the electronic device can operate the mirror image to generate a container corresponding to the mirror image, a corresponding operation environment is provided for the first configuration algorithm through the container, and after the electronic device obtains the container corresponding to the mirror image, the trained model is deployed in the container through the first configuration algorithm.
In this embodiment of the application, after the electronic device completes the mirror image construction, the mirror image may be uploaded to a port (Harbor) mirror image warehouse, and when there is a subsequent need to use the mirror image corresponding to the first configuration algorithm, the mirror image may be directly obtained in the Harbor mirror image warehouse, specifically, how to upload the mirror image to the Harbor mirror image warehouse, and how to obtain the mirror image in the Harbor mirror image warehouse are prior art, and details are not described herein.
Specifically, the method for constructing the mirror image corresponding to the first configuration algorithm in the embodiment of the present application may be: the electronic equipment adopts Job tasks of kubernets to construct and obtain the prediction codes corresponding to the first configuration algorithm, wherein Job is not executed any more after being suitable for executing the one-time work task. The electronic equipment acquires a kaniko construction basic mirror image, acquires a mirror image corresponding to a first configuration algorithm based on the basic mirror image, and finally uploads the successfully constructed mirror image to a Harbor mirror image warehouse after the mirror image is constructed.
Wherein, the mirror IMAGE constructed by the electronic device for the first configuration algorithm may be IMAGE i The first configuration algorithm may be ALG i
In addition, in the embodiment of the application, other devices having a connection relationship with the electronic device may also be used, and after the mirror image corresponding to the first configuration algorithm is constructed, the mirror image corresponding to the first configuration algorithm is sent to the electronic device.
Fig. 3 is a schematic process diagram of a generative model service according to an embodiment of the present disclosure.
After receiving the corresponding relation between the trained model and the identification of the model, the electronic equipment acquires the configuration algorithm corresponding to the identification of the model, constructs a mirror image corresponding to the configuration algorithm, runs the mirror image to generate a corresponding container, deploys the model in the container through the configuration algorithm to generate a model service, and the model service can process data. The configuration algorithm includes corresponding codes and resources dependent on the deployment model.
Fig. 4 is a schematic process diagram of generating a composite service according to an embodiment of the present application.
As can be seen from fig. 4, the professional is configured with the algorithm corresponding to the identifier of the model in advance, and the algorithm includes the corresponding code and the resource that is relied on when the model is deployed. After receiving the trained model and the identification of the model, the electronic equipment acquires a configuration algorithm corresponding to the identification of the model, constructs a mirror image corresponding to the configuration algorithm, runs the mirror image to generate a corresponding container, deploys the model in the container through the configuration algorithm, and generates a model service.
The electronic equipment acquires an inference graph carried in the combined service generation request and a third identifier of a model to be combined recorded in each node in the inference graph when receiving the combined service generation request; determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service; and combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain the combined service.
Example 5:
in order to save resources, on the basis of the foregoing embodiments, in an embodiment of the present application, the method further includes:
and if the fourth model service corresponding to the fifth identifier currently meets a preset stop condition, stopping the fourth model service and releasing resources occupied by the fourth model service.
Since many model services may be stored in the electronic device in the embodiment of the present application, but the current application device may not have a requirement for acquiring some model services, in the embodiment of the present application, the electronic device may store a corresponding deactivation condition for the model service corresponding to each identifier in advance, and if a fourth model service corresponding to a fifth identifier currently meets the preset deactivation condition, the electronic device deactivates the fourth model service and releases resources occupied by the fourth model service. Specifically, the released resources are CPU resources and GPU resources.
In order to save resources, on the basis of the foregoing embodiments, in this embodiment, the step of enabling the fourth model service corresponding to the fifth identifier to currently meet a preset deactivation condition includes:
the received carried models to be used reach a set number for the application requests of the fifth identifier; or
The current moment is a preset deactivation moment of a fourth model service corresponding to the fifth identifier; or
The deployed time length of the fourth model service corresponding to the fifth identifier reaches a first preset time length.
In this embodiment of the application, after the application device has obtained the model service, the subsequent application device does not have a requirement for obtaining the model service, and therefore, it may be that the fourth model service corresponding to the fifth identifier currently meets the preset deactivation condition, that the application request carrying the model to be used as the fifth identifier is received, and the number of the received application requests carrying the fifth identifier reaches a set number, and the set number may be any number, for example, 1.
Since most of application devices have a requirement for acquiring the model service at a fixed time, and after the application devices acquire the model service, the requirement for acquiring the model service may not exist any more subsequently, the electronic device may pre-store the shutdown time of the model service corresponding to each identifier, and the current time when the fourth model service corresponding to the fifth identifier meets the preset shutdown condition may be the current time when the fourth model service corresponding to the fifth identifier meets the preset shutdown time. The preset deactivation time may be any time, for example, ten times per day.
In this embodiment of the present application, because some application devices may acquire the model service according to a preset time interval, after the application devices acquire the model service, there is no need to acquire the model service in the preset time interval, in this embodiment of the present application, a fourth model service corresponding to a fifth identifier currently satisfies a preset deactivation condition, where a deployed time length of the fourth model service corresponding to the fifth identifier reaches a first preset time length, where the first preset time length may be the same as or different from a time interval corresponding to the application devices acquiring the fourth model service.
Example 6:
in order to accurately generate a corresponding model service, on the basis of the foregoing embodiments, in an embodiment of the present application, the method further includes:
if the fourth model service corresponding to the deactivated fifth identifier currently meets the preset deployment condition, determining a second configuration algorithm corresponding to the fifth identifier according to the corresponding relation between the identifier of the model stored in advance and the algorithm, and deploying the model of the fifth identifier through the second configuration algorithm;
and generating a fourth model service corresponding to the deployed fifth identified model.
In this embodiment of the present application, after a certain model service is deactivated, the application device may further have a requirement for obtaining and applying the model service, and therefore in this embodiment of the present application, the electronic device may store a corresponding deployment condition for the model service corresponding to each identifier in advance, and if a fourth model service corresponding to the deactivated fifth identifier currently meets the preset deployment condition, the electronic device regenerates the deployed fourth model service. Specifically, the electronic device determines an algorithm corresponding to a fifth identifier in the correspondence between the identifier of the model and the algorithm according to the correspondence between the identifier of the model and the algorithm stored in advance, and uses the algorithm as a second configuration algorithm, and after the electronic device obtains the second configuration algorithm, the electronic device deploys the model of the fifth identifier through the second configuration algorithm, so that a fourth model service corresponding to the deployed model of the fifth identifier can be generated.
The fourth model service corresponding to the deactivated fifth identifier, which currently meets the preset deployment condition, may be that a redeployment instruction carrying the fifth identifier is received.
In order to accurately generate the corresponding model service, on the basis of the foregoing embodiments, in an embodiment of the present application, the step of currently satisfying a preset deployment condition by the fourth model service corresponding to the deactivated fifth identifier includes:
the current moment is a preset deployment moment of the fourth model service corresponding to the fifth identifier; or
The time length of the deactivated fourth model service corresponding to the deactivated fifth identification reaches a second preset time length.
In this embodiment of the application, since the application device usually has a requirement for obtaining the model service at a fixed time, in this embodiment of the application, a deployment time may be saved in the electronic device in advance for the model service corresponding to each identifier, and a fourth model service corresponding to a fifth identifier currently meets a preset deployment condition, which may be a preset deployment time that is the fourth model service corresponding to the fifth identifier at the current time. That is, if the fourth model service corresponding to the fifth identifier is deactivated and the current time is the preset deployment time of the fourth model service corresponding to the fifth identifier, the electronic device may deploy the fourth model service corresponding to the fifth identifier.
In this embodiment of the application, because some due devices may obtain the model service according to a preset time interval, the fourth model service corresponding to the fifth identifier currently meets a preset deployment condition, where the time length for the fifth model service corresponding to the deactivated fifth identifier to be deactivated reaches a second preset time length, where the second preset time length may be the same as or different from the time interval for the application device to obtain the fourth model service.
Example 7:
in order to accurately generate a model service corresponding to an identifier of a stored model, on the basis of the foregoing embodiments, in an embodiment of the present application, after generating a first model service corresponding to a deployed model and before storing a corresponding relationship between the first identifier and the first model service, the method further includes:
judging whether the corresponding relation between the stored model identification and the model service contains the first identification or not;
the saving the correspondence between the first identifier and the first model service includes:
if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service;
and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
In the embodiment of the present application, the model may be optimized continuously, and when a certain model is optimized and the electronic device receives the identifier carrying the optimized model and model, the electronic device generates a model service corresponding to the optimized model based on the identification of the optimized model and the model, but the identity of the model is not changed, so the electronic device will store the correspondence between the identity of the model and the model service corresponding to the optimized model, since the correspondence between the identity of the model and the model service before the model is optimized is already stored in the electronic device, therefore, in this embodiment of the present application, the electronic device updates the stored correspondence between the identifier of the model and the model service corresponding to the identifier of the model by using the correspondence between the identifier corresponding to the model and the optimized model service corresponding to the model.
After the electronic device generates the first model service corresponding to the model of the first identifier, it may be determined whether the first model service is the model service corresponding to the optimized model by whether the first identifier is stored in the electronic device, so that the electronic device may online determine whether the stored correspondence between the identifier of the model and the model service includes the first identifier.
If the stored correspondence between the identifier of the model and the model service does not include the first identifier, it indicates that the first model service corresponding to the first identifier is not the optimized model service, and the electronic device may add the correspondence between the first identifier and the first model service to the stored correspondence between the model and the model service without replacing other model services with the first model service.
If the corresponding relationship between the identifier of the saved model and the model service contains a first identifier, it is indicated that the first model service corresponding to the first identifier is the optimized model service, the performance of the optimized model service is superior to that of the model service before optimization, and generally, when the model service corresponding to the first identifier is applied by the application device, the optimized model service should be applied by the application device. Therefore, in the corresponding relation between the identification of the model stored in the electronic equipment and the model service, the model service is the newly optimized model service, and the processing result is more accurate when the application equipment adopts the acquired information of the model service to process data.
In the embodiment of the application, after the new version model service with better inference performance, namely the optimized model service, is generated, the stored old version model service, namely the model service before optimization, can be quickly replaced, so that the inference capability of the inference service is increased.
Assume the original model service is:
Figure BDA0003633767890000211
wherein, ALG i Representing a corresponding configuration algorithm, IMAGE i Representing the mirror, MODEL, corresponding to the configuration algorithm i-1 Representing the model before optimization, m _ SERVICE i Representing the model service before optimization. The above representation is performed by a configuration algorithm ALG i Construction of corresponding mirror IMAGEs IMAGE i Providing a running environment for a configuration algorithm through mirroring, and deploying a MODEL MODEL through the configuration algorithm i-1 Generating a corresponding model SERVICE m _ SERVICE i
The optimized model service is as follows:
Figure BDA0003633767890000212
wherein, ALG i Representing a corresponding configuration algorithm, IMAGE i Representing the mirror, MODEL, corresponding to the configuration algorithm i-2 Representing the optimized model, m _ SERVICE i Representing the optimized model service. The above representation is performed by a configuration algorithm ALG i Construction of corresponding mirror IMAGE IMAGE i Providing a running environment for a configuration algorithm through mirroring, and deploying a MODEL MODEL through the configuration algorithm i-2 Generating a corresponding model SERVICE m _ SERVICE i ,MODEL i-2 Compared with the MODEL MODEL i-1 Is an updated model, so the model serves m _ SERVICE i A stronger reasoning ability is obtained.
In the embodiment of the application, the newly generated model service can also receive on-line test and problem feedback, so as to provide optimization suggestions for the next model. In order to verify the performance of the model service, the embodiment of the application mainly comprises two aspects of verification, namely the verification of reasoning ability and the verification of reasoning performance. The reasoning ability verification mainly focuses on the verification of model accuracy, such as text classification accuracy verification and the like; the reasoning performance verification mainly focuses on testing a high-concurrency and high-timeliness reasoning scene, and the important characteristic of the online reasoning service is that a large number of online reasoning requests are quickly responded, namely a large number of data can be quickly and accurately obtained, and the reasoning performance verification mainly tests the reasoning capability expression in the scene.
Example 8:
fig. 5 is a schematic structural diagram of a model deployment apparatus provided in an embodiment of the present application, where the apparatus includes:
a receiving module 501, configured to receive a trained model and a first identifier of the model;
a processing module 502, configured to determine, according to a correspondence between a pre-stored identifier of a model and an algorithm, a first configuration algorithm corresponding to the first identifier, and deploy the model through the first configuration algorithm; and generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
In a possible implementation manner, the processing module 502 is further configured to receive an application request sent by an application device, where the application request carries a second identifier of a model to be used; and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
In a possible implementation manner, the processing module 502 is further configured to, if a combined service generation request is received, obtain an inference graph carried in the combined service generation request, a third identifier of a model to be combined recorded in each node in the inference graph, and a fourth identifier corresponding to the combined service; determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service; combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain combined service; and adding the corresponding relation between the fourth identification and the combined service in the corresponding relation between the stored identification of the model and the model service.
In a possible implementation, the processing module 502 is further configured to construct a mirror image of the first configuration algorithm; and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
In a possible implementation manner, the processing module 502 is further configured to deactivate the fourth model service and release resources occupied by the fourth model service if the fourth model service corresponding to the fifth identifier currently meets a preset deactivation condition.
In a possible implementation manner, the processing module 502 is further configured to determine, if a fourth model service corresponding to the deactivated fifth identifier currently meets a preset deployment condition, a second configuration algorithm corresponding to the fifth identifier according to a correspondence between identifiers of pre-stored models and algorithms, and deploy the model of the fifth identifier through the second configuration algorithm; and generating a fourth model service corresponding to the deployed fifth identified model.
In a possible implementation manner, the processing module 502 is further configured to determine whether the stored correspondence between the identifier of the model and the model service includes the first identifier; if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service; and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
Example 9:
fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and on the basis of the foregoing embodiments, an embodiment of the present application further provides an electronic device, as shown in fig. 6, including: the system comprises a processor 601, a communication interface 602, a memory 603 and a communication bus 604, wherein the processor 601, the communication interface 602 and the memory 603 complete mutual communication through the communication bus 604;
the memory 603 has stored therein a computer program which, when executed by the processor 601, causes the processor 601 to perform the steps of:
receiving a trained model and a first identification of the model;
determining a first configuration algorithm corresponding to the first identifier according to a corresponding relation between the identifier of a pre-stored model and the algorithm, and deploying the model through the first configuration algorithm;
and generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
Further, the processor 601 is further configured to receive an application request sent by an application device, where the application request carries a second identifier of a model to be used;
and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
Further, the processor 601 is further configured to, if a composite service generation request is received, obtain an inference graph carried in the composite service generation request, a third identifier of a model to be combined recorded in each node in the inference graph, and a fourth identifier corresponding to the composite service;
determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service;
combining the third model service corresponding to each node in the inference graph according to the connection relationship between the nodes in the inference graph to obtain combined service;
and adding the corresponding relation between the fourth identification and the combined service in the corresponding relation between the stored identification of the model and the model service.
Further, the processor 601 is further configured to construct a mirror image of the first configuration algorithm; and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
Further, the processor 601 is further configured to deactivate the fourth model service and release the resource occupied by the fourth model service if the fourth model service corresponding to the fifth identifier currently meets a preset deactivation condition.
Further, the processor 601 is further configured to determine, according to a correspondence between identifiers of pre-stored models and algorithms, a second configuration algorithm corresponding to a fifth identifier if a fourth model service corresponding to the deactivated fifth identifier currently meets a preset deployment condition, and deploy the model of the fifth identifier through the second configuration algorithm;
and generating a fourth model service corresponding to the deployed fifth identified model.
Further, the processor 601 is further configured to determine whether the stored correspondence between the identifier of the model and the model service includes the first identifier; if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service;
and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
The communication bus mentioned in the above server may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the aforementioned processor.
The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
Example 10:
on the basis of the foregoing embodiments, an embodiment of the present application further provides a computer-readable storage medium, in which a computer program executable by an electronic device is stored, and when the program is run on the electronic device, the electronic device is caused to perform the following steps:
the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of:
receiving a trained model and a first identification of the model;
determining a first configuration algorithm corresponding to the first identifier according to a corresponding relation between the identifier of a pre-stored model and the algorithm, and deploying the model through the first configuration algorithm;
and generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
In one possible embodiment, the method further comprises:
receiving an application request sent by application equipment, wherein the application request carries a second identifier of a model to be used;
and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
In one possible embodiment, the method further comprises:
if a combined service generation request is received, acquiring an inference graph carried in the combined service generation request, a third identifier of a model to be combined recorded in each node in the inference graph and a fourth identifier corresponding to the combined service;
determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service;
combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain combined service;
and adding the corresponding relation between the fourth identification and the combined service in the corresponding relation between the stored identification of the model and the model service.
In a possible implementation, after determining the first configuration algorithm corresponding to the first identifier and before deploying the model by the first configuration algorithm, the method further includes:
constructing a mirror image of the first configuration algorithm;
said deploying said model by said first configuration algorithm comprises:
and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
In one possible embodiment, the method further comprises:
and if the fourth model service corresponding to the fifth identifier currently meets a preset stop condition, stopping the fourth model service and releasing resources occupied by the fourth model service.
In a possible embodiment, the step of the fourth model service corresponding to the fifth identifier currently meeting the preset deactivation condition includes:
the received carried models to be used reach a set number for the application requests of the fifth identifier; or
The current moment is a preset deactivation moment of a fourth model service corresponding to the fifth identifier; or
The deployed time length of the fourth model service corresponding to the fifth identifier reaches a first preset time length.
In one possible embodiment, the method further comprises:
if the fourth model service corresponding to the deactivated fifth identifier currently meets the preset deployment condition, determining a second configuration algorithm corresponding to the fifth identifier according to the corresponding relation between the identifier of the model stored in advance and the algorithm, and deploying the model of the fifth identifier through the second configuration algorithm;
and generating a fourth model service corresponding to the deployed fifth identified model.
In a possible embodiment, the step of currently meeting the preset deployment condition by the fourth model service corresponding to the deactivated fifth identifier includes:
the current moment is a preset deployment moment of the fourth model service corresponding to the fifth identifier; or
The time length of the deactivated fourth model service corresponding to the deactivated fifth identification reaches a second preset time length.
In a possible embodiment, after the generating of the first model service corresponding to the deployed model and before the saving of the correspondence between the first identifier and the first model service, the method further includes:
judging whether the corresponding relation between the stored model identification and the model service contains the first identification or not;
the saving the corresponding relation between the first identification and the first model service comprises:
if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service;
and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
Example 11:
the embodiment of the present application further provides a computer program product, and when executed by a computer, the computer program product implements the model deployment method described in any of the above method embodiments applied to an electronic device.
In the above embodiments, the implementation may be realized in whole or in part by software, hardware, firmware, or any combination thereof, and may be realized in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions which, when loaded and executed on a computer, cause a process or function according to an embodiment of the application to be performed, in whole or in part.
In the embodiment of the application, the corresponding relation between the identifier of the model and the algorithm is stored in the electronic device, so that after the first identifier of the model is received, the first configuration algorithm corresponding to the model of the first identifier can be determined, the model is deployed by adopting the first configuration algorithm, the problems that the deployment is performed on the application device and is limited by a window on the production line can be solved, and a code for deploying each model is written respectively when a professional is not required to go on the production line of each model, so that the application of the model is facilitated.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (19)

1. A method of model deployment, the method comprising:
receiving a trained model and a first identification of the model;
determining a first configuration algorithm corresponding to the first identifier according to a corresponding relation between the identifier of a pre-stored model and the algorithm, and deploying the model through the first configuration algorithm;
and generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
2. The method of claim 1, further comprising:
receiving an application request sent by application equipment, wherein the application request carries a second identifier of a model to be used;
and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
3. The method of claim 1, further comprising:
if a combined service generation request is received, acquiring an inference graph carried in the combined service generation request, a third identifier of a model to be combined and a fourth identifier corresponding to the combined service, wherein the third identifier is recorded in each node in the inference graph;
determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service;
combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain combined service;
and adding the corresponding relation between the fourth identification and the combined service in the corresponding relation between the stored identification of the model and the model service.
4. The method of claim 1, wherein after determining the first configuration algorithm corresponding to the first identifier and before deploying the model by the first configuration algorithm, the method further comprises:
constructing a mirror image of the first configuration algorithm;
said deploying said model by said first configuration algorithm comprises:
and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
5. The method of claim 1, further comprising:
and if the fourth model service corresponding to the fifth identifier currently meets a preset stop condition, stopping the fourth model service and releasing resources occupied by the fourth model service.
6. The method of claim 5, wherein the fourth model service corresponding to the fifth identifier currently satisfying the preset deactivation condition comprises:
the received carried models to be used reach a set number for the application requests of the fifth identifier; or
The current moment is a preset deactivation moment of a fourth model service corresponding to the fifth identifier; or
The deployed time length of the fourth model service corresponding to the fifth identifier reaches a first preset time length.
7. The method of claim 5 or 6, further comprising:
if the fourth model service corresponding to the deactivated fifth identifier currently meets the preset deployment condition, determining a second configuration algorithm corresponding to the fifth identifier according to the corresponding relation between the identifier of the model stored in advance and the algorithm, and deploying the model of the fifth identifier through the second configuration algorithm;
and generating a fourth model service corresponding to the deployed fifth identified model.
8. The method according to claim 7, wherein the fourth model service corresponding to the deactivated fifth identifier currently satisfies the preset deployment condition includes:
the current moment is a preset deployment moment of the fourth model service corresponding to the fifth identifier; or
The time length of the deactivated fourth model service corresponding to the deactivated fifth identification reaches a second preset time length.
9. The method of claim 1, wherein after the generating of the first model service corresponding to the deployed model and before the saving of the correspondence between the first identifier and the first model service, the method further comprises:
judging whether the corresponding relation between the stored model identification and the model service contains the first identification or not;
the saving the correspondence between the first identifier and the first model service includes:
if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service;
and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
10. A model deployment apparatus, the apparatus comprising:
the receiving module is used for receiving the trained model and the first identification of the model;
the processing module is used for determining a first configuration algorithm corresponding to the first identifier according to the corresponding relation between the identifier of the model and the algorithm which are stored in advance, and deploying the model through the first configuration algorithm; and generating a first model service corresponding to the deployed model, and storing the corresponding relation between the first identifier and the first model service.
11. The apparatus according to claim 10, wherein the processing module is further configured to receive an application request sent by an application device, where the application request carries a second identifier of a model to be used; and determining a second model service corresponding to the second identifier according to the corresponding relation between the stored identifier of the model and the model service, and sending the second model service to the application equipment.
12. The apparatus according to claim 10, wherein the processing module is further configured to, if a composite service generation request is received, obtain an inference graph carried in the composite service generation request, a third identifier of a model to be combined recorded in each node in the inference graph, and a fourth identifier corresponding to the composite service; determining a third model service corresponding to the third identifier according to the corresponding relation between the stored identifier of the model and the model service; combining the third model service corresponding to each node in the inference graph according to the connection relation between the nodes in the inference graph to obtain combined service; and adding the corresponding relation between the fourth identification and the combined service in the corresponding relation between the stored identification of the model and the model service.
13. The apparatus of claim 10, wherein the processing module is further configured to construct a mirror image of the first configuration algorithm; and running the mirror image to generate a corresponding container, and deploying the model in the container through the first configuration algorithm.
14. The apparatus of claim 10, wherein the processing module is further configured to deactivate the fourth model service if a fourth model service corresponding to the fifth identifier currently meets a preset deactivation condition, and release resources occupied by the fourth model service.
15. The apparatus according to claim 14, wherein the processing module is further configured to, if a fourth model service corresponding to the deactivated fifth identifier currently meets a preset deployment condition, determine a second configuration algorithm corresponding to the fifth identifier according to a correspondence between identifiers of pre-saved models and algorithms, and deploy the model of the fifth identifier through the second configuration algorithm; and generating a fourth model service corresponding to the deployed fifth identified model.
16. The apparatus according to claim 10, wherein the processing module is further configured to determine whether the first identifier is included in the correspondence between the saved identifier of the model and the model service; if the corresponding relation between the stored model identification and the model service comprises the first identification, updating the model service corresponding to the stored first identification by adopting the first identification and the corresponding first model service; and if the corresponding relation between the stored model identification and the model service does not contain the first identification, adding the corresponding relation between the first identification and the first model service into the corresponding relation between the stored model identification and the model service.
17. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to implement the steps of the model deployment method according to any of the claims 1-9 when executing a computer program stored in the memory.
18. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the model deployment method according to any one of claims 1 to 9.
19. A computer program product, characterized in that the computer program product comprises: computer program code for causing a computer to perform the steps of the model deployment method as claimed in any one of the preceding claims 1-9 when said computer program code is run on a computer.
CN202210498334.0A 2022-05-09 2022-05-09 Model deployment method, device, equipment and medium Pending CN114879953A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210498334.0A CN114879953A (en) 2022-05-09 2022-05-09 Model deployment method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210498334.0A CN114879953A (en) 2022-05-09 2022-05-09 Model deployment method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN114879953A true CN114879953A (en) 2022-08-09

Family

ID=82674503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210498334.0A Pending CN114879953A (en) 2022-05-09 2022-05-09 Model deployment method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114879953A (en)

Similar Documents

Publication Publication Date Title
US10963783B2 (en) Technologies for optimized machine learning training
US11038821B1 (en) Chatbot artificial intelligence
JP2019200551A5 (en)
CN111523640B (en) Training method and device for neural network model
US11494614B2 (en) Subsampling training data during artificial neural network training
US20200162412A1 (en) Automated prevention of sending objectionable content through electronic communications
US20230252991A1 (en) Artificial Assistant System Notifications
US11487949B2 (en) Image object disambiguation resolution using learner model based conversation templates
CN113657483A (en) Model training method, target detection method, device, equipment and storage medium
CN115331275A (en) Image processing method, computer system, electronic device, and program product
US11556848B2 (en) Resolving conflicts between experts' intuition and data-driven artificial intelligence models
CN112784905A (en) Data sample expansion method and device and electronic equipment
CN115358401A (en) Inference service processing method and device, computer equipment and storage medium
CN113094125B (en) Business process processing method, device, server and storage medium
CN113516251B (en) Machine learning system and model training method
WO2020081858A2 (en) Data analytics platform
US20200074277A1 (en) Fuzzy input for autoencoders
JPWO2019189249A1 (en) Learning devices, learning methods, and programs
CN110750295B (en) Information processing method, device, electronic equipment and storage medium
US20230177263A1 (en) Identifying chat correction pairs for trainig model to automatically correct chat inputs
CN114879953A (en) Model deployment method, device, equipment and medium
US20220300837A1 (en) Data mark classification to verify data removal
CN112817560B (en) Computing task processing method, system and computer readable storage medium based on table function
CN113986255A (en) Method and device for deploying service and computer-readable storage medium
CN114611609A (en) Graph network model node classification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination