CN111340232A - Online prediction service deployment method and device, electronic equipment and storage medium - Google Patents

Online prediction service deployment method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111340232A
CN111340232A CN202010096889.3A CN202010096889A CN111340232A CN 111340232 A CN111340232 A CN 111340232A CN 202010096889 A CN202010096889 A CN 202010096889A CN 111340232 A CN111340232 A CN 111340232A
Authority
CN
China
Prior art keywords
model
machine learning
prediction
prediction service
configuration information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010096889.3A
Other languages
Chinese (zh)
Inventor
乔彦辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010096889.3A priority Critical patent/CN111340232A/en
Publication of CN111340232A publication Critical patent/CN111340232A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present specification provides a deployment method of an online prediction service, including: acquiring sub-model configuration information of a prediction service to be deployed; according to the sub-model configuration information, at least two machine learning models are respectively deployed on the machine cluster corresponding to the prediction service, and registration addresses corresponding to the at least two machine learning models are respectively determined; acquiring multi-model combinational logic configuration information of a prediction service; and sending the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models to a prediction service gateway, and finishing the deployment of the prediction service by the prediction service gateway. The specification also provides a device, electronic equipment and a storage medium for realizing the online prediction service deployment.

Description

Online prediction service deployment method and device, electronic equipment and storage medium
Technical Field
One or more embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method and an apparatus for deploying an online prediction service, an electronic device, and a computer-readable storage medium.
Background
With the widespread popularity of machine learning, its application is also becoming wider and wider. For example, it may be applied to recommendation systems, voice assistants, and precision advertising systems. To implement the application of machine learning, two phases are typically required: training and prediction (also called inference). Training refers to the process of building a model from data. Prediction means that given an input then a prediction is made by the model. It should be noted that, for a trained model, it is usually required to be deployed as a prediction service before providing its prediction capability.
In the conventional technology, the deployed prediction service usually only comprises a single model and cannot support the combination of multiple models, so that the provided prediction capability is single and cannot meet the prediction requirements of diversified applications.
Disclosure of Invention
In view of this, one or more embodiments of the present disclosure provide a deployment method of an online prediction service, which can deploy a combination of multiple machine learning models as one prediction service, meet the requirement of an application program for performing prediction using a complex model, and improve the accuracy of the online prediction and the generalization capability of the online prediction.
In an embodiment of the present specification, the deployment method of the online prediction service may include:
acquiring sub-model configuration information of a prediction service to be deployed; wherein the prediction service corresponds to at least two machine learning models that have completed training; the sub-model configuration information comprises configuration information of each of the at least two machine learning models; the configuration information of the machine learning model includes: feature extraction logic and scoring logic of the machine learning model;
according to the sub-model configuration information, deploying the at least two machine learning models on a machine cluster corresponding to the prediction service respectively, and determining registration addresses corresponding to the at least two machine learning models;
acquiring multi-model combinational logic configuration information of the prediction service; wherein the multi-model combinational logic configuration information includes combinational logic between the at least two machine learning models;
sending the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models to a prediction service gateway, so that the prediction service gateway loads the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory to complete the deployment of the prediction service; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
Wherein respectively deploying the at least two machine learning models on a cluster of machines corresponding to the prediction service comprises: respectively executing the following steps for the configuration information of each machine learning model in the sub-model configuration information:
determining a machine cluster corresponding to the machine learning model; the machine cluster comprises a plurality of machines, and each machine runs with a plurality of prediction engines; each prediction engine is used for loading and executing a machine learning model of a corresponding configuration form;
distributing configuration information of the machine learning model to each machine in the machine cluster;
for any first machine of the machines, after receiving the configuration information of the machine learning model, the first machine analyzes the configuration form of the machine learning model based on the configuration information of the machine learning model; selecting a target prediction engine from the plurality of prediction engines based on the determined configuration modality; loading, by the target prediction engine, configuration information of the machine learning model into a memory to complete deployment of the first prediction service on the first machine;
receiving a registration request sent by each machine after the prediction service is deployed;
and responding to the registration request, registering the machines and distributing a uniform registration address for the machines.
The configuration form of the prediction service comprises one of a file configuration form, an autonomous coding form and a visualization configuration form;
when the configuration form of the prediction service is a file configuration form, a target prediction engine selected from the multiple prediction engines is a C + + prediction engine CMPS;
when the configuration form of the prediction service is an autonomous coding form, a target prediction engine selected from the multiple prediction engines is a python prediction engine PyMPS;
and when the configuration form of the prediction service is a visual configuration form, selecting a target prediction engine from the multiple prediction engines as a java prediction engine JMPS.
Wherein obtaining the multi-model combinational logic configuration information of the prediction service comprises: and acquiring a multi-model combinational logic expression configured by a user through a file configuration form, an autonomous coding form or a visual configuration form.
Wherein the combinatorial logic between the at least two machine learning models comprises: one of model segmented combinational logic, model integrated combinational logic, model chained combinational logic, and model weighted combinational logic.
In an embodiment of the present specification, the deployment method of the online prediction service may further include:
obtaining predefined metadata for the prediction service; wherein the metadata includes: identification information of the prediction service;
and sending the metadata to the prediction service gateway so that the prediction service gateway establishes a corresponding relation between the identification information of the prediction service and the multi-model combinational logic of at least two machine learning models.
One or more embodiments of the present specification further provide a method for deploying an online prediction service, including:
receiving multi-model combinational logic configuration information of a prediction service to be deployed, which is sent by a model prediction platform, and registration addresses of at least two machine learning models corresponding to the prediction service;
loading the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory to complete the deployment of the prediction service; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
Loading the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models into a memory comprises: loading the identification information of the at least two machine learning models and the registration addresses corresponding to the at least two machine learning models into a memory warehouse; and analyzing the multi-model combinational logic configuration information, determining a multi-model combinational logic expression of the at least two machine learning models, instantiating the multi-model combinational logic expression, and loading the multi-model combinational logic expression into a memory warehouse.
In an embodiment of the present specification, the deployment method of the online prediction service may further include:
receiving metadata for the predictive service; wherein the metadata includes: identification information of the prediction service;
and establishing a corresponding relation between the identification information of the prediction service and the multi-model combinational logic expression of the at least two machine learning models.
In an embodiment of the present specification, the deployment method of the online prediction service may further include:
receiving an access request of the prediction service sent by an application program, wherein the access request at least comprises identification information of the prediction service;
determining the multi-model combinational logic expression of the at least two machine learning models corresponding to the identification information of the prediction service according to the corresponding relation between the identification information of the prediction service and the multi-model combinational logic expression of the at least two machine learning models;
running the multi-model combinational logic expression;
initializing context parameters according to the multi-model combinational logic expression when a machine learning model needs to be called, determining a registration address of the machine learning model according to an identifier of the machine learning model, determining a target machine from all machines corresponding to the registration address, and sending a model calling request to the target machine, wherein the model calling request is used for indicating the target machine to execute the machine learning model through the target prediction engine so as to obtain an intermediate result;
performing combined calculation on intermediate results output by the at least two machine learning models according to the multi-model combinational logic expression to obtain a prediction result; and
and returning the prediction result to the application program.
Wherein determining a target machine from the machines corresponding to the registration address comprises: and determining a target machine from the machines corresponding to the registration address according to a load balancing algorithm.
One or more embodiments of the present specification provide an online prediction service deployment apparatus, including:
the device comprises a configuration information acquisition module, a configuration information acquisition module and a prediction service configuration module, wherein the configuration information acquisition module is used for acquiring sub-model configuration information of the prediction service to be deployed and multi-model combination logic configuration information of the prediction service; wherein the prediction service corresponds to at least two machine learning models that have completed training; the sub-model configuration information comprises configuration information of each of the at least two machine learning models; the configuration information of the machine learning model includes: feature extraction logic and scoring logic of the machine learning model; the multi-model combinational logic configuration information includes combinational logic between the at least two machine learning models;
the submodel deployment module is used for respectively deploying the at least two machine learning models on the machine cluster corresponding to the prediction service according to the submodel configuration information and respectively determining the registration addresses corresponding to the at least two machine learning models;
a multi-model combinational logic deployment module, configured to send the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models to a prediction service gateway, so that the prediction service gateway loads the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models into a memory, so as to complete deployment of the prediction service on the prediction service gateway; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
Wherein, the configuration information acquisition module includes:
the sub-model configuration information acquisition module is used for respectively configuring the configuration information of at least two machine learning models corresponding to the prediction service;
and the multi-model combinational logic configuration information acquisition module is used for configuring the combinational logic configuration information between at least two machine learning models corresponding to the prediction service.
Wherein, the submodel deployment module comprises:
the sub-model distribution module is used for respectively packaging the configuration information of the at least two machine learning models and respectively distributing the packaged configuration information to each machine of a designated machine cluster so as to complete the deployment of the plurality of machine learning models on each machine;
and the sub-model registration module is used for receiving a registration request sent by each machine after the prediction service is deployed, registering each machine and distributing a uniform registration address to each machine for each machine in each machine learning model of the at least two machine learning models.
The deployment apparatus of the online prediction service in one or more embodiments of the present specification further includes: a metadata definition module for defining metadata of the prediction service, wherein the metadata includes identification information of the prediction service.
One or more embodiments of the present specification further provide an online prediction service deployment apparatus, including:
the multi-model prediction service configuration information receiving module is used for receiving multi-model combinational logic configuration information of the prediction service to be deployed, which is sent by a model prediction platform, and registration addresses of at least two machine learning models corresponding to the prediction service;
a multi-model prediction service configuration loading module, configured to load the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory, so as to complete deployment of the prediction service; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
The multi-model prediction service configuration loading module comprises:
the sub-model loading unit is used for loading the identification information of the at least two machine learning models and the registration addresses corresponding to the at least two machine learning models into the memory warehouse;
and the multi-model combinational logic loading unit is used for analyzing the multi-model combinational logic configuration information, determining a multi-model combinational logic expression of the at least two machine learning models, instantiating the multi-model combinational logic expression and loading the multi-model combinational logic expression into the memory warehouse.
The multi-model prediction service configuration information receiving module is further used for receiving metadata of the prediction service; wherein the metadata includes: identification information of the prediction service;
the multi-model prediction service configuration loading module is further used for establishing a corresponding relation between the identification information of the prediction service and the multi-model combinational logic expression of the at least two machine learning models.
The deployment apparatus of the online prediction service in one or more embodiments of the present specification further includes:
the access module is used for receiving an access request which is sent by an application program and aims at a prediction service, and the access request at least comprises identification information of the prediction service;
the combinational logic determining module is used for determining a multi-model combinational logic expression corresponding to the prediction service according to the identification information of the prediction service; operating the multi-model combinational logic expression, initializing context parameters when calling of the machine learning model is needed, and triggering the service routing module to call the corresponding machine learning model;
the service routing module is used for determining the registration address of the machine learning model to be called according to the identifier of the machine learning model to be called; determining a target machine from the machines corresponding to the registered address; sending a model calling request to the target machine; wherein the model invocation request is used to instruct the target machine to execute the machine learning model via the target prediction engine to obtain an intermediate result;
and the multi-model combined calculation module is used for performing combined calculation on intermediate results output by the at least two machine learning models according to the multi-model combined logic expression of the at least two machine learning models to obtain a prediction result and returning the prediction result to the application program.
One or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the deployment method of the online prediction service when executing the program.
One or more embodiments of the present specification also propose a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the above-described deployment method of the online prediction service.
It can be seen from the foregoing technical solutions that, according to the deployment method and apparatus for online prediction service provided in one or more embodiments of the present specification, different machine learning models can be deployed for an online prediction service according to service requirements, and meanwhile, combinational logic between different machine learning models can be deployed, so as to obtain different complex combinational models, so as to meet requirements of the service on the complex models, thereby improving prediction accuracy or prediction generalization capability.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
Fig. 1 is a schematic view of an application scenario of an online prediction service deployment method according to one or more embodiments of the present disclosure;
FIG. 2 is a schematic diagram of an internal structure of a model prediction platform 10 according to one or more embodiments of the present disclosure;
fig. 3 is a schematic diagram illustrating an internal structure of the configuration information obtaining module 102 according to one or more embodiments of the present disclosure;
FIG. 4 is a schematic diagram of an internal structure of the sub-model deployment module 104 according to one or more embodiments of the present disclosure;
fig. 5 is a schematic diagram illustrating an internal structure of the prediction service gateway 30 according to one or more embodiments of the present disclosure;
FIG. 6 is a flow diagram of a method for online forecast service deployment in accordance with one or more embodiments of the subject specification;
FIG. 7 is a deployment method of a machine learning model provided by one or more embodiments of the present description;
FIG. 8 is a flow diagram of a method for providing online forecast service deployment in accordance with one or more embodiments of the subject specification;
FIG. 9 is a flow diagram of an online prediction method provided by one or more embodiments of the present disclosure; and
fig. 10 is a schematic structural diagram of an electronic device provided in one or more embodiments of the present specification.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in one or more embodiments of the specification is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
Fig. 1 is a schematic application scenario diagram of an online prediction service deployment method according to one or more embodiments of the present disclosure.
In fig. 1, a plurality of prediction services may be deployed on the service prediction gateway 30 through the model prediction platform 10, and each prediction service may correspond to a combined model obtained by combining at least two machine learning models, so as to implement a prediction service based on multi-model combination.
For example, the first prediction service in any of the plurality of prediction services may correspond to a combined model, and the combined model may be obtained by combining at least two machine learning models. Thus, the model prediction platform 10 deploys one prediction service at the prediction service gateway 30, which is equivalent to performing a combined deployment of multiple machine learning models at a time. The combined deployment described herein may include: the method comprises two parts of sub-model deployment of a plurality of machine learning models involved in prediction service and deployment of combinational logic among the plurality of machine learning models.
In the embodiment of the present disclosure, the plurality of machine learning models may be obtained after performing steps such as data analysis, feature engineering, model training, and model evaluation based on the feedback data and/or the business data of the business decision system 20.
In an embodiment of the present specification, the combined model obtained by performing combined deployment on the multiple machine learning models is used to extract user features based on the feature extraction logic of each machine learning model, score multiple predetermined behaviors of the user respectively by using the scoring logic of each machine learning model based on the extracted user features, perform combined calculation on the scores of each model according to the combined logic among the multiple machine learning models, and finally predict future behaviors of the user according to a combined calculation result.
For example, for a prediction service that predicts whether the comment of the user relates to violence, horror, politics, advertisement and the like, the model corresponding to the prediction service can be designed as a combined model. In this example, the combined model may include two sub-models: one is a predictor submodel a that predicts whether the reviews relate to violence, horror, and political content; the other is a predictor sub-model B that predicts whether the comment relates to the advertisement content. The input of the two submodels is a character string (for example, the comment of the user), and the output is a score, namely, the score is given to the comment of the user. Finally, the prediction service performs combined calculation on the outputs of the two submodels according to the combination logic of the two submodels to obtain a final prediction result of whether the comments of the user relate to the contents of violence, horror, politics, advertisements and the like.
In an embodiment of the present specification, the feature extraction logic and the scoring logic of each of the at least two machine learning models are recorded in the sub-model configuration information of the prediction service; combinatorial logic between multiple machine learning models can be recorded in multi-model combinatorial logic configuration information for the predictive service. Specifically, the sub-model configuration information may include: configuration information of the at least two machine learning models; wherein, the configuration information of a machine learning model at least comprises: feature extraction logic and scoring logic of the machine learning model. The configuration information may be identified by identification information of the machine learning model. The multi-model combinational logic configuration information may include an identification of the at least two machine learning models and combinational logic between the at least two machine learning models. In an embodiment of the present specification, the sub-model configuration information of the prediction service and the multi-model combinational logic configuration information of the prediction service may be carried in the form of one file or a plurality of files.
For example, assume that a prediction service corresponds to a composite model C, which relates to two machine learning models identified as a and B, and the combinational logic between the two machine learning models can be expressed as a combinational logic expression: x1 model a + X2 model B, where X1 and X2 are the weights of the models. In this example, the sub-model configuration information of the prediction service includes: configuration information of model a (feature extraction logic and scoring logic of model a, which may be identified by identification information of model a), and configuration information of model B (feature extraction logic and scoring logic of model B, which may be identified by identification information of model B); the multi-model combinational logic configuration information of the prediction service comprises: multi-model combinational logic expression: x1 model a + X2 model B (the multiple model combinational logic expression includes the identity of model a and the identity of model B).
In the embodiment of the present specification, when all the machine learning models related to implementing one online prediction service are deployed in the machine cluster related to the online prediction service and the combinational logic of the machine learning models is also deployed in the prediction service gateway 30, the deployment of the combinational model may be considered to be completed. At this time, the prediction service gateway 30 records the model identifier and the registration address of the machine learning model related to the online prediction service and the combinational logic among these machine learning models, so that the corresponding machine learning model can be called according to the request of the service party, and the output result can be combined to perform online prediction. Since the prediction service gateway 30 may be deployed with a plurality of prediction services, identification information of one prediction service may be set for each prediction service, and a relationship between the identification information and the multi-model combinational logic may be established. On the other hand, to enable access to any of the predicted services, it is necessary that the prediction service gateway 30 provide service routing call capability, such as vipserver or RPC services. In addition, the forecast service gateway 30 may provide a uniform access interface for multiple forecast services. It should be noted that the prediction service gateway 30 may also be a machine cluster including a plurality of machines. In the case that the prediction service gateway 30 is a machine cluster, the deployment operation of the multi-model combinational logic needs to be completed on each machine of the machine cluster.
The above process of online prediction using a combined model may include: the business decision system 20 may send a forecast request to the forecast service gateway 30. After receiving the prediction request, the prediction service gateway 30 may first determine the multi-model combinational logic corresponding to the prediction service through the combinational model corresponding to the pre-deployed prediction service, and then determine the registration address of each machine learning model to be called. Then, the prediction service gateway 30 may call each machine learning model according to the above combination logic and the registration address of each machine learning model, and perform combination calculation on the outputs of each machine learning model according to the combination logic to obtain a prediction result. Finally, the prediction service gateway 30 may return the prediction result to the business decision system 20 after obtaining the prediction result.
Fig. 2 is a schematic diagram of an internal structure of the model prediction platform 10 according to one or more embodiments of the present disclosure. The specific implementation of the model prediction platform 10 will be described in detail below with reference to fig. 2.
As shown in fig. 2, the model prediction platform 10 may include: a configuration information acquisition module 102, a sub-model deployment module 104, and a multi-model combinational logic deployment module 106.
The configuration information obtaining module 102 may be configured to obtain sub-model configuration information and multi-model combinational logic configuration information of the prediction service to be deployed.
In embodiments of the present description, the prediction service to be deployed described above will involve multiple machine learning models. As described above, the prediction service submodel configuration information may include configuration information of a machine learning model related to the prediction service (including feature extraction logic and scoring logic of the machine learning model). The multi-model combinational logic configuration information of the prediction service may include multi-model combinational logic among a plurality of machine learning models involved in the prediction service, and may be presented in the form of a multi-model combinational logic expression.
The submodel deployment module 104 may be configured to deploy the at least two machine learning models on the machine cluster corresponding to the prediction service according to the submodel configuration information of the prediction service, and determine registration addresses corresponding to the at least two machine learning models respectively.
In an embodiment of the present specification, the deploying of a machine learning model may specifically refer to distributing configuration information of the machine learning model to each machine of a designated machine cluster, and each machine in the machine cluster loads received configuration information of the machine learning model into a memory through a prediction engine installed in the machine cluster, so as to complete instantiation of the deployed machine learning model. And after instantiation, the sub-model deployment module 104 will also assign a uniform registration address (also called domain name) to each machine.
The multi-model combinational logic deployment module 106 may be configured to push the multi-model combinational logic configuration information of the prediction service to be deployed and the registration addresses corresponding to the multiple machine learning models to the prediction service gateway 30, so that the prediction service gateway 30 completes the deployment of the multi-model combinational logic.
In an embodiment of the present specification, the multi-model combinational logic configuration information of the prediction service to be deployed may include: and predicting a multi-model combinational logic expression among a plurality of machine learning models involved by the service, wherein the multi-model combinational logic expression contains identification information of the plurality of machine learning models.
At this time, after receiving the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models, the prediction service gateway 30 loads the received configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models into the memory to complete the deployment of the prediction service. The deployed prediction service is used for predicting the future behavior of the user based on the feature extraction logic and the scoring logic of the at least two machine learning models and the combination logic between the at least two machine learning models.
In an embodiment of the present specification, the combination logic between the plurality of machine learning models may include one of the following combinations:
and (3) model segmentation and combination: before the machine learning model is called, the class of the input data is judged, and the machine learning model to be called is determined according to the class of the input data, namely, different machine learning models are called for different types of input data, namely, different machine learning models are adopted for different types of input data for prediction. For example, for a combined model that is segmented using gender characteristics, model a may be invoked for prediction for boys; and for girls, model B is called for prediction.
Model integration and combination: and respectively and simultaneously calling a plurality of machine learning models, selecting one or more machine learning models with higher weight from the machine learning models, and combining the machine learning models to obtain a final output result.
And (3) model chain combination: and arranging a plurality of machine learning models according to a chain structure, wherein the output of the former machine learning model is used as the input of the next machine learning model, and the output of the last machine learning model is used as the final output result. For example, for three machine learning models arranged in a chain: the model A is input as the input of the combined model, the output of the model A is input of the model B, the output of the model B is input of the model C, and the output of the model C is output of the combined model.
And (3) weighted combination of models: and performing combined calculation on the outputs of the plurality of machine learning models according to one arithmetic logic, and taking the obtained calculation result as the final output result of the combined model. For example, it can be set that the combined model C is a combined model of model a and model B, and the combination logic is: x1 model a + X2 model B. The above combinational logic represents: weights X1 and X2 are set for model A and model B, respectively, and the output of model A and the output of model B are subjected to weighted summation, and the summation result is used as the final output result of combined model C.
It can be seen that the combination modes among the above models may be various, that is, the model prediction platform 10 may deploy a plurality of different machine learning models for an online prediction service according to actual service requirements, and may also deploy combination logic among the plurality of machine learning models, so as to obtain various complex combination models, so as to meet the requirements of the service decision system 20 on the complex models, thereby improving the accuracy of the prediction service and the generalization capability of the prediction service.
Furthermore, the model prediction platform 10 may further include: metadata definition module 108. The metadata definition module 108 is used to define metadata for the prediction service. The metadata herein may include, but is not limited to, identification information of the prediction service. The metadata may also include name information, input and output parameters, etc. of the prediction service. In this specification, the identification information of the predicted service may refer to a service number (serviceID) and a version number (version).
It should be noted that the metadata definition module 108 may provide a plurality of languages, such as C + + language, python language, and java language, to define the metadata.
It can be seen that, by using the model prediction platform 10, a combination of multiple machine learning models can be deployed as a prediction service, so as to meet the demand of an application program for performing prediction by using a complex model, and improve the accuracy of online prediction and the generalization capability of online prediction.
Furthermore, the model prediction platform 10 according to the embodiment of the present disclosure can provide unified metadata management for prediction services in different configuration forms, provide unified building and deployment capabilities for prediction services in different configuration forms, and build prediction services in different configuration forms into configurations that can be loaded by different prediction engines, so as to facilitate production analysis.
Fig. 3 is a schematic diagram of an internal structure of the configuration information obtaining module 102 according to one or more embodiments of the present disclosure. A specific implementation of the configuration information obtaining module 102 will be described in detail below with reference to fig. 3.
As shown in fig. 3, the configuration information obtaining module 102 may include: a sub-model configuration information obtaining module 302 and a multi-model combinational logic configuration information obtaining module 304.
The sub-model configuration information obtaining module 302 is configured to configure configuration information of each machine learning model related to the prediction service.
In an embodiment of the present specification, the sub-model configuration information obtaining module 302 may support the following configuration modes: a file configuration modality, an autonomous coding modality, and a visualization configuration modality.
In the embodiments of the present specification, the file configuration is suitable for users with certain encoding capability. The configuration information obtained based on the configuration form may include the following two result files: a feature extraction file and a model resource file. The feature extraction file is used to describe feature extraction logic, including but not limited to feature mapping and feature selection. The model resource file is used to describe the scoring logic.
In the embodiment of the present specification, the above autonomous coding mode is suitable for a user with a strong coding capability, for example, the user may write python code autonomously. It will be appreciated that in this configuration, the feature extraction logic and the scoring logic are described by the code being written.
In the embodiments of the present specification, the visualization configuration modality described above is suitable for a user with weak encoding capability. The configuration information obtained based on the configuration form may be a Directed Acyclic Graph (DAG), also referred to as a DAG flow. The DAG flow may include rule components, condition components, custom script components, feature operator components, and algorithm components, among others. It is to be understood that in this configuration, the feature extraction logic and the scoring logic are described by combinations of various components.
It should be noted that, for the above file configuration mode, the sub-model configuration information acquisition module 302 is supported by C + + language at the bottom layer. For the above-described autonomous coding mode, the sub-model configuration information obtaining module 302 is supported at the bottom layer by the python language. The open source machine learning framework of python language is rich and diverse, such as tenserflow, scinit-leann, xgboost lightgbm caffe, etc. For the visual configuration mode, the sub-model configuration information acquisition module 302 is supported by java language at the bottom layer. The hot deployment capability of the java language can enable the configuration of a user to be automatically compiled into byte code second-level loading, and the quick service is realized.
The multi-model combinational logic configuration information obtaining module 304 is configured to configure combinational logic configuration information between at least two machine learning models involved in the prediction service.
In the embodiment of the present specification, similar to the sub-model configuration information obtaining module 302, the multi-model combinational logic configuration information obtaining module 304 may also support the following configuration modes: a file configuration modality, an autonomous coding modality, and a visualization configuration modality. The user can realize the configuration of the configuration information of the multi-model combinational logic through the above configuration forms. The combinational logic configuration information between the at least two machine learning models may be embodied as a multi-model combinational logic expression. For example, the X1 model a + X2 model B described above.
It can be seen that the configuration information obtaining module 102 according to the embodiment of the present disclosure may provide different configuration forms for users with different encoding capabilities. In addition, the configuration information acquisition module 102 may provide a plurality of encoding languages to cope with different backgrounds and capabilities of the machine learning framework.
Fig. 4 is a schematic internal structural diagram of the sub-model deployment module 104 according to one or more embodiments of the present disclosure. The specific implementation of the sub-model deployment module 104 will be described in detail below with reference to fig. 4.
As shown in fig. 4, the submodel deployment module 104 may include: a sub-model distribution module 402 and a sub-model registration module 404.
The sub-model distribution module 402 is configured to respectively package configuration information of each machine learning model related to the prediction service, and distribute the packaged configuration information of each machine learning model to each machine of the designated machine cluster, so as to complete deployment of the at least two machine learning models on each machine.
In the embodiments of the present specification, each machine in the machine cluster may run a plurality of prediction engines. The prediction engine is used for completing instantiation of the deployed machine learning model, or loading model configuration information of the prediction service into a memory, and then calling the deployed machine learning model based on the configuration information of the preloaded machine learning model when an application program accesses the deployed machine learning model to obtain a scoring result of the machine learning model. The bottom layer of the prediction engine can be packaged with different machine learning frameworks.
Additionally, the various prediction engines referred to herein may include, but are not limited to, a C + + prediction engine (CMPS for short), a python prediction engine (PyMPS for short), and a java prediction engine (JMPS for short). The CMPS provides high-performance bottom layer prediction capability, the bottom layer integrates feature acquisition, feature extraction, model node arrangement support, TensorFlow, pssmart, cafe and other deep learning models, and a Field Programmable Gate Array (FPGA) is integrated at the same time to provide heterogeneous computing capability. PyMPS provides autonomous python prediction service coding capability. The JMPS provides flexible and visual predictive service orchestration capabilities.
Based on the underlying support languages of the various configuration modalities and the capabilities provided by the various prediction engines, it can be derived: loading and analyzing and executing a feature extraction file and a model resource file which are obtained based on the file position form by CMPS, and loading and analyzing and executing a python code which is written based on an autonomous coding form by PyMPS; and loading and parsing by the JMPS to execute the DAG flow combined based on the visualization configuration modality.
The sub-model registration module 404 is configured to register each machine after each machine completes deployment of the machine learning model for one machine learning model, for example, record an IP address of each machine, and assign a uniform registration address (also called a domain name) to each machine.
Specifically, in the deployment process of a machine learning model, each machine sends a registration request to the sub-model registration module 404 after completing the deployment of the machine learning model, and the sub-model registration module 404 allocates a uniform registration address to each machine after receiving the registration request of each machine.
It can be seen that the sub-model deployment module 104 also provides a plurality of prediction engines, so that it can adapt to the configuration information of different configuration forms.
Fig. 5 is a schematic diagram of an internal structure of the prediction service gateway 30 according to one or more embodiments of the present disclosure. The specific implementation of the prediction service gateway 30 will be described in detail below with reference to fig. 5.
As shown in fig. 5, the prediction service gateway 30 may include: a multi-model prediction service configuration information receiving module 502 and a multi-model prediction service configuration loading module 504.
In an embodiment of the present specification, the multi-model prediction service configuration information receiving module 502 may be configured to receive multi-model combinational logic configuration information of a prediction service to be deployed, which is sent by the model prediction platform 10, and registration addresses of at least two machine learning models corresponding to the prediction service.
In an embodiment of the present specification, the multi-model combinational logic configuration information of the prediction service to be deployed and the registration addresses of the at least two machine learning models corresponding to the prediction service may be carried in the form of one file or multiple files.
The multi-model prediction service configuration loading module 504 may be configured to load the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory, so as to complete instantiation of the prediction service, that is, complete deployment of the prediction service; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
In an embodiment of the present specification, the multi-model prediction service configuration loading module 504 may specifically include: the deployment of the prediction service is realized by two parts, namely a sub-model loading unit and a multi-model combined logic loading unit.
Specifically, the sub-model loading unit may be configured to load the identification information of the at least two machine learning models and the registration addresses corresponding to the at least two machine learning models into a memory repository. For example, in an embodiment of the present specification, a mapping Map < String1, String2> modelRegistyMap may be loaded in the memory repository, where String1 may represent identification information of the machine learning model; string2 may represent the registered address corresponding to the machine learning model. After the mapping is loaded, the registration address corresponding to the machine learning model can be searched from the modelRegistyMap according to the identification information of the machine learning model.
The multi-model combinational logic loading unit may be configured to analyze the multi-model combinational logic configuration information to obtain the multi-model combinational logic expression, and load the multi-model combinational logic expression into the memory warehouse after instantiating the multi-model combinational logic expression. For example, in an embodiment of the present specification, a Map < String, Expression > mutilmoderegistrationmap may be loaded in the memory repository, where String may represent identification information of the prediction service; expression may represent a multi-model combinational logic Expression. After the mapping is loaded, the multi-model group and the logic expression corresponding to the prediction service can be searched from the mutilModelRegistryMap according to the identification information of the prediction service.
The multi-model combinational logic loading unit may further analyze the multi-model combinational logic configuration information to obtain identification information of at least two machine learning models related to the prediction service, and verify whether a mapping corresponding to the identification information of the at least two machine learning models has been loaded in the memory warehouse.
The multi-model combinational logic loading unit can further verify whether the analyzed multi-model combinational logic expression meets the relevant regulations, and if so, the subsequent instantiation and loading operations can be further executed; otherwise, a corresponding error prompt may be output to the model prediction platform 10.
As described above, in the embodiment of the present disclosure, the prediction service gateway 30 may also be a machine cluster including a plurality of machines. In this case, it is necessary to complete the deployment of the multi-model combinational logic on each machine of the machine cluster, that is, to load the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models in the memory of each machine by the method.
In an embodiment of the present specification, the multi-model prediction service configuration information receiving module 502 may be further configured to receive metadata of the prediction service; wherein the metadata includes at least: name information, identification information, input and output parameters of the prediction service. At this time, the multi-model prediction service configuration loading module 504 may be further configured to establish a correspondence between the identification information of the prediction service and the multi-model combinational logic expression of the at least two machine learning models, so as to access the prediction service.
Since the prediction service gateway 30 may be deployed with a plurality of prediction services, in order to implement access to any one of the prediction services, the prediction service gateway 30 needs to provide a service routing call capability, such as a vipserver or RPC service. In addition, the forecast service gateway 30 may provide a uniform access interface for multiple forecast services. Therefore, the prediction service gateway 30 may further include: an access module 506, a combinational logic determination module 508, a service routing module 510, and a multi-model combination calculation module 512.
The access module 506 may be configured to receive an access request for a predictive service from an application.
In an embodiment of the present specification, the access request includes at least identification information of the prediction service.
The combinational logic determining module 508 may be configured to determine that the predicted service corresponds to a combinational model based on the identification information of the predicted service, and determine a multi-model combinational logic expression corresponding to the identification information of the predicted service according to the established correspondence; and when the multi-model combinational logic expression is operated and the machine learning model needs to be called, initializing context parameters and triggering the service routing module 510 to call the corresponding machine learning model.
The service routing module 510 may be configured to determine a registration address of the machine learning model to be called according to the identifier of the machine learning model to be called; determining a target machine from the machines corresponding to the registered address; sending a model calling request to the target machine; the model call request is used for instructing the target machine to execute the machine learning model through the target prediction engine so as to obtain an intermediate result.
The multi-model combination calculation module 512 may be configured to perform combination calculation on intermediate results output by the at least two machine learning models according to a multi-model combination logic expression between the at least two machine learning models to obtain a prediction result, and return the prediction result to the application program.
The above is a description of the structure of the model prediction platform 10 and the prediction service gateway 30. It is understood that the model prediction platform 10 can complete the deployment of the prediction service on the prediction service gateway 30, and the prediction service can be implemented by a combination of at least two machine learning models, and the following description will be made with reference to the accompanying drawings.
Fig. 6 is a flowchart of a deployment method of an online prediction service according to an embodiment of the present disclosure. The execution subject of the method may be a device with processing capabilities: a server or system or platform, such as model prediction platform 10 of fig. 1.
As shown in fig. 6, the deployment method of the online prediction service may specifically include:
step 602, obtaining sub-model configuration information of the prediction service to be deployed.
Optionally, before obtaining the sub-model configuration information of the prediction service to be deployed, predefined metadata of the prediction service may be obtained. As previously described, the predefined metadata may include at least identification information of the predicted service (e.g., serviceID + version). The metadata may also include name information, input and output parameters, etc. of the prediction service.
In an embodiment of the present specification, the prediction service corresponds to at least two machine learning models that have completed training. The sub-model configuration information at least includes configuration information of each of the at least two machine learning models. Wherein, the configuration information of the machine learning model at least comprises: feature extraction logic and scoring logic of the machine learning model.
The machine learning model herein may include, but is not limited to, Deep learning models (e.g., Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), text classification algorithm (TEXTCNN), Deep Neural Network (Deep Neural Networks, DNN, and Wide & Deep, etc.), Natural Language Processing (NLP) models, and so on.
In some embodiments of the present disclosure, the sub-model configuration information of the prediction service may be configured based on any one of the three configuration modes (i.e., a file configuration mode, an autonomous coding mode, and a visualization configuration mode). For the three configuration modes, reference may be made to the above description, which is not repeated herein. It should be noted that, the step 602 may be executed by the configuration information obtaining module 102 in fig. 2.
Step 604, deploying the at least two machine learning models on the machine cluster corresponding to the prediction service respectively according to the sub-model configuration information, and determining registration addresses corresponding to the at least two machine learning models respectively.
Step 606, obtaining multi-model combinational logic configuration information of the prediction service; wherein the configuration information of the multi-model combinational logic at least includes combinational logic between the at least two machine learning models.
In an embodiment of the present specification, multi-model combinational logic configuration information configured by a user through a file configuration form, an autonomous coding form, or a visualization configuration form may be acquired.
Step 608, sending the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models to a prediction service gateway.
In an embodiment of the present specification, after receiving the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models, the prediction service gateway loads the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory to complete deployment of the prediction service on the prediction service gateway; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
In the embodiment of the present specification, for the configuration information of each machine learning model in the sub-model configuration information, the deployment of one machine learning model may be implemented by the method shown in fig. 7. As shown in fig. 7, the machine learning model deployment method may include:
step 702, determining a machine cluster corresponding to the machine learning model, wherein the machine cluster comprises a plurality of machines, and each machine runs a plurality of prediction engines; wherein each prediction engine is configured to load and execute a machine learning model of a corresponding configuration modality.
Here, the machine cluster corresponding to each machine learning model related to the prediction service may be preset, and one machine cluster may include a plurality of machines, where each machine may run a plurality of prediction engines, and each prediction engine is used to load and execute the prediction service of the corresponding configuration form. It is understood that the various prediction engines herein can include, but are not limited to, CMPS, PyMPS, JMPS, etc., and each prediction engine functions as described above and will not be repeated herein. In addition, each machine may be pre-loaded with other operating environments, such as external dependency libraries, etc.
Step 704, distributing the configuration information of the machine learning model to each machine in the machine cluster.
Specifically, the configuration information of one machine learning model may be packaged first, and then the packaged configuration information is distributed to each machine in the machine cluster corresponding to the machine learning model. After receiving the distributed configuration information, any one of the devices may analyze the configuration form of the configuration information based on the configuration information. And then, selecting a target prediction engine from the multiple prediction engines based on the determined configuration form, and loading configuration information into a memory through the target prediction engine to complete the deployment of the machine learning model. The deployed machine learning model is used for participating in predicting future behaviors of the user based on feature extraction logic and scoring logic.
The configuration form of the configuration information may be determined based on a writing language of the configuration information. The programming language of the configuration information may include any one of the following: c + + language, python language, java language, etc. In one example, when the writing language of the configuration information is C + +, the configuration form of the prediction service may be a file configuration form. When the writing language of the configuration information is python, the configuration form of the prediction service may be an autonomous coding form. When the writing language of the configuration information is java language, the configuration form of the prediction service may be a visual configuration form.
In addition, the process of selecting the target prediction engine based on the configuration form may be: and when the configuration form of the prediction service is a file configuration form, selecting the CMPS from the multiple prediction engines as a target prediction engine. When the configuration mode of the prediction service is an autonomous coding mode, selecting PyMPS from the multiple prediction engines as a target prediction engine. When the configuration form of the prediction service is a visualization configuration form, selecting JMPS from the multiple prediction engines as a target prediction engine.
Finally, the process of loading the configuration information into the memory through the target prediction engine described in this specification is an instantiation process of the machine learning model. It will be appreciated that after instantiation of a machine learning model, an instantiation object will result. The instantiated object may then be executed directly to complete the prediction when the machine learning model is invoked.
It should be noted that, the above steps 702 and 704 may be performed by the sub-model distribution module 402 in fig. 4.
It will be appreciated that after performing the completion step 704, the deployment of the machine learning model is completed. After the deployment of the machine learning model is completed, in order to facilitate access to the prediction service by an external application (e.g., the business decision system 20 in fig. 1), the model prediction platform 10 may further perform the following steps:
step 706, receiving the registration request sent by each machine after completing the deployment of the prediction service.
Step 708, responding to the registration request, registering the machines, and assigning a uniform registration address to the machines.
In an embodiment of the present specification, the registering specifically may include: the IP address of each machine is recorded and a uniform registration address (also called domain name), which may also be called the registration address of the predictive service, is assigned to each machine. Then, a corresponding relationship between the registration address and the identifier of the machine learning model can be established, so that the machine learning model can be accessed based on the registration address in the following.
In an embodiment of the present specification, the registration addresses assigned to the machines deploying the same machine learning model may be the same, while the registration addresses assigned to the machines deploying different machine learning models are different.
It should be noted that the above steps 706 and 708 may be performed by the sub-model registration module 404 in fig. 4.
In a word, the deployment method of the machine learning model provided by the specification can provide various configuration forms for a user, so that the problem of use cost of various users in various scenes of machine learning can be solved, for example, visual dragging is provided for users with weak encoding capability, autonomous encoding is provided for users with strong programming capability, file configuration is provided for users with certain capability, and meanwhile, the packaging dimensions of different scenes are different, and only the corresponding capability is exposed to the user. In addition, the scheme also provides a plurality of prediction engines supporting different languages, so that the multi-language appeal of the prediction engines in the machine learning development process is met, and meanwhile, the transverse expansion is supported. Finally, the C + + prediction engine provided by the scheme adopts a special high-performance prediction engine for deep learning, simultaneously supports chained execution, is provided with a characteristic operator calculation library and a model analysis loading library, provides a small-batch execution for meeting the calling problem of a large-scale personalized recommendation scene, provides a flexible external library loading capacity, supports nlp and a machine learning algorithm, can be autonomously dragged to meet the assembly of various algorithm component capacities, and is rapidly applied in the fields of risk identification, credit scoring and nlp.
In an embodiment of the present specification, the method for deploying a prediction service may further include:
firstly, acquiring predefined metadata of the prediction service; wherein the metadata may include: identification information of the predicted service.
In an embodiment of the present specification, the metadata may further include: name information, input, and output parameters of the prediction service.
Secondly, the metadata is sent to the prediction service gateway so that the prediction service gateway establishes correspondence between the identification information of the prediction service and the combinational logic between the at least two machine learning models.
Fig. 8 is a flowchart of a deployment method of an online prediction service according to an embodiment of the present disclosure. The execution subject of the method may be a device with processing capabilities: a server or system or platform, such as might be prediction services gateway 30 of fig. 1, or the like.
As shown in fig. 8, the deployment method of the online prediction service may specifically include:
step 802, receiving multi-model combinational logic configuration information of the prediction service to be deployed sent by the model prediction platform 10 and registration addresses of at least two machine learning models corresponding to the prediction service.
Step 804, loading the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models into a memory to complete the deployment of the prediction service.
In an embodiment of the present specification, the step 804 may include:
firstly, loading the identification information of the at least two machine learning models and the registration addresses corresponding to the at least two machine learning models into a memory warehouse;
and secondly, analyzing the multi-model combinational logic configuration information, determining a multi-model combinational logic expression of the at least two machine learning models, instantiating the multi-model combinational logic expression, and loading the multi-model combinational logic expression into a memory warehouse.
Specifically, as mentioned above, the loading may refer to establishing a mapping and adding the mapping to the memory store.
In embodiments of the present description, the post-deployment prediction service may be configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
In some embodiments of the present specification, the deployment method of the online prediction service may further include:
step 806, receiving metadata of the prediction service; wherein the metadata may include: identification information of the predicted service.
In an embodiment of the present specification, the metadata may further include: name information, input, and output parameters of the prediction service.
Step 808, establishing a correspondence between the identification information of the prediction service and the combinational logic between the at least two machine learning models.
In an embodiment of the present specification, in step 808, a correspondence between the identification information of the prediction service and the multi-model combinational logic expression between the at least two machine learning models may be established.
The above is a description of the deployment process of any of the predictive services. It will be appreciated that after deployment of the prediction service is complete, predictions can be made based on the prediction service. A method for performing online prediction based on the prediction service will be described in detail with reference to the accompanying drawings.
Fig. 9 is a flowchart of an online prediction method according to an embodiment of the present disclosure. The execution subject of the method may be a device with processing capabilities: a server or system or platform, such as might be prediction services gateway 30 of fig. 1, or the like.
As shown in fig. 9, the online prediction method may specifically include:
step 902, receiving an access request of the predicted service sent by an application program, where the access request at least includes identification information of the predicted service.
In an embodiment of the present specification, the prediction service may be configured to extract user features based on feature extraction logics of a plurality of corresponding machine learning models, score a plurality of predetermined behaviors of the user based on the extracted user features using scoring logic, perform combined computation on scoring results of the plurality of machine learning models according to combinational logic among the plurality of machine learning models, and use the computation results to predict future behaviors of the user.
The access request may be sent by an application (e.g., the business decision system 20 in fig. 1) based on the HTTP protocol or the TR protocol, which may include at least identification information of the predicted service, i.e., ServiceID + Version.
Step 904, determining a multi-model combinational logic expression between the at least two machine learning models corresponding to the identification information of the prediction service according to the correspondence established in the step 808.
Step 906, the multi-model combinational logic expression is run.
In an embodiment of the present specification, the executing the multi-model combinational logic expression may include:
according to the multi-model combinational logic expression, when a machine learning model needs to be called:
initializing context parameters;
determining a registration address of the machine learning model according to the identifier of the machine learning model;
determining a target machine from the machines corresponding to the registration address; and
sending a model call request to the target machine, wherein the model call request is used for instructing the target machine to execute the machine learning model through the target prediction engine so as to obtain an intermediate result.
In the embodiment of the present specification, the identities of the multiple machine learning models corresponding to the prediction service and the multi-model combinational logic expression between the multiple machine learning models may be determined according to the identification information (i.e., serviceID + version) of the prediction service based on the correspondence between the established identification information of the prediction service and the combinational logic between the at least two machine learning models corresponding to the prediction service.
In an embodiment of the present specification, each machine herein may be deployed with multiple machine learning models, where the multiple machine learning models deployed on each machine include at least the machine learning model currently requested to be accessed.
In one implementation of the present description, the target device may be determined from the devices corresponding to the registered addresses according to a load balancing algorithm.
In an embodiment of the present specification, the model invocation request at least includes an identification of the machine learning model. The model invocation request is used for instructing the target machine to execute the machine learning model through the target prediction engine so as to obtain an intermediate result.
And 908, performing combined calculation on intermediate results output by the at least two machine learning models according to the multi-model combinational logic expression between the at least two machine learning models to obtain a prediction result.
Step 910, returning the prediction result to the application program.
In embodiments of the present description, the prediction results may be returned to business decision system 20 in fig. 1, for example.
For example, assuming that a prediction service C corresponds to a combined model C, after the prediction service C is deployed, the prediction service gateway 30 records a multi-model combined logical expression X1 × model a + X2 × model B corresponding to the identification information C of the prediction service, and a machine learning model related to the expression: and the registered addresses corresponding to the model A and the model B. In the above method, after receiving an access request for the predicted service c, first, the identification information c of the predicted service is acquired. Then, a multi-model combinational logic expression X1 model a + X2 model B corresponding to the identification information c is determined from the identification information c. After the expression is operated, two machine learning model models A and B are found to be required to be called concurrently, and then the registration addresses of the model A and the model B are respectively determined according to the identification information of the model A and the model B. And then, after initializing context parameters, respectively calling the model A and the model B according to the registration addresses to obtain the outputs a and B of the two models. And finally, obtaining a prediction result X1 a + X2 b according to the multi-model combinational logic expression, and returning the prediction result to the application program. It can be seen that, in the above example, by the deployed prediction service C for the combined model C, concurrent invocation of two sub models can be implemented, and the final output result of the combined model C is the result of combined calculation of the output results of the two sub models.
It should be noted that the step 902 may be performed by the access module 506 in fig. 5, the step 904 and 906 may be performed by the combinational logic determination module 508 and the service routing module 510 in fig. 5, and the step 908 and 910 may be performed by the multi-model combination calculation module 512 in fig. 5.
In short, by the online prediction service-based prediction method provided by the specification, a plurality of machine learning models can be combined and deployed into one prediction service, the requirement of an application program for prediction by using a complex model is met, and the accuracy of online prediction and the generalization capability of online prediction are improved.
Furthermore, in the embodiment of the present specification, the model prediction platform 10 and the prediction service gateway 30 may be regarded as one electronic device or a cluster of several electronic devices, and therefore, the internal structure of the model prediction platform 10 and the prediction service gateway 30 may be as shown in fig. 10, including: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. The processor 1010, the memory 1020, the input/output interface 1030, and the communication interface 1040 are communicatively connected to each other within the device via a bus 1050.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and may also store the modules of the model prediction platform 10 and the prediction service gateway 30 provided in the embodiment of the present specification, and when the technical solution provided in the embodiment of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called and executed by the processor 1010.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The input/output interface 1030 may be used to connect input/output modules for information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (for example, USB, network cable, etc.), and can also realize communication in a wireless mode (for example, mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path that transfers information between various components of the device, such as the processor, memory, input/output interfaces, and communication interfaces.
It should be noted that although the above-described device shows only a processor, a memory, an input/output interface, a communication interface and a bus, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The technical carrier involved in payment in the embodiments of the present specification may include Near Field Communication (NFC), WIFI, 3G/4G/5G, POS machine card swiping technology, two-dimensional code scanning technology, barcode scanning technology, bluetooth, infrared, Short Message Service (SMS), Multimedia Message (MMS), and the like, for example.
The biometric features related to biometric identification in the embodiments of the present specification may include, for example, eye features, voice prints, fingerprints, palm prints, heart beats, pulse, chromosomes, DNA, human teeth bites, and the like. Wherein the eye pattern may include biological features of the iris, sclera, etc.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Computer-readable media of the present embodiments, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
In addition, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown in the provided figures, for simplicity of illustration and discussion, and so as not to obscure one or more embodiments of the disclosure. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the understanding of one or more embodiments of the present description, and this also takes into account the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (21)

1. A deployment method of an online prediction service comprises the following steps:
acquiring sub-model configuration information of a prediction service to be deployed; wherein the prediction service corresponds to at least two machine learning models that have completed training; the sub-model configuration information comprises configuration information of each of the at least two machine learning models; the configuration information of the machine learning model includes: feature extraction logic and scoring logic of the machine learning model;
according to the sub-model configuration information, deploying the at least two machine learning models on a machine cluster corresponding to the prediction service respectively, and determining registration addresses corresponding to the at least two machine learning models;
acquiring multi-model combinational logic configuration information of the prediction service; wherein the multi-model combinational logic configuration information includes combinational logic between the at least two machine learning models;
sending the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models to a prediction service gateway, so that the prediction service gateway loads the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory to complete the deployment of the prediction service; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
2. The method of claim 1, wherein the respectively deploying the at least two machine learning models on the cluster of machines corresponding to the prediction service comprises:
respectively executing the following steps for the configuration information of each machine learning model in the sub-model configuration information:
determining a machine cluster corresponding to the machine learning model; the machine cluster comprises a plurality of machines, and each machine runs with a plurality of prediction engines; each prediction engine is used for loading and executing a machine learning model of a corresponding configuration form;
distributing configuration information of the machine learning model to each machine in the machine cluster;
for any first machine of the machines, after receiving the configuration information of the machine learning model, the first machine analyzes the configuration form of the machine learning model based on the configuration information of the machine learning model; selecting a target prediction engine from the plurality of prediction engines based on the determined configuration modality; loading, by the target prediction engine, configuration information of the machine learning model into a memory to complete deployment of the first prediction service on the first machine;
receiving a registration request sent by each machine after the prediction service is deployed;
and responding to the registration request, registering the machines and distributing a uniform registration address for the machines.
3. The method of claim 2, wherein the configuration modality of the predictive service includes one of a file configuration modality, an autonomous coding modality, and a visualization configuration modality;
when the configuration form of the prediction service is a file configuration form, a target prediction engine selected from the multiple prediction engines is a C + + prediction engine CMPS;
when the configuration form of the prediction service is an autonomous coding form, a target prediction engine selected from the multiple prediction engines is a python prediction engine PyMPS;
and when the configuration form of the prediction service is a visual configuration form, selecting a target prediction engine from the multiple prediction engines as a java prediction engine JMPS.
4. The method of claim 1, the obtaining multi-model combinational logic configuration information for the predictive service comprising: and acquiring a multi-model combinational logic expression configured by a user through a file configuration form, an autonomous coding form or a visual configuration form.
5. The method of claim 1, the combinatorial logic between the at least two machine learning models comprising: one of model segmented combinational logic, model integrated combinational logic, model chained combinational logic, and model weighted combinational logic.
6. The method of claim 1, further comprising:
obtaining predefined metadata for the prediction service; wherein the metadata includes: identification information of the prediction service;
and sending the metadata to the prediction service gateway so that the prediction service gateway establishes a corresponding relation between the identification information of the prediction service and the multi-model combinational logic of at least two machine learning models.
7. A deployment method of an online prediction service comprises the following steps:
receiving multi-model combinational logic configuration information of a prediction service to be deployed, which is sent by a model prediction platform, and registration addresses of at least two machine learning models corresponding to the prediction service;
loading the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory to complete the deployment of the prediction service; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
8. The method of claim 7, wherein the loading the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into memory comprises:
loading the identification information of the at least two machine learning models and the registration addresses corresponding to the at least two machine learning models into a memory warehouse;
and analyzing the multi-model combinational logic configuration information, determining a multi-model combinational logic expression of the at least two machine learning models, instantiating the multi-model combinational logic expression, and loading the multi-model combinational logic expression into a memory warehouse.
9. The method of claim 7, further comprising:
receiving metadata for the predictive service; wherein the metadata includes: identification information of the prediction service;
and establishing a corresponding relation between the identification information of the prediction service and the multi-model combinational logic expression of the at least two machine learning models.
10. The method of claim 9, further comprising:
receiving an access request of the prediction service sent by an application program, wherein the access request at least comprises identification information of the prediction service;
determining the multi-model combinational logic expression of the at least two machine learning models corresponding to the identification information of the prediction service according to the corresponding relation between the identification information of the prediction service and the multi-model combinational logic expression of the at least two machine learning models;
running the multi-model combinational logic expression;
initializing context parameters according to the multi-model combinational logic expression when a machine learning model needs to be called, determining a registration address of the machine learning model according to an identifier of the machine learning model, determining a target machine from all machines corresponding to the registration address, and sending a model calling request to the target machine, wherein the model calling request is used for indicating the target machine to execute the machine learning model through the target prediction engine so as to obtain an intermediate result;
performing combined calculation on intermediate results output by the at least two machine learning models according to the multi-model combinational logic expression to obtain a prediction result; and
and returning the prediction result to the application program.
11. The method of claim 10, wherein the determining a target machine from the machines corresponding to the registered address comprises:
and determining a target machine from the machines corresponding to the registration address according to a load balancing algorithm.
12. An online prediction service deployment apparatus, comprising:
the device comprises a configuration information acquisition module, a configuration information acquisition module and a prediction service configuration module, wherein the configuration information acquisition module is used for acquiring sub-model configuration information of the prediction service to be deployed and multi-model combination logic configuration information of the prediction service; wherein the prediction service corresponds to at least two machine learning models that have completed training; the sub-model configuration information comprises configuration information of each of the at least two machine learning models; the configuration information of the machine learning model includes: feature extraction logic and scoring logic of the machine learning model; the multi-model combinational logic configuration information includes combinational logic between the at least two machine learning models;
the submodel deployment module is used for respectively deploying the at least two machine learning models on the machine cluster corresponding to the prediction service according to the submodel configuration information and respectively determining the registration addresses corresponding to the at least two machine learning models;
a multi-model combinational logic deployment module, configured to send the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models to a prediction service gateway, so that the prediction service gateway loads the configuration information of the multi-model combinational logic and the registration addresses corresponding to the at least two machine learning models into a memory, so as to complete deployment of the prediction service on the prediction service gateway; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
13. The apparatus of claim 12, wherein the configuration information acquisition module comprises:
the sub-model configuration information acquisition module is used for respectively configuring the configuration information of at least two machine learning models corresponding to the prediction service;
and the multi-model combinational logic configuration information acquisition module is used for configuring the combinational logic configuration information between at least two machine learning models corresponding to the prediction service.
14. The apparatus of claim 12, wherein the submodel deployment module comprises:
the sub-model distribution module is used for respectively packaging the configuration information of the at least two machine learning models and respectively distributing the packaged configuration information to each machine of a designated machine cluster so as to complete the deployment of the plurality of machine learning models on each machine;
and the sub-model registration module is used for receiving a registration request sent by each machine after the prediction service is deployed, registering each machine and distributing a uniform registration address to each machine for each machine in each machine learning model of the at least two machine learning models.
15. The apparatus of claim 12, further comprising:
a metadata definition module for defining metadata of the prediction service, wherein the metadata includes identification information of the prediction service.
16. An online prediction service deployment apparatus, comprising:
the multi-model prediction service configuration information receiving module is used for receiving multi-model combinational logic configuration information of the prediction service to be deployed, which is sent by a model prediction platform, and registration addresses of at least two machine learning models corresponding to the prediction service;
a multi-model prediction service configuration loading module, configured to load the multi-model combinational logic configuration information and the registration addresses corresponding to the at least two machine learning models into a memory, so as to complete deployment of the prediction service; wherein the post-deployment prediction service is configured to predict future behavior of the user based on feature extraction logic and scoring logic of the at least two machine learning models and combinatorial logic between the at least two machine learning models.
17. The apparatus of claim 16, wherein the multi-model prediction service configuration loading module comprises:
the sub-model loading unit is used for loading the identification information of the at least two machine learning models and the registration addresses corresponding to the at least two machine learning models into the memory warehouse;
and the multi-model combinational logic loading unit is used for analyzing the multi-model combinational logic configuration information, determining a multi-model combinational logic expression of the at least two machine learning models, instantiating the multi-model combinational logic expression and loading the multi-model combinational logic expression into the memory warehouse.
18. The apparatus of claim 17, wherein,
the multi-model prediction service configuration information receiving module is further used for receiving metadata of the prediction service; wherein the metadata includes: identification information of the prediction service;
the multi-model prediction service configuration loading module is further used for establishing a corresponding relation between the identification information of the prediction service and the multi-model combinational logic expression of the at least two machine learning models.
19. The apparatus of claim 18, further comprising:
the access module is used for receiving an access request which is sent by an application program and aims at a prediction service, and the access request at least comprises identification information of the prediction service;
the combinational logic determining module is used for determining a multi-model combinational logic expression corresponding to the prediction service according to the identification information of the prediction service; operating the multi-model combinational logic expression, initializing context parameters when calling of the machine learning model is needed, and triggering the service routing module to call the corresponding machine learning model;
the service routing module is used for determining the registration address of the machine learning model to be called according to the identifier of the machine learning model to be called; determining a target machine from the machines corresponding to the registered address; sending a model calling request to the target machine; wherein the model invocation request is used to instruct the target machine to execute the machine learning model via the target prediction engine to obtain an intermediate result;
and the multi-model combined calculation module is used for performing combined calculation on intermediate results output by the at least two machine learning models according to the multi-model combined logic expression of the at least two machine learning models to obtain a prediction result and returning the prediction result to the application program.
20. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 11 when executing the program.
21. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 11.
CN202010096889.3A 2020-02-17 2020-02-17 Online prediction service deployment method and device, electronic equipment and storage medium Pending CN111340232A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010096889.3A CN111340232A (en) 2020-02-17 2020-02-17 Online prediction service deployment method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010096889.3A CN111340232A (en) 2020-02-17 2020-02-17 Online prediction service deployment method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111340232A true CN111340232A (en) 2020-06-26

Family

ID=71185353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010096889.3A Pending CN111340232A (en) 2020-02-17 2020-02-17 Online prediction service deployment method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111340232A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111866855A (en) * 2020-07-17 2020-10-30 江苏海全科技有限公司 Intelligent terminal initialization activation method
CN112540835A (en) * 2020-12-10 2021-03-23 北京奇艺世纪科技有限公司 Operation method and device of hybrid machine learning model and related equipment
CN113448545A (en) * 2021-06-23 2021-09-28 北京百度网讯科技有限公司 Method, apparatus, storage medium, and program product for machine learning model servitization
CN113608762A (en) * 2021-07-30 2021-11-05 烽火通信科技股份有限公司 Deep learning multi-model unified deployment method and device
WO2022161230A1 (en) * 2021-02-01 2022-08-04 大唐移动通信设备有限公司 Model update method and apparatus in communication system, and storage medium
CN115185543A (en) * 2022-09-09 2022-10-14 腾讯科技(深圳)有限公司 Model deployment method, packing method, device, equipment and storage medium
CN116775047A (en) * 2023-08-18 2023-09-19 北京偶数科技有限公司 Deployment method, device and medium of AI model service cluster architecture

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615081A (en) * 2018-09-26 2019-04-12 阿里巴巴集团控股有限公司 A kind of Model forecast system and method
CN110543950A (en) * 2019-09-27 2019-12-06 宁波和利时智能科技有限公司 Industrial big data modeling platform
CN110555550A (en) * 2019-08-22 2019-12-10 阿里巴巴集团控股有限公司 Online prediction service deployment method, device and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615081A (en) * 2018-09-26 2019-04-12 阿里巴巴集团控股有限公司 A kind of Model forecast system and method
CN110555550A (en) * 2019-08-22 2019-12-10 阿里巴巴集团控股有限公司 Online prediction service deployment method, device and equipment
CN110543950A (en) * 2019-09-27 2019-12-06 宁波和利时智能科技有限公司 Industrial big data modeling platform

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111866855A (en) * 2020-07-17 2020-10-30 江苏海全科技有限公司 Intelligent terminal initialization activation method
CN111866855B (en) * 2020-07-17 2021-01-08 江苏海全科技有限公司 Intelligent terminal initialization activation method
CN112540835A (en) * 2020-12-10 2021-03-23 北京奇艺世纪科技有限公司 Operation method and device of hybrid machine learning model and related equipment
CN112540835B (en) * 2020-12-10 2023-09-08 北京奇艺世纪科技有限公司 Method and device for operating hybrid machine learning model and related equipment
WO2022161230A1 (en) * 2021-02-01 2022-08-04 大唐移动通信设备有限公司 Model update method and apparatus in communication system, and storage medium
CN113448545A (en) * 2021-06-23 2021-09-28 北京百度网讯科技有限公司 Method, apparatus, storage medium, and program product for machine learning model servitization
CN113448545B (en) * 2021-06-23 2023-08-08 北京百度网讯科技有限公司 Method, apparatus, storage medium and program product for machine learning model servitization
CN113608762A (en) * 2021-07-30 2021-11-05 烽火通信科技股份有限公司 Deep learning multi-model unified deployment method and device
CN113608762B (en) * 2021-07-30 2024-05-17 烽火通信科技股份有限公司 Deep learning multi-model unified deployment method and device
CN115185543A (en) * 2022-09-09 2022-10-14 腾讯科技(深圳)有限公司 Model deployment method, packing method, device, equipment and storage medium
CN115185543B (en) * 2022-09-09 2022-11-25 腾讯科技(深圳)有限公司 Model deployment method, packing method, device, equipment and storage medium
CN116775047A (en) * 2023-08-18 2023-09-19 北京偶数科技有限公司 Deployment method, device and medium of AI model service cluster architecture

Similar Documents

Publication Publication Date Title
CN111340232A (en) Online prediction service deployment method and device, electronic equipment and storage medium
US20200249936A1 (en) Method and system for a platform for api based user supplied algorithm deployment
CN110555550B (en) Online prediction service deployment method, device and equipment
US11210070B2 (en) System and a method for automating application development and deployment
CN110249304A (en) The Visual intelligent management of electronic equipment
US20210191759A1 (en) Elastic Execution of Machine Learning Workloads Using Application Based Profiling
CN114997412A (en) Recommendation method, training method and device
CN111158884A (en) Data analysis method and device, electronic equipment and storage medium
Behan et al. Adaptive graphical user interface solution for modern user devices
CN111784000A (en) Data processing method and device and server
CN112099848A (en) Service processing method, device and equipment
CN117009650A (en) Recommendation method and device
US20230215430A1 (en) Contextual attention across diverse artificial intelligence voice assistance systems
Ribeiro et al. A microservice based architecture topology for machine learning deployment
Fowdur et al. Big data analytics with machine learning tools
CN112099882B (en) Service processing method, device and equipment
CN111580883B (en) Application program starting method, device, computer system and medium
CN112882769A (en) Skill pack data processing method, skill pack data processing device, computer equipment and storage medium
WO2021120177A1 (en) Method and apparatus for compiling neural network model
CN111783985A (en) Information processing method, information processing device, model processing method, model processing device, and model processing medium
CN116225567A (en) Page loading method and device, storage medium and computer equipment
Rosendo et al. KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments
CN114675978A (en) Operation framework of algorithm application element, data processing method, equipment and storage medium
Kiesel et al. TRIZ–develop or die in a world driven by volatility, uncertainty, complexity and ambiguity
CN110389754A (en) Method for processing business, system, server, terminal, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200626

RJ01 Rejection of invention patent application after publication