CN111966382A

CN111966382A - Online deployment method and device of machine learning model and related equipment

Info

Publication number: CN111966382A
Application number: CN202010888528.2A
Authority: CN
Inventors: 韦家强; 王国印; 郑德鹏
Original assignee: Shanghai Xunmeng Information Technology Co Ltd
Current assignee: Shanghai Xunmeng Information Technology Co Ltd
Priority date: 2020-08-28
Filing date: 2020-08-28
Publication date: 2020-11-20

Abstract

The invention provides an online deployment method, an online deployment device and related equipment of a machine learning model, wherein the online deployment method of the machine learning model comprises the following steps: matching model version numbers in a model storage system to model files meeting a first loading rule; loading the matched model file; receiving a call request of a prediction service; reading feature data according to the calling request; inputting the characteristic data into the loaded model file for prediction. The method and the device provided by the invention realize the complete deployment of the online module so as to improve the execution efficiency and the stability of the online module.

Description

Online deployment method and device of machine learning model and related equipment

Technical Field

The invention relates to the field of computer application, in particular to an online deployment method and device of a machine learning model and related equipment.

Background

Currently, mainstream internet enterprises generally generate hundreds of millions of business data every day, and analysis and decision-making based on the business data become the key for the survival and development of the enterprises. Since the speed and magnitude of business data generation have greatly exceeded the limits of manual processing, large-scale machine learning algorithms have become an important means for daily data analysis. When an enterprise implements a large-scale machine learning algorithm, a corresponding engineering architecture is generally divided into an offline module and an online module. The off-line module mainly completes the learning process of the decision model, specifically comprises a model training module and a model evaluation module, and outputs a decision model file. The online module mainly completes the inference process of the decision model, specifically, converts the online request into an input format corresponding to the decision model, and obtains an inference result through a computation logic in the decision model.

At present, an off-line module in a large-scale machine learning algorithm is deeply explored and researched in academic circles and industrial circles, and is mature in development; however, there is no effective and uniform implementation scheme for online modules of large-scale machine learning algorithm

Therefore, how to implement the deployment of the complete online module to improve the execution efficiency and stability of the online module is a technical problem to be solved urgently by those skilled in the art.

Disclosure of Invention

In order to overcome the defects of the related technologies, the invention provides an online deployment method and device of a machine learning model, an electronic device and a storage medium, so as to realize the complete deployment of an online module and improve the execution efficiency and stability of the online module.

According to one aspect of the invention, an online deployment method of a machine learning model is provided, which comprises the following steps:

matching model version numbers in a model storage system to model files meeting a first loading rule;

loading the matched model file;

receiving a call request of a prediction service;

reading feature data according to the calling request;

inputting the characteristic data into the loaded model file for prediction.

In some embodiments of the present invention, the model file is generated through offline training and associated with a model version number for storing in the model storage system.

In some embodiments of the present invention, a write flag is associated with a model file in the model storage system, and when the model file is generated by offline training and written into the model storage system, the write flag is set to indicate that the model file is completely written.

In some embodiments of the invention, the first loading rule comprises one or more of the following loading rules:

the model version number is latest;

the writing mark indicates that the model file is completely written;

the model version number conforms to the specified model version number.

In some embodiments of the present invention, after the loading the matched model file and before the receiving the call request of the prediction service further comprises:

checking whether the model file in the model storage system has update;

if yes, judging whether the updated model file conforms to a second loading rule;

and if so, loading the updated model file.

In some embodiments of the invention, the second loading rule comprises one or more of the following loading rules:

the version number of the model is later than the version model number of the currently loaded model file;

the writing mark indicates that the model file is completely written;

model version numbers are not specified.

In some embodiments of the present invention, after the loading the matched model file and before the reading the feature data according to the call request, the method further includes:

judging whether the characteristic data to be read comprises characteristic data generated by the model file during offline training or not, and storing the characteristic data generated by the model file during offline training into a cache system;

if yes, reading the characteristic data from the cache system according to a reading rule.

In some embodiments of the present invention, the feature data generated by the model file during the offline training is stored in the cache system in the form of a first key-value pair, where a key of the first key-value pair stores a feature name and a feature version number of the feature data, and a value of the first key-value pair stores feature information of the feature data.

In some embodiments of the present invention, the feature data generated by the model file during the offline training is stored in a cache system in association with a dead time, so that at least two feature version numbers exist in the feature data of the same feature name in the cache system at the same time.

In some embodiments of the present invention, after the set characteristic data generated during the offline training of the model file is stored in a cache system, the set characteristic data generated during the offline training of the model file is associated with a second key-value pair in the cache system, a key of the second key-value pair stores a characteristic version identifier, a value of the second key-value pair stores a characteristic version number, and the second key-value pair is obtained before the first key-value pair.

In some embodiments of the invention, the feature version number of the feature data is consistent with a model version number of a model file that spawns the feature data.

In some embodiments of the invention, the read rule comprises any one of the following read rules:

the characteristic version number is consistent with the model version number of the currently loaded model file;

the characteristic version number is latest;

the feature version number conforms to the specified feature version number.

In some embodiments of the invention, the model storage system stores model files for a set historical period of time.

In some embodiments of the invention, the model storage system is an object storage system or a distributed storage system.

According to another aspect of the present invention, there is also provided an online deployment apparatus of a machine learning model, including:

the matching module is configured to match a model file with a model version number meeting a first loading rule from a model storage system;

a loading module configured to load the matched model file;

a receiving module configured to receive a call request of a prediction service;

the reading module is configured to read the feature data according to the calling request;

a prediction module configured to input the feature data into the loaded model file for prediction.

According to still another aspect of the present invention, there is also provided an electronic apparatus, including: a processor; a storage medium having stored thereon a computer program which, when executed by the processor, performs the steps as described above.

According to yet another aspect of the present invention, there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps as described above.

Compared with the prior art, the invention has the advantages that:

the invention loads the matched model file by matching the model file of which the model version number accords with a first loading rule in a model storage system, reads the characteristic data and predicts the model based on the model file, thereby realizing accurate and stable loading of the model file based on the first loading rule, realizing complete deployment of the online module and improving the execution efficiency and stability of the online module.

Drawings

The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.

FIG. 1 shows a flow diagram of a method for online deployment of a machine learning model according to an embodiment of the invention.

FIG. 2 illustrates a flow diagram for update loading of a model file, according to a specific embodiment of the invention.

FIG. 3 is a flow chart illustrating online prediction using feature data in training a model according to an embodiment of the present invention.

FIG. 4 shows a flow diagram for prediction using a machine learning model, according to an embodiment of the invention.

FIG. 5 illustrates a block diagram of an apparatus for online deployment of a machine learning model, in accordance with an embodiment of the present invention.

Fig. 6 schematically illustrates a computer-readable storage medium in an exemplary embodiment of the invention.

Fig. 7 schematically illustrates an electronic device in an exemplary embodiment of the invention.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the steps. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

In various embodiments of the present invention, the online deployment method of the machine learning model provided by the present invention may be applied to various scenarios, for example, the prediction of logistics delivery time, the prediction of logistics service quality, the prediction of user portrait, the prediction of recommended goods, and the like, which is not limited thereto. The invention has the advantages that the data storage, data transmission and data calling of the model file and the feature file are realized through the online deployment of the machine learning model, the efficiency and the stability of online model prediction are improved, and meanwhile, the error in online model prediction is avoided, so that the system stability in the data storage, data transmission, data calling and online prediction execution of the system is further improved.

FIG. 1 shows a flow diagram of a method for online deployment of a machine learning model according to an embodiment of the invention. The online deployment method of the machine learning model comprises the following steps:

step S110: model files matching model version numbers in a model storage system and meeting a first loading rule are stored in the model storage system.

In particular, the model file may be produced via offline training. When each model file is produced off-line, a model version number is associated, so that the model version number can be stored in the model storage system together. The model version number can be used to identify model files for different prediction functions as well as model files for the same prediction function that are produced during offline training at different times.

In the above embodiment, the model file in the model storage system may further be associated with a write identifier. And after the model file is output through offline training and written into the model storage system, setting the writing identifier to indicate that the model file is completely written. Therefore, the writing state of the model file of the version can be determined through the writing identifier, the model file which is not completely written is prevented from being called, the prediction function of the model cannot be realized, or the model writing needs to be waited, so that the waiting time is increased, and the online prediction efficiency is influenced.

Specifically, the first loading rule includes one or more of the following loading rules: the model version number is latest; the writing mark indicates that the model file is completely written; the model version number conforms to the specified model version number. Thus, a first loading rule may be set as needed, wherein model files with the latest version numbers (version updates usually indicate that a model file of a new version has a higher prediction accuracy or has higher prediction performance) are loaded to avoid using model files of old versions, thereby affecting the performance of model prediction. The loaded model version number accords with the specified model version number, so that the problems of model file deletion of a new version and the like caused by system downtime can be avoided, and further model prediction cannot be executed. The written mark indicates that the model file is completely written into the model storage system and can be matched with the latest model version number or the model version number conforms to the specified model version number for use, so that the currently loaded model file is definitely written into the model storage system completely.

Step S120: and loading the matched model file.

Step S130: a call request for a predictive service is received.

Step S140: and reading the characteristic data according to the calling request.

Step S150: inputting the characteristic data into the loaded model file for prediction.

Specifically, the above steps are only for schematically illustrating the execution sequence of the steps, the present invention is not limited thereto, and the steps may be executed in other sequences, for example, step S130 and step S140 are executed before step S110, before step S120, in synchronization with step S110 or step S120, and the like.

In the online deployment method of the machine learning model, the matched model file is loaded by matching the model file with the model version number conforming to the first loading rule in the model storage system, and the characteristic data is read and the model is predicted based on the model file, so that the accurate and stable loading of the model file is realized based on the first loading rule, the complete deployment of the online module is realized, and the execution efficiency and the stability of the online module are improved.

FIG. 2 illustrates a flow diagram for update loading of a model file, according to a specific embodiment of the invention. Specifically, the update loading of the model file may be performed after the matched model file is loaded at step S120 and before the call request of the prediction service is received at step S130. Fig. 2 shows the following steps together:

step S121: checking whether the model file in the model storage system has update.

Specifically, the update judgment in step S121 may determine whether the model file is updated according to the version number associated with each model file in the model storage system. For example, the values of the version numbers may increase sequentially in the update sequence, so that whether the version number of the model file in the model storage system is greater than the version number of the currently loaded model text may be determined according to the value of the version number, thereby determining whether the model file is updated, which is not limited in the present invention.

If the updated model file exists in step S121, step S122 is executed to determine whether the updated model file conforms to the second loading rule. If it is determined in step S121 that the information does not exist, the update is not necessary.

In particular, the second loading rule may include one or more of the following loading rules: the version number of the model is later than the version model number of the currently loaded model file; the writing mark indicates that the model file is completely written; model version numbers are not specified. The three loading rules can be matched with each other, and the loading rule without specifying the model version number indicates that a user does not specify the version number, so that when the model file is updated, the updated model file can be loaded (if the user specifies the version number, the model file with the version number specified by the user can be loaded, and the model file does not need to be updated and loaded); the model version number is later than the version model number of the currently loaded model file so as to identify that the model file is updated; the write flag indicating that the model file has been written to completion may indicate that the currently loaded model file has been completely written to the model storage system.

If the step S122 determines that the result is yes, step S123 is executed: the updated model file is loaded. If step S122 determines that the data is not met, no loading is required.

Specifically, in an embodiment of the present invention, the model file may be stored as follows.

The model files trained by machine learning algorithms are typically of a size and format dependent on the machine learning framework and training data set used, e.g. models trained using a tensoflow (open source software library for high performance numerical calculations) framework are typically pb files or PMML format files. As for the model file, a model file in pb format or PMML (predictive model markup language) format is generally large in size and may be generated periodically with an offline training task, and an online service is required to perform model update without repackaging and publishing. Thus, the model file may be stored using an object storage service or a distributed file service.

Specifically, the model file is updated iteratively as the offline task is continuously executed, so that the updating and failure design of the online model is a critical problem to be solved. In an object storage or distributed file service environment, after an offline training process is completed, model updating needs to be performed in the following manner: first, the version number of the model file is marked (the generation date and time of the model file can be used); secondly, generating a blank file (for writing marks) for marking the model writing completion under the directory of the model file; when the online request is made to access the model file, the background thread can be used for regularly checking whether the model version is updated and successfully written, if the model version is updated and loaded successfully, the new model is adopted to replace the old model, otherwise, the detection is quitted, and the next detection period is waited. The model update rules may be customized according to machine learning tasks, for example, after the hit rate and accuracy rate on the test set reach a certain threshold, the model update operation is allowed to be executed.

For the online model file, in order to avoid system downtime, a history version of a certain period needs to be reserved for rollback operation in an emergency. For a request task with low real-time requirement, a model version which is manually specified can be configured, and if the real-time requirement is high or the configuration of the manually specified version fails, the current latest version of the model file can be used by default.

Specifically, after the model file is loaded, the model file may be parsed using a model parsing Software Development Kit (SDK) in a common model format. the pd file generated by tensorflow can be parsed using tensorflow wsdk. The PMML model may use Jmppl parsing. And model analysis can be realized by self-defining the self-defined model format. The present invention may also be implemented in many different ways, which are not described herein.

The above is merely an illustrative description of the storage and deployment of the model files of the present invention and the present invention is not so limited.

FIG. 3 is a flow chart illustrating online prediction using feature data in training a model according to an embodiment of the present invention. The step of online prediction using the feature data in the model training process is determined to be performed after the matched model file is loaded in step S120 and before the feature data is read in step S140 according to the call request. Fig. 3 shows the following steps in total:

step S131: and judging whether the characteristic data to be read comprises the characteristic data generated by the model file during the offline training, and storing the characteristic data generated by the model file during the offline training into a cache system.

If the determination in step S131 is yes, step S132 is executed: and reading the characteristic data from the cache system according to a reading rule.

In particular, in further embodiments of the present invention, the reading rule may include any one of the following reading rules: the characteristic version number is consistent with the model version number of the currently loaded model file; the characteristic version number is latest; the feature version number conforms to the specified feature version number. The consistency of the feature version number and the model version number of the currently loaded model file can ensure the consistency of the feature data and the model file. The latest feature version number can be predicted in real time through the latest feature data, so that the influence on the accuracy of model prediction caused by too early time of the feature data is avoided. The characteristic version number accords with the specified characteristic version number, so that the problems of loss of the characteristic file of the new version and the like caused by system downtime can be avoided, and further the model prediction cannot be executed.

Thus, multiplexing of feature data can be achieved, so that feature data can be used for off-line training and on-line prediction at the same time.

Specifically, the feature data generated by the model file during offline training may be stored in the cache system in the form of a first key-value pair, where the key of the first key-value pair stores a feature name and a feature version number of the feature data, and the value of the first key-value pair stores feature information of the feature data. The feature data during model training sometimes needs to be read and analyzed for the second time in an online environment, so that the storage design of the feature file needs to be considered under specific requirements, and online service reading is facilitated. In the case of a feature file, the content of the feature file also changes with the change of the model, and considering that the feature data is generally unstructured data and the life cycle of the feature data is not long, a caching service (e.g., a Redis cluster) in a Key-Value (first Key-Value pair) format is selected to store the feature data. For a model feature file, in a scene where features are stored in an online cache manner, the format of a Key can be designed as a business rule prefix-feature plaintext-feature version (usually, the Key is consistent with the model version); value is recorded as the acquired characteristic information, which is not limited in the present invention. In particular, the present invention can also configure a configuration for manually specifying feature versions in an online configuration service. The online prediction service preferentially uses the manually-specified version as a feature version, and if the mandatory feature version is configured to be empty, feature data of an execution version is automatically acquired in a cache to realize emergency rollback of the feature data.

In some embodiments, the feature version number of the feature data is consistent with a model version number of a model file that spawned the feature data. Thus, the association of the feature data and the model file can be intuitively realized.

Specifically, the feature data generated by the model file during offline training is stored in a cache system in association with a failure time, so that at least two feature version numbers exist in the feature data of the same feature name in the cache system at the same time. Thereby, the life cycle of the feature data is managed. Further, for the feature information, it is necessary to set a reasonable failure time, at least to ensure that at least two versions exist simultaneously for online use in the whole life cycle of the feature. During specific execution, before offline feature writing, the feature format needs to be assembled, and corresponding failure time is set during cache writing.

Specifically, after the set feature data generated by the model file during the offline training is stored in a cache system, the set feature data generated by the model file during the offline training is associated with a second key value pair in the cache system, the key of the second key value pair stores a feature version identifier, the value of the second key value pair stores a feature version number, and the second key value pair is acquired before the first key value pair. Further, after all the features are written into the cache, a special Key-Value sequence pair (second Key Value pair) may be defined, where Key is a feature version identifier, Value is a version number, and the second Key Value pair needs to be acquired before the online service accesses the feature content.

The above description is only illustrative of the manner in which the characteristic data of the present invention is stored and recalled, and the present invention is not limited thereto.

Referring now to fig. 4, fig. 4 illustrates a flow diagram for prediction using a machine learning model, according to an embodiment of the invention.

As shown in fig. 4, in the off-line phase: the machine learning platform 250 may first perform step S201, read the historical data from the big data platform 270, and train to produce a model file, where the produced model file is written into the path with the model version in the object storage service 230 through step S203. After the model file is successfully written into the object storage, a blank mark model write completion file (used for identifying that the version model file is output completed and can be loaded for use) can be written under the same path prefix. If the feature data generated by the training process needs to be read by the online prediction service 220, the feature platform 260 may assemble the feature data into a key-value format (the key itself has a version) according to the business rules through steps S203 and S204, and then export the key-value format into the caching service 240 and set a reasonable caching time (at least, it is ensured that two model versions exist at any time, so that the versions can be backed up when a problem occurs). After all feature writes are completed, the value representing the feature version key is updated to the currently derived feature version in the same caching service 240.

As shown in fig. 4, in the online phase: when the online prediction service 220 is started, the model is loaded according to the model loading policy (loading the specified version model or loading the current latest version model) through step S205, and the model object is cached in the client corresponding to the online prediction service 220 for online real-time prediction. When the online prediction service 220 is started, a background timing task may be started to periodically detect whether a new version of the model file is generated, and detect whether a loading condition is met (for example, the new version of the model is newer than the version in the current service, the model version is not manually specified, and the like), and if the loading condition is met, the new version of the model is loaded. If the online prediction service 220 needs to read the feature data, the feature of the specified version can be read according to the policy (the model version and the feature version are consistent, the model version and the feature version are not necessarily consistent, but only the respective latest version is required) through step S06. The business system 210 calls the online prediction service 220 through step S200, reads the features according to the business rules, and performs prediction using the model. The online prediction service 220 returns the prediction result through step S207. In this embodiment, the online prediction service 220 performs base-in using an appropriate base-in policy (e.g., the aforementioned model and feature rollback policies) when a particular problem is encountered.

The above are merely a plurality of specific implementations of the present invention, and each implementation may be implemented independently or in combination, and the present invention is not limited thereto.

Referring now to FIG. 5, FIG. 5 illustrates a block diagram of an apparatus for online deployment of a machine learning model, according to an embodiment of the invention. The online deployment apparatus 300 of the machine learning model includes a matching module 310, a loading module 320, a receiving module 330, a reading module 340, and a prediction module 350.

The matching module 310 is configured to match model files from a model storage system whose model version numbers meet a first loading rule;

the loading module 320 is configured to load the matched model file;

the receiving module 330 is configured to receive a call request of a prediction service;

the reading module 340 is configured to read the feature data according to the call request;

the prediction module 350 is configured to input the feature data into the loaded model file for prediction.

In the online deployment device of the machine learning model according to the exemplary embodiment of the present invention, a model file whose model version number conforms to a first loading rule is matched from a model storage system, so as to load the matched model file, and the feature data is read and the model is predicted based on the model file, so that the accurate and stable loading of the model file is realized based on the first loading rule, and thus the deployment of a complete online module is realized, and the execution efficiency and stability of the online module are improved.

Fig. 5 is a schematic diagram of the online deployment apparatus 300 of the machine learning model provided by the present invention, and the splitting, merging, and adding of modules are within the scope of the present invention without departing from the concept of the present invention. The online deployment apparatus 300 of the machine learning model provided by the present invention can be implemented by software, hardware, firmware, plug-in and any combination thereof, and the present invention is not limited thereto.

In an exemplary embodiment of the invention, a computer-readable storage medium is also provided, on which a computer program is stored, which when executed by, for example, a processor, may implement the steps of the online deployment method of the machine learning model described in any of the above embodiments. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present invention described in the method for online deployment of a machine learning model section above in this specification, when the program product is run on the terminal device.

Referring to fig. 6, a program product 700 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the tenant computing device, partly on the tenant device, as a stand-alone software package, partly on the tenant computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing devices may be connected to the tenant computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

In an exemplary embodiment of the invention, there is also provided an electronic device that may include a processor and a memory for storing executable instructions of the processor. Wherein the processor is configured to perform the steps of the online deployment method of the machine learning model in any of the above embodiments via execution of the executable instructions.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 500 according to this embodiment of the invention is described below with reference to fig. 7. The electronic device 500 shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 7, the electronic device 500 is embodied in the form of a general purpose computing device. The components of the electronic device 500 may include, but are not limited to: at least one processing unit 510, at least one memory unit 520, a bus 530 that couples various system components including the memory unit 520 and the processing unit 510, a display unit 540, and the like.

Wherein the storage unit stores program code, which is executable by the processing unit 510 to cause the processing unit 510 to perform steps according to various exemplary embodiments of the present invention described in the online deployment method of machine learning models section above in this specification. For example, the processing unit 510 may perform the steps as shown in any one or more of fig. 1-4.

The memory unit 520 may include a readable medium in the form of a volatile memory unit, such as a random access memory unit (RAM)5201 and/or a cache memory unit 5202, and may further include a read only memory unit (ROM) 5203.

The memory unit 520 may also include a program/utility 5204 having a set (at least one) of program modules 5205, such program modules 5205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 530 may be one or more of any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 500 may also communicate with one or more external devices 600 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a tenant to interact with the electronic device 500, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 500 to communicate with one or more other computing devices. Such communication may be through input/output (I/O) interfaces 550. Also, the electronic device 500 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet) via the network adapter 560. The network adapter 560 may communicate with other modules of the electronic device 500 via the bus 530. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, or a network device, etc.) to execute the online deployment method of the machine learning model according to the embodiment of the present invention.

Compared with the prior art, the invention has the advantages that:

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

Claims

1. A method for online deployment of a machine learning model, comprising:

loading the matched model file;

receiving a call request of a prediction service;

reading feature data according to the calling request;

inputting the characteristic data into the loaded model file for prediction.

2. The method for online deployment of a machine learning model of claim 1, wherein the model file is produced via offline training and associated with a model version number for storage in the model storage system.

3. The method for online deployment of a machine learning model of claim 1, wherein a write flag is associated with a model file in the model storage system, and wherein the write flag is set to indicate that the model file is completely written when the model file is generated by offline training and written to the model storage system.

4. The method for online deployment of a machine learning model of claim 3, wherein the first loading rule comprises one or more of the following loading rules:

the model version number is latest;

the writing mark indicates that the model file is completely written;

the model version number conforms to the specified model version number.

5. The method for online deployment of a machine learning model of claim 3, wherein after said loading the matched model file and before said receiving a call request for a prediction service further comprises:

checking whether the model file in the model storage system has update;

and if so, loading the updated model file.

6. The method for online deployment of a machine learning model of claim 5, wherein the second loading rule comprises one or more of the following loading rules:

the writing mark indicates that the model file is completely written;

model version numbers are not specified.

7. The method for online deployment of a machine learning model according to claim 1, wherein after the loading the matched model file and before the reading feature data according to the invocation request further comprises:

8. The method of online deployment of machine learning models of claim 7, wherein the feature data generated by the model file during offline training is stored in the cache system in the form of a first key-value pair, wherein the key of the first key-value pair stores the feature name and the feature version number of the feature data, and the value of the first key-value pair stores the feature information of the feature data.

9. The method of claim 8, wherein the feature data generated by the model file during offline training is stored in a cache system in association with a dead time, such that at least two feature version numbers exist in the same feature data of the same feature name in the cache system.

10. The method of claim 8, wherein the set-up feature data generated by the model file during the offline training is stored in a cache system, and then is associated with a second key-value pair in the cache system, wherein the key of the second key-value pair stores the feature version id, and the value of the second key-value pair stores the feature version number, and the second key-value pair is obtained before the first key-value pair.

11. The method for online deployment of a machine learning model of claim 8, wherein the feature version number of the feature data is consistent with a model version number of a model file that produced the feature data.

12. The method for online deployment of a machine learning model of claim 8, wherein the read rules comprise any one of the following read rules:

the characteristic version number is latest;

the feature version number conforms to the specified feature version number.

13. The method for online deployment of a machine learning model of any one of claims 1 to 12, wherein model files are saved in the model storage system for a set historical period of time.

14. The method for online deployment of a machine learning model of any one of claims 1 to 12, wherein the model storage system is an object storage system or a distributed storage system.

15. An apparatus for online deployment of a machine learning model, comprising:

a loading module configured to load the matched model file;

16. An electronic device, characterized in that the electronic device comprises:

a processor;

a memory having stored thereon a computer program that, when executed by the processor, performs:

a method of online deployment of a machine learning model as claimed in any one of claims 1 to 14.

17. A storage medium having a computer program stored thereon, the computer program when executed by a processor performing: