CN111461283A - Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model - Google Patents

Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model Download PDF

Info

Publication number
CN111461283A
CN111461283A CN202010190700.7A CN202010190700A CN111461283A CN 111461283 A CN111461283 A CN 111461283A CN 202010190700 A CN202010190700 A CN 202010190700A CN 111461283 A CN111461283 A CN 111461283A
Authority
CN
China
Prior art keywords
model
training
trained
automatically
replaced
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010190700.7A
Other languages
Chinese (zh)
Inventor
范博
周海刚
陈宇
艾青
王乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Ctrip Business Co Ltd
Original Assignee
Shanghai Ctrip Business Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Ctrip Business Co Ltd filed Critical Shanghai Ctrip Business Co Ltd
Priority to CN202010190700.7A priority Critical patent/CN111461283A/en
Publication of CN111461283A publication Critical patent/CN111461283A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/008Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Robotics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses an automatic iteration operation and maintenance method, a system, equipment and a storage medium of an AI model, wherein the automatic iteration operation and maintenance method of the AI model comprises the following steps: acquiring a version number corresponding to a training set and a version name corresponding to an AI model to be trained; automatically downloading a training set and an AI model to be trained from a file system by using a training service according to the version number and the version name; the training service automatically trains the AI model to be trained by using the training set; the training service automatically evaluates the training effect to obtain an evaluation result; and the training service judges whether to carry out model replacement according to the evaluation result, and if so, the model is replaced by automatic hot loading. The invention takes AI training side service as a center and is matched with a management end, so that automatic iteration operation and maintenance of an AI model can be realized, and various different model operation and maintenance and version configuration can be realized according to different scenes; and the operation and maintenance personnel are isolated from the complex logic of each service model, the version is transparent, and the operation and maintenance efficiency can be improved on the basis of reducing the operation and maintenance cost.

Description

Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model
Technical Field
The invention relates to the technical field of Artificial Intelligence, in particular to an automatic iteration operation and maintenance method, system, equipment and storage medium of an AI (Artificial Intelligence) model.
Background
Based on the AI technology, a large number of deep learning or machine learning models are generated by various businesses, training data of the models come from online, and online data changes for a long time, for example, a ratio of black and white samples to dirty data changes, however, the models generated by training using historical data cannot sense the change of the online data, which causes the models not to fit actual data any more, and accuracy or recall rate to be reduced. Aiming at the iteration of the AI model, data on the line needs to be taken periodically, the model is retrained after data cleaning, or incremental training is performed on the basis of the original model, so that the problem of complex operation and maintenance of multiple models can be caused.
Disclosure of Invention
The invention aims to overcome the defects of difficult operation and maintenance and high cost of manual iterative training of a model in the prior art, and provides an automatic iterative operation and maintenance method, system, equipment and storage medium of an AI model.
The invention solves the technical problems through the following technical scheme:
the invention provides an automatic iteration operation and maintenance method of an AI model, which comprises the following steps:
s1, acquiring a version number corresponding to the training set and a version name corresponding to the AI model to be trained;
s2, automatically downloading the training set and the AI model to be trained from a file system by using a training service according to the version number and the version name;
s3, the training service automatically trains the AI model to be trained by using the training set;
s4, the training service automatically evaluates the training effect to obtain an evaluation result;
and S5, the training service judges whether to replace the model according to the evaluation result, and if so, the model is replaced by automatic hot loading.
Preferably, the S1 further includes: acquiring a training mode and a version name corresponding to the AI model to be replaced;
the S2 further includes: automatically downloading the AI model to be replaced from the file system by using a training service according to the version name corresponding to the AI model to be replaced;
the training mode comprises incremental training and full training;
the S3 includes:
if the training mode is full training, the training service uses the training set to automatically train the AI model to be trained so as to obtain a fully trained model;
if the training mode is incremental training, the training service automatically trains the AI model to be replaced by using the training set to obtain an incrementally trained model;
the S4 includes the steps of:
if the training mode is full training, the training service automatically compares the accuracy and/or recall rate of the AI model to be trained with the model after full training to obtain an evaluation result;
if the training mode is incremental training, the training service automatically compares the accuracy and/or recall rate of the AI model to be replaced and the model after incremental training to obtain an evaluation result.
Preferably, the following steps are further included between the S2 and the S3:
the training service automatically performs format verification on the training set and the AI model to be trained and/or the AI model to be replaced, and the training service automatically judges whether the training set and the AI model to be trained and/or the AI model to be replaced are available according to a verification result.
Preferably, the AI model to be trained, the AI model to be replaced, the fully trained model, and the incrementally trained model are python (a cross-platform computer programming language) data models;
the steps between the S3 and the S4 are further as follows:
the training service automatically converts the python data model into a JAVA (one object oriented programming language) data model, performs format verification on the JAVA data model, and executes the S4 if the verification is passed.
Preferably, the version name includes a service line name, a scene name, and a data cycle name.
Preferably, the S5 includes the following steps:
and the training service judges whether to replace the model according to the evaluation result, if so, judges whether to replace the model automatically, if so, automatically loads the replacement model in a hot mode, and otherwise, sends the evaluation result to a preset receiving end through a mail and/or a short message.
The invention provides an automatic iteration operation and maintenance system of an AI model, which comprises a front end, a management end and a training service end;
the front end is used for acquiring a version number corresponding to the training set;
the management terminal is used for acquiring a version name corresponding to the AI model to be trained;
the training server comprises a downloading module, a training module, an evaluation module and a replacement module;
the downloading module is used for automatically downloading the training set and the AI model to be trained from a file system according to the version number and the version name;
the training module is used for automatically training the AI model to be trained by using the training set;
the evaluation module is used for automatically evaluating the training effect to obtain an evaluation result;
and the replacement module is used for judging whether to replace the model according to the evaluation result, and if so, automatically hot-loading the replacement model.
Preferably, the front end is further configured to obtain a training mode, and the management end is further configured to obtain a version name corresponding to the AI model to be replaced;
the downloading module is further used for automatically downloading the AI model to be replaced from the file system according to the version name corresponding to the AI model to be replaced;
the training mode comprises incremental training and full training;
the training module comprises a full training unit and an incremental training unit;
the full training unit is used for automatically training the AI model to be trained by using the training set when the training mode is full training so as to obtain a fully trained model;
the incremental training unit is used for automatically training the AI model to be replaced by using the training set when the training mode is incremental training so as to obtain an incrementally trained model;
the evaluation module comprises a full evaluation unit and an incremental evaluation unit;
the full-scale evaluation unit is used for automatically comparing the precision rate and/or the recall rate of the AI model to be trained and the model after full-scale training when the training mode is full-scale training so as to obtain an evaluation result;
and the increment evaluation unit is used for automatically comparing the accuracy and/or recall ratio of the AI model to be replaced and the model after increment training when the training mode is increment training so as to obtain an evaluation result.
Preferably, the training server further comprises a verification module;
the verification module is used for automatically performing format verification on the training set and the AI model to be trained and/or the AI model to be replaced, and automatically judging whether the training set and the AI model to be trained and/or the AI model to be replaced are available according to a verification result.
Preferably, the AI model to be trained, the AI model to be replaced, the fully trained model and the incrementally trained model are python data models;
the training server also comprises a model conversion module;
the model conversion module is used for automatically converting the python data model into a JAVA data model, carrying out format verification on the JAVA data model, and calling the evaluation module if the verification is passed.
Preferably, the version name includes a service line name, a scene name, and a data cycle name.
Preferably, the replacement module includes a first judgment unit, a second judgment unit, an automatic replacement unit and a result sending unit;
the first judging unit is used for judging whether to carry out model replacement according to the evaluation result, and if so, the second judging unit is called;
the second judgment unit is used for judging whether automatic replacement is carried out or not, if so, the automatic replacement unit is called, and if not, the result sending unit is called;
the automatic replacement unit is used for automatically hot-loading a replacement model;
and the result sending unit is used for sending the evaluation result to a preset receiving end through an email and/or a short message.
A third aspect of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the automatic iterative operation and maintenance method for the AI model according to the first aspect.
A fourth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the automatic iterative operation and maintenance method of an AI model according to the first aspect.
The positive progress effects of the invention are as follows:
the invention discloses an automatic iteration operation and maintenance method, a system, equipment and a storage medium of an AI model, which take an AI training side service as a center and are matched with a management terminal, so that the automatic iteration operation and maintenance of the AI model can be realized, and various different model operation and maintenance and version configurations can be realized according to different scenes; and the operation and maintenance personnel are isolated from the complex logic of each service model, the version is transparent, and the operation and maintenance efficiency can be improved on the basis of reducing the operation and maintenance cost.
Drawings
Fig. 1 is a flowchart of an automatic iterative operation and maintenance method for an AI model according to embodiment 1 of the present invention.
Fig. 2 is a schematic diagram of an automatic iterative operation and maintenance system of an AI model according to embodiment 2 of the present invention.
Fig. 3 is a schematic structural diagram of a training server in the automatic iterative operation and maintenance system of the AI model according to embodiment 2 of the present invention.
Fig. 4 is a schematic structural diagram of a training module in the automatic iterative operation and maintenance system of the AI model according to embodiment 2 of the present invention.
Fig. 5 is a schematic structural diagram of an evaluation module in the automatic iterative operation and maintenance system of the AI model according to embodiment 2 of the present invention.
Fig. 6 is a schematic structural diagram of a replacement module in the automatic iterative operation and maintenance system of the AI model according to embodiment 2 of the present invention.
Fig. 7 is a schematic structural diagram of an electronic device according to embodiment 3 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
As shown in fig. 1, the present embodiment provides an automatic iterative operation and maintenance method for an AI model, which includes the following steps:
101, acquiring a version number corresponding to a training set, a training mode, a version name corresponding to an AI model to be replaced and a version name corresponding to the AI model to be trained; the training mode comprises incremental training and full training; the training set is data of the last week and can also be data uploaded by the training set.
In this embodiment, the training mode is determined according to the configuration of the operation and maintenance staff, and incremental training and/or full training may be selected.
And 102, respectively and automatically downloading a training set, the AI model to be replaced and the AI model to be trained from the file system by using a training service according to the version number, the version name corresponding to the AI model to be replaced and the version name corresponding to the AI model to be trained.
103, the training service automatically performs format verification on the training set and the AI model to be trained and/or the AI model to be replaced, the training service automatically judges whether the training set and the AI model to be trained and/or the AI model to be replaced are available according to a verification result, and if the training set and the AI model to be trained and/or the AI model to be replaced are unavailable, the step 104 is executed; if so, go to step 105.
In this embodiment, the training service automatically aggregates and cleans the data.
And step 104, entering an exception handling flow.
Step 105, the training service splits the training set into a sub-training set and a sub-test set according to a ratio of 1: 9.
The sub-training set is used for automatically training the AI model to be trained and/or the AI model to be replaced; the sub-test set is used for automatically carrying out effect evaluation on the AI model to be trained, the model after full-scale training, the AI model to be replaced and the model after incremental training.
106, automatically training the AI model to be trained by using a training set by the training service; specifically, if the training mode is full training, the training service automatically trains the AI model to be trained by using the sub-training set to obtain a fully trained model; if the training mode is incremental training, the training service uses the sub-training set to automatically train the AI model to be replaced, and the model after incremental training is obtained.
In this embodiment, the training service is a program that is constructed by python codes and can be automatically invoked, and the AI model to be trained, the AI model to be replaced, the model after full training, and the model after incremental training are all python data models.
Step 107, the training service automatically converts the python data model into a JAVA data model, performs format verification on the JAVA data model, and executes step 104 if the verification fails; if the check is passed, step 108 is performed.
Step 108, the training service automatically evaluates the training effect to obtain an evaluation result; specifically, if the training mode is full-scale training, the training service uses the subtest set to automatically compare the accuracy and/or recall rate of the AI model to be trained with the model after full-scale training to obtain an evaluation result; if the training mode is incremental training, the training service uses the sub-test set to automatically compare the accuracy and/or recall rate of the AI model to be replaced and the model after incremental training to obtain an evaluation result.
Step 109, the training service judges whether to perform model replacement according to the evaluation result, if not, the process is ended; if yes, go to step 110.
And step 110, judging whether to automatically replace, if so, executing step 111, and otherwise, executing step 112.
And step 111, automatically hot-loading the replacement model, and ending the process.
And 112, sending the evaluation result to a preset receiving end through a mail and/or a short message, selecting whether to load the model to a production environment or a test environment by the subsequent preset receiving end, and replacing the model if the model is selected to be loaded.
In this embodiment, the model evaluation adopts the existing implementation manner. If the training mode is incremental training in step 108, the training service automatically compares the accuracy and/or recall of the to-be-replaced AI model and the incrementally trained model using the sub-test set to obtain an evaluation result, which may include the following steps:
and acquiring the precision rate and the recall rate of the AI model to be replaced on the test set.
And acquiring the accuracy and the recall rate of the AI model to be trained on the test set.
And comparing the AI model to be trained with the AI model to be replaced, and replacing the model if the accuracy of the AI model to be trained is greater than that of the AI model to be replaced on the premise that the recall rate change is less than 10%.
In this embodiment, the model evaluation may adopt a reasonable iteration cycle according to the online data condition, ensure the data volume, compare the accuracy of the new model with the accuracy of the old model with the recall rate (the new model is a model to be trained, and the old model is a model to be replaced), and the operation and maintenance staff determines whether to trigger the hot-loading model replacement operation.
The automatic iterative operation and maintenance method for the AI model provided by this embodiment has the advantages that once an operation and maintenance person configures a model scene, the model in this scene can perform online data self-increment training in the form of a sliding window, and return a test result of online data of a new model and an old model, and the operation and maintenance person can manually trigger or automatically hot load the new model according to the training result to replace the old model. The method is suitable for automatic iterative updating of the machine learning and deep learning models, unifies version management of the models in multiple scenes compared with the traditional manual training updating model, reduces the labor operation and maintenance cost, and improves the operation and maintenance efficiency.
Compared with the method for realizing model iteration by using a manual training model, the automatic iteration operation and maintenance method for the AI model provided by the embodiment can realize model iteration in multiple scenes without human intervention. Operation and maintenance personnel only need to configure the model version according to the scene, the model under each scene can be automatically subjected to incremental data training at regular intervals, and the old model is automatically replaced by hot loading according to the results of the new model and the old model.
According to the method, the model of each scene is automatically updated in an iterative manner by configuring the scenes, the test set is automatically divided according to new data, the model effect of the new version and the model effect of the old version are evaluated, the score is returned, and whether hot loading replacement is carried out or not is judged according to the score result. The main characteristics are as follows: an off-line training mode is adopted, on-line prediction service is not interfered, and zero service perception is realized in the mode of hot loading when the model is replaced, so that the effect of zero intrusion on a service party is achieved; and aiming at incremental business data, the same model is used for incremental training to ensure the timeliness of the training data and the usability of the model, and meanwhile, the automatic iterative operation and maintenance system of the AI model is compatible with all deployable deep learning and machine learning models.
In the embodiment, the naming mode of the model version is named according to the service line name, the scene name and the data cycle name, and the naming mode ensures the uniqueness of the model version by the least information and is convenient for version management.
In the embodiment, an AI training side service is taken as a center, and the automatic iteration operation and maintenance of an AI model can be realized by matching with a management terminal, and various different model operation and maintenance and version configurations can be realized according to different scenes; and the operation and maintenance personnel are isolated from the complex logic of each service model, the version is transparent, and the operation and maintenance efficiency can be improved on the basis of reducing the operation and maintenance cost.
Example 2
As shown in fig. 2, the embodiment provides an automatic iterative operation and maintenance system for an AI model, which includes a front end 1, a management end 2, and a training server end 3.
The front end 1 is used for acquiring a version number and a training mode corresponding to a training set, wherein the training mode comprises incremental training and full training; the training set is data of the last week.
The management terminal 2 is configured to obtain a version name corresponding to the AI model to be trained and a version name corresponding to the AI model to be replaced.
As shown in fig. 3, the training server 3 includes a downloading module 301, a checking module 302, a training module 303, a model conversion module 304, an evaluation module 305, and a replacement module 306.
The downloading module 301 is configured to automatically download the training set, the to-be-trained AI model and/or the to-be-replaced AI model from the file system according to the version number and the version name.
The checking module 302 is configured to automatically perform format checking on the training set, the AI model to be trained, and/or the AI model to be replaced, and automatically determine whether the training set, the AI model to be trained, and/or the AI model to be replaced is available according to a checking result.
The training module 303 is configured to automatically train the available AI models to be trained using the available training sets.
As shown in fig. 4, the training module 303 includes a full training unit 3031 and an incremental training unit 3032.
The full training unit 3031 is configured to automatically train the AI model to be trained by using the training set when the training mode is full training, so as to obtain a fully trained model.
The incremental training unit 3032 is configured to automatically train the to-be-replaced AI model by using the training set when the training mode is incremental training, so as to obtain an incrementally trained model.
In this embodiment, the training service is a program that is constructed by python codes and can be automatically invoked, and the AI model to be trained, the AI model to be replaced, the model after full training, and the model after incremental training are all python data models.
The model conversion module 304 is configured to automatically convert the python data model into a JAVA data model, perform format verification on the JAVA data model, and call the evaluation module 305 if the format verification passes.
The evaluation module 305 is used for automatically evaluating the effect of the training to obtain an evaluation result.
As shown in fig. 5, the evaluation module 305 includes a full-scale evaluation unit 3051 and an incremental evaluation unit 3052.
The full-scale evaluation unit 3051 is configured to, when the training mode is full-scale training, automatically compare accuracy rates and/or recall rates of the AI model to be trained and the model after full-scale training to obtain an evaluation result.
The increment evaluation unit 3052 is configured to, when the training mode is increment training, automatically compare accuracy rates and/or recall rates of the to-be-replaced AI model and the incrementally trained model to obtain an evaluation result.
As shown in fig. 6, the replacement module 306 includes a first determination unit 3061, a second determination unit 3062, an automatic replacement unit 3063, and a result transmission unit 3064.
The first determination unit 3061 is used for determining whether to perform model replacement according to the evaluation result, and if so, the second determination unit 3062 is called.
The second determination unit 3062 is used for determining whether to perform automatic replacement, if so, the automatic replacement unit 3063 is called, and if not, the result sending unit 3064 is called.
The automatic replacement unit 3063 is used to automatically hot load the replacement model.
The result sending unit 3064 is configured to send the evaluation result to a preset receiving end through an email and/or a short message.
In this embodiment, when the training server 3 calls the download module 301 to download the training set, the AI model to be trained, and/or the AI model to be replaced, the information of "preparation for training during data download" is returned to the front end 1 through the management terminal 2, and the front end 1 displays "preparation for training".
In this embodiment, the training server 3 calls the verification module 302 to automatically perform format verification on the downloaded training set, the to-be-trained AI model and/or the to-be-replaced AI model, and automatically determines whether the training set, the to-be-trained AI model and/or the to-be-replaced AI model are available according to a verification result. If the data is not available, entering an exception handling flow, simultaneously returning the information of 'data damage please retrain' to the front end 1 through the management end 2, and displaying 'data damage please retrain' by the front end 1; if available, the management terminal 2 returns the "training" information to the front end 1, and the front end 1 displays "training".
In this embodiment, when the training server 3 calls the evaluation module 305 to evaluate the training effect, the management terminal 2 returns the information of "successful training and unloaded" to the front end 1, and the front end 1 displays "successful training and unloaded", and names the AI model to be trained with the corresponding version name and sends the AI model to the file system.
In this embodiment, when the training server 3 calls the automatic replacement unit 3063 to automatically hot load the replacement model, the management terminal 2 returns the "load" information to the front end 1, and the front end 1 displays "load".
According to the method, the model of each scene is automatically updated in an iterative manner by configuring the scenes, the test set is automatically divided according to new data, the model effect of the new version and the model effect of the old version are evaluated, the score is returned, and whether hot loading replacement is carried out or not is judged according to the score result. The main characteristics are as follows: an off-line training mode is adopted, on-line prediction service is not interfered, and zero service perception is realized in the mode of hot loading when the model is replaced, so that the effect of zero intrusion on a service party is achieved; and aiming at incremental business data, the same model is used for incremental training to ensure the timeliness of the training data and the usability of the model, and meanwhile, the automatic iterative operation and maintenance system of the AI model is compatible with all deployable deep learning and machine learning models.
In the embodiment, an AI training side service is taken as a center, and the automatic iteration operation and maintenance of an AI model can be realized by matching with a management terminal, and various different model operation and maintenance and version configurations can be realized according to different scenes; and the operation and maintenance personnel are isolated from the complex logic of each service model, the version is transparent, and the operation and maintenance efficiency can be improved on the basis of reducing the operation and maintenance cost.
Example 3
Fig. 7 is a schematic structural diagram of an electronic device according to embodiment 3 of the present invention. The electronic device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the automatic iterative operation and maintenance method of the AI model of embodiment 1 when executing the computer program. The electronic device 30 shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 7, the electronic device 30 may be embodied in the form of a general purpose computing device, which may be, for example, a server device. The components of the electronic device 30 may include, but are not limited to: the at least one processor 31, the at least one memory 32, and a bus 33 connecting the various system components (including the memory 32 and the processor 31).
The bus 33 includes a data bus, an address bus, and a control bus.
The memory 32 may include volatile memory, such as Random Access Memory (RAM)321 and/or cache memory 322, and may further include Read Only Memory (ROM) 323.
Memory 32 may also include a program/utility 325 having a set (at least one) of program modules 324, such program modules 324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The processor 31 executes various functional applications and data processing, such as an automatic iterative operation and maintenance method of the AI model provided in embodiment 1 of the present invention, by running the computer program stored in the memory 32.
The electronic device 30 may also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.), such communication may be through input/output (I/O) interfaces 35, and the model-generated device 30 may also communicate with one or more networks (e.g., local area network (L AN), Wide Area Network (WAN) and/or a public network, such as the Internet) through a network adapter 36. As shown, the network adapter 36 communicates with other modules of the model-generated device 30 through a bus 33. it should be understood that, although not shown in the figures, other hardware and/or software modules may be used in connection with the model-generated device 30, including, but not limited to, microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Example 4
The present embodiment provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the automatic iterative operation and maintenance method for the AI model provided in embodiment 1.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation manner, the present invention can also be implemented in the form of a program product, which includes program code for causing a terminal device to execute the steps of the automatic iterative operation and maintenance method for implementing the AI model described in embodiment 1 when the program product runs on the terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may be executed entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (14)

1. An automatic iteration operation and maintenance method of an AI model is characterized by comprising the following steps:
s1, acquiring a version number corresponding to the training set and a version name corresponding to the AI model to be trained;
s2, automatically downloading the training set and the AI model to be trained from a file system by using a training service according to the version number and the version name;
s3, the training service automatically trains the AI model to be trained by using the training set;
s4, the training service automatically evaluates the training effect to obtain an evaluation result;
and S5, the training service judges whether to replace the model according to the evaluation result, and if so, the model is replaced by automatic hot loading.
2. The AI model automatic iterative operation and maintenance method of claim 1, wherein the S1 further comprises: acquiring a training mode and a version name corresponding to the AI model to be replaced;
the S2 further includes: automatically downloading the AI model to be replaced from the file system by using a training service according to the version name corresponding to the AI model to be replaced;
the training mode comprises incremental training and full training;
the S3 includes:
if the training mode is full training, the training service uses the training set to automatically train the AI model to be trained so as to obtain a fully trained model;
if the training mode is incremental training, the training service automatically trains the AI model to be replaced by using the training set to obtain an incrementally trained model;
the S4 includes the steps of:
if the training mode is full training, the training service automatically compares the accuracy and/or recall rate of the AI model to be trained with the model after full training to obtain an evaluation result;
if the training mode is incremental training, the training service automatically compares the accuracy and/or recall rate of the AI model to be replaced and the model after incremental training to obtain an evaluation result.
3. The AI model automatic iterative operation and maintenance method of claim 2, further comprising, between the S2 and the S3, the steps of:
the training service automatically performs format verification on the training set and the AI model to be trained and/or the AI model to be replaced, and the training service automatically judges whether the training set and the AI model to be trained and/or the AI model to be replaced are available according to a verification result.
4. The automatic iterative operation and maintenance method for AI models according to claim 2, wherein the AI model to be trained, the AI model to be replaced, the fully trained model, and the incrementally trained model are python data models;
the steps between the S3 and the S4 are further as follows:
the training service automatically converts the python data model into a JAVA data model, performs format verification on the JAVA data model, and if the verification passes, executes the S4.
5. The AI model automatic iterative operation and maintenance method of claim 1, wherein the version names include a line of business name, a scene name, and a data cycle name.
6. The AI model automatic iterative operation and maintenance method according to claim 1, wherein said S5 includes the steps of:
and the training service judges whether to replace the model according to the evaluation result, if so, judges whether to replace the model automatically, if so, automatically loads the replacement model in a hot mode, and otherwise, sends the evaluation result to a preset receiving end through a mail and/or a short message.
7. An automatic iteration operation and maintenance system of an AI model is characterized by comprising a front end, a management end and a training service end;
the front end is used for acquiring a version number corresponding to the training set;
the management terminal is used for acquiring a version name corresponding to the AI model to be trained;
the training server comprises a downloading module, a training module, an evaluation module and a replacement module;
the downloading module is used for automatically downloading the training set and the AI model to be trained from a file system according to the version number and the version name;
the training module is used for automatically training the AI model to be trained by using the training set;
the evaluation module is used for automatically evaluating the training effect to obtain an evaluation result;
and the replacement module is used for judging whether to replace the model according to the evaluation result, and if so, automatically hot-loading the replacement model.
8. The AI model auto-iterative operation and maintenance system of claim 7, wherein the front end is further configured to obtain a training mode, and the management end is further configured to obtain a version name corresponding to the AI model to be replaced;
the downloading module is further used for automatically downloading the AI model to be replaced from the file system according to the version name corresponding to the AI model to be replaced;
the training mode comprises incremental training and full training;
the training module comprises a full training unit and an incremental training unit;
the full training unit is used for automatically training the AI model to be trained by using the training set when the training mode is full training so as to obtain a fully trained model;
the incremental training unit is used for automatically training the AI model to be replaced by using the training set when the training mode is incremental training so as to obtain an incrementally trained model;
the evaluation module comprises a full evaluation unit and an incremental evaluation unit;
the full-scale evaluation unit is used for automatically comparing the precision rate and/or the recall rate of the AI model to be trained and the model after full-scale training when the training mode is full-scale training so as to obtain an evaluation result;
and the increment evaluation unit is used for automatically comparing the accuracy and/or recall ratio of the AI model to be replaced and the model after increment training when the training mode is increment training so as to obtain an evaluation result.
9. The AI model auto-iterative operation and maintenance system of claim 8, wherein the training server further comprises a verification module;
the verification module is used for automatically performing format verification on the training set and the AI model to be trained and/or the AI model to be replaced, and automatically judging whether the training set and the AI model to be trained and/or the AI model to be replaced are available according to a verification result.
10. The AI model auto-iterative operation and maintenance system of claim 8, wherein the AI model to be trained, the AI model to be replaced, the fully trained model, and the incrementally trained model are python data models;
the training server also comprises a model conversion module;
the model conversion module is used for automatically converting the python data model into a JAVA data model, carrying out format verification on the JAVA data model, and calling the evaluation module if the verification is passed.
11. The AI model automated iterative operation and maintenance system of claim 7, wherein the version names include a line of business name, a scene name, and a data cycle name.
12. The AI model automatic iterative operation and maintenance system of claim 7,
the replacement module comprises a first judgment unit, a second judgment unit, an automatic replacement unit and a result sending unit;
the first judging unit is used for judging whether to carry out model replacement according to the evaluation result, and if so, the second judging unit is called;
the second judgment unit is used for judging whether automatic replacement is carried out or not, if so, the automatic replacement unit is called, and if not, the result sending unit is called;
the automatic replacement unit is used for automatically hot-loading a replacement model;
and the result sending unit is used for sending the evaluation result to a preset receiving end through an email and/or a short message.
13. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, implements the automatic iterative operation and maintenance method for an AI model according to any one of claims 1-6.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the automatic iterative operation and maintenance method of an AI model according to any one of claims 1 to 6.
CN202010190700.7A 2020-03-18 2020-03-18 Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model Pending CN111461283A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010190700.7A CN111461283A (en) 2020-03-18 2020-03-18 Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010190700.7A CN111461283A (en) 2020-03-18 2020-03-18 Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model

Publications (1)

Publication Number Publication Date
CN111461283A true CN111461283A (en) 2020-07-28

Family

ID=71683167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010190700.7A Pending CN111461283A (en) 2020-03-18 2020-03-18 Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model

Country Status (1)

Country Link
CN (1) CN111461283A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966382A (en) * 2020-08-28 2020-11-20 上海寻梦信息技术有限公司 Online deployment method and device of machine learning model and related equipment
CN112508715A (en) * 2020-11-30 2021-03-16 泰康保险集团股份有限公司 Method and device for online deployment of insurance dual-core data model, electronic equipment and medium
CN113642622A (en) * 2021-08-03 2021-11-12 浙江数链科技有限公司 Data model effect evaluation method, system, electronic device and storage medium
WO2024130709A1 (en) * 2022-12-23 2024-06-27 Oppo广东移动通信有限公司 Model updating method and device
WO2024140272A1 (en) * 2022-12-30 2024-07-04 华为技术有限公司 Method for sending training data of ai model and communication apparatus

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203518A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 Method, system and device, the electronic equipment of on-line system personalized recommendation
CN109299178A (en) * 2018-09-30 2019-02-01 北京九章云极科技有限公司 A kind of application method and data analysis system
CN109815991A (en) * 2018-12-29 2019-05-28 北京城市网邻信息技术有限公司 Training method, device, electronic equipment and the storage medium of machine learning model
CN109978062A (en) * 2019-03-28 2019-07-05 北京九章云极科技有限公司 A kind of model on-line monitoring method and system
CN110276074A (en) * 2019-06-20 2019-09-24 出门问问信息科技有限公司 Distributed training method, device, equipment and the storage medium of natural language processing
CN110362333A (en) * 2019-06-29 2019-10-22 上海淇馥信息技术有限公司 A kind of quick solution, device and electronic equipment that client upgrading hinders
CN110378463A (en) * 2019-07-15 2019-10-25 北京智能工场科技有限公司 A kind of artificial intelligence model standardized training platform and automated system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203518A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 Method, system and device, the electronic equipment of on-line system personalized recommendation
CN109299178A (en) * 2018-09-30 2019-02-01 北京九章云极科技有限公司 A kind of application method and data analysis system
CN109815991A (en) * 2018-12-29 2019-05-28 北京城市网邻信息技术有限公司 Training method, device, electronic equipment and the storage medium of machine learning model
CN109978062A (en) * 2019-03-28 2019-07-05 北京九章云极科技有限公司 A kind of model on-line monitoring method and system
CN110276074A (en) * 2019-06-20 2019-09-24 出门问问信息科技有限公司 Distributed training method, device, equipment and the storage medium of natural language processing
CN110362333A (en) * 2019-06-29 2019-10-22 上海淇馥信息技术有限公司 A kind of quick solution, device and electronic equipment that client upgrading hinders
CN110378463A (en) * 2019-07-15 2019-10-25 北京智能工场科技有限公司 A kind of artificial intelligence model standardized training platform and automated system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱祥磊: "加速AI分布式训练研究和实践" *
杨体东 等: "基于多维度评价信息的在线服务信誉度量" *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966382A (en) * 2020-08-28 2020-11-20 上海寻梦信息技术有限公司 Online deployment method and device of machine learning model and related equipment
CN112508715A (en) * 2020-11-30 2021-03-16 泰康保险集团股份有限公司 Method and device for online deployment of insurance dual-core data model, electronic equipment and medium
CN113642622A (en) * 2021-08-03 2021-11-12 浙江数链科技有限公司 Data model effect evaluation method, system, electronic device and storage medium
WO2024130709A1 (en) * 2022-12-23 2024-06-27 Oppo广东移动通信有限公司 Model updating method and device
WO2024140272A1 (en) * 2022-12-30 2024-07-04 华为技术有限公司 Method for sending training data of ai model and communication apparatus

Similar Documents

Publication Publication Date Title
CN111461283A (en) Automatic iteration operation and maintenance method, system, equipment and storage medium of AI model
CN109344906B (en) User risk classification method, device, medium and equipment based on machine learning
CN109035028B (en) Intelligent consultation strategy generation method and device, electronic equipment and storage medium
CN104461863A (en) Service system testing method, device and system
CN110286938B (en) Method and apparatus for outputting evaluation information for user
CN110309967A (en) Prediction technique, system, equipment and the storage medium of customer service session grading system
CN113689111B (en) Fault recognition model training method, fault recognition device and electronic equipment
CN115310954B (en) IT service operation maintenance method and system
CN109858548A (en) The judgment method and device of abnormal power consumption, storage medium, communication terminal
CN111210332A (en) Method and device for generating post-loan management strategy and electronic equipment
CN114785830A (en) Performance smart cloud platform
CN113487182A (en) Equipment health state evaluation method and device, computer equipment and medium
US12117890B2 (en) Dynamically creating a contact address to customer support based on information associated with a computing device
CN113487086A (en) Method and device for predicting remaining service life of equipment, computer equipment and medium
CN110445847A (en) A kind of meter register method and device based on wechat small routine
CN114791954A (en) Equipment fault diagnosis method and device
CN112782584B (en) Method, system, medium and device for predicting remaining usage amount of battery electric quantity
CN113850686A (en) Insurance application probability determination method and device, storage medium and electronic equipment
CN112380118A (en) Unit testing method, unit testing device, medium and electronic equipment
CN111275268A (en) Pricing process efficiency prediction method, device, equipment and storage medium
CN113780689B (en) Energy router service life prediction method and device based on artificial intelligence
CN117896021B (en) Performance evaluation method and device for software radio communication equipment
CN113065733B (en) Electrical asset management method based on artificial intelligence
CN112895967B (en) Method, system, medium, and device for predicting remaining service time of battery replacement mileage
US20230064674A1 (en) Iterative training of computer model for machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination