WO2021159749A1

WO2021159749A1 - Self-learning online update method and system for multi-classification model, and apparatus

Info

Publication number: WO2021159749A1
Application number: PCT/CN2020/124883
Authority: WO
Inventors: 李弦; 阮晓雯; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-09-04
Filing date: 2020-10-29
Publication date: 2021-08-19
Also published as: CN112036579B; CN112036579A

Abstract

Disclosed is a self-learning online update method for a multi-classification model, relating to artificial intelligence. The method comprises: according to a preset statistical period, performing monitoring and compiling statistics on the prediction performance of a model to be updated, and storing, in a statistical database, a predication performance statistical result in each statistical period (S110); checking data in the statistical database by using a preset trigger mechanism, so as to determine whether said model needs to be updated online (S120); if said model needs to be updated online, acquiring online newly generated data, and updating training data of said model according to the newly generated data (S130); and updating and training said model by using the updated training data, so as to obtain an updated multi-classification model (S140). The present application further relates to blockchain technology. The statistical database is stored in a blockchain. The existing problems of the prediction precision of a multi-classification model being significantly reduced as time goes by, and the multi-classification model being unable to be automatically updated can be solved.

Description

Multi-classification model self-learning online update method, system and device

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on September 4, 2020, the application number is 2020109227529, and the invention title is "Multi-class model self-learning online update method, system and device", the entire content of which is by reference Incorporated in this application.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a method, system, device and storage medium for self-learning online updating of multiple classification models.

Background technique

In the field of artificial intelligence technology, machine learning models are commonly used methods, such as multi-classification models, which are used to classify the data to be tested, realize data classification automation, and improve classification efficiency. However, for machine learning models (especially multi-class models), its prediction performance mainly depends on the mining of training sample data. The stronger the training data sample's simulation of actual data, the stronger the prediction performance of the model.

However, after the trained model is deployed and online, if the distribution or pattern of the data to be predicted on the line changes over time, and there are more patterns that are not covered by the training data, the prediction accuracy of the model will drop significantly. Such as the classification model of government official documents, the content of official documents to be predicted will change with current events or policies. Therefore, it is necessary to use the newly acquired annotation data to update the model affected by timeliness. If manual updates are used, technicians are required to track the performance of the model, and continue to repeatedly train the model and deploy it online, which will inevitably consume a lot of manpower.

The inventor found that there are currently few automatic update methods for machine learning models, especially for multi-class models, which cannot be automatically updated. The main problem is the lack of trigger mechanism for multi-class model update, selection of training data, and model update. And other specific technical solutions, so the automatic update of the multi-classification model cannot be realized.

Based on the above problems, there is an urgent need for a method that can realize automatic updating of multi-classification models.

technical problem

This application provides a self-learning online update method, system, electronic device, and computer storage medium for a multi-classification model. The main purpose of the method is to solve the problem that the existing multi-classification model has a significant decrease in prediction accuracy over time and cannot achieve automatic update. problem.

Technical solutions

In order to achieve the above objective, this application provides a self-learning online update method for a multi-classification model. The method includes the following steps:

According to the preset statistical period, the prediction performance of the model to be updated is monitored and statistics, and the statistical results of the prediction performance in each statistical period are stored in the statistical database;

Use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online;

If the model to be updated needs to be updated online, acquiring newly generated data online, and updating the training data of the model to be updated according to the newly generated data;

The updated training data is used to update and train the model to be updated to obtain an updated multi-classification model.

In addition, this application also provides a self-learning online update system for multi-classification models, which includes:

The performance monitoring unit is used to monitor and count the predicted performance of the model to be updated according to the preset statistical period, and store the statistical results of the predicted performance in each statistical period into the statistical database;

The mechanism trigger unit is configured to use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online;

The data update unit is configured to, if the model to be updated needs to be updated online, obtain newly generated data online, and update the training data of the model to be updated according to the newly generated data;

The model update unit is used to update and train the model to be updated by using the updated training data to obtain an updated multi-classification model.

In addition, in order to achieve the above object, the present application also provides an electronic device, the electronic device comprising: a memory, a processor, and a multi-class model self-learning online update program stored in the memory and running on the processor , The method for realizing the self-learning online update of the multi-classification model when the multi-classification model self-learning online update program is executed by the processor;

Wherein, the steps of the multi-classification model self-learning online update method include:

In addition, in order to achieve the above object, this application also provides a computer-readable storage medium in which a multi-class model self-learning online update program is stored, and the multi-class model self-learning online update program is processed A self-learning online update method for multi-classification models is realized when the device is executed;

Beneficial effect

The multi-classification model self-learning online update method, electronic device and computer readable storage medium proposed in this application can realize multi-classification by designing a set of multi-classification model update trigger mechanism, training data update mechanism and model update method. The online automatic update of the model can also ensure that the prediction accuracy of the multi-class model has been maintained at a high level.

Description of the drawings

Fig. 1 is a flowchart of a preferred embodiment of a self-learning online update method for a multi-classification model according to an embodiment of the present application;

2 is a schematic structural diagram of a preferred embodiment of an electronic device according to an embodiment of the present application;

Fig. 3 is a schematic diagram of internal logic of a multi-class model self-learning online update program according to an embodiment of the present application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

The best mode of the present invention

In the following description, for illustrative purposes, in order to provide a comprehensive understanding of one or more embodiments, many specific details are set forth. However, it is obvious that these embodiments can also be implemented without these specific details.

The specific embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Example 1

In order to illustrate the multi-class model self-learning online update method provided in this application, FIG. 1 shows the flow of the multi-class model self-learning online update method provided in this application.

As shown in Figure 1, the self-learning online update method for multi-classification models provided by this application includes:

S110: Perform monitoring and statistics on the prediction performance of the model to be updated according to the preset statistical period, and store the statistical results of the prediction performance in each statistical period in the statistical database.

It should be noted that, in order to more accurately reflect the prediction performance of the model to be updated, this application uses the overall prediction precision value of the model to be updated as a statistical value to characterize the prediction performance of the model to be updated, where the precision value of the prediction accuracy is specific The calculation formula is:

Prediction accuracy Precision value = the number of samples correctly classified/the number of samples as a whole, where the number of samples correctly classified is the number of samples correctly classified by the model to be updated in the statistical period, and the total number of samples is the input in the statistical period To the total number of samples in the model to be updated.

It should be further explained that the statistical period needs to be preset according to the volume of business data of the system. If there is a large amount of business data, you can set daily statistics (that is, 1 day is a statistical cycle). If the amount of data is small, you can set statistics on a weekly or monthly basis (that is, 1 week or 1 month is a statistics cycle). In practical applications, for official document classification scenarios, the prediction accuracy of the model is usually calculated with a weekly statistical cycle.

Taking the scenario of document classification as an example, it is generally assumed that the amount of newly added data in the system is greater than 1,000, and it can be considered that the amount of data is large. If the amount of newly added data per day is greater than 1000, then the statistics will be calculated on a daily basis, if the accumulated data amount of the week is greater than 1000, then the statistics will be calculated on a weekly basis, and so on.

What needs to be explained here is that the official document classification scenario means that the staff assigns the official documents to each corresponding office or department in accordance with the content of the official document and the functions of each office or department within the organization. Department for processing. To put it simply, it is to classify official documents, and the label of the classification is the name of each office. The number of official documents received by various agencies may be different each day, but the amount is generally relatively small. The number of official documents that need to be distributed daily is about 200, so the statistics are calculated on a weekly basis.

In addition, it should be emphasized that, in order to further ensure the privacy and security of the data in the above statistical database, the statistical database is stored in the nodes of the blockchain.

S120: Use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online.

It should be noted that the trigger mechanism is used to determine whether the model to be updated needs to be updated based on the data in the statistical database. Whether the model update needs to be triggered can be determined based on the historical prediction performance of the online model (ie the model to be updated) in the statistical database. .

Specifically, the judgment condition for triggering the update according to the historical prediction performance of the online model can be: the online model is updated if any of the following conditions is met (this application corresponds to the processing of the subsequent steps, if the judgment condition is not met , Then continue to loop the above steps).

For example, trigger mechanisms include:

Mechanism A: If the prediction accuracy value of the historical N statistical periods including the current statistical period continues to decrease, it is determined that the model to be updated needs to be updated online, where N is the first preset parameter.

Mechanism B: If the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-2*standard deviation of the prediction accuracy, or the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-decrease percentage P; then it is determined that the update is pending The model needs to be updated online; among them,

The prediction accuracy average value is the average value of the prediction accuracy Precision values of the historical N statistical periods, the prediction accuracy standard deviation is the standard deviation of the prediction accuracy Precision values of the historical N statistical periods, and the decrease percentage P is the second The prediction parameters, N and the percentage of decline P can be set according to business scenarios. In the document classification scenario, N is set to 5.

It should be noted that the above trigger mechanism is designed based on the actual situation and experience in the real scene. Following this rule to update the model can maintain a certain accuracy of the model, which will not change with time but the accuracy will decrease significantly.

What needs to be further explained here is that the business experience here refers to the accuracy changes that can be tolerated in each business scenario. If some business scenarios require high requirements, and the accuracy is not allowed to drop from 90% to 89%, the model needs to be updated if the accuracy drops more than P = 1%. In some scenarios, the requirements are not so high, maybe P=10% to update the model. In the official document scenario, generally compare with the value of the previous N=5 statistical periods, P=5%, and update if it exceeds 5%.

In addition, other mechanisms can also be used to initiate the model update (that is, the subsequent steps of this application); for example, the model update is performed by setting the time period through business experience. When the statistical period of the statistical database reaches the set number of time periods, the online model Update.

For example, the trigger mechanism further includes: Mechanism C: Determine whether the online duration of the model to be updated reaches a preset update cycle threshold, and if it reaches, it is determined that the model to be updated needs to be updated online; wherein, the update The period threshold is M times the statistical period; where M is a natural number and ≥2.

S130: If the model to be updated needs to be updated online, obtain newly generated data online, and update the training data of the model to be updated according to the newly generated data.

It should be noted that the most important aspect of model update is the update of training data. Including the generated new data into the training data of the model is the preferred method for model update.

Specifically, the update of training data includes the following steps:

Check whether the distribution of the newly generated annotation data on the line is consistent with the historical training data of the model. If they are consistent, the newly generated data and the historical training data are merged to generate training update data.

It should be noted that, since the prediction accuracy of the multi-class model is extremely susceptible to the data distribution, it is necessary to check the consistency between the current online data distribution and the historical training data distribution. If the proportions of various samples are close, the historical training data can be directly merged with the new data. Otherwise, if there is a change in the online data distribution, that is, there is a big difference from the historical training data. If the proportion of a certain type of sample changes greatly, you need to compare the online data situation to the historical training data through downsampling and oversampling. Adjust the sample ratio of, and then merge.

Specifically, if the trigger mechanism is the mechanism A or the mechanism B, the updating the training data of the model to be updated according to the newly generated data includes:

Incremental update:

Checking whether the distribution of the newly generated data is consistent with the historical training data of the model to be updated;

If they are consistent, merge the newly generated data with the historical training data to generate training update data; wherein,

If the proportion difference between the various samples of the newly generated data and the historical training data is less than the preset proportion threshold, it is determined that the distribution of the newly generated data is consistent with the historical training data.

If the distribution of the newly generated data is inconsistent with the historical training data, the various samples of the historical training data are processed in a circular manner through down-sampling and over-sampling until the newly generated data is consistent with the historical training data. The proportion difference of various samples of the training data is less than the preset proportion threshold;

Combining the newly generated data with the historical training data after cyclic processing to generate the training update data.

It should be noted that there are two ways to update the training data, one is incremental, that is, the new data generated is directly added to the historical training data, and the full amount of historical training data is retained. The other is the rolling fixed length, that is, the duration of the fixed training data. For example, the updated training data is the data 2 years before the current statistical period. Adding new data will remove the earliest data of the corresponding length in the training data. The newly added data refers to the data accumulated from the last update to the current statistical period. The second method is generally selected in scenes where data changes rapidly.

Specifically, if the trigger mechanism is the mechanism C, the updating the training data of the model to be updated according to the newly generated data includes:

Rolling update:

Set a fixed rolling duration, and select corresponding data from the newly generated data and the historical training data according to the fixed rolling duration as the training update data; wherein,

The rolling fixed duration is L times the statistical period; where L is a natural number, and L>M. It should be noted that only when L>M, the generated training update data will include both historical training data and newly generated data, so as to ensure the fit of the training update data to the actual scene.

S140: Perform update training on the to-be-updated model using the training update training generated after the update, so as to obtain an updated multi-classification model.

In actual use, in the new statistical period, evaluate and compare the prediction performance of the new training model and the old model to determine whether to replace the old model with the new model. If the prediction accuracy of the new model is greater than the old model, the prediction model will be used Replace, otherwise the model will not be replaced.

The following uses official document data as an example to explain in detail the process of the multi-class model self-learning online update method provided in this application. Notice, take the classification label: Social Development Department (Office Name), where the labeled data and historical training data are in the above display form, but the new labeled data is the latest official document, and the historical training data is the previous official document.

The proportion of samples of various types refers to the different numbers of samples belonging to each type, such as:

Office 1 The number of official documents accounted for 20% of all official documents

Office 2 Proportion of the number of official documents 7%

Office 3 Proportion of official documents 0.2%

…

If there are a large number of specific official document events in a certain period of time, the distribution of the original sample's proportion will change, becoming as follows:

Office 1 The number of official documents accounted for 9% of all official documents

Office 2 Proportion of the number of official documents 10%

Office 3 Proportion of official documents 3%

…

The proportion of Office 1 changed from the dominant 20% to 9%, which caused a big change in the sample distribution. The big change here can be defined as the rate of change in the proportion of a certain room exceeds threshold = 50%, (20-9/20> 50%)

The corresponding multi-classification model is updated according to the above model update process. For example, in the official document classification scenario, the weekly prediction accuracy remains at about 78% after the model is online. When the epidemic occurs, the content of the official document changes. When the model is not updated, the model accuracy Down to 70%. After adopting the automatic model update mechanism proposed by the patent, the accuracy of the model can be maintained at 78%.

It can be seen from the expression of the above technical solution that the self-learning online update method for multi-classification models provided by this application can realize multi-classification models by self-designing a set of multi-classification model update trigger mechanism, training data update mechanism and model update method. The online automatic update can ensure that the prediction accuracy of the multi-classification model has been maintained at a high level. In addition, the prediction performance of the online multi-class model is automatically reflected through model accuracy tracking, and the corresponding model update trigger mechanism is set to provide a criterion for model performance degradation, which can effectively find the update time of the multi-class model. Prevent the prediction model from falling. In addition, by checking the online data distribution and adjusting the sample ratio of the training data, the updated model can be adapted to adapt to changes in the online running data distribution; and the setting of the model update conditions makes the model's prediction accuracy Stable in a certain range, so as to significantly improve the adaptive ability of the model under the premise of ensuring the stability of the model's prediction accuracy. In addition, the multi-class model self-learning online update solution provided by the present application can also effectively avoid the complicated work of manually updating the model, and can respond in real time to ensure the performance of the prediction model.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

Example 2

Corresponding to the above method, this application also provides a multi-classification model self-learning online update system, which includes:

Example 3

The application also provides an electronic device 70. Referring to FIG. 2, this figure is a schematic structural diagram of a preferred embodiment of the electronic device 70 provided by this application.

In this embodiment, the electronic device 70 may be a terminal device with a computing function, such as a server, a smart phone, a tablet computer, a portable computer, a desktop computer, and the like.

The electronic device 70 includes a processor 71 and a memory 72.

The memory 72 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as flash memory, hard disk, multimedia card, card-type memory, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 70, such as a hard disk of the electronic device 70. In other embodiments, the readable storage medium may also be an external memory of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 70, a smart memory card (Smart Media Card, SMC), or a secure digital (Secure Digital). Digital, SD) card, flash card, etc.

In this embodiment, the readable storage medium of the memory 72 is generally used to store the multi-class model self-learning online update program 73 installed in the electronic device 70. The memory 72 can also be used to temporarily store data that has been output or will be output.

The processor 72 may be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip in some embodiments, for running the program code or processing data stored in the memory 72, for example, the multi-classification model self Learn the online update program 73 and more.

In some embodiments, the electronic device 70 is a terminal device such as a smart phone, a tablet computer, and a portable computer. In other embodiments, the electronic device 70 may be a server.

FIG. 2 only shows the electronic device 70 with the components 71-73, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.

Optionally, the electronic device 70 may also include a user interface. The user interface may include an input unit such as a keyboard (Keyboard), a voice input device such as a microphone (microphone) and other devices with voice recognition functions, and a voice output device such as audio, earphones, etc. Optionally, the user interface may also include a standard wired interface and a wireless interface.

Optionally, the electronic device 70 may further include a display, and the display may also be referred to as a display screen or a display unit. In some embodiments, it may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, and an organic light-emitting diode (Organic Light Emitting Diode). Light-Emitting Diode, OLED) touch device, etc. The display is used for displaying information processed in the electronic device 70 and for displaying a visualized user interface.

Optionally, the electronic device 70 may also include a touch sensor. The area provided by the touch sensor for the user to perform touch operations is called the touch area. In addition, the touch sensor here may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor, but also a proximity type touch sensor and the like. In addition, the touch sensor may be a single sensor, or may be, for example, a plurality of sensors arranged in an array.

In addition, the area of the display of the electronic device 70 may be the same as or different from the area of the touch sensor. Optionally, the display and the touch sensor are stacked to form a touch display screen. The device detects the touch operation triggered by the user based on the touch screen.

Optionally, the electronic device 70 may also include a radio frequency (RF) circuit, a sensor, an audio circuit, etc., which will not be repeated here.

In the device embodiment shown in FIG. 2, the memory 72 as a computer storage medium may include an operating system, and a multi-class model self-learning online update program 73; the processor 71 executes the multi-class model self-learning stored in the memory 72 The following steps are implemented when updating program 73 online:

In this embodiment, FIG. 3 is a schematic diagram of the internal logic of the multi-class model self-learning online update program according to an embodiment of the present application. As shown in FIG. 3, the multi-class model self-learning online update program 73 can also be divided into one or Multiple modules, one or more modules are stored in the memory 72 and executed by the processor 71 to complete the application. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions. Referring to FIG. 3, it is a program module diagram of a preferred embodiment of the multi-class model self-learning online update program 73 in FIG. 2. The multi-class model self-learning online update program 73 can be divided into: a performance monitoring module 74, a mechanism triggering module 75, a data update module 76, and a model update module 77. The functions or operation steps implemented by modules 74-77 are similar to the above, and will not be described in detail here. Illustratively, for example, where:

The performance monitoring module 74 is used to monitor and count the predicted performance of the model to be updated according to the preset statistical period, and store the statistical results of the predicted performance in each statistical period into the statistical database;

The mechanism trigger module 75 is configured to use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online;

The data update module 76 is configured to, if the model to be updated needs to be updated online, obtain newly generated data online, and update the training data of the model to be updated according to the newly generated data;

The model update module 77 is configured to update and train the model to be updated by using the updated training data to obtain an updated multi-classification model.

Example 4

This application also provides a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium stores a multi-classification model self-learning online update program 73, When the multi-class model self-learning online update program 73 is executed by the processor, the following operations are implemented:

The specific implementation of the computer-readable storage medium provided in this application is substantially the same as the specific implementation of the above-mentioned multi-class model self-learning online update method and electronic device, and will not be repeated here.

It should be noted that the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

It should be further clarified that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements , But also includes other elements that are not explicitly listed, or elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article, or method that includes the element.

The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments. Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium such as ROM/RAM, magnetic A disc or an optical disc) includes a number of instructions to enable a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods of the various embodiments of the present application.

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

A self-learning online update method for multiple classification models, applied to an electronic device, wherein the method includes:

According to the preset statistical period, the prediction performance of the model to be updated is monitored and statistics, and the statistical results of the prediction performance in each statistical period are stored in the statistical database;

Use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online;

If the model to be updated needs to be updated online, acquiring newly generated data online, and updating the training data of the model to be updated according to the newly generated data;

The updated training data is used to update and train the model to be updated to obtain an updated multi-classification model.
The multi-class model self-learning online update method according to claim 1, wherein the statistical database is stored in a node of the blockchain, and the prediction performance includes a prediction precision value, and the calculation of the prediction precision precision value The formula is:

Prediction accuracy Precision value = the number of samples correctly classified/the number of samples as a whole; and,

The trigger mechanism includes:

Mechanism A: If the prediction accuracy value of the historical N statistical periods including the current statistical period continues to decrease, it is determined that the model to be updated needs to be updated online, where N is the first preset parameter.
The method for self-learning online updating of multiple classification models according to claim 2, wherein the trigger mechanism further comprises:

Mechanism B: If the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-2*standard deviation of the prediction accuracy, or the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-decrease percentage P; then it is determined that the update is pending The model needs to be updated online; among them,

The prediction accuracy average value is the average value of the prediction accuracy Precision values of the historical N statistical periods, the prediction accuracy standard deviation is the standard deviation of the prediction accuracy Precision values of the historical N statistical periods, and the decrease percentage P is the second Forecast parameters.
The method for self-learning online updating of multiple classification models according to claim 3, wherein the trigger mechanism further comprises:

Mechanism C: Determine whether the online duration of the model to be updated reaches the preset update cycle threshold, and if it reaches, then determine that the model to be updated needs to be updated online; wherein,

The update period threshold is M times the statistical period; where M is a natural number and ≥2.
The multi-class model self-learning online update method according to claim 4, wherein if the trigger mechanism is the mechanism A or the mechanism B, the updating the training data of the model to be updated according to the newly generated data comprises :

Incremental update:

Checking whether the distribution of the newly generated data is consistent with the historical training data of the model to be updated;

If they are consistent, merge the newly generated data with the historical training data to generate training update data; wherein,

If the proportion difference between the various samples of the newly generated data and the historical training data is less than the preset proportion threshold, it is determined that the distribution of the newly generated data is consistent with the historical training data.
The method for self-learning online updating of multiple classification models according to claim 5, wherein:

If the distribution of the newly generated data is inconsistent with the historical training data, the various samples of the historical training data are processed in a circular manner through down-sampling and over-sampling until the newly generated data is consistent with the historical training data. The proportion difference of various samples of the training data is less than the preset proportion threshold;

Combining the newly generated data with the historical training data after cyclic processing to generate the training update data.
The multi-class model self-learning online update method according to claim 6, wherein if the trigger mechanism is the mechanism C, the updating the training data of the model to be updated according to the newly generated data comprises:

Rolling update:

Set a fixed rolling duration, and select corresponding data from the newly generated data and the historical training data according to the fixed rolling duration as the training update data; wherein,

The rolling fixed duration is L times the statistical period; where L is a natural number, and L>M.
A self-learning online update system for multiple classification models, wherein the system includes:

The performance monitoring unit is used to monitor and count the predicted performance of the model to be updated according to the preset statistical period, and store the statistical results of the predicted performance in each statistical period into the statistical database;

The mechanism trigger unit is configured to use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online;

The data update unit is configured to, if the model to be updated needs to be updated online, obtain newly generated data online, and update the training data of the model to be updated according to the newly generated data;

The model update unit is used to update and train the model to be updated by using the updated training data to obtain an updated multi-classification model.
An electronic device, wherein the electronic device includes a memory, a processor, and a multi-class model self-learning online update program stored in the memory and running on the processor, and the multi-class model self-learning A method for realizing self-learning online updating of multi-classification models when the online updating program is executed by the processor;

Wherein, the steps of the multi-classification model self-learning online update method include:

According to the preset statistical period, the prediction performance of the model to be updated is monitored and statistics, and the statistical results of the prediction performance in each statistical period are stored in the statistical database;

Use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online;

If the model to be updated needs to be updated online, acquiring newly generated data online, and updating the training data of the model to be updated according to the newly generated data;

The updated training data is used to update and train the model to be updated to obtain an updated multi-classification model.
The electronic device according to claim 9, wherein the statistical database is stored in a node of the blockchain, and the prediction performance includes a prediction precision value, and the calculation formula of the prediction precision precision value is:

Prediction accuracy Precision value = the number of samples correctly classified/the number of samples as a whole; and,

The trigger mechanism includes:

Mechanism A: If the prediction accuracy value of the historical N statistical periods including the current statistical period continues to decrease, it is determined that the model to be updated needs to be updated online, where N is the first preset parameter.
The electronic device according to claim 10, wherein the trigger mechanism further comprises:

Mechanism B: If the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-2*standard deviation of the prediction accuracy, or the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-decrease percentage P; then it is determined that the update is pending The model needs to be updated online; among them,

The prediction accuracy average value is the average value of the prediction accuracy Precision values of the historical N statistical periods, the prediction accuracy standard deviation is the standard deviation of the prediction accuracy Precision values of the historical N statistical periods, and the decrease percentage P is the second Forecast parameters.
The electronic device according to claim 11, wherein the trigger mechanism further comprises:

Mechanism C: Determine whether the online duration of the model to be updated reaches the preset update cycle threshold, and if it reaches, then determine that the model to be updated needs to be updated online; wherein,

The update period threshold is M times the statistical period; where M is a natural number and ≥2.
The electronic device according to claim 12, wherein if the trigger mechanism is the mechanism A or the mechanism B, the updating the training data of the model to be updated according to the newly generated data comprises:

Incremental update:

Checking whether the distribution of the newly generated data is consistent with the historical training data of the model to be updated;

If they are consistent, merge the newly generated data with the historical training data to generate training update data; wherein,

If the proportion difference between the various samples of the newly generated data and the historical training data is less than the preset proportion threshold, it is determined that the distribution of the newly generated data is consistent with the historical training data.
The electronic device according to claim 13, wherein:

If the distribution of the newly generated data is inconsistent with the historical training data, the various samples of the historical training data are processed in a circular manner through down-sampling and over-sampling until the newly generated data is consistent with the historical training data. The proportion difference of various samples of the training data is less than the preset proportion threshold;

Combining the newly generated data with the historical training data after cyclic processing to generate the training update data.
A computer-readable storage medium, wherein a multi-class model self-learning online update program is stored in the computer-readable storage medium, and the multi-class model self-learning online update program is executed by a processor to realize the multi-class model self-learning Online update method;

Wherein, the steps of the multi-classification model self-learning online update method include:

According to the preset statistical period, the prediction performance of the model to be updated is monitored and statistics, and the statistical results of the prediction performance in each statistical period are stored in the statistical database;

Use a preset trigger mechanism to check the data in the statistical database to determine whether the model to be updated needs to be updated online;

If the model to be updated needs to be updated online, acquiring newly generated data online, and updating the training data of the model to be updated according to the newly generated data;

The updated training data is used to update and train the model to be updated to obtain an updated multi-classification model.
The computer-readable storage medium according to claim 15, wherein the statistical database is stored in a node of the blockchain, and the prediction performance includes a prediction precision value, and the calculation formula of the prediction precision precision value is:

Prediction accuracy Precision value = the number of samples correctly classified/the number of samples as a whole; and,

The trigger mechanism includes:

Mechanism A: If the prediction accuracy value of the historical N statistical periods including the current statistical period continues to decrease, it is determined that the model to be updated needs to be updated online, where N is the first preset parameter.
The computer-readable storage medium of claim 16, wherein the trigger mechanism further comprises:

Mechanism B: If the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-2*standard deviation of the prediction accuracy, or the precision value of the prediction accuracy of the current statistical period is less than the average prediction accuracy-decrease percentage P; then it is determined that the update is pending The model needs to be updated online; among them,

The prediction accuracy average value is the average value of the prediction accuracy Precision values of the historical N statistical periods, the prediction accuracy standard deviation is the standard deviation of the prediction accuracy Precision values of the historical N statistical periods, and the decrease percentage P is the second Forecast parameters.
The computer-readable storage medium of claim 17, wherein the trigger mechanism further comprises:

Mechanism C: Determine whether the online duration of the model to be updated reaches the preset update cycle threshold, and if it reaches, then determine that the model to be updated needs to be updated online; wherein,

The update period threshold is M times the statistical period; where M is a natural number and ≥2.
The computer-readable storage medium according to claim 18, wherein if the trigger mechanism is the mechanism A or the mechanism B, the updating the training data of the model to be updated according to the newly generated data comprises:

Incremental update:

Checking whether the distribution of the newly generated data is consistent with the historical training data of the model to be updated;

If they are consistent, merge the newly generated data with the historical training data to generate training update data; wherein,

If the proportion difference between the various samples of the newly generated data and the historical training data is less than the preset proportion threshold, it is determined that the distribution of the newly generated data is consistent with the historical training data.
The computer-readable storage medium according to claim 19, wherein:

If the distribution of the newly generated data is inconsistent with the historical training data, the various samples of the historical training data are processed in a circular manner through down-sampling and over-sampling until the newly generated data is consistent with the historical training data. The proportion difference of various samples of the training data is less than the preset proportion threshold;

Combining the newly generated data with the historical training data after cyclic processing to generate the training update data.