CN116432780A - Model increment learning method, device, equipment and storage medium - Google Patents
Model increment learning method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN116432780A CN116432780A CN202310399396.0A CN202310399396A CN116432780A CN 116432780 A CN116432780 A CN 116432780A CN 202310399396 A CN202310399396 A CN 202310399396A CN 116432780 A CN116432780 A CN 116432780A
- Authority
- CN
- China
- Prior art keywords
- model
- preset
- training
- data set
- trained
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 67
- 238000012549 training Methods 0.000 claims abstract description 217
- 230000010354 integration Effects 0.000 claims abstract description 33
- 238000004590 computer program Methods 0.000 claims description 14
- 230000009191 jumping Effects 0.000 claims description 9
- 230000000694 effects Effects 0.000 abstract description 10
- 238000010586 diagram Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 9
- 238000002372 labelling Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000013585 weight reducing agent Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a model increment learning method, device, equipment and storage medium, relating to the technical field of model training, comprising the following steps: acquiring a plurality of historical prediction results as a current data set to be trained, and performing data updating operation on the current data set to be trained based on preset data updating rules to obtain a corresponding updated data set to be trained; model training is carried out by using the updated data set to be trained to obtain a corresponding current training model, and the current training model is put into a preset integration model based on preset model adding rules to update the preset integration model; and carrying out weight updating operation on each training model in the updated preset integrated model so as to update the preset weight adjuster, and carrying out category prediction operation by using the updated preset integrated model and the updated weight adjuster so as to obtain a plurality of prediction results. According to the method and the device, the integrated model is utilized to update and learn data and the model, so that the problem of concept drift can be relieved, and the prediction effect of the model is smoother and more stable.
Description
Technical Field
The present invention relates to the field of model training technologies, and in particular, to a model incremental learning method, device, equipment, and storage medium.
Background
With the continuous popularization of artificial intelligence, model training in machine learning has been widely applied to various industries, however, in the process of model training, the problem of degradation of model detection performance due to concept drift may occur. Concept drift is classified into true concept drift and false concept drift. False concept drift is that data trained once cannot be covered by the data population, and new data comes in need of retraining the model. True concept drift is a mapping of existing training dataThe shot changes over time from y=f (x) to y=g (x), resulting in the need for retraining of the original model. Referring to FIG. 1, there is a data set D 1 Training to obtain a model m 1 . After a certain time, data D appears 2 。D 2 C in (2) represents new data occurrence, then model m 1 C cannot be predicted; data D 1 And D 2 If there is a change in the data mapping function from y=f (x) to y=g (x), model m 1 Also, a cannot be predicted correctly.
The prior art has methods for alleviating concept drift, such as using G-means for dynamic weighting, using K-means clustering algorithm for determining similarity between data for retraining, etc., and the foregoing methods can alleviate the influence of conflict data on a model to a certain extent, but require manual labeling of data and the generated virtual data can negatively affect the model to be trapped in local problems, etc., thus solving the problem of concept drift, learning new knowledge in data C and solving the problem that the conflict data in B still needs attention at the present stage.
Disclosure of Invention
In view of the above, the present invention aims to provide a model incremental learning method, device, equipment and storage medium, which can update and learn data and a model by using an integrated model, so as to alleviate or even solve the problem of concept drift to a certain extent and make the prediction effect of the model smoother and more stable. The specific scheme is as follows:
in a first aspect, the present application discloses a model incremental learning method, including:
acquiring a plurality of historical prediction results as a current data set to be trained, and performing data updating operation on the current data set to be trained based on preset data updating rules to obtain a corresponding updated data set to be trained;
model training is carried out by utilizing the updated data set to be trained to obtain a corresponding current training model, and the current training model is put into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction;
and carrying out weight updating operation on each training model in the updated preset integrated model to update the preset weight adjuster, carrying out category prediction operation on the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning.
Optionally, before the obtaining the plurality of historical prediction results as the current data set to be trained, the method further includes:
and training the model by using the initial data set to be trained to obtain a first training model, and creating the preset integrated model to put the first training model into the preset integrated model.
Optionally, after the obtaining the plurality of historical prediction results as the current data set to be trained, the method further includes:
calculating the similarity between the current data set to be trained and the data set to be trained of the previous round of iterative learning of the model of the present round, and judging whether the similarity is smaller than a preset similarity threshold value or not;
if not, determining reserved data in the current data set to be trained, and directly carrying out category prediction operation by using the preset integrated model and the weight adjuster to obtain a plurality of prediction results; the reserved data are used for being added into a data set to be trained of the next round when the model of the next round is iteratively learned.
Optionally, the performing a data update operation on the current data set to be trained based on a preset data update rule to obtain a corresponding updated data set to be trained includes:
and if the similarity is smaller than the preset similarity threshold, carrying out data updating operation on the current data set to be trained which is iteratively learned by the round of model based on a preset data updating rule so as to obtain a corresponding updated data set to be trained.
Optionally, the step of placing the current training model into a preset integration model based on a preset model addition rule to update the preset integration model includes:
judging whether the number of all training models in the preset integrated model reaches a preset number threshold;
if not, directly putting the current training model into the preset integration model to update the preset integration model;
if yes, a training model is removed from the preset integrated model by using a preset model removing rule, and the current training model is put into the preset integrated model to update the preset integrated model.
Optionally, the updating the weight of each training model in the updated preset integrated model to update the preset weight adjuster includes:
selecting a data set to be predicted from a historical data set based on a preset data selection rule, and inputting the data set to be predicted into each training model of the preset integrated model to obtain a first predicted value corresponding to each training model;
and splicing the first predicted values of each training model into an array, inputting the array into the preset weight adjuster, and training to obtain the weight corresponding to each training model so as to update the preset weight adjuster.
Optionally, the performing a class prediction operation by using the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results includes:
acquiring a preset data set and inputting the preset data set into each training model of the updated preset integrated model to obtain a second predicted value corresponding to each training model;
and splicing the second predicted values corresponding to each training model into an array, and inputting the array into the updated weight adjuster for category prediction operation so as to obtain a plurality of predicted results.
In a second aspect, the present application discloses a model incremental learning device comprising:
the data updating module is used for acquiring a plurality of historical prediction results as a current data set to be trained, and carrying out data updating operation on the current data set to be trained based on a preset data updating rule so as to obtain a corresponding updated data set to be trained;
the model updating module is used for carrying out model training by utilizing the updated data set to be trained to obtain a corresponding current training model, and placing the current training model into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction;
The weight updating module is used for carrying out weight updating operation on each training model in the updated preset integrated model so as to update the preset weight adjuster;
and the prediction result determining module is used for carrying out category prediction operation by utilizing the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning.
In a third aspect, the present application discloses an electronic device comprising:
a memory for storing a computer program;
and a processor for executing the computer program to implement the model increment learning method.
In a fourth aspect, the present application discloses a computer readable storage medium storing a computer program which, when executed by a processor, implements the foregoing model incremental learning method.
Therefore, the method and the device acquire a plurality of historical prediction results as the current data set to be trained, and perform data updating operation on the current data set to be trained based on the preset data updating rule so as to acquire the corresponding updated data set to be trained; model training is carried out by utilizing the updated data set to be trained to obtain a corresponding current training model, and the current training model is put into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction; and carrying out weight updating operation on each training model in the updated preset integrated model to update the preset weight adjuster, carrying out category prediction operation on the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning. Therefore, the method and the device can update and train the data and the models periodically, collect the training models obtained according to each training to obtain an integrated model, and determine the weight of each training model in the integrated model, so that effective data is extracted through updating operation of the data, redundant data is reduced to participate in model training, the model training iteration speed is accelerated, manual labeling work participation is reduced to a certain extent, in addition, the integrated model is obtained by collecting training models containing different knowledge systems through the use of an integrated idea, the weight of each training model in the integrated model is adjusted, the effect of each training model in the current time period is highlighted, the negative influence of concept drift problem on model category prediction is reduced, and the model prediction effect can be more accurate and stable while the model rapid iteration updating is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a data relationship due to concept drift as disclosed herein;
FIG. 2 is a flow chart of a model increment learning method disclosed in the present application;
FIG. 3 is an overall flow chart of a model increment learning method disclosed herein;
FIG. 4 is a flowchart of a specific model incremental learning method disclosed herein;
FIG. 5 is a schematic diagram of a data approximations update of the present disclosure;
FIG. 6 is a schematic diagram of a first model addition method disclosed in the present application;
FIG. 7 is a schematic diagram of a second model addition method disclosed herein;
FIG. 8 is a schematic diagram of a model weight update flow disclosed in the present application;
FIG. 9 is a schematic diagram of a model class prediction flow disclosed in the present application;
FIG. 10 is a schematic diagram of a model incremental learning device disclosed in the present application;
fig. 11 is a block diagram of an electronic device disclosed in the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
In the prior art, methods for alleviating concept drift exist, such as using G-means for dynamic weighting, using K-means clustering algorithm for determining similarity between data for retraining, and the like, and although the foregoing methods can alleviate the influence of conflict data on a model to a certain extent, the foregoing methods require manual labeling of data and the generated virtual data can negatively affect the model to a local problem, and in order to solve the problem of concept drift, learn new knowledge in data C and solve conflict data in existing B, the application discloses a model incremental learning method.
Referring to fig. 1, an embodiment of the present application discloses a model incremental learning method, including:
step S11: and acquiring a plurality of historical prediction results as a current data set to be trained, and performing data updating operation on the current data set to be trained based on a preset data updating rule so as to obtain a corresponding updated data set to be trained.
In this embodiment, first, in the class prediction process of the previous model, a plurality of historical prediction results are generated, a plurality of historical prediction results are obtained and utilized as a current data set to be trained in the present round of iterative training, and then, based on a preset data update rule, data update operation is performed on the current data set to be trained, so as to obtain a corresponding updated data set to be trained. In this way, effective data causing concept offset can be extracted by updating the data, so as to reduce the participation and influence of model training times and repeated data on model training.
Step S12: model training is carried out by utilizing the updated data set to be trained to obtain a corresponding current training model, and the current training model is put into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction.
In this embodiment, after the data updating operation is completed, performing a model training operation by using the updated data set to be trained to obtain a current training model corresponding to the iterative training of the present round, and then placing the current training model into a preset integrated model based on a preset model adding rule to update the preset integrated model, where the training models of all the participating class predictions are stored in the preset integrated model. In this way, the model is trained by using the updated data, and a new training model can be quickly constructed by using the effective data.
Step S13: and carrying out weight updating operation on each training model in the updated preset integrated model to update the preset weight adjuster, carrying out category prediction operation on the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning.
In this embodiment, after the current training model is added to the preset integrated model, a weight updating operation is performed on each training model in the updated preset integrated model, so as to update a preset weight adjuster, a class prediction operation is performed on the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, then step S11 is skipped to perform a next round of model iterative training, and model iterative learning is performed on the prediction results obtained by the present round of training, and it is noted that the prediction value obtained by the preset weight adjuster can be regarded as a linear neural network, and if the preset integrated model has n training models, the prediction value output by each training model has 3 classes, that is, each class has n probability prediction values, and each class has a probability value, the prediction value output by the preset integrated model is combined with a corresponding probability value, that is, so that the final probability value is obtained by using the final probability value is 3. In this way, the problem of reduced model detection performance and the like caused by the concept drift problem can be relieved and even solved to a certain extent by utilizing the existing model in the preset integrated model to strengthen the prediction of the effective model to the data.
Referring to fig. 3, an overall flowchart of model incremental learning provided in the present application is shown, firstly, initial training is performed on a model, that is, when model training is performed for the first time, preset initial data is needed to be used for training to obtain an initial model, so that iterative incremental learning is continuously performed on the model, then, when new data to be trained is obtained, whether the data need to be updated is judged, if yes, the data is updated, then, model updating and weight adjustment operation are performed, if no update operation is performed on the data, model updating and weight adjustment operation can be performed directly by using a model generated by previous iterative training, and finally, class prediction operation is performed by using an updated model obtained by the present training, so that iterative training of the present model is completed.
Therefore, the method and the device acquire a plurality of historical prediction results as the current data set to be trained, and perform data updating operation on the current data set to be trained based on the preset data updating rule so as to acquire the corresponding updated data set to be trained; model training is carried out by utilizing the updated data set to be trained to obtain a corresponding current training model, and the current training model is put into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction; and carrying out weight updating operation on each training model in the updated preset integrated model to update the preset weight adjuster, carrying out category prediction operation on the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning. Therefore, the method and the device can update and train the data and the models periodically, collect the training models obtained according to each training to obtain an integrated model, and determine the weight of each training model in the integrated model, so that effective data is extracted through updating operation of the data, redundant data is reduced to participate in model training, the model training iteration speed is accelerated, manual labeling work participation is reduced to a certain extent, in addition, the integrated model is obtained by collecting training models containing different knowledge systems through the use of an integrated idea, the weight of each training model in the integrated model is adjusted, the effect of each training model in the current time period is highlighted, the negative influence of concept drift problem on model category prediction is reduced, and the model prediction effect can be more accurate and stable while the model rapid iteration updating is realized.
Based on the above embodiments, the model increment learning method provided in the present application includes operations such as data update, model update, weight adjustment, and category prediction, and the process of iterative training of a round of model will be described in detail. Referring to fig. 4, an embodiment of the present application discloses a specific model incremental learning method, which includes:
step S201: and training the model by using the initial data set to be trained to obtain a first training model, and creating a preset integrated model to put the first training model into the preset integrated model.
In this embodiment, when model training is performed for the first time, since no training model exists at this time, model training is performed by using the initial data set to be trained to obtain a first training model m 1 And creates a preset integration model M to put the first training model into the preset integration model, at this time, the preset integration model is M++m 1 }. The initial data set to be trained may be a data set to be trained, which is acquired in advance by a user, and is not particularly limited herein. In this way, the preset integrated model and the first training model are created when the model training is performed for the first time, so that the prediction result obtained by the category prediction by the first training model can be directly used for incremental learning.
Step S202: and obtaining a plurality of historical prediction results as a current data set to be trained, calculating the similarity between the current data set to be trained and the previous data set to be trained in the iterative learning of the model, and judging whether the similarity is smaller than a preset similarity threshold value.
In this embodiment, after category prediction is performed by using a training model obtained by performing previous model iterative training to obtain a plurality of historical prediction results, a plurality of historical prediction results are obtained in the current model iterative training to serve as a current data set D to be trained i Where i is the number of model updates, e.g. if the current model iteration is the second round, the current data to be trained at this time is D 2 And so on. Then the current data set D to be trained i Is determined as the data tag in1, the data set D to be trained of the previous round i-1 The data label in the data set D is determined to be 0, and the data set D with the determined label is determined i And data set D i-1 Summarizing to obtain a new dataset D m Using the data set D m Model training is performed to obtain a model M c Thereafter, the model M is used c Computing the data set D m AUC (Area Under Curve, i.e. the Area enclosed by the coordinate axes Under the ROC Curve), and then calculating the current training data set D for iterative learning of the current round of models by using the AUC index i With the data set D to be trained of the previous round i-1 Similarity betweenIt should be noted that the new data set D is obtained m And the model M c For calculating similarity only, the model M is independent of the preset integrated model c The new data set D is not put into the preset integrated model, and when the similarity calculation is completed m And the model M c I.e. without any use value. The similarity->The calculation formula of (2) is as follows:
S=max(1-AUC,AUC)
wherein S is an intermediate parameter, max is a maximum value, AUC is the AUC index,for the current training data set D i With the data set D to be trained of the previous round i-1 Similarity between them. Determining said similarity +.>After that, judging the similarity/>Whether the preset similarity threshold value thresh is smaller than the preset similarity threshold value thresh or not, in a specific embodiment, the preset similarity threshold value thresh may be set to 0.8, or the preset similarity threshold value may be adjusted automatically according to a model situation, which is not limited specifically herein.
Step S203: if not, determining reserved data in the current data set to be trained, and directly carrying out category prediction operation by using the preset integrated model and the weight adjuster to obtain a plurality of prediction results; the reserved data are used for being added into a data set to be trained of the next round when the model of the next round is iteratively learned.
In this embodiment, if the similarityGreater than or equal to the preset similarity threshold thresh, the current data set D to be trained is described i With the data set D to be trained of the previous round i-1 The data in the model are very similar, the data offset is less, the model type prediction effect is stable, the model is not required to be subjected to iterative updating, and the current data set D to be trained is required to be determined i Reserved data D of (2) i-s Wherein the reserved data D i-s The calculation formula of (2) is as follows:
it should be noted that the reserved data D i-s S of (a) is independent of the intermediate parameter S described above, and is only one parameter referring to reserved data. As described with reference to fig. 5, for the current data set D to be trained i With the data set D to be trained of the previous round i-1 Schematic diagram of data approximations is performed. The data updating process of the application uses a data approximation algorithm to update, and needs to input a data set D carrying prediction probability i And D i-1 Finally outputting the current training data setD i With the data set D to be trained of the previous round i-1 Data a of the intersecting part, consisting of data belonging to said current training data set D i A of (2) i And the data set D to be trained belonging to the previous round i-1 A of (2) i-1 Composition; the current training data set D i Data B of independent part of the training data set D to be trained of the previous round i-1 Data C of the independent part of (C). During the data approximations, a lot of data is generated, including data set D i-1 Sample set N with real label 0 predicted to be 1 1 The method comprises the steps of carrying out a first treatment on the surface of the Data D i Sample set N with real label 1 predicted to be 0 2 The method comprises the steps of carrying out a first treatment on the surface of the Pair D i-1 Sample probability p of 0 of real label in descending order sequencing set V 1 The method comprises the steps of carrying out a first treatment on the surface of the Pair D i Sample probability p of 1 in real label prediction 1 is subjected to descending order sequencing set V 2 The method comprises the steps of carrying out a first treatment on the surface of the Data V 1 TOP-Size (N) 1 ) Data set B, where Size (N 1 ) For set N 1 Number of samples; data V 2 TOP-Size (N) 2 ) Data set C, where Size (N 2 ) For set N 2 Number of samples; n (N) 1 ∪(V 1 B), i.e. data A belonging to D i-1 Part A of (2) i-1 ;N 2 ∪(V 2 -C), i.e. data A belonging to D i Part A of (2) i ;D i And D i-1 Data A of intersecting portions, i.e. A i-1 ∪A i 。
In this embodiment, the reserved data D is calculated and determined i-s Then, since the subsequent data updating, model updating and weight updating operations are not required, the preset integrated model obtained by the previous model iterative learning and the weight adjuster can be directly utilized to perform the category prediction operation to obtain a plurality of prediction results, and the step S202 is skipped to perform the next model iterative learning, and it is to be noted that the retention data D obtained in the present model iterative learning is needed to be described i-s The method adds the model to be trained in the next round of model iterative learning, so that repeated data can be prevented from being used for continuously training the model, the model is trained by using effective data to the greatest extent, and the model is reducedThe iteration times improve the efficiency of model increment learning.
Step S204: and if the similarity is smaller than the preset similarity threshold, carrying out data updating operation on the current data set to be trained which is iteratively learned by the round of model based on a preset data updating rule so as to obtain a corresponding updated data set to be trained.
In this embodiment, if the similarityIf the similarity is smaller than the preset similarity threshold thresh, iteratively learning the current data set D to be trained of the round of model based on a preset data updating rule i Performing data updating operation to obtain a corresponding updated data set D to be trained i-u . Wherein, the updated data set D to be trained is determined i-u The calculation formula of (2) is as follows:
thus, the data B and C are retained by the data updating method, and the similarity is usedAnd selecting the data A, wherein the selected data is more inclined to the data C from the time angle so as to select effective data to ensure the novelty of the data and reduce the participation of manual labeling work to a certain extent.
Step S205: and carrying out model training by using the updated data set to be trained to obtain a corresponding current training model, and judging whether the number of all training models in the preset integrated model reaches a preset number threshold.
In this embodiment, after the data updating operation is completed, the updated data set D to be trained is utilized i-u Performing model training to obtain a current training model for iterative learning of the model, and then judging whether the number of all training models in the preset integrated model M reaches a preset number threshold W, wherein the preset number threshold WThe value is the time window W, which may be set according to the specific requirements of the training model, and is not particularly limited herein.
Step S206: if not, the current training model is directly put into the preset integration model to update the preset integration model.
In this embodiment, if the number of all the training models in the preset integrated model M does not reach the preset number threshold W, that is, the number of training models is smaller than the time window W, the current training model may be directly put into the preset integrated model M to update the preset integrated model. Referring to FIG. 6, m is shown in the solid line box 1 To m i+1 Forming the preset integrated model M, wherein a dotted line frame is a set time window size W, and when the number of models in the preset integrated model M does not exceed the time window size, directly adding the new model M i+1 And putting the integrated model into the preset integrated model M.
Step S207: if yes, a training model is removed from the preset integrated model by using a preset model removing rule, and the current training model is put into the preset integrated model to update the preset integrated model.
In this embodiment, if the number of all the training models in the preset integrated model M reaches the preset number threshold W, the latest K training models are reserved, then one training model is randomly selected from the rest of the training models to be removed, and the current training model is put into the preset integrated model M to update the preset integrated model M, where the parameter K may be set by the user according to the model, and is not specifically limited herein. Referring to FIG. 7, when the number of training models in the solid line frame, i.e., the preset integrated model, has reached the preset number threshold, then the new model m i+1 Upon arrival, reserve m i-k To m i From m 1 To m i-k-1 Randomly screening a training model to reject, and removing a new model m i+1 And adding the integrated model to the preset integrated model.
Step S208: and selecting a data set to be predicted from the historical data set based on a preset data selection rule, and inputting the data set to be predicted into each training model of the preset integrated model to obtain a first predicted value corresponding to each training model.
In this embodiment, as shown in fig. 8, after the update of the preset integrated model M is completed, the training data set D corresponding to the latest K training models is obtained i-k To D i In the composed historical data set, 20% of data are randomly selected as a data set D to be predicted f The data set D to be predicted is then processed f And carrying out category prediction in each training model input to the preset integrated model M so as to obtain a first predicted value of each category output by each training model.
Step S209: and splicing the first predicted values of each training model into an array, inputting the array into the preset weight adjuster, and training to obtain the weight corresponding to each training model so as to update the preset weight adjuster.
In this embodiment, the first predicted value output by each training model is spliced into a one-dimensional array and input into the preset weight adjuster for simple training to obtain a weight corresponding to each training model, so as to update the preset weight adjuster. It can be understood that in the previous iteration process of the model, each training model in the preset integrated model has a corresponding weight through a weight updating operation, and after the current training model is generated by the present iteration and is placed in the preset integrated model, because the current training model is a newly added model, no corresponding weight exists yet, the preset weight adjuster needs to be updated to obtain the weight of the current training model, and the weights corresponding to other training models are updated.
Step S210: and acquiring a preset data set and inputting the preset data set into each training model of the updated preset integrated model to obtain a second predicted value corresponding to each training model.
In this embodiment, as shown in fig. 9, after the data updating, model updating and weight updating operations are all completed, a preset data set is obtained and input into each training model of the updated preset integrated model M, so as to obtain a second predicted value of each category output by each training model, where the preset data set may be new data, i.e. a data set that has not been predicted or trained in the previous model training, or a data set that has been predicted or trained before, which will not be specifically limited herein.
Step S211: and splicing the second predicted values corresponding to each training model into an array, inputting the array into the updated weight adjuster for category prediction operation so as to obtain a plurality of predicted results, and jumping to the step of obtaining a plurality of historical predicted results as a current data set to be trained so as to perform the next round of model iterative learning.
In this embodiment, the second predicted value corresponding to each training model is spliced into a one-dimensional array and input into the updated weight adjustment for performing a class prediction operation, so as to obtain a plurality of predicted results, i.e., sample classes, and when the predicted results are accumulated to a certain extent, the process jumps to step S202, so that the predicted results generated in this round are used as the data set to be trained in the next round to perform the next round of model iterative learning.
Therefore, the method and the device start model iterative learning through generating the initial model, update and weight reduction operations are carried out on data through similarity comparison of a data set and a data approximation algorithm, the model is trained by extracting effective data to the greatest extent, participation of manual labeling work is reduced, rapid iterative updating of the model is achieved, an integrated model is further introduced, occurrence of local problems can be reduced to a certain extent through continuous updating of the integrated model by using a random strategy, a dynamic and static weight calculation method is provided, model importance of different time periods is adjusted, difference of category prediction rules of the different time periods is eliminated, smoothness of data prediction is improved, and good prediction effect is achieved.
As described with reference to fig. 10, the embodiment of the present application further correspondingly discloses a model incremental learning device, including:
The data updating module 11 is configured to obtain a plurality of historical prediction results as a current data set to be trained, and perform data updating operation on the current data set to be trained based on a preset data updating rule, so as to obtain a corresponding updated data set to be trained;
the model updating module 12 is configured to perform model training by using the updated data set to be trained to obtain a corresponding current training model, and put the current training model into a preset integration model based on a preset model addition rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction;
the weight updating module 13 is configured to perform a weight updating operation on each training model in the updated preset integrated model so as to update a preset weight adjuster;
the prediction result determining module 14 is configured to perform a class prediction operation by using the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and skip to the step of obtaining a plurality of historical prediction results as a current data set to be trained to perform a next model iterative learning.
Therefore, the method and the device acquire a plurality of historical prediction results as the current data set to be trained, and perform data updating operation on the current data set to be trained based on the preset data updating rule so as to acquire the corresponding updated data set to be trained; model training is carried out by utilizing the updated data set to be trained to obtain a corresponding current training model, and the current training model is put into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction; and carrying out weight updating operation on each training model in the updated preset integrated model to update the preset weight adjuster, carrying out category prediction operation on the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning. Therefore, the method and the device can update and train the data and the models periodically, collect the training models obtained according to each training to obtain an integrated model, and determine the weight of each training model in the integrated model, so that effective data is extracted through updating operation of the data, redundant data is reduced to participate in model training, the model training iteration speed is accelerated, manual labeling work participation is reduced to a certain extent, in addition, the integrated model is obtained by collecting training models containing different knowledge systems through the use of an integrated idea, the weight of each training model in the integrated model is adjusted, the effect of each training model in the current time period is highlighted, the negative influence of concept drift problem on model category prediction is reduced, and the model prediction effect can be more accurate and stable while the model rapid iteration updating is realized.
In some specific embodiments, the model incremental learning device may further include:
and the initial model training module is used for carrying out model training by utilizing an initial data set to be trained to obtain a first training model, and creating the preset integrated model so as to put the first training model into the preset integrated model.
In some specific embodiments, the model incremental learning device may further include:
the similarity calculation module is used for calculating the similarity between the current data set to be trained and the data set to be trained of the previous round of iterative learning of the model of the present round, and judging whether the similarity is smaller than a preset similarity threshold value or not;
the reserved data determining module is used for determining reserved data in the current data set to be trained when the similarity is not smaller than a preset similarity threshold value, and directly performing category prediction operation by using the preset integrated model and the weight adjuster to obtain a plurality of prediction results; the reserved data are used for being added into a data set to be trained of the next round when the model of the next round is iteratively learned.
In some specific embodiments, the data updating module 11 may be specifically configured to perform a data updating operation on the current data set to be trained that is iteratively learned by the present round of model based on a preset data updating rule to obtain a corresponding updated data set to be trained when the similarity is smaller than the preset similarity threshold.
In some specific embodiments, the model updating module 12 may specifically include:
the quantity judging unit is used for judging whether the quantity of all training models in the preset integrated model reaches a preset quantity threshold value or not;
the first model updating unit is used for directly putting the current training model into the preset integrated model to update the preset integrated model when the number of all training models in the preset integrated model does not reach a preset number threshold;
and the second model updating unit is used for eliminating one training model from the preset integrated model by utilizing a preset model eliminating rule when the number of all training models in the preset integrated model reaches a preset number threshold value, and placing the current training model into the preset integrated model to update the preset integrated model.
In some specific embodiments, the weight updating module 13 may specifically include:
the first predicted value acquisition unit is used for selecting a data set to be predicted from a historical data set based on a preset data selection rule, and inputting the data set to be predicted into each training model of the preset integrated model to obtain a first predicted value corresponding to each training model;
And the weight updating unit is used for splicing the first predicted value of each training model into an array and inputting the array into the preset weight adjuster to train to obtain the weight corresponding to each training model so as to update the preset weight adjuster.
In some specific embodiments, the prediction result determining module 14 may specifically include:
the second predicted value acquisition unit is used for acquiring a preset data set and inputting the preset data set into each training model of the updated preset integrated model so as to obtain a second predicted value corresponding to each training model;
and the prediction result determining unit is used for splicing the second prediction values corresponding to each training model into an array and inputting the array into the updated weight adjuster for category prediction operation so as to obtain a plurality of prediction results.
Further, the embodiment of the present application further discloses an electronic device, and fig. 11 is a block diagram of an electronic device 20 according to an exemplary embodiment, where the content of the figure is not to be considered as any limitation on the scope of use of the present application.
Fig. 11 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is configured to store a computer program that is loaded and executed by the processor 21 to implement the relevant steps in the model incremental learning method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and computer programs 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the model incremental learning method performed by the electronic device 20 disclosed in any of the previous embodiments.
Further, the application also discloses a computer readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements the model increment learning method disclosed previously. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has outlined the detailed description of the preferred embodiment of the present application, and the detailed description of the principles and embodiments of the present application has been provided herein by way of example only to facilitate the understanding of the method and core concepts of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.
Claims (10)
1. A model incremental learning method, comprising:
acquiring a plurality of historical prediction results as a current data set to be trained, and performing data updating operation on the current data set to be trained based on preset data updating rules to obtain a corresponding updated data set to be trained;
model training is carried out by utilizing the updated data set to be trained to obtain a corresponding current training model, and the current training model is put into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction;
and carrying out weight updating operation on each training model in the updated preset integrated model to update the preset weight adjuster, carrying out category prediction operation on the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning.
2. The model incremental learning method of claim 1 wherein prior to the obtaining a number of historical predictions as the current data set to be trained, further comprising:
and training the model by using the initial data set to be trained to obtain a first training model, and creating the preset integrated model to put the first training model into the preset integrated model.
3. The model incremental learning method of claim 1 wherein after the obtaining a number of historical predictions as the current data set to be trained, further comprising:
calculating the similarity between the current data set to be trained and the data set to be trained of the previous round of iterative learning of the model of the present round, and judging whether the similarity is smaller than a preset similarity threshold value or not;
if not, determining reserved data in the current data set to be trained, and directly carrying out category prediction operation by using the preset integrated model and the weight adjuster to obtain a plurality of prediction results; the reserved data are used for being added into a data set to be trained of the next round when the model of the next round is iteratively learned.
4. The model incremental learning method according to claim 3, wherein the performing a data update operation on the current data set to be trained based on a preset data update rule to obtain a corresponding updated data set to be trained includes:
And if the similarity is smaller than the preset similarity threshold, carrying out data updating operation on the current data set to be trained which is iteratively learned by the round of model based on a preset data updating rule so as to obtain a corresponding updated data set to be trained.
5. The model incremental learning method of claim 1 wherein the placing the current training model into a preset integration model based on preset model addition rules to update the preset integration model comprises:
judging whether the number of all training models in the preset integrated model reaches a preset number threshold;
if not, directly putting the current training model into the preset integration model to update the preset integration model;
if yes, a training model is removed from the preset integrated model by using a preset model removing rule, and the current training model is put into the preset integrated model to update the preset integrated model.
6. The model incremental learning method of claim 1 wherein the performing a weight update operation on each of the updated pre-set integrated models to update the pre-set weight adjuster comprises:
Selecting a data set to be predicted from a historical data set based on a preset data selection rule, and inputting the data set to be predicted into each training model of the preset integrated model to obtain a first predicted value corresponding to each training model;
and splicing the first predicted values of each training model into an array, inputting the array into the preset weight adjuster, and training to obtain the weight corresponding to each training model so as to update the preset weight adjuster.
7. The model incremental learning method according to any one of claims 1 to 6, wherein the performing a class prediction operation with the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results includes:
acquiring a preset data set and inputting the preset data set into each training model of the updated preset integrated model to obtain a second predicted value corresponding to each training model;
and splicing the second predicted values corresponding to each training model into an array, and inputting the array into the updated weight adjuster for category prediction operation so as to obtain a plurality of predicted results.
8. A model increment learning device, comprising:
the data updating module is used for acquiring a plurality of historical prediction results as a current data set to be trained, and carrying out data updating operation on the current data set to be trained based on a preset data updating rule so as to obtain a corresponding updated data set to be trained;
the model updating module is used for carrying out model training by utilizing the updated data set to be trained to obtain a corresponding current training model, and placing the current training model into a preset integration model based on a preset model adding rule to update the preset integration model; the preset integrated model is used for storing the training models which all participate in category prediction;
the weight updating module is used for carrying out weight updating operation on each training model in the updated preset integrated model so as to update the preset weight adjuster;
and the prediction result determining module is used for carrying out category prediction operation by utilizing the updated preset integrated model and the updated weight adjuster to obtain a plurality of prediction results, and jumping to the step of obtaining a plurality of historical prediction results as a current data set to be trained so as to carry out next model iterative learning.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the model incremental learning method of any one of claims 1 to 7.
10. A computer readable storage medium for storing a computer program which when executed by a processor implements the model incremental learning method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310399396.0A CN116432780A (en) | 2023-04-11 | 2023-04-11 | Model increment learning method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310399396.0A CN116432780A (en) | 2023-04-11 | 2023-04-11 | Model increment learning method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116432780A true CN116432780A (en) | 2023-07-14 |
Family
ID=87082843
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310399396.0A Pending CN116432780A (en) | 2023-04-11 | 2023-04-11 | Model increment learning method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116432780A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116959696A (en) * | 2023-09-20 | 2023-10-27 | 武汉光盾科技有限公司 | Data processing method and device based on laser therapeutic instrument |
CN118090672A (en) * | 2024-04-25 | 2024-05-28 | 奥谱天成(厦门)光电有限公司 | Kiwi fruit feature detection method, device, medium and equipment |
-
2023
- 2023-04-11 CN CN202310399396.0A patent/CN116432780A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116959696A (en) * | 2023-09-20 | 2023-10-27 | 武汉光盾科技有限公司 | Data processing method and device based on laser therapeutic instrument |
CN116959696B (en) * | 2023-09-20 | 2023-12-08 | 武汉光盾科技有限公司 | Data processing method and device based on laser therapeutic instrument |
CN118090672A (en) * | 2024-04-25 | 2024-05-28 | 奥谱天成(厦门)光电有限公司 | Kiwi fruit feature detection method, device, medium and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116432780A (en) | Model increment learning method, device, equipment and storage medium | |
CN111260030B (en) | A-TCN-based power load prediction method and device, computer equipment and storage medium | |
CN111612134A (en) | Neural network structure searching method and device, electronic equipment and storage medium | |
KR20190124846A (en) | The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network | |
CN111506814A (en) | Sequence recommendation method based on variational self-attention network | |
US12005580B2 (en) | Method and device for controlling a robot | |
US11914672B2 (en) | Method of neural architecture search using continuous action reinforcement learning | |
US20190228297A1 (en) | Artificial Intelligence Modelling Engine | |
CN111611085A (en) | Man-machine hybrid enhanced intelligent system, method and device based on cloud edge collaboration | |
CN110490304A (en) | A kind of data processing method and equipment | |
CN118365099B (en) | Multi-AGV scheduling method, device, equipment and storage medium | |
CN113642652A (en) | Method, device and equipment for generating fusion model | |
JP2024532679A (en) | Evaluating output sequences using autoregressive language model neural networks | |
CN112215412A (en) | Dissolved oxygen prediction method and device | |
CN117744754B (en) | Large language model task processing method, device, equipment and medium | |
CN114037772A (en) | Training method of image generator, image generation method and device | |
WO2024012179A1 (en) | Model training method, target detection method and apparatuses | |
CN111445024A (en) | Medical image recognition training method | |
CN113052191A (en) | Training method, device, equipment and medium of neural language network model | |
CN114926701A (en) | Model training method, target detection method and related equipment | |
CN113657501A (en) | Model adaptive training method, apparatus, device, medium, and program product | |
CN111753519A (en) | Model training and recognition method and device, electronic equipment and storage medium | |
CN115186096A (en) | Recognition method, device, medium and electronic equipment for specific type word segmentation | |
CN113469204A (en) | Data processing method, device, equipment and computer storage medium | |
CN117540807A (en) | Model weight parameter storage method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |