CN114004315A

CN114004315A - Method and device for incremental learning based on small sample

Info

Publication number: CN114004315A
Application number: CN202111653502.0A
Authority: CN
Inventors: 崔燕红; 魏风顺
Original assignee: Beijing Teddy Bear Mobile Technology Co ltd
Current assignee: Beijing Teddy Bear Mobile Technology Co ltd
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2022-02-01
Also published as: CN114785890A

Abstract

The invention discloses a method and a device for incremental learning based on a small sample, and belongs to the technical field of artificial intelligence. The method comprises the following steps: firstly, performing incremental learning on first small sample data based on knowledge distillation to obtain a loss function; obtaining error sample data in historical training sample data based on the loss function; then, performing characteristic clustering processing on the error sample data, and performing incremental processing on second small sample data based on a characteristic clustering result to obtain updated second small sample data; and finally, performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model. Therefore, incremental learning can be performed on the original model based on the newly added small samples, and the problem that the original model cannot be iterated by using the newly added small sample data due to the loss of historical training sample data in the prior art is solved, so that the original model is effectively updated.

Description

Method and device for incremental learning based on small sample

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a device for incremental learning based on a small sample.

Background

At present, in a classification task with supervised learning, people often adopt a deep learning algorithm, model training is carried out by using large-scale data, and then the trained model is used for predicting data to be tested so as to improve the accuracy of a prediction result. But over time, new valid data is generated each day that can be used as training samples. Therefore, in order to improve the accuracy of the model, the original model is usually iteratively trained by using newly-added small sample data, so as to update the original model.

However, during the use of the original model, since the phenomenon that historical training sample data is lost often occurs, the original model cannot be iterated by using newly-added small sample data. Therefore, it is urgently needed to provide a method for incremental learning based on small samples to solve the problem that the original model cannot be iterated by using newly-added small sample data due to the loss of historical training sample data, so as to update the original model.

Disclosure of Invention

In order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for incremental learning based on a small sample, which can perform incremental learning on an original model based on a newly added small sample, and solve the problem in the prior art that the original model cannot be iterated by using newly added small sample data due to loss of historical training sample data, thereby effectively updating the original model.

To achieve the above object, according to a first aspect of the embodiments of the present invention, there is provided a method for incremental learning based on a small sample, the method including: performing incremental learning on the first small sample data based on knowledge distillation to obtain a loss function; obtaining error sample data in historical training sample data based on the loss function; performing characteristic clustering processing on the error sample data, and performing incremental processing on second small sample data based on a characteristic clustering result to obtain updated second small sample data; and performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

Optionally, the performing incremental processing on the second small sample data based on the feature clustering result to obtain updated second small sample data includes: marking a class label on the error sample data based on the characteristic clustering result to obtain the error sample data with the class label; adding error sample data with a category label to second small sample data; and updating the sampling weight of the error characteristic category in the added second small sample data to obtain the updated second small sample data.

Optionally, the incrementally learning the first small sample data based on knowledge distillation to obtain a loss function includes: training an original model based on first small sample data to generate a first new model, and obtaining a first loss function; training the first new model and the original new model simultaneously based on the first small sample data to generate a second new model, and obtaining a second loss function; training the first new model based on the first small sample data to generate a third new model, and obtaining a third loss function; and determining the first loss function, the second loss function and the third loss function as the loss functions obtained by performing incremental learning on the first small sample data.

Optionally, the obtaining error sample data in historical training sample data based on the loss function includes: obtaining first error sample data in historical training sample data based on the first loss function; obtaining second error sample data in historical training sample data based on the second loss function; obtaining third error sample data in the historical training sample data based on a third loss function; determining the first, second and third error sample data as error sample data.

Optionally, the incremental learning is performed on the updated second small sample data based on knowledge distillation to obtain a final new model, including: training the original model based on the updated second small sample data to generate a fourth new model, and obtaining a fourth loss function; training the fourth new model and the original new model simultaneously based on the updated second small sample data to generate a fifth new model, and obtaining a fifth loss function; training the fifth new model based on the updated second small sample data to obtain a sixth loss function; determining the fourth, fifth, and sixth loss functions as a total loss function; when the total loss function reaches a minimum, a final new model is obtained.

To achieve the above object, according to a second aspect of the embodiments of the present invention, there is also provided an apparatus for incremental learning based on a small sample, the apparatus including: the first learning module is used for performing incremental learning on the first small sample data based on knowledge distillation to obtain a loss function; the acquisition module is used for acquiring error sample data in historical training sample data based on the loss function; the increment module is used for carrying out characteristic clustering processing on the error sample data and carrying out increment processing on the error sample data based on a characteristic clustering result to obtain updated second small sample data; and the second learning module is used for carrying out incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

Optionally, the increment module includes: the clustering unit is used for marking a class label on the error sample data based on the characteristic clustering result to obtain the error sample data with the class label; an adding unit, configured to add error sample data with a category label to the second small sample data; and the increment unit is used for updating the sampling weight of the error feature type in the added second small sample data to obtain the updated second small sample data.

Optionally, the first learning module includes: a first obtaining unit for obtaining a first small sample number

Generating a first new model according to the training original model to obtain a first loss function; a second obtaining unit, configured to train the first new model and the original new model simultaneously based on the first small sample data to generate a second new model, and obtain a second loss function; a third obtaining unit, configured to train the first new model based on the first small sample data to generate a third new model, and obtain a third loss function; a determining unit, configured to determine the first loss function, the second loss function, and the third loss function as a loss function obtained by performing incremental learning on first small sample data.

To achieve the above object, according to a third aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including: one or more processors; memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method for incremental learning based on small samples of the first aspect.

To achieve the above object, according to a fourth aspect of the embodiments of the present invention, there is further provided a computer-readable storage medium having a computer program stored therein, wherein the computer program, when executed by a processor, implements the method for incremental learning based on small samples according to the first aspect.

Compared with the prior art, the method and the device for incremental learning based on the small sample provided by the embodiment of the invention comprise the following steps: firstly, performing incremental learning on first small sample data based on knowledge distillation to obtain a loss function; obtaining error sample data in historical training sample data based on the loss function; then, performing characteristic clustering processing on the error sample data, and performing incremental processing on second small sample data based on a characteristic clustering result to obtain updated second small sample data; and finally, performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model. Therefore, incremental learning can be performed on the original model based on the newly added small samples, and the problem that the original model cannot be iterated by using the newly added small sample data due to the loss of historical training sample data in the prior art is solved, so that the original model is effectively updated.

It is to be understood that the teachings of the present invention need not achieve all of the above-described benefits, but rather that specific embodiments may achieve specific technical results, and that other embodiments of the present invention may achieve benefits not mentioned above.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein like or corresponding reference numerals designate like or corresponding parts throughout the several views.

FIG. 1 is a schematic flow chart of a method for incremental learning based on small samples in accordance with an embodiment of the present invention;

FIG. 2 is a schematic flow chart diagram of a method of incremental learning based on small samples according to another embodiment of the present invention;

fig. 3 is a schematic block diagram of an apparatus for incremental learning based on small samples according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

As shown in fig. 1, a schematic flow chart of a method for incremental learning based on small samples according to an embodiment of the present invention is shown. A method for incremental learning based on small samples comprises the following steps: s101, performing incremental learning on the first small sample data based on knowledge distillation to obtain a loss function; s102, obtaining error sample data in historical training sample data based on a loss function; s103, performing characteristic clustering processing on the error sample data, and performing incremental processing on the second small sample data based on a characteristic clustering result to obtain updated second small sample data; and S104, performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

In S101 and S102, knowledge distillation refers to inducing training of a student network by introducing soft targets related to a teacher network as part of total loss to achieve knowledge migration. That is, a teacher network is trained, and then the student network is trained with the output result q of the teacher network as the target of the student network, so that the output result p of the student network approaches q.

An original model can be trained based on the first small sample data to obtain a loss function; when the loss function tends to be minimum, generating a first new model; and then obtaining error sample data in the historical training sample data based on the loss function.

Or firstly training an original model based on the first small sample data to obtain a first loss function; when the first loss function tends to be minimal, a first new model is generated. Secondly, training a first new model and an original new model simultaneously based on the first small sample data to obtain a second loss function; when the second loss function tends to be minimal, a second new model is generated. And finally, obtaining error sample data in the historical training sample data based on the first loss function and the second loss function.

Or firstly training an original model based on the first small sample data to obtain a first loss function; when the first loss function tends to be minimal, a first new model is generated. Then, training a first new model and an original new model simultaneously based on the first small sample data to obtain a second loss function; when the second loss function tends to be minimal, a second new model is generated. Then, training a first new model based on the first small sample data to obtain a third loss function; when the third loss function tends to be minimal, a third new model is generated. And finally, obtaining error sample data in the historical training sample data based on the first loss function, the second loss function and the third loss function.

In S103, the second small sample data may be subjected to incremental processing by adding the error sample data to the second small sample data, so as to obtain updated second small sample data.

Or, carrying out feature clustering processing on the error sample data; and adding the error sample data of the corresponding category to the second small sample data based on the category of the error sample data in the clustering result to obtain the updated second small sample data. Here, the error sample data added to the second small sample data may include error sample data acquired from historical training sample data, or may not include error sample data acquired from historical training sample data, but the characteristics of the error sample data added to the second small sample data are necessarily the same as or similar to the characteristics of the error sample data in the historical training sample data.

Or, counting the proportional relation of different types of error sample data in the clustering result; and performing incremental processing on the error sample data of each category on the second small sample data according to the proportional relation of the error sample data of different categories to obtain the updated second small sample data.

In S104, the process of incrementally learning the updated second small sample data based on knowledge distillation is similar to the process of incrementally learning the first small sample data.

Specifically, training an original model based on the updated second small sample data to obtain a fourth loss function; when the fourth loss function tends to be minimal, a final new model is generated.

Or firstly training an original model based on the updated second small sample data to obtain a fourth loss function; when the fourth loss function tends to be minimum, generating a fourth new model; then, simultaneously training a fourth new model and the original new model based on the updated second small sample data to obtain a fifth loss function; when the fifth loss function tends to be minimal, a final new model is generated.

Or firstly training the original model based on the updated second small sample data to generate a fourth new model, and obtaining a fourth loss function; then, training a fourth new model and the original new model simultaneously based on the updated second small sample data to generate a fifth new model, and obtaining a fifth loss function; then, training a fifth new model based on the updated second small sample data to obtain a sixth loss function; finally, determining a fourth loss function, a fifth loss function and a sixth loss function as a total loss function; when the total loss function reaches the minimum, the final new model is obtained.

Here, the first small sample data and the second small sample data are both new training sample data.

The method comprises the steps of firstly, performing incremental learning on first small sample data based on knowledge distillation to obtain a loss function; obtaining error sample data in historical training sample data based on the loss function; then, performing incremental processing on the second small sample data based on the error sample data to obtain updated second small sample data; and finally, performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model. Therefore, incremental learning can be performed on the original model based on the newly added small samples, and the problem that the original model cannot be iterated by using the newly added small sample data due to the loss of historical training sample data in the prior art is solved, so that the original model is effectively updated.

As shown in fig. 2, another embodiment of the present invention is a schematic flow chart of a method for incremental learning based on small samples. This embodiment is further optimized based on the embodiment of fig. 1. The method for incremental learning based on the small sample comprises the following steps: s201, training an original model based on first small sample data to generate a first new model, and obtaining a first loss function; s202, training a first new model and an original new model simultaneously based on the first small sample data to generate a second new model, and obtaining a second loss function; s203, training the first new model based on the first small sample data to generate a third new model, and obtaining a third loss function; s204, determining the first loss function, the second loss function and the third loss function as the loss functions obtained by performing incremental learning on the first small sample data; s205, obtaining error sample data in historical training sample data based on a loss function; s206, carrying out feature clustering processing on the error sample data to obtain a feature clustering result; marking a class label on the error sample data based on the characteristic clustering result to obtain the error sample data with the class label; s207, adding error sample data with the class label into second small sample data; s208, updating the sampling weight of the error feature type in the added second small sample data to obtain updated second small sample data; and S209, performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

In S201 to S204, a first new model is generated when the first loss function tends to be minimum; generating a second new model when the second loss function tends to be minimal; a third new model is generated when the third loss function tends to be minimal.

In S205, obtaining first error sample data in the historical training sample data based on the first loss function; obtaining second error sample data in the historical training sample data based on the second loss function; obtaining third error sample data in the historical training sample data based on a third loss function; the first, second and third error sample data are determined to be error sample data. Specifically, when the first loss function tends to be minimum, obtaining first error sample data in historical training sample data; when the second loss function tends to be minimum, second error sample data in the historical training sample data is obtained; and when the third loss function tends to be minimum, obtaining third error sample data in the historical training sample data.

In S206 to S208, carrying out feature clustering processing on the error sample data to obtain a feature clustering result; marking different types of error sample data in the feature clustering result with type labels to obtain error sample data with the type labels; respectively counting the number of the error sample data corresponding to each category label to obtain the proportional relation of different category labels in the error sample data; and adding the error sample data with the class label into the second small sample data, and adding the error sample data of the corresponding class to the second small sample data to which the error sample data is added according to the proportion relation to obtain the updated second small sample data, thereby improving the iteration precision of the original model.

In S209, training the original model based on the updated second small sample data to generate a fourth new model, and obtaining a fourth loss function; training a fourth new model and the original new model simultaneously based on the updated second small sample data to generate a fifth new model, and obtaining a fifth loss function; training a fifth new model based on the updated second small sample data to obtain a sixth loss function; determining a fourth loss function, a fifth loss function and a sixth loss function as a total loss function; when the total loss function reaches the minimum, the final new model is obtained.

In the embodiment, the original model is distilled, the small samples are used for incremental learning, then the small samples with errors are classified through feature clustering, and the number of samples with similar features to the classified small samples is increased in the small samples for iterative training, so that the incremental learning of parameter updating is performed on the model, and the purpose of continuously improving the training precision is achieved.

It should be understood that, in the embodiments of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

The method for incremental learning based on small samples will be further described with reference to specific scenarios.

Aiming at the condition that historical sample data used for training the original model S _ old in the original model S _ old is incomplete (lost), a method for performing incremental learning based on a small sample is adopted, and the method comprises the following steps:

(1) training an original model S _ old by using first small sample data D _ label _ new, and finely adjusting parameters of the original model when a Loss function Loss _ function = abs | Y-Y _ on' | reaches the minimum to generate a first new model; wherein, Y is the true value, and Y _ on' is the prediction result output by the first new model in the training process.

(2) Simultaneously training a first new model and an original model by using first small sample data, and when a Loss function Loss _ function = abs | Y _ on-Y _ nn | reaches a minimum value, performing parameter adjustment on the original model and the first new model to generate a second new model; and Y _ on is a prediction result output by the original model in the training process, and Y _ nn is a prediction result output by the first new model in the training process.

(3) Training a first new model by using the first small sample data, and when a Loss function Loss _ function = abs | Y-Y _ nn' | reaches a minimum value, performing parameter adjustment on the first new model to generate a third new model; wherein, Y _ nn' is the prediction result output by the first new model in the training process.

(4) Obtaining error sample data in historical training data based on a Loss function Loss _ function = abs | Y-Y _ on '|, a Loss function Loss _ function = abs | Y _ on-Y _ nn |, and a Loss function Loss _ function = abs | Y-Y _ nn' |; performing characteristic clustering processing on the error sample data to obtain first type error sample data, second type error sample data and third type error sample data; marking category labels on the three types of error sample data, and respectively counting the number of the error sample data corresponding to each category label to obtain the proportional relation of the three types of error sample data; and adding error sample data with a class label into the second small sample data, and then adding the error sample data corresponding to each error characteristic class according to the class label and a proportional relation to obtain the updated second small sample data, so that the sampling weight of the data with the similar characteristics to the error sample data in the updated second small sample data can be enhanced.

The invention clusters the samples with the wrong classification of the new model, finds out wrong sample data, and adds the number of the wrong samples which are easy to cause the wrong classification into the second small sample through clustering, thereby realizing the updating of the data which are easy to have errors in the small sample data on the basis of keeping the corresponding training data characteristics of the original model, and further improving the accuracy of the training of the new model.

The innovation point of the method is that knowledge distillation and small sample updating are combined, the small sample is used for pre-judging error sample data in historical training data in the iteration process, and feature clustering processing is carried out on the error sample data, so that the sampling weight of the error feature category in the small sample is updated in the next training, and the method is similar to the error correction idea of xgboost.

As shown in fig. 3, a schematic block diagram of an apparatus for incremental learning based on small samples according to an embodiment of the present invention. An apparatus for incremental learning based on small samples, the apparatus 300 comprising: the first learning module 301 is configured to perform incremental learning on the first small sample data based on knowledge distillation to obtain a loss function; an obtaining module 302, configured to obtain error sample data in historical training sample data based on the loss function; an increment module 303, configured to perform feature clustering on the error sample data, and perform increment processing on the second small sample data based on a feature clustering result to obtain updated second small sample data; and the second learning module 304 is configured to perform incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

In an alternative embodiment, the increment module 303 includes: the clustering unit is used for marking a class label on the error sample data based on the characteristic clustering result to obtain the error sample data with the class label; an adding unit, configured to add error sample data with a category label to the second small sample data; and the increment unit is used for updating the sampling weight of the error feature type in the added second small sample data to obtain the updated second small sample data.

In an alternative embodiment, the first learning module 301 comprises: the first obtaining unit is used for training an original model based on first small sample data to generate a first new model and obtaining a first loss function; a second obtaining unit, configured to train the first new model and the original new model simultaneously based on the first small sample data to generate a second new model, and obtain a second loss function; a third obtaining unit, configured to train the first new model based on the first small sample data to generate a third new model, and obtain a third loss function; a determining unit, configured to determine the first loss function, the second loss function, and the third loss function as a loss function obtained by performing incremental learning on first small sample data.

In an alternative embodiment, the obtaining module 302 includes: a first obtaining unit, configured to obtain first error sample data in historical training sample data based on the first loss function; a second obtaining unit, configured to obtain second error sample data in historical training sample data based on the second loss function; a third obtaining unit, configured to obtain third error sample data in the historical training sample data based on a third loss function; a determining unit, configured to determine the first, second, and third error sample data as error sample data.

In an alternative embodiment, the second learning module 304 includes: the first obtaining unit is used for training an original model based on the updated second small sample data to generate a fourth new model and obtain a fourth loss function; a second obtaining unit, configured to train the fourth new model and the original new model simultaneously based on the updated second small sample data to generate a fifth new model, and obtain a fifth loss function; a third obtaining unit, configured to train the fifth new model based on the updated second small sample data, and obtain a sixth loss function; a determining unit configured to determine the fourth loss function, the fifth loss function, and the sixth loss function as a total loss function; when the total loss function reaches a minimum, a final new model is obtained.

The test device can execute the method for incremental learning based on the small sample, and has the corresponding functional modules and beneficial effects of executing the test method. For the technical details that are not described in detail in this embodiment, reference may be made to the method for incremental learning based on small samples provided in the embodiment of the present invention.

According to still another embodiment of the present invention, there is also provided an electronic apparatus including: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for incremental learning based on small samples provided by the above-mentioned embodiment of the present invention.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to perform at least the following: s101, performing incremental learning on the first small sample data based on knowledge distillation to obtain a loss function; s102, obtaining error sample data in historical training sample data based on a loss function; s103, performing incremental processing on the second small sample data based on the error sample data to obtain updated second small sample data; and S104, performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A method for incremental learning based on small samples, comprising:

performing incremental learning on the first small sample data based on knowledge distillation to obtain a loss function;

obtaining error sample data in historical training sample data based on the loss function;

performing characteristic clustering processing on the error sample data, and performing incremental processing on second small sample data based on a characteristic clustering result to obtain updated second small sample data;

and performing incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

2. The method of claim 1, wherein performing incremental processing on the second small sample data based on the feature clustering result to obtain updated second small sample data comprises:

marking a class label on the error sample data based on the characteristic clustering result to obtain the error sample data with the class label;

adding error sample data with a category label to second small sample data;

and updating the sampling weight of the error characteristic category in the added second small sample data to obtain the updated second small sample data.

3. The method of claim 1, wherein incrementally learning the first small sample data based on knowledge distillation to obtain a loss function comprises:

training an original model based on first small sample data to generate a first new model, and obtaining a first loss function;

training the first new model and the original model simultaneously based on the first small sample data to generate a second new model, and obtaining a second loss function;

training the first new model based on the first small sample data to generate a third new model, and obtaining a third loss function;

and determining the first loss function, the second loss function and the third loss function as the loss functions obtained by performing incremental learning on the first small sample data.

4. The method of claim 3, wherein obtaining error sample data in historical training sample data based on the loss function comprises:

obtaining first error sample data in historical training sample data based on the first loss function;

obtaining second error sample data in historical training sample data based on the second loss function;

obtaining third error sample data in the historical training sample data based on a third loss function;

determining the first, second and third error sample data as error sample data.

5. The method of claim 1 or 2, wherein the incremental learning of the updated second small sample data based on knowledge distillation to obtain a final new model comprises:

training the original model based on the updated second small sample data to generate a fourth new model, and obtaining a fourth loss function;

training the fourth new model and the original model simultaneously based on the updated second small sample data to generate a fifth new model, and obtaining a fifth loss function;

training the fifth new model based on the updated second small sample data to obtain a sixth loss function;

determining the fourth, fifth, and sixth loss functions as a total loss function;

when the total loss function reaches a minimum, a final new model is obtained.

6. An apparatus for incremental learning based on small samples, comprising:

the first learning module is used for performing incremental learning on the first small sample data based on knowledge distillation to obtain a loss function;

the acquisition module is used for acquiring error sample data in historical training sample data based on the loss function;

the increment module is used for carrying out characteristic clustering processing on the error sample data and carrying out increment processing on the error sample data based on a characteristic clustering result to obtain updated second small sample data;

and the second learning module is used for carrying out incremental learning on the updated second small sample data based on knowledge distillation to obtain a final new model.

7. The apparatus of claim 6, wherein the increment module comprises:

the clustering unit is used for marking a class label on the error sample data based on the characteristic clustering result to obtain the error sample data with the class label;

an adding unit, configured to add error sample data with a category label to the second small sample data;

and the increment unit is used for updating the sampling weight of the error feature type in the added second small sample data to obtain the updated second small sample data.

8. The apparatus of claim 6, wherein the first learning module comprises:

the first obtaining unit is used for training an original model based on first small sample data to generate a first new model and obtaining a first loss function;

a second obtaining unit, configured to train the first new model and the original model simultaneously based on the first small sample data to generate a second new model, and obtain a second loss function;

a third obtaining unit, configured to train the first new model based on the first small sample data to generate a third new model, and obtain a third loss function;

a determining unit, configured to determine the first loss function, the second loss function, and the third loss function as a loss function obtained by performing incremental learning on first small sample data.

9. An electronic device, comprising:

one or more processors;

memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-5.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-5.