CN112734038A

CN112734038A - Training method, medium, device and computing equipment for small sample continuous learning model

Info

Publication number: CN112734038A
Application number: CN202110077164.4A
Authority: CN
Inventors: 朱军; 钟毅; 王立元; 李乾
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2021-01-20
Filing date: 2021-01-20
Publication date: 2021-04-30

Abstract

The embodiment of the invention provides a training method, medium, device and computing equipment for a small sample continuous learning model. The method comprises the following steps: the small sample continuous learning model comprises a slow weight and a fast weight, and a current activity mark corresponding to the slow weight is calculated based on a training data set corresponding to a current task; calculating a mapping value based on the current activity marker and an accumulated activity marker stored by a previous task; copying the slow weight to the fast weight, and updating the fast weight through a classification loss function; and updating parameters in the slow weight through the updated fast weight, and realizing the training of the small sample continuous learning model based on the updated fast weight and the updated slow weight. The method can balance generalization ability of small sample learning and fitting ability of continuous learning in the same model, and improves the processing performance of the small sample continuous learning model on the whole task sequence.

Description

Training method, medium, device and computing equipment for small sample continuous learning model

Technical Field

The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a training method, a medium, a device and computing equipment for a small sample continuous learning model.

Background

This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

Small sample learning and continuous learning are two more important tasks that exist in the scientific and engineering fields. The aim of small sample learning is to realize learning a new task from a small number of training samples, and the aim of continuous learning is to prevent the neural network from catastrophically forgetting an old task which is learned when learning the new task, so that the ability of learning incremental tasks is improved.

However, in practice, it has been found that continuous learning attempts to accurately memorize input tasks to mitigate catastrophic forgetfulness, which inevitably over-fits the perceived tasks and interferes with the generalization ability required for small sample learning, often resulting in a reduction in the learning ability of small sample learning, and it has been found that the goals of small sample learning and continuous learning are difficult to achieve simultaneously, which may affect the processing performance of the entire sequence of tasks.

Disclosure of Invention

In this context, embodiments of the present invention desirably provide a training method of a small sample continuous learning model, the small sample continuous learning model including a slow weight and a fast weight, the method including:

calculating a current activity mark corresponding to the slow weight based on a training data set corresponding to a current task;

calculating a mapping value based on the current activity flag and an accumulated activity flag stored by a previous task, the mapping value being used to constrain updating of the fast weight;

copying the slow weight to the fast weight, and updating the fast weight through a classification loss function;

and updating parameters in the slow weight through the updated fast weight, and realizing the training of the small sample continuous learning model based on the updated fast weight and the updated slow weight.

In an embodiment of this embodiment, before calculating the current activity flag corresponding to the slow weight based on the training data set corresponding to the current task, the method further includes:

acquiring a preset support data set;

and pre-training based on the support data set to obtain a slow weight.

In one example of this embodiment, the small sample continuous learning model includes a feature embedding layer and an output layer, the slow weights are composed of parameters in the feature embedding layer and parameters in the output layer, and the fast weights are composed of parameters in the feature embedding layer and parameters in the output layer.

In an embodiment of this embodiment, calculating, based on a training data set corresponding to a current task, a current activity flag corresponding to the slow weight includes:

acquiring a training data set corresponding to a current task;

calculating the training data set to obtain expected modular lengths of parameter gradients in each layer of the slow-weight feature embedding layer;

and determining the modular length as the current active mark corresponding to the slow weight.

In an embodiment of this embodiment, calculating a mapping value based on the current activity flag and an accumulated activity flag stored by a previous task includes:

acquiring an accumulated activity mark corresponding to a previously stored historical task;

updating the cumulative activity flag with the current activity flag;

calculating the difference of the updated average values of the accumulated activity mark and the current activity mark;

and mapping the difference of the mean values through a Sigmoid function to obtain a mapping value.

In an embodiment of the present invention, copying the slow weight to the fast weight, and updating the fast weight by a classification loss function includes:

acquiring a historical training data set corresponding to a historical task;

adding the historical training data set to a training data set corresponding to the current task to obtain a replay data set;

copying the slow weight to the fast weight;

and updating the fast weight through a classification loss function by taking the replay data set, the slow weight and the mapping value as a basis, wherein the classification loss function is a loss function in the small sample continuous learning model.

In an embodiment of this embodiment, after the parameters in the slow weight are updated by the updated fast weight and the training of the continuous learning model of the small sample is implemented based on the updated fast weight and the updated slow weight, the method further includes:

when detecting that the current task has a corresponding next task, determining the next task as the current task, and executing the training data set corresponding to the current task;

and when detecting that the current task does not have a corresponding next task, determining that the training of the small sample continuous learning model is finished, and verifying the performance of the small sample continuous learning model through a verification data set corresponding to each task to obtain a performance verification result of the small sample continuous learning model.

In a second aspect of the embodiments of the present invention, there is provided a training apparatus for a small sample continuous learning model, the small sample continuous learning model including a slow weight and a fast weight, the apparatus including:

the first calculating unit is used for calculating a current activity mark corresponding to the slow weight based on a training data set corresponding to a current task;

a second calculation unit, configured to calculate a mapping value based on the current activity flag and an accumulated activity flag stored by a previous task, where the mapping value is used to constrain updating of the fast weight;

the copying unit is used for copying the slow weight to the fast weight and updating the fast weight through a classification loss function;

and the updating unit is used for updating the parameters in the slow weight through the updated fast weight and realizing the training of the small sample continuous learning model based on the updated fast weight and the updated slow weight.

In one embodiment of this embodiment, the apparatus further comprises:

the acquisition unit is used for acquiring a preset support data set before the first calculation unit calculates the current activity mark corresponding to the slow weight based on the training data set corresponding to the current task;

and the training unit is used for pre-training based on the support data set to obtain the slow weight.

In one embodiment of this embodiment, the first calculation unit includes:

the first acquisition subunit is used for acquiring a training data set corresponding to the current task;

the first calculating subunit is used for calculating the training data set to obtain expected modular lengths of parameter gradients in each layer of the slow-weight feature embedding layer;

and the determining subunit is configured to determine that the modular length is the current active flag corresponding to the slow weight.

In one embodiment of this embodiment, the second calculation unit includes:

the second acquisition subunit is used for acquiring the accumulated activity mark corresponding to the previously stored historical task;

a first updating subunit, configured to update the accumulated activity flag with the current activity flag;

the second calculating subunit is configured to calculate a difference between the updated accumulated activity flag and the average of the current activity flag;

and the mapping subunit is used for mapping the difference of the mean values through a Sigmoid function to obtain a mapping value.

In one embodiment of this embodiment, the copy unit includes:

the third acquisition subunit is used for acquiring a historical training data set corresponding to the historical task;

the adding subunit is used for adding the historical training data set to the training data set corresponding to the current task to obtain a replay data set;

a replicon unit to replicate the slow weight to the fast weight;

and the second updating subunit is used for updating the fast weight through a classification loss function by taking the replay data set, the slow weight and the mapping value as a basis, wherein the classification loss function is a loss function in the small sample continuous learning model.

In one embodiment of this embodiment, the apparatus further comprises:

the determining unit is used for updating the parameters in the slow weight through the updated fast weight by the updating unit, realizing the training of the small sample continuous learning model based on the updated fast weight and the updated slow weight, determining a next task as a current task when detecting that the current task has the corresponding next task, and controlling the first obtaining subunit to obtain a training data set corresponding to the current task;

and the verification unit is used for determining that the training of the small sample continuous learning model is finished when the current task is detected to have no corresponding next task, and verifying the performance of the small sample continuous learning model through a verification data set corresponding to each task to obtain a performance verification result of the small sample continuous learning model.

In a third aspect of embodiments of the present invention, there is provided a computer-readable storage medium storing a computer program enabling, when executed by a processor, the method of any one of the first aspect.

In a fourth aspect of embodiments of the present invention, there is provided a computing device comprising the storage medium of the third aspect.

According to the training method, medium, device and computing equipment of the small sample continuous learning model, the slow weight and the fast weight can be set in the small sample continuous learning model, the mapping value used for constraining the fast weight can be obtained through the activity mark related to the slow weight, the fast weight is updated through the mapping value and the slow weight, and the fast weight is updated again through the updated slow weight, so that the training of the small sample continuous model is realized, the generalization capability of small sample learning and the fitting capability of continuous learning in the same model are balanced, and the processing performance of the small sample continuous learning model on the whole task sequence is improved.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

fig. 1 is a schematic flowchart of a training method of a small sample continuous learning model according to an embodiment of the present invention;

FIG. 2a is an example picture contained in a miniImageNet dataset;

FIG. 2b is an example picture of an omniroot dataset;

FIG. 3 is a schematic flowchart of a training method for a small sample continuous learning model according to another embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a training apparatus for a small sample continuous learning model according to an embodiment of the present invention;

FIG. 5 schematically shows a schematic of the structure of a medium according to an embodiment of the invention;

fig. 6 schematically shows a structural diagram of a computing device according to an embodiment of the present invention.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to the embodiment of the invention, a training method, a medium, a device and a computing device for a small sample continuous learning model are provided.

In this document, it is to be understood that any number of elements in the figures are provided by way of illustration and not limitation, and any nomenclature is used for differentiation only and not in any limiting sense.

The principles and spirit of the present invention are explained in detail below with reference to several representative embodiments of the invention.

Exemplary method

Referring to fig. 1, fig. 1 is a schematic flowchart of a training method of a small sample continuous learning model according to an embodiment of the present invention. It should be noted that the embodiments of the present invention can be applied to any applicable scenarios.

Fig. 1 shows a flow 100 of a training method for a small sample continuous learning model according to an embodiment of the present invention, which includes:

step S101, calculating a current activity mark corresponding to the slow weight based on a training data set corresponding to a current task; the small sample continuous learning model comprises a slow weight and a fast weight;

step S102, calculating a mapping value based on the current activity mark and an accumulated activity mark stored by a previous task, wherein the mapping value is used for restricting the updating of the fast weight;

step S103, copying the slow weight to the fast weight, and updating the fast weight through a classification loss function;

and step S104, updating parameters in the slow weight through the updated fast weight, and realizing the training of the small sample continuous learning model based on the updated fast weight and the updated slow weight.

The model processing method provided by the application aims at a small sample continuous learning model constructed based on artificial intelligence represented by an artificial neural network, and is used for learning a new task through a small number of samples and learning the small sample continuous learning model of an incremental task under the condition of not forgetting an old learned task.

According to the invention, the slow weight and the fast weight can be set in the small sample continuous learning model, the mapping value for restraining the fast weight can be obtained through the activity mark calculation related to the slow weight, the fast weight is updated through the mapping value and the slow weight, and the fast weight is updated again through the updated slow weight, so that the training of the small sample continuous model is realized, the generalization capability and the fitting capability of continuous learning of small sample learning in the same model are balanced, and the processing performance of the small sample continuous learning model on the whole task sequence is improved.

The following describes how to balance generalization ability of small sample learning and fitting ability of continuous learning in the same model and improve the processing performance of the small sample continuous learning model on the whole task sequence with reference to the accompanying drawings:

in an embodiment of the present invention, the small sample continuous learning model includes a feature embedding layer and an output layer, the slow weight is composed of parameters in the feature embedding layer and parameters in the output layer, and the fast weight is composed of parameters in the feature embedding layer and parameters in the output layer. Therefore, the slow weight and the fast weight can be obtained through the parameters in the feature embedding layer and the parameters in the output layer in the small sample continuous learning model, and the correlation between the small sample continuous learning model and the slow weight and the fast weight is ensured.

In the embodiment of the invention, in order to realize the purpose that a small sample continuous learning model wants to realize small sample learning and continuous learning in the same model at the same time, the small sample continuous learning model can be trained through a task sequence consisting of a plurality of tasks, wherein each task can be different and can respectively correspond to a training data set with a small number of samples; the current task can be any one task in a task sequence, a training data set can contain a small number of training samples, and due to the fact that the slow weight assists in achieving small sample learning, the current activity mark corresponding to the slow weight can be obtained by training the training data set corresponding to the current task.

Further, the accumulated activity flag stored in the previous task before the current task may be obtained, and a mapping value may be calculated based on the accumulated activity flag and the current activity flag, where the mapping value may limit the updating of the fast weight, and prevent the small sample continuous learning model from being over-fitted.

In the embodiment of the invention, the slow weight can be copied to the fast weight, and the fast weight can be updated based on the slow weight and the mapping value through the classification loss function, so that the updated fast weight can be used as a basis when a new task is learned, the classification loss function can be a loss function in the small sample continuous learning model, and the loss function of the small sample continuous learning model can be subjected to gradient calculation through the calculated loss result and the mapping value, so that the parameter of the fast weight is updated.

In addition, the slow weight can be updated again through the updated fast weight so as to improve the generalization capability of the small sample continuous learning model to subsequent tasks and training samples, and the training of the small sample continuous learning model can be realized based on the updated slow weight parameter and fast weight parameter.

As an optional implementation manner, in step S101, based on the training data set corresponding to the current task, a manner of calculating the current activity flag corresponding to the slow weight may specifically include the following steps:

acquiring a training data set corresponding to a current task;

By implementing the implementation mode, the current activity mark corresponding to the slow weight can be obtained by calculating the training data set corresponding to the current task, so that the accuracy of calculating the current activity mark is improved.

In the embodiment of the invention, the current task may be t, the training data set corresponding to the current task t may be Dt, and the slow weight included in the small sample continuous learning model may be theta_sThe fast weight may be θ_fThe feature input layer in the small sample continuous learning model can be e, the input layer can be o, and the slow weight theta composed of the parameters in the feature embedding layer e and the parameters in the output layer o_sCan be that

And fast weights theta consisting of parameters in the feature embedding layer e and parameters in the output layer o_fCan be that

And, for slow-weighted feature embedding layers, each layer can be represented as l, each parameter in each layer can be represented as i, canBy aligning the training data sets D_tCalculating to obtain the expected modular length of each parameter gradient in each layer of the characteristic embedding layer with slow weight, wherein the modular length can be determined as the current active mark corresponding to the slow weight

The expression of the current activity flag is:

as an alternative implementation, the step S102 may specifically calculate the mapping value based on the current activity flag and the cumulative activity flag stored by the previous task, and include the following steps:

updating the cumulative activity flag with the current activity flag;

By implementing the implementation mode, the accumulated activity mark corresponding to the historical task can be updated through the current activity mark, and the mapping value can be obtained through the current activity mark and the accumulated activity mark, so that the obtained mapping value is more accurate.

According to the embodiment of the invention, the accumulated activity mark of the history corresponding to the previously stored historical task can be acquired

The cumulative activity flag may be updated in a cumulative manner based on the current activity flag, i.e.

And may be done by comparing the updated cumulative activity flag with the current activity flagCalculating to obtain the difference of the average values of the accumulated active markers and the current active marker, and mapping the difference of the average values into a (0, 1) interval through a Sigmoid function to obtain a mapping value

The expression for this mapped value may be:

where m may be a positive scale hyperparameter, N_lMay be the number of layer 1 parameters.

In the embodiment of the invention, in order to realize small sample learning and continuous learning in the same model, the small sample continuous learning model can be provided with a fast and slow weight mechanism and a two-step consolidation mechanism, wherein the fast and slow weight mechanism is as follows: small sample learning can be assisted through the slow weight, and continuous learning can be assisted through the fast weight; the two consolidation mechanisms are: the variation of fast weights at the feature input layer can be limited by the expected gradient accumulation size to mitigate overfitting, and the mapped values can be calculated by Sigmoid function to limit fast weights by mapped values to avoid under-fitting.

For example, the small sample continuous learning model may be an image classification model for classifying images, and based on the small sample learning and continuous learning capabilities, the small sample continuous learning model for classifying images may classify various different types of images, thereby improving the classification performance of the model. The training data set corresponding to the task may include training data of image samples, and the ability of the small sample continuous learning model to continuously learn to classify different types of images may be achieved based on the training data set, please refer to fig. 2a and fig. 2b together, where fig. 2a is an example picture included in the miniImageNet data set, and fig. 2b is an example picture of the omniglot data set, and the small sample continuous learning model may be trained in the above manner, so that the small sample continuous learning model continuously learns 10 classification tasks, where each classification task may include 5 types of images, and each type of image may have 5 training samples; the small sample continuous learning model after the training can classify the learned 50 types of images. For example, on the miniImageNet dataset shown in FIG. 2a and on the omniglot dataset shown in FIG. 2b, the small-sample continuous learning model can improve the classification accuracy of continuous learning from 18.07% and 92.13% to 33.66% and 96.63%, respectively.

Therefore, the ability to continuously learn new tasks and new samples is crucial for a small sample continuous learning model, and for an image classification task, new classes are often required to be continuously learned, for example, a face recognition system needs to enter new users. An image classification task learned by a small sample continuous learning model usually needs a large amount of labeled data for training, and is often not easy to acquire in practical application scenes (such as a face recognition task, a fingerprint lock and the like of a certain user). Therefore, continuous learning in a small sample scene, i.e., small sample continuous learning, needs to be considered.

Referring to fig. 3, fig. 3 is a schematic flow chart of a training method of a small sample continuous learning model according to another embodiment of the present invention, and a flow chart 300 of the training method of the small sample continuous learning model according to another embodiment of the present invention shown in fig. 3 includes:

step S301, acquiring a preset support data set;

and S302, pre-training based on the support data set to obtain a slow weight.

In the embodiment of the present invention, by implementing the steps S301 to S302, before the small sample continuous learning model learns the task, the slow weight may be pre-trained through a pre-set support training set to obtain an initial slow weight, so as to ensure that the small sample continuous learning model can perform normal learning based on the slow weight.

Step S303, calculating a current activity mark corresponding to the slow weight based on a training data set corresponding to a current task;

step S304, calculating a mapping value based on the current activity mark and the accumulated activity mark stored by the prior task, wherein the mapping value is used for restricting the updating of the fast weight;

step S305, acquiring a historical training data set corresponding to a historical task;

step S306, adding the historical training data set to a training data set corresponding to the current task to obtain a replay data set;

step S307, copying the slow weight to the fast weight;

step S308, updating the fast weight through a classification loss function by taking the replay data set, the slow weight and the mapping value as a basis, wherein the classification loss function is a loss function in the small sample continuous learning model.

In the embodiment of the present invention, by implementing the above steps S305 to S308, the historical training data set corresponding to the historical task may be added to the training data set corresponding to the current task to obtain the replay data set, and the fast weight may be updated through the replay data set, the slow weight, and the mapping value, so that the small-sample continuous learning model may simultaneously implement small-sample learning and continuous learning based on the slow weight and the fast weight, and the training effect of the small-sample continuous learning model is improved.

In the embodiment of the invention, the fast weight is updated to improve the continuous learning capability of the small sample continuous learning model, so that in order to avoid the phenomenon that the small sample continuous learning model is disastrous to the learned historical tasks, the fast weight can be updated in multiple steps through memory replay, that is, the historical training data set of the historical tasks can be replayed into the training data set of the current task, and the replay data set D containing the historical training data set and the current training data set is obtained_1：tAnd then, the fast weight can be updated through the classification loss function L of the small sample continuous learning model, that is:

it can be seen that the loss function can be classifiedL based on replay data set D_1：tSlow weight θ_sAnd the mapped value

For fast weight theta_fAnd (6) updating.

And S309, updating parameters in the slow weight through the updated fast weight, and realizing training of the small sample continuous learning model based on the updated fast weight and the updated slow weight.

Step S310, when detecting that the current task has a corresponding next task, determining the next task as the current task, and executing the acquisition of the training data set corresponding to the current task;

step S311, when detecting that the current task does not have a corresponding next task, determining that the training of the small sample continuous learning model is completed, and verifying the performance of the small sample continuous learning model through a verification data set corresponding to each task to obtain a performance verification result of the small sample continuous learning model.

In the embodiment of the present invention, by implementing the above steps S302 to S305, it may be detected whether there is a task that has not been trained in the task sequence, that is, whether there is a next task, if there is a next task, the next task may be learned again through the above steps, if there is no next task, it may be determined that the training of the small sample continuous learning model is completed, the small sample continuous learning model may be verified through the verification data set, so as to determine the training effect of the model according to the verification result, and thus the comprehensiveness of the training of the small sample continuous learning model is ensured.

According to the technical scheme, the generalization capability of small sample learning and the fitting capability of continuous learning in the same model can be balanced, and the processing performance of the small sample continuous learning model on the whole task sequence is improved. In addition, the relevance of the small sample continuous learning model to the slow weight and the fast weight can be ensured. In addition, the accuracy of calculating the current activity mark can be improved. In addition, the obtained mapping value can be more accurate. In addition, the small sample continuous learning model can be ensured to perform normal learning based on the slow weight. In addition, the training effect of the small sample continuous learning model can be improved. In addition, the comprehensiveness of the training of the small sample continuous learning model can be ensured.

Exemplary devices

Having described the method of the exemplary embodiment of the present invention, next, a training apparatus of a small sample continuous learning model of the exemplary embodiment of the present invention will be described with reference to fig. 4, the apparatus including:

a first calculating unit 401, configured to calculate, based on a training data set corresponding to a current task, a current activity flag corresponding to the slow weight; the small sample continuous learning model comprises a slow weight and a fast weight;

a second calculating unit 402, configured to calculate a mapping value based on the current activity flag obtained by the first calculating unit 401 and an accumulated activity flag stored by a previous task, where the mapping value is used to constrain updating of the fast weight;

a copying unit 403, configured to copy the slow weight to the fast weight, and update the fast weight through a classification loss function;

an updating unit 404, configured to update parameters in the slow weight through the fast weight updated by the copying unit 403, and implement training of the continuous learning model of the small sample based on the fast weight and the slow weight after updating.

As an optional implementation, the apparatus may further include:

an obtaining unit, configured to obtain a preset support data set before the first calculating unit 401 calculates a current activity flag corresponding to the slow weight based on a training data set corresponding to a current task;

Wherein, in carrying out such an embodiment,

the slow weight can be pre-trained through a preset support training set before the small sample continuous learning model learns the task, so that the initial slow weight is obtained, and the small sample continuous learning model can be ensured to normally learn based on the slow weight.

As an alternative embodiment, the small sample continuous learning model may include a feature embedding layer and an output layer, the slow weight is composed of parameters in the feature embedding layer and parameters in the output layer, and the fast weight is composed of parameters in the feature embedding layer and parameters in the output layer.

By implementing the implementation mode, it can be seen that the slow weight and the fast weight can be obtained through the parameters in the feature embedding layer and the parameters in the output layer in the small sample continuous learning model, and the correlation between the small sample continuous learning model and the slow weight and the fast weight is ensured.

As an optional implementation, the first computing unit 401 may include:

Wherein, in carrying out such an embodiment,

the current activity mark corresponding to the slow weight can be obtained by calculating the training data set corresponding to the current task, so that the accuracy of calculating the current activity mark is improved.

As an optional implementation manner, the second computing unit 402 may include:

As an optional implementation manner, the copying unit 403 may include:

a replicon unit to replicate the slow weight to the fast weight;

By implementing the implementation mode, the historical training data set corresponding to the historical task can be added to the training data set corresponding to the current task to obtain the replay data set, and the fast weight can be updated through the replay data set, the slow weight and the mapping value, so that the small sample continuous learning model can realize small sample learning and continuous learning simultaneously based on the slow weight and the fast weight, and the training effect of the small sample continuous learning model is improved.

As an optional implementation, the apparatus may further include:

a determining unit, configured to, after the updating unit 404 updates the parameters in the slow weight through the updated fast weight, and based on the updated fast weight and the updated slow weight, implement training of the continuous learning model for the small sample, and when it is detected that a next task corresponding to the current task exists, determine the next task as the current task, and control the first obtaining subunit to obtain a training data set corresponding to the current task;

By implementing the implementation mode, whether a task which is not trained exists in the task sequence or not can be detected, namely whether a next task exists or not can be detected, if the next task exists, the next task can be learned again through the steps, if the next task does not exist, the training completion of the small sample continuous learning model can be determined, the small sample continuous learning model can be verified through the verification data set, the training effect of the model can be determined according to the verification result, and the comprehensiveness of the small sample continuous learning model training is ensured.

Exemplary Medium

Having described the method and apparatus of the exemplary embodiments of the present invention, next, a computer-readable storage medium of the exemplary embodiments of the present invention is described with reference to fig. 5, please refer to fig. 5, which illustrates a computer-readable storage medium being an optical disc 50 having a computer program (i.e., a program product) stored thereon, where the computer program, when executed by a processor, implements the steps described in the above method embodiments, for example, the small sample continuous learning model includes a slow weight and a fast weight, and the current activity flag corresponding to the slow weight is calculated based on a training dataset corresponding to the current task; calculating a mapping value based on the current activity marker and an accumulated activity marker stored by a previous task; copying the slow weight to the fast weight, and updating the fast weight through a classification loss function; updating parameters in the slow weight through the updated fast weight, and realizing the training of the small sample continuous learning model based on the updated fast weight and the updated slow weight; the specific implementation of each step is not repeated here.

It should be noted that examples of the computer-readable storage medium may also include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, or other optical and magnetic storage media, which are not described in detail herein.

Exemplary computing device

Having described the methods, media, and apparatus of exemplary embodiments of the present invention, a computing device for training of a small sample continuous learning model of exemplary embodiments of the present invention is next described with reference to FIG. 6.

FIG. 6 illustrates a block diagram of an exemplary computing device 60 suitable for use in implementing embodiments of the present invention, the computing device 60 may be a computer system or server. The computing device 60 shown in FIG. 6 is only one example and should not be taken to limit the scope of use and functionality of embodiments of the present invention.

As shown in fig. 6, components of computing device 60 may include, but are not limited to: one or more processors or processing units 601, a system memory 602, and a bus 603 that couples various system components including the system memory 602 and the processing unit 601.

Computing device 60 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computing device 60 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 602 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)6021 and/or cache memory 6022. Computing device 60 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, ROM6023 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, but typically referred to as a "hard disk drive"). Although not shown in FIG. 6, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus 603 by one or more data media interfaces. At least one program product may be included in system memory 602 with a set (e.g., at least one) of program modules configured to perform the functions of embodiments of the present invention.

A program/utility 6025 having a set (at least one) of program modules 6024 may be stored, for example, in the system memory 602, and such program modules 6024 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment. Program modules 6024 generally carry out the functions and/or methodologies of embodiments of the invention as described herein.

Computing device 60 may also communicate with one or more external devices 604, such as a keyboard, pointing device, display, etc. Such communication may occur via input/output (I/O) interfaces 605. Moreover, computing device 60 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through network adapter 606. As shown in FIG. 6, network adapter 606 communicates with other modules of computing device 60, such as processing unit 601, via bus 603. It should be appreciated that although not shown in FIG. 6, other hardware and/or software modules may be used in conjunction with computing device 60.

The processing unit 601 executes various functional applications and data processing by running a program stored in the system memory 602, for example, the small sample continuous learning model includes a slow weight and a fast weight, and a current activity flag corresponding to the slow weight is calculated based on a training data set corresponding to a current task; calculating a mapping value based on the current activity marker and an accumulated activity marker stored by a previous task; copying the slow weight to the fast weight, and updating the fast weight through a classification loss function; and updating parameters in the slow weight through the updated fast weight, and realizing the training of the small sample continuous learning model based on the updated fast weight and the updated slow weight. The specific implementation of each step is not repeated here. It should be noted that although in the above detailed description several units/modules or sub-units/sub-modules of the training apparatus of the small sample continuous learning model are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

In the description of the present invention, it should be noted that the terms "first", "second", and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

Through the above description, the embodiments of the present invention provide the following technical solutions, but are not limited thereto:

1. a training method of a small sample continuous learning model, wherein the small sample continuous learning model comprises a slow weight and a fast weight, and the method comprises the following steps:

2. The training method of the small-sample continuous learning model according to scheme 1, before calculating the current activity flag corresponding to the slow weight based on the training data set corresponding to the current task, the method further includes:

acquiring a preset support data set;

and pre-training based on the support data set to obtain a slow weight.

3. The training method of the small sample continuous learning model according to scheme 2, wherein the small sample continuous learning model comprises a feature embedding layer and an output layer, the slow weight is composed of parameters in the feature embedding layer and parameters in the output layer, and the fast weight is composed of parameters in the feature embedding layer and parameters in the output layer.

4. The training method of the small sample continuous learning model according to scheme 3, which calculates the current activity flag corresponding to the slow weight based on the training data set corresponding to the current task, includes:

acquiring a training data set corresponding to a current task;

5. The training method of the small sample continuous learning model according to scheme 4, wherein a mapping value is calculated based on the current activity flag and the accumulated activity flag stored by the previous task, and the method comprises the following steps:

updating the cumulative activity flag with the current activity flag;

6. The training method of the small sample continuous learning model according to any one of schemes 1 to 5, which copies the slow weight to the fast weight and updates the fast weight through a classification loss function, includes:

acquiring a historical training data set corresponding to a historical task;

copying the slow weight to the fast weight;

7. In the training method of the small-sample continuous learning model according to scheme 6, after the parameters in the slow weight are updated by the updated fast weight and the training of the small-sample continuous learning model is implemented based on the updated fast weight and the updated slow weight, the method further includes:

8. An apparatus for training a small sample continuous learning model, wherein the small sample continuous learning model comprises a slow weight and a fast weight, the apparatus comprising:

9. The training device for the small-sample continuous learning model according to scheme 8, further comprising:

10. The training apparatus for a small sample continuous learning model according to claim 9, wherein the small sample continuous learning model includes a feature embedding layer and an output layer, the slow weight is composed of parameters in the feature embedding layer and parameters in the output layer, and the fast weight is composed of parameters in the feature embedding layer and parameters in the output layer.

11. The training device for a small-sample continuous learning model according to claim 10, wherein the first calculation unit includes:

12. The training device for the small-sample continuous learning model according to claim 11, wherein the second calculation unit includes:

13. The training device for a small sample continuous learning model according to any one of claims 8 to 12, wherein the replication unit comprises:

a replicon unit to replicate the slow weight to the fast weight;

14. The training apparatus for continuously learning a model with a small sample according to claim 13, further comprising:

15. A storage medium storing a program, wherein the storage medium stores a computer program which, when executed by a processor, implements a training method of a training model of a small-sample continuous learning model according to any one of aspects 1 to 7.

16. A computing device comprising a storage medium as recited in claim 15 above.

Claims

2. The training method of the small-sample continuous learning model according to claim 1, before calculating the current activity flag corresponding to the slow weight based on the training dataset corresponding to the current task, the method further comprising:

acquiring a preset support data set;

and pre-training based on the support data set to obtain a slow weight.

3. The training method of a small sample continuous learning model according to claim 2, the small sample continuous learning model comprising a feature embedding layer and an output layer, the slow weight being composed of parameters in the feature embedding layer and parameters in the output layer, and the fast weight being composed of parameters in the feature embedding layer and parameters in the output layer.

4. The training method of the small-sample continuous learning model according to claim 3, wherein calculating the current activity label corresponding to the slow weight based on the training data set corresponding to the current task comprises:

acquiring a training data set corresponding to a current task;

5. An apparatus for training a small sample continuous learning model, wherein the small sample continuous learning model comprises a slow weight and a fast weight, the apparatus comprising:

6. The training apparatus for a small-sample continuous learning model according to claim 5, further comprising:

7. The training device of the small sample continuous learning model according to claim 6, wherein the small sample continuous learning model comprises a feature embedding layer and an output layer, the slow weight is composed of parameters in the feature embedding layer and parameters in the output layer, and the fast weight is composed of parameters in the feature embedding layer and parameters in the output layer.

8. The training device for the small-sample continuous learning model according to claim 7, wherein the first calculation unit comprises:

9. A storage medium storing a program, wherein the storage medium stores a computer program which, when executed by a processor, implements a training method of a training model of a small-sample continuous learning model according to any one of claims 1-4.

10. A computing device comprising the storage medium of claim 9.