CN116644805A

CN116644805A - Model training method, device, equipment and medium

Info

Publication number: CN116644805A
Application number: CN202310562897.6A
Authority: CN
Inventors: 赵蕾
Original assignee: Xinao Xinzhi Technology Co ltd
Current assignee: Xinao Xinzhi Technology Co ltd
Priority date: 2023-05-18
Filing date: 2023-05-18
Publication date: 2023-08-25

Abstract

The embodiment of the application provides a model training method, device, equipment and medium, which are used for solving the problem of reduced recognition capability caused by explosion of meta-learner parameters in the related technology. In the embodiment of the application, after the first loss value is acquired, the first loss value is input into the LSTM network model, the corresponding gradient value is acquired, the learner parameter is adjusted based on the gradient value, so that the training stability and efficiency are improved, after the learner parameter is adjusted, the learner of the parameter is adjusted to assist the meta-learner in training, and the LSTM network model is characterized in that the value in a period of time can be remembered, so that the value of each neuron in each iteration process can be saved, the problem of parameter explosion can be solved, and the identification capability of meta-learner is improved.

Description

Model training method, device, equipment and medium

Technical Field

The present application relates to the field of training technologies, and in particular, to a model training method, apparatus, device, and medium.

Background

With the rapid development of society, the data size and complexity are increasing, and the problem to be handled becomes more complex. Models are becoming an increasingly important tool as a means of solving problems. However, training the model requires a large number of data samples, each of which needs to be labeled, which results in a large amount of human resources being wasted. The time and the cost for training the model are increased, and the application and popularization of the model are restricted.

In order to solve the problem that a large number of samples are required for training a model, a meta-learning method is provided in the field of artificial intelligence. Meta learning is trained by using fewer samples, so that the burden of data annotation is reduced, and the meta learning has strong adaptability and generalization capability. Typically, meta-learning is training by a sample-training-based learner (learner) to assist in the meta-learner, and adjustment and optimization of the meta-learner parameters is achieved by back-propagation of the adjusted learner parameters. However, since there are a large number of neural network elements in the learner, each neural network element needs to perform parameter adjustment, and when performing adjustment, the parameter when each neural element needs to be memorized for multiple times, so that the problem that the parameter is adjusted to be the same parameter, and the number of iterations in the training process is large, and thus the parameter explosion is easily caused is avoided. Once the parameters explode, the meta-learner recognition capability is greatly reduced, thereby affecting the effect of meta-learning itself.

Disclosure of Invention

The embodiment of the application provides a model training method, device, equipment and medium, which are used for solving the problem of reduced recognition capability caused by explosion of meta-learner parameters in the related technology.

In a first aspect, an embodiment of the present application provides a model training method, where the method includes:

in any iteration, any first sample in a training set (Dtrain) and first labeling information corresponding to the first sample are obtained; inputting the sample into a leanner, obtaining a first output of the leanner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into a Long Short-time memory (LSTM) network model, and acquiring a gradient value output by the LSTM model; adjusting a parameter of the learner based on the gradient value; inputting any second sample in the test set (Dtest) to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of the parameters in the learnereta-learner according to the second loss value.

Further, the inputting the second sample into the parameter-adjusted learner, and determining the corresponding second loss value includes:

acquiring any second sample in the Dtest and correspondingly storing second labeling information aiming at the second sample;

inputting the second sample into the parameter-adjusted learner to obtain a second output of the learner;

and acquiring a corresponding second loss value based on the second output and the second labeling information.

Further, before the obtaining any first sample in Dtrain, the method further includes:

receiving training instructions carrying categories;

the obtaining any first sample in Dtrain includes:

acquiring a support set (support set) correspondingly stored for the category in the Dtrain; any first sample in the support set is acquired.

Further, before any sample in Dtrain and the first labeling information corresponding to the sample are obtained, the method further includes:

the parameters of meta-learner are initialized.

Further, initializing the parameter of the meta-learner includes:

initializing the parameter of the meta-learner by a random initialization mode.

Further, initializing the parameter of the meta-learner includes:

and randomly determining a parameter within a pre-stored parameter range, and initializing the parameter of the meta-player by adopting the parameter.

In a second aspect, an embodiment of the present application further provides a model training apparatus, where the apparatus includes:

in any iteration, an acquisition module is used for acquiring any first sample in Dtrain and first labeling information corresponding to the first sample; an acquisition module for inputting the sample into a learner leanner, acquiring a first output of the leanner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into a long-short-time memory LSTM network model, and acquiring a gradient value output by the LSTM model;

the processing module is used for adjusting the parameters of the learner based on the gradient value; inputting any second sample in Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of the element learnermeta-learner according to the second loss value.

Further, the processing module is specifically configured to obtain any second sample in the Dtest and second labeling information corresponding to the second sample; inputting the second sample into the parameter-adjusted learner to obtain a second output of the learner; and acquiring a corresponding second loss value based on the second output and the second labeling information.

Further, the processing module is further configured to receive a training instruction carrying a category;

the acquisition module is specifically configured to acquire a support set corresponding to the category in the Dtrain; any first sample in the support set is acquired.

Further, the processing module is further configured to initialize a parameter of the meta-filter.

Further, the processing module is specifically configured to initialize the parameter of the meta-bearer in a random initialization manner.

Further, the processing module is specifically configured to randomly determine a parameter within a pre-saved parameter range, and initialize the parameter of the meta-bearer with the parameter.

In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes at least a processor and a memory, where the processor is configured to implement the steps of the training method according to any one of the preceding claims when executing a computer program stored in the memory.

In a fourth aspect, embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the training method as described in any of the preceding claims.

In the embodiment of the application, any first sample in Dtrain and first labeling information corresponding to the first sample are acquired in any iteration; inputting the sample into a learner to obtain a first output of the learner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into an LSTM network model, and acquiring a gradient value output by the LSTM model; adjusting a parameter of the leanner based on the gradient value; inputting any second sample of Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of the meta-learner according to the second loss value so as to improve the identification capability of the meta-learner. In the embodiment of the application, after the first loss value is acquired, the first loss value is input into the LSTM network model, the corresponding gradient value is acquired, the learner parameter is adjusted based on the gradient value, so that the stability and efficiency of training are improved, any second sample in Dtest is input into the learner after the parameter adjustment is carried out, the corresponding second loss value is determined, the parameter of meta-learner is adjusted according to the second loss value, the meta-learner is assisted by the learner for training, and the value in a period of time can be memorized due to the characteristic of the LSTM network model, so that the value of each neuron in each iteration process can be avoided being saved, the problem of parameter explosion can be solved, and the identification capability of meta-learner is improved.

Drawings

In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a process of any one iteration in training according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a process for obtaining a second loss value according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a process for acquiring a first sample according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a model training device according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the application, fall within the scope of protection of the application.

In order to improve the identification capability of meta-learner, the embodiment of the application provides a model training method, device, equipment and medium.

The training method comprises the following steps: in any iteration, any first sample in Dtrain and first labeling information corresponding to the first sample are obtained; inputting the sample into a learner to obtain a first output of the learner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into the LSTM network model, and acquiring a gradient value output by the LSTM model; adjusting a parameter of the leanner based on the gradient value; inputting any second sample of Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of the meta-learner according to the second loss value so as to improve the identification capability of the meta-learner.

Example 1:

fig. 1 is a schematic diagram of a process of any iteration in training according to an embodiment of the present application, where the process includes the following steps:

s101: any first sample in Dtrain is obtained, and first labeling information corresponding to the first sample is stored.

The training method provided by the embodiment of the application is applied to the electronic equipment, and the electronic equipment can be intelligent equipment such as a PC or a server.

In order to realize the training of meta-learner, the electronic device locally stores Dtrain in advance, and corresponding labeling information is stored in the Dtrain for each sample, and the labeling information is called as first labeling information for the convenience of distinguishing.

It should be noted that, if the meta-learner is used for executing the classification task, the first labeling information is a class label of the corresponding sample, for example, the class label of the water cup may be 00, and the class label of the trash can may be 01; if the meta-filter is used for executing the target detection task, the first labeling information is information such as the type of the target in the corresponding sample and the position of the target; the information of what the first identification information is when the meta-player is used to perform other tasks is not limited herein.

In order to realize the training of meta-learner, the electronic device may acquire any first sample in Dtrain and the first labeling information corresponding to the first sample.

S102: inputting the sample into a leanner, obtaining a first output of the leanner; and based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into an LSTM network model, and acquiring a gradient value output by the LSTM model.

To enable training of the meta-filter, after inputting the acquired sample into the filter, the electronic device may acquire an output of the filter, which may be referred to as a first output for ease of distinction. Depending on the task type and the leanner, the first output may be a probability distribution, a real value, or information such as the position of a set of target detection frames, and the specific first output is not limited herein.

Based on the first output and the first labeling information, the electronic device may calculate a loss value corresponding to the sample, and for convenience of distinction, the loss value may be referred to as a first loss value. The common loss functions include cross entropy loss, square loss and the like, which loss function is selected to determine the loss value is predetermined by service personnel, and which loss function is specifically selected is not limited.

After the first loss value is obtained, the electronic device may input the first loss value into the LSTM network model, and obtain a gradient value output by the LSTM network model, where the gradient value is used to optimize a parameter of the meta-learner. Wherein the LSTM network model typically calculates gradient values using a back propagation algorithm.

S103: adjusting a parameter of the learner based on the gradient value; inputting any second sample in Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of meta-learner according to the second loss value.

After the gradient value is obtained, the electronic device may adjust the parameters of the leanner based on the gradient value, and in particular, how to adjust the parameters of the leanner according to the gradient value is the prior art, and is not indexed here.

After the adjustment of the parameters of the learner is completed, the electronic device may acquire any one of the samples of the Dtest, for convenience of distinguishing, the sample is referred to as a second sample, and the electronic device may input the second sample into the learner after the adjustment of the parameters, so as to acquire a second loss value corresponding to the second sample. Meanwhile, the electronic equipment can adjust parameters of the meta-player by using the obtained second loss value, and finally a better optimization effect is obtained. Specifically, how to adjust the meta-learner parameter based on the loss value is the prior art and is not described herein.

In each iteration process, the electronic device may extract a first sample set and a second sample from Dtrain and Dtest, respectively, and the samples and the labeling information stored for the samples may be referred to as a sample pair.

The meta-learner trained by the embodiment of the application can be applied to a pension robot or other robots or other devices, and the pension robot is taken as an example, the pension robot can provide basic life assistance for the old needing assistance, such as taking a cup and the like, and the meta-learner can determine whether the currently acquired image contains the cup or not.

It should be noted that, the conventional model needs to collect a large number of samples for each target to train, and the existing model cannot detect a new target, i.e. has low compatibility, while the meta learner can train to detect a new target continuously on the basis of realizing the existing target detection, so that the compatibility is high. In the embodiment of the application, an LSTM-based optimization method is provided for optimizing meta-learner.

The updating of meta-learner meta-leainer parameters is similar to the updating of the state of LSTM network model, so optimization of meta-leainer can be achieved by means of LSTM network model. In the embodiment of the application, the optimal state of the learner is determined by the state of the LSTM network model, so that the learner is rapidly optimized.

It should be noted that the learner includes a plurality of neurons, the parameter of each neuron needs to be adjusted, and the parameter of each neuron needs to be memorized when being adjusted for a plurality of times, so as to avoid adjusting the parameter to the same value, and the iteration times in the training process are more, which may cause parameter explosion.

In the embodiment of the application, after the first loss value is acquired, the first loss value is input into the LSTM network model, the corresponding gradient value is acquired, the learner parameter is adjusted based on the gradient value, so that the stability and efficiency of training are improved, any second sample in Dtest is input into the learner after the parameter adjustment is carried out, the corresponding second loss value is determined, the parameter of meta-learner is adjusted according to the second loss value, the meta-learner is assisted by the learner for training, and the value in a period of time can be memorized due to the characteristic of the LSTM network model, so that the value of each neuron in each iteration process can be avoided being saved, the problem of parameter explosion can be solved, and the identification capability of meta-learner is improved.

Example 2:

in order to improve the identification capability of the meta-filter, in the embodiment of the present application, the inputting the second sample into the parameter-adjusted filter, and determining the corresponding second loss value includes:

In order to realize training of meta-learner, the electronic device locally stores Dtest in advance, and corresponding labeling information is stored in the Dtest for each sample, and the labeling information is called second labeling information for convenience of distinguishing. The electronic device can acquire any second sample in Dtest and correspondingly store second labeling information aiming at the second sample.

After the second sample is obtained, the electronic device inputs the second sample into the parameter-adjusted learner, and obtains an output of the parameter-adjusted learner, where the output may be referred to as a second output for convenience of distinction, and based on the second output and the second labeling information, the electronic device may calculate a loss value corresponding to the second sample, and may be referred to as a second loss value for convenience of distinction.

Fig. 2 is a schematic diagram of a process for obtaining a second loss value according to an embodiment of the present application, where the process includes the following steps:

s201: and obtaining any second sample in Dtest and correspondingly storing second labeling information aiming at the second sample.

S202: the second sample is input into the parameter-adjusted learner.

S203: a second output of the learner is obtained.

S204: and acquiring a corresponding second loss value based on the second output and the second labeling information.

Example 3:

in order to accurately train the meta-player, based on the above embodiments, in the embodiment of the present application, before any first sample in Dtrain is obtained, the method further includes:

receiving training instructions carrying categories;

the obtaining any first sample in Dtrain includes:

acquiring a support set which is correspondingly stored for the category in the Dtrain; any first sample in the support set is acquired.

In meta-filter training for classification tasks, dtrain contains multiple classes of samples, with each class of images forming a support set. In order to enable the meta-matching to accurately identify the new category, when the first sample is acquired, the first sample corresponding to the new category may be acquired.

The electronic device may receive the training instruction of the carrying category first, specifically, when a service person needs to train the meta-learner based on a new category, select the category to be trained through the device used by the service person or a preset page of the preset device, and click a preset button, for example, a "training" button, so that the electronic device may receive the training instruction of the carrying category.

In order to realize the training of the meta-player, a support set is correspondingly stored for each category in Dtrain stored in the electronic equipment, and a sample in the support set is a sample of the category. In each iteration process, when the electronic device acquires the first sample, a support set corresponding to the category in Dtrain can be acquired, any first sample is acquired in the support set, and meta-learner is trained based on the first sample.

When any first sample in the support set is acquired, the electronic device may randomly select one sample from the first samples according to a certain policy. Specifically, the electronic device may employ different sampling methods, such as random sampling, minimum interval sampling, high-resolution sampling, and the like, and select the most suitable sampling method according to the actual situation.

It should be noted that, when training meta-learner for a certain class, the corresponding training process is completed through a preset number of samples of the class.

For example, if there are 100 categories in Dtrain, each category contains 10 images, then 10 images of a certain category can be used as a support set of the category, and one image is selected from the 10 images for optimizing the task of identifying the category by meta-learner, and the image is the first sample described in the embodiment of the present application.

Before training, a data set related to the task of the meta learner is collected by a business person. The electronic equipment adopts N-way and K-shot to sample to obtain Dtrain and Dtest, wherein the Dtrain and the Dtest respectively consist of a support set and a query set. I.e. the samples to be classified belong to one of N categories, K samples in Dtrain for each category.

The category learned before training on meta-learning may be referred to as meta-training task (meta-training task), and the newly learned category may be referred to as meta-testing task (meta-training task). The category in the meta-tracking task is completely different from the category in the meta-testing task, i.e., a completely new category.

Fig. 3 is a schematic diagram of a process for obtaining a first sample according to an embodiment of the present application, where the process includes the following steps:

s301: and receiving training instructions carrying the categories.

S302: and acquiring a support set which is correspondingly stored for the category in Dtrain.

S303: any first sample of the support set is acquired.

Example 4:

in order to accurately train the meta-player, in the embodiments of the present application, before any sample in Dtrain and the first labeling information corresponding to the sample are obtained, the method further includes:

the parameters of meta-learner are initialized.

In the embodiment of the present application, before training the meta-learner, the electronic device may initialize parameters of the meta-learner, and specifically, the electronic device may first acquire each parameter related to the meta-learner, and initialize each parameter respectively.

In order to accurately train the meta-learner, in the embodiments of the present application, initializing the parameters of the meta-learner includes:

initializing the parameter of the meta-learner by a random initialization mode.

The main reason for random initialization is to avoid that the initialization parameters are too close to each other, thus affecting the meta-player behavior. The electronic device may initialize the meta-player parameter in a random initialization manner. Specifically, the parameter may be randomly generated from a normal distribution or a uniform distribution of the standard, and is taken as an initial parameter of meta-learner.

In the meta-learning field, the meta-leanner parameter is more important than the traditional neural network parameter, because the difference of the meta-leanner parameter may have a great influence on the whole learning process. Therefore, the identification capability of meta-learner can be improved by selecting appropriate initial parameters.

In order to improve the efficiency of meta-leanner training, in the embodiments of the present application, the initializing parameters of meta-leanner includes:

Because the parameter may be too high or too low in a random determination manner, so as to reduce the efficiency of meta-learner training, in the embodiment of the present application, the electronic device may randomly determine the parameter within a pre-saved parameter range, and initialize the meta-learner parameter with the randomly determined parameter.

It should be noted that, there may be more than one parameter to be initialized, if there are multiple parameters to be initialized, the electronic device obtains, for each parameter, a parameter range stored for the parameter, randomly determines the parameter in the parameter range, and initializes the parameter.

Example 5:

fig. 4 is a schematic structural diagram of a model training device according to an embodiment of the present application, where the device includes:

in any iteration, the obtaining module 401 is configured to obtain any first sample in Dtrain and first labeling information corresponding to the first sample; an acquisition module for inputting the sample into a learner leanner, acquiring a first output of the leanner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into a long-short-time memory LSTM network model, and acquiring a gradient value output by the LSTM model;

a processing module 402, configured to adjust a parameter of the learner based on the gradient value; inputting any second sample in Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of the element learnermeta-learner according to the second loss value.

In a possible implementation manner, the processing module 402 is specifically configured to obtain any second sample in the Dtest and second labeling information corresponding to the second sample; inputting the second sample into the parameter-adjusted learner to obtain a second output of the learner; and acquiring a corresponding second loss value based on the second output and the second labeling information.

In a possible implementation manner, the processing module 402 is further configured to receive a training instruction carrying a category;

In a possible implementation, the processing module 402 is further configured to initialize a parameter of meta-learner.

In a possible implementation manner, the processing module 402 is specifically configured to initialize the parameter of the meta-bearer in a random initialization manner.

In one possible implementation, the processing module 402 is specifically configured to randomly determine a parameter within a range of parameters that are saved in advance, and use the parameter to initialize the parameter of the meta-player.

Example 6:

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and on the basis of the foregoing embodiments, the embodiment of the present application further provides an electronic device, as shown in fig. 5, including: the device comprises a processor 501, a communication interface 502, a memory 503 and a communication bus 504, wherein the processor 501, the communication interface 502 and the memory 503 are in communication with each other through the communication bus 504;

the memory 503 has stored therein a computer program which, when executed by the processor 501, causes the processor 501 to perform the steps of:

in any iteration, any first sample in Dtrain and first labeling information corresponding to the first sample are obtained; inputting the sample into a leanner, obtaining a first output of the leanner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into an LSTM network model, and acquiring a gradient value output by the LSTM model; adjusting a parameter of the learner based on the gradient value; inputting any second sample in Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of the learnereta-learner according to the second loss value.

Further, the processor 501 is specifically configured to obtain any second sample in the Dtest and second labeling information corresponding to the second sample;

Further, the processor 501 is further configured to receive a training instruction carrying a category;

the processor 501 is specifically configured to obtain a support set corresponding to the category in the Dtrain; any first sample in the support set is acquired.

The processor 501 is further configured to initialize a parameter of the meta-bearer.

The processor 501 is specifically configured to initialize the meta-player parameter by using a random initialization method.

The processor 501 is specifically configured to randomly determine a parameter within a pre-saved parameter range, and use the parameter to initialize the meta-parameter.

The communication bus mentioned by the server may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the electronic device and other devices.

The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; but also digital instruction processors (Digital Signal Processing, DSP), application specific integrated circuits, field programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

Example 7:

on the basis of the above embodiments, the embodiments of the present application further provide a computer readable storage medium having stored therein a computer program executable by an electronic device, which when run on the electronic device, causes the electronic device to perform the steps of:

the memory has stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of:

In one possible implementation manner, the inputting the second sample of the Dtest to the parameter-adjusted learner, and determining the corresponding second loss value includes:

In one possible embodiment, before the obtaining any first sample in Dtrain, the method further includes:

receiving training instructions carrying categories;

the obtaining any first sample in Dtrain includes:

In one possible implementation manner, before any sample in Dtrain and the first labeling information corresponding to the sample are obtained, the method further includes:

the parameters of meta-learner are initialized.

In one possible implementation, the initializing the parameter of the meta-player includes:

initializing the parameter of the meta-learner by a random initialization mode.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method of model training, the method comprising:

in any iteration, any first sample in the training set Dtrain and first labeling information corresponding to the first sample are obtained; inputting the sample into a learner, obtaining a first output of the learner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into a long-short-time memory LSTM network model, and acquiring a gradient value output by the LSTM model; adjusting a parameter of the learner based on the gradient value; and inputting any second sample in the test set Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting the parameter of the parameter in the element learner learnereta-learner according to the second loss value.

2. The method of claim 1, wherein the inputting of any second sample of Dtest to the parameter-adjusted learner, determining a corresponding second loss value, comprises:

3. The method of claim 1, wherein prior to the obtaining any first sample in Dtrain, the method further comprises:

receiving training instructions carrying categories;

the obtaining any first sample in Dtrain includes:

acquiring a support set corresponding to the category in the Dtrain; any first sample in the support set is acquired.

4. The method of claim 1, wherein before obtaining any sample in Dtrain and the first labeling information corresponding to the sample, the method further comprises:

the parameters of meta-learner are initialized.

5. The method of claim 4, wherein initializing parameters of the meta-player comprises:

initializing the parameter of the meta-learner by a random initialization mode.

6. The method of claim 4 or 5, wherein initializing the meta-learner parameter comprises:

7. A model training apparatus, the apparatus comprising:

in any iteration, an acquisition module is used for acquiring any first sample in a training set Dtrain and first labeling information corresponding to the first sample; an acquisition module for inputting the sample into a learner leanner, acquiring a first output of the leanner; based on the first output and the first labeling information, acquiring a first loss value corresponding to the first sample, inputting the first loss value into a long-short-time memory LSTM network model, and acquiring a gradient value output by the LSTM model;

the processing module is used for adjusting the parameters of the learner based on the gradient value; inputting any second sample in the test set Dtest to the parameter-adjusted learner, determining a corresponding second loss value, and adjusting parameters of the element learnermeta-learner according to the second loss value.

8. The apparatus of claim 7, wherein the processing module is specifically configured to obtain any second sample in the Dtest and second labeling information corresponding to the second sample; inputting the second sample into the parameter-adjusted learner to obtain a second output of the learner; and acquiring a corresponding second loss value based on the second output and the second labeling information.

9. An electronic device comprising at least a processor and a memory, the processor being adapted to implement the steps of the model training method according to any of the preceding claims 1-6 when executing a computer program stored in the memory.

10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the steps of the training method as claimed in any of the preceding claims 1-6.