CN113688944B

CN113688944B - Image identification method based on meta-learning

Info

Publication number: CN113688944B
Application number: CN202111149647.7A
Authority: CN
Inventors: 张鸿杰; 盛谦; 蒋斌; 郭延文
Original assignee: Nanjing L Zone Intelligent Technology Co ltd
Current assignee: Nanjing L Zone Intelligent Technology Co ltd
Priority date: 2021-09-29
Filing date: 2021-09-29
Publication date: 2022-12-27
Anticipated expiration: 2041-09-29
Also published as: CN113688944A

Abstract

The invention provides an image identification method based on meta-learning, which comprises the following steps: step 1, enhancing training data: generating a virtual open set sample for the images in each training set based on the generated confrontation network; step 2, dividing training data: combining the generated open set sample, dividing the training data into a meta training set and a meta testing set according to different domain distributions; step 3, meta-training: calculating the gradient of the model parameters on the meta-training set and modifying the parameters; step 4, meta-testing: evaluating the best model learned in step 3 on the meta-test set; step 5, meta-optimization: combining the loss function optimization models of the step 3 and the step 4; step 6, element iteration: and (5) repeating the steps 2 to 5 to carry out iterative optimization until an image recognition model with good generalization capability is obtained. Compared with the prior art, the method and the device can improve the generalization capability of the model to the domain distribution change on the training set with a plurality of different domain distributions.

Description

Image identification method based on meta-learning

Technical Field

The invention relates to the technical field of computer vision, in particular to an image identification method based on meta-learning.

Background

At present, most image recognition methods are proposed based on the assumption that the closed set and the domain distribution are the same. Wherein the closed set assumption means that the class distribution of the training set completely covers all classes appearing in the test set. However, this assumption does not hold true in real application scenarios. The recognition system is highly likely to face risks from outside the distribution of the training classes, and the closed set assumption forces the recognition system to effectively recognize these samples from unknown classes, which may cause security problems in practical applications.

A more common case is open set assumptions, i.e., the test set contains classes that have not appeared in the training set. This requires that the recognition system be able to accurately recognize both known classes that have appeared during the training process and to respond correctly to unknown classes that have not appeared. The assumption that the domain distribution is the same means that the domain distribution of the training set and the domain distribution of the test set are the same, that is, the image styles of the training set and the test set are the same, which also does not hold true in real application scenarios. The recognition system faces a variety of image styles, and the training process cannot exhaust all possible image styles. However, such a domain distribution difference will cause a serious decrease in the accuracy of the recognition system, forcing the recognition system to have a stronger generalization capability to cope with the influence of the domain distribution difference.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the problem of open set and domain distribution difference of image recognition in a real application scene, an image recognition method based on meta-learning is provided. The invention trains an image recognition model on a plurality of training set data with class labels, and the domain distribution of each training set is different, so as to simulate the domain distribution change of a test set and the training set in the real world. In the image identification method based on meta-learning, a virtual open set sample is generated for each training set data by generating a confrontation network, and overfitting to any category distribution and domain distribution is reduced by a meta-learning optimization strategy based on the open set samples and original training data, so that the generalization capability of a model is improved.

The technical scheme is as follows: the invention discloses an image identification method based on meta-learning, which is characterized in that the generalization capability of a model to category distribution change and domain distribution change is improved through a meta-learning optimization strategy, and comprises the following steps:

step 1, generating a virtual open set sample for images in each training set based on a generated confrontation network, wherein the training set comprises a plurality of sub-training sets with class labels, the class labels contained in each sub-training set are completely overlapped, but the domain distribution of each sub-training set is different, namely the image style in each sub-training set is different; through the steps, the training data can be enhanced;

step 2, combining the virtual open set samples, dividing the training set into a meta-training set and a meta-testing set according to different domain distributions, wherein the classes in the meta-training set completely cover all the classes in the meta-testing set, namely the meta-training set comprises known classes with class labels and unknown classes formed by the virtual open set samples; through the steps, the division of training data is realized;

step 3, calculating the gradient of model parameters on the meta-training set and modifying the parameters to obtain the current optimal model; through the steps, meta-training is realized;

step 4, evaluating the current optimal model obtained in the step 3 on the meta test set; through the steps, the meta test is realized;

step 5, optimizing the current optimal model by combining all the loss functions in the step 3 and the step 4; through the steps, meta-optimization is realized;

and 6, repeating the steps 2 to 5 to carry out iterative optimization until a final optimal model is obtained. Through the steps, the element iteration is realized;

further, in one implementation, the step 1 includes:

step 1-1, projecting the original images in the training set to a feature space z = E (x) by using an encoder;

step 1-2, reversely generating an image G (z) from the feature space by using a generator;

step 1-3, jointly training the encoder, generator and discriminator by cross-optimizing two loss functions according to the following formula:

wherein L is _dis Representing the discrimination loss function, L _recons Representing the loss function of image reconstruction, S _i Representing the ith sub-training set of the plurality S of training sets,

representing the images in the ith sub-training set, D representing the discriminator, G representing the generator, and E representing the encoder; in the invention, through the steps 1-3, the generated reconstructed image can be as vivid as possible as an original image, and the identifier can not distinguish the authenticity of the original image;

step 1-4, according to the following formula, the ith sub-training set S _i Training classifier

Where k represents the number of known classes in the training set, L _c A function representing the loss of classification is represented,

represents the ith sub-training set S _i Image of (1)

The class labels of (a), the class labels containing only class labels of k known classes of the training set;

step 1-5, using the classifier

Approximating decision boundaries of known class data and unknown class data, generating each trainletExercise and Collection S _i Corresponding virtual open set sample

Generating the virtual open set sample of each

Are all integrated into each virtual open set sample

Corresponding sub-training set S _i And then, an enhanced training set is obtained, expressed as S = { S = { [ S ] ⁺ ,S ^- In which S is ⁺ Representing known class data, S ^- Representing the generated virtual open set sample.

Further, in one implementation, the steps 1-5 include:

inputting the image G (z) to a classifier

When the classifier is used

When the output confidence degrees of all the categories are low, and the low confidence degree is low relative to the output confidence degree of zero, determining that the image belongs to the unknown data;

from the ith sub-training set S _i In randomly selecting a seed image x _i As input, the following formula is minimized by the gradient descent method:

wherein z is ^* Representing the feature space vector to be found, the first term | z-E (x) _i ) II denotes the feature space vector z to be found ^* It is desirable to match the seed image x as much as possible _i Similar style, second item

Representing a log-likelihood assuming that the output confidence of an unknown class is zero, and by minimizing the log-likelihood, the output confidence of all the known classes is pushed down;

finding z by a fixed number of steps ^* And to z ^* Decoding using generator G, i.e. for each sub-training set S _i Generating a virtual open set sample, the generated virtual open set sample being represented as

Each of the sub-training sets S _i All have a sub-training set S _i Corresponding virtual open set sample

All the sub-training sets S _i Corresponding virtual open set sample

I.e. as a meta-training set

Virtual open set sample in (1)

All the generated virtual open set samples

Are all integrated into respective corresponding sub-training sets S _i Thereafter, an enhanced training set S is obtained, which is derived from the original known class data S ⁺ And the generated open set sample S ^- And (4) forming.

Specifically, in the present invention, the fixed number of steps is set according to different image tasks in an experiment.

Further, in one implementation, the step 2 includes:

dividing the enhanced training set S into N according to different domain distribution _s Training set of individual units

And N _t Test set of individuals

The meta test set

Number N of _t Is set to 1; the meta training set

For training models, the set of meta-tests

For testing the trained model. In the present invention, the step 2 is to simulate the training and testing set in the real scene, i.e. simulate the domain distribution difference in the real scene.

Further, in one implementation, the step 3 includes:

step 3-1, using the meta training set generated in step 1

Virtual open set sample in (1)

Training an image recognition model F, said

Presentation element training set

The formula of the process is as follows:

wherein L is _o Representing virtual open set examples

θ represents a parameter of the image recognition model F;

step 3-2, training the meta training set

Known class data of

Training is carried out, and the formula of the process is expressed as follows:

L _c ＝-log((F(θ；x)) _y )；

wherein L is _c Application of representation to meta-training set

Known class data in

Cross entropy loss of (d);

step 3-3, the overall loss function is expressed as:

step 3-4, calculating all the meta training sets according to the following formula

Loss of

Wherein, N _s Presentation element training set

The number of (2);

step 3-5, calculating the gradient of the parameter theta according to the following formula:

and 3-6, modifying the parameters of the model F according to the following formula:

wherein α is the step size of the meta training;

and obtaining the model after the whole meta-training parameter is updated, namely the current optimal model.

Further, in an implementation manner, the step 4 includes:

step 4-1, according to the following formula, in the meta-test set

The adaptive parameter loss L (theta') is calculated:

wherein the meta test set

The loss function L (theta') of (C) uses a training set of secondary elements

Calculating the parameter theta' of middle learning;

step 4-2, calculating all the meta-test sets according to the following formula

Loss of (c):

further, in one implementation, the step 5 includes:

in meta-optimization, meta-training and meta-testing are simultaneously optimized by gradients computed from a joint loss function, the process of the optimization being expressed as the following formula:

where β is a trade-off parameter between meta-training and meta-testing.

Further, in one implementation, the step 6 includes:

and repeating the step 2 to the step 5, and obtaining a final optimal model, namely the image recognition model F with good generalization capability when the model optimized in the step 5 is optimized to be converged on the training set S.

The method uses the generated countermeasure network to generate open set samples for training set data distributed in a plurality of different domains, and improves the generalization capability of the model to domain distribution change and category distribution change in image recognition through a meta-learning optimization strategy based on the generated open set samples.

Compared with the prior art, the invention at least comprises the following beneficial effects:

(1) And the model acquires knowledge for identifying unknown classes by using the generated open set sample, and the open set identification knowledge is transferred to an image identification environment with unknown domain distribution through meta-learning.

(2) And the generalization capability of the model for coping with the domain distribution change is improved on a training set distributed in a plurality of different domains based on the meta-learning method.

(3) All models (including an encoder, a generator, a discriminator and an image recognition model) in the invention are not limited to specific structures, and suitable models can be selected according to specific requirements of image recognition tasks.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings required to be used in the embodiments will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic workflow diagram of an image recognition method based on meta-learning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of approximate image generation in an image recognition method based on meta learning according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating the generation of open sample in an image recognition method based on meta-learning according to an embodiment of the present invention;

fig. 4 is a schematic diagram of an open set sample in an image recognition method based on meta learning according to an embodiment of the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, the present invention is described in detail with reference to the accompanying drawings and the detailed description thereof. In different embodiments, the deep learning model is selected, and different models can be selected according to the situation of a specific image recognition task. All other embodiments obtained by a person skilled in the art without making any inventive step are within the scope of the present invention.

The embodiment of the invention discloses an image identification method based on meta learning, which is applied to an image identification scene with unknown images and image domain distribution, namely, different image styles.

The process of the present invention is shown in FIG. 1. First, a plurality of open set examples approximating the domain distribution of the training set, such as "minus" in fig. 1, are generated on a plurality of training set data by generating a confrontation network, and used as an unknown class to learn knowledge for identifying the unknown class. And then, dividing the enhanced training set data into a plurality of virtual training sets and test sets according to different domain distributions, and using the virtual training sets and the test sets to simulate the domain distribution change in a real application scene. And finally, based on a meta-learning optimization strategy, learning a more robust feature space in an iterative optimization mode, and reducing overfitting of the model to any specific training set domain distribution and category distribution, thereby improving the generalization capability of the model in image recognition.

Specifically, as shown in fig. 1, the present embodiment provides an image recognition method based on meta-learning, which is characterized in that the generalization capability of a model to class distribution change and domain distribution change is enhanced by a meta-learning optimization strategy, and includes the following steps:

step 2, combining the virtual open set samples, dividing the training set into a meta training set and a meta test set according to different domain distributions, wherein the classes in the meta training set completely cover all the classes in the meta test set, namely the meta training set comprises known classes with class labels and unknown classes formed by the virtual open set samples; through the steps, the division of training data is realized;

step 4, evaluating the current optimal model obtained in the step 3 on the meta-test set; through the steps, the meta test is realized;

and 6, repeating the steps 2 to 5 for iterative optimization until a final optimal model is obtained. Through the steps, the element iteration is realized;

in the image recognition method based on meta learning according to this embodiment, the step 1 includes:

wherein L is _dis Represents the discrimination loss function, L _recons Representing the loss function of image reconstruction, S _i Representing the ith sub-training set of the plurality S of training sets,

representing the images in the ith sub-training set, D representing the discriminator, G representing the generator, and E representing the encoder; in the invention, through the steps 1-3, the generated reconstructed image can be as vivid as possible as an original image, and a discriminator cannot distinguish the true image from the false image; specifically, in this embodiment, a flowchart of the process is shown in fig. 2;

represents the ith sub-training set S _i Image of (1)

The class labels of (1), the class labels only containing class labels of k known classes of the training set S;

step 1-5, using the classifier

Approximating decision boundaries of the known class data and the unknown class data, generating each sub-training set S _i Corresponding virtual open set sample

Generating the virtual open set sample of each

Are all integrated into each virtual open set sample

Corresponding sub-training set S _i After that, an enhanced training set is obtained, denoted as S = { S = } ⁺ ,S ^- In which S is ⁺ Representing known class data, S ^- Representing the generated virtual open set sample.

In the image recognition method based on meta learning according to this embodiment, the steps 1 to 5 include:

inputting the image G (z) to a classifier

When the classifier is used

When the output confidence coefficients of all the categories are lower, and the lower confidence coefficient is lower relative to the output confidence coefficient of zero, determining that the image belongs to the unknown data;

All the sub-training sets S _i Corresponding virtual open set sample

I.e. as a meta training set

Virtual open set sample in (1)

All the generated virtual open set samples

Are all integrated into the respective corresponding sub-training sets S _i Thereafter, an enhanced training set S is obtained, which is derived from the original known class data S ⁺ And the generated open set sample S ^- And (4) forming. Specifically, in the present invention, the fixed number of steps is set according to different image tasks in an experiment. A flow chart of this process is shown in fig. 3. Fig. 4 shows a schematic diagram of some open set generation examples on a Digits-DG image dataset, which is a classical domain generalization task experiment dataset, including a MNIST image dataset, a MNIST-M image dataset, a SVHN image dataset, and a USPS image dataset.

In the image recognition method based on meta learning according to this embodiment, the step 2 includes:

And N _t Test set of individuals

The meta test set

Number N of _t Is set to 1; the meta training set

For training models, the set of meta-tests

For testing the trained model. Specifically, in the present embodiment, N _s Is set to be 3,N _t Is set to 1. The step 2 is to simulate a training and testing set in a real scene, that is, to simulate a domain distribution difference in the real scene.

In the image recognition method based on meta learning according to this embodiment, the step 3 includes:

step 3-1, using the meta training set generated in step 1

Virtual open set sample in (1)

Training an image recognition model F, the formula of the process is as follows:

wherein L is _o Representing virtual open set instances

θ represents a parameter of the image recognition model F;

step 3-2, training the meta training set

Known class data of

L _c ＝-log((F(θ；x)) _y )；

wherein L is _c Representation application to Meta-training set

Known class data of

Cross entropy loss of (d);

step 3-3, the overall loss function is expressed as:

Loss of

Wherein N is _s Presentation element training set

The number of (2);

wherein α is the step size of the meta training;

In the image recognition method based on meta learning according to this embodiment, the step 4 includes:

step 4-1, according to the following formula, in the meta-test set

Upper-calculated adaptive parameter loss L (θ'):

wherein the meta test set

The loss function L (theta') of (C) uses a training set of secondary elements

Calculating the parameter theta' of middle learning;

step 4-2, calculating all the meta-test sets according to the following formula

Loss of (2):

in the image recognition method based on meta learning according to this embodiment, the step 5 includes:

where β is a trade-off parameter between meta-training and meta-testing.

In the image recognition method based on meta learning according to this embodiment, the step 6 includes:

repeating the steps 2 to 5, and obtaining a final optimal model when the model optimized in the step 5 is optimized to be converged on the training set S, namely the image recognition model F with good generalization capability ^* 。

In specific implementation, the present invention further provides a computer storage medium, where the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments of the image recognition method based on meta learning provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like.

Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented using software plus any required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

The same and similar parts among the various embodiments in this specification may be referred to each other. The above-described embodiments of the present invention do not limit the scope of the present invention.

Claims

1. An image recognition method based on meta-learning is characterized by comprising the following steps:

step 1, generating a virtual open set sample for images in each training set based on a generated confrontation network, wherein the training set comprises a plurality of sub-training sets with class labels, the class labels contained in each sub-training set are completely overlapped, but the domain distribution of each sub-training set is different, namely the image style in each sub-training set is different;

the step 1 comprises the following steps:

representing the images in the ith sub-training set, D representing the discriminator, G representing the generator, and E representing the encoder;

Where k represents the number of known classes in the training set S, L _c A function representing the loss of the classification is expressed,

represents the ith sub-training set S _i Image of (1)

step 1-5, using the classifier

Approximating the decision boundary of the known class data and the unknown class data to generate each sub-training set S _i Corresponding virtual open set sample

Generating the virtual open set sample of each

Are all integrated into each virtual open set sample

Corresponding sub-training set S _i After that, an enhanced training set is obtained, denoted as S = { S = } ⁺ ,S ^- In which S is ⁺ Representing known class data, S ^- Representing the generated virtual open set sample;

step 2, combining the virtual open set samples, dividing the training set into a meta-training set and a meta-testing set according to different domain distributions, wherein the classes in the meta-training set completely cover all the classes in the meta-testing set, namely the meta-training set comprises known classes with class labels and unknown classes formed by the virtual open set samples;

step 3, calculating the gradient of model parameters on the meta-training set and modifying the parameters to obtain a current optimal model;

step 4, evaluating the current optimal model obtained in the step 3 on the meta test set;

step 5, optimizing the current optimal model by combining all the loss functions in the step 3 and the step 4;

and 6, repeating the steps 2 to 5 for iterative optimization until a final optimal model is obtained.

2. The method for image recognition based on meta-learning as claimed in claim 1, wherein the steps 1-5 comprise:

inputting the image G (z) to a classifier

When the classifier is used

wherein z is ^* The first term | | z-E (x) represents the feature space vector to be searched _i ) | | represents a feature space vector z to be found ^* It is desirable to match the seed image x as much as possible _i Similar style, second item

Representing a log-likelihood assuming that the output confidence of the unknown class is zero, and by minimizing the log-likelihood, the output confidence of all the known classes is pushed down;

All the sub-training sets S _i Corresponding virtual open set sample

I.e. as a meta-training set

Virtual open set sample in (1)

All the generated virtual open set samples

3. The method for image recognition based on meta learning according to claim 1, wherein the step 2 comprises:

And N _t Test set of individuals

The meta test set

Number N of _t Is set to 1; the meta training set

For training models, the meta-test set

For testing the trained model.

4. The method for image recognition based on meta learning according to claim 3, wherein the step 3 comprises:

step 3-1, using the meta training set generated in the step 1

Virtual open set sample in (1)

Training an image recognition model F, wherein the formula of the step is expressed as follows:

wherein L is _o Representing virtual open set examples

Theta represents a parameter of the image recognition model F, and k is the number of class labels of known classes;

step 3-2, training the meta training set

Known class data of

Training is performed, and the formula of the step is expressed as follows:

L _c ＝-log((F(θ；x)) _y )；

wherein L is _c Representation application to Meta-training set

Known class data of

Cross entropy loss of (F (theta; x)) _y ；

Step 3-3, the entire loss function is expressed as:

Loss of (2)

Wherein, N _s Presentation element training set

The number of (2);

wherein α is the step size of the meta training;

and obtaining a model after the whole meta-training parameter is updated, namely the current optimal model.

5. The method for image recognition based on meta learning according to claim 4, wherein the step 4 comprises:

step 4-1, according to the following formula, in the meta-test set

Upper-calculated adaptive parameter loss L (θ'):

wherein the meta test set

The loss function L (theta') of (C) uses a training set of secondary elements

Calculating the parameter theta' of middle learning;

step 4-2, calculating all the meta-test sets according to the following formula

Loss of (2):

6. the method for image recognition based on meta-learning according to claim 5, wherein the step 5 comprises:

where β is a trade-off parameter between meta-training and meta-testing.

7. The method according to claim 6, wherein the step 6 comprises:

repeating the step 2 to the step 5, and obtaining a final optimal model, namely an image recognition model F with good generalization capability when the model optimized in the step 5 is optimized to be converged on the training set S ^* 。