CN111639679A

CN111639679A - Small sample learning method based on multi-scale metric learning

Info

Publication number: CN111639679A
Application number: CN202010384767.4A
Authority: CN
Inventors: 蒋雯; 黄凯; 耿杰; 邓鑫洋
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2020-05-09
Filing date: 2020-05-09
Publication date: 2020-09-08
Anticipated expiration: 2040-05-09
Also published as: CN111639679B

Abstract

The invention discloses a small sample learning method based on multi-scale metric learning, which comprises the following steps: step one, establishing a data set; step two, generating a multi-scale feature mapping layer; step three, transfer learning: the conversion module carries out secondary mapping on the multi-scale characteristics of the sample; step four, generating a multi-scale feature mapping pair; calculating a relationship score of the multi-scale feature mapping pair in the multi-scale relationship generation network; and step six, measuring the sample similarity by adopting a multi-scale measurement learning model. The invention has simple structure and reasonable design, obtains multi-scale feature mapping pairs through transfer learning, ensures that the trained model has mobility, adds a loss term brought by sample spacing to the whole model on the basis of a mean square error loss function to form a new loss function, realizes metric learning and is suitable for the training of small sample learning.

Description

Small sample learning method based on multi-scale metric learning

Technical Field

The invention belongs to the technical field of image processing and recognition, and particularly relates to a small sample learning method based on multi-scale metric learning.

Background

Humans are very good at identifying a new object with a very small number of samples, for example, a small child may only need some pictures in a book to know what is "zebra" and what is "rhinoceros". Inspired by the fast Learning capability of human beings, researchers hope that after a machine Learning model learns a large amount of data of a certain class, only a small amount of samples are needed for a new class to fast learn, and the problem to be solved by Few-shot Learning is solved.

For machine learning, although deep learning on an image recognition task can obtain quite satisfactory results in some scenarios by virtue of a deep and complex network model, huge training data support and strong hardware support, general model training becomes quite difficult in some rare task scenarios because sufficient training samples cannot be obtained. Instead, a human can learn a new category with only a few reference samples. Therefore, in order to make the model have the same capability, learning based on a small sample that is learned quickly from a small number of samples becomes a more popular research direction in the field of image recognition recently.

Small sample learning aims to have the ability to identify unknown classes with a small number of training samples, which is the process by which the human brain associates and infers unknown things based on a priori knowledge. At the core of small sample learning, a new class is identified, which is the domain of migrating source knowledge to target knowledge. However, the conventional algorithm of the transfer learning cannot solve the small sample learning task well, wherein the most important difference is that the small sample learning needs to obtain the capability of identifying unknown classes, which means that the method has the principle of identifying a large number of classes which are not trained, and obviously the transfer learning cannot do the same.

Currently, small sample learning is mainly divided into three categories: transfer-based learning, metric-based learning, and meta-based learning. Transfer-based learning has been widely used in many fields, which is fine-tuned by a certain amount of self-labeling data on a pre-trained base network. The method based on metric learning mainly models the distance between samples, so that similar samples are close to heterogeneous samples and are far away from the heterogeneous samples. The meta-learning-based method aims to create a model so that the model can adapt to a multi-task learning scene, and for small sample learning tasks, because the training input types at each time are different, the training task at each time can be regarded as different learning tasks, and therefore the model can be trained by using the meta-learning method.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a small sample learning method based on multi-scale metric learning, which has a simple structure and a reasonable design, obtains multi-scale feature mapping pairs through migration learning, learns the multi-scale relationship pairs by constructing a hierarchical relationship learning network, so as to mine the feature association between a support set and an inquiry set, and adds a sample interval to a loss term brought by an overall model on the basis of a mean square error loss function to form a new loss function, so as to adapt to the training of small sample learning.

In order to solve the technical problems, the invention adopts the technical scheme that: a small sample learning method based on multi-scale metric learning is characterized by comprising the following steps:

step one, establishing a data set: establishing a support set S_suAnd a set of queries S_qu，

Wherein x_iRepresentation support set S_suThe ith sample in (1), y_iRepresents a sample x_iLabel of (2), x_jRepresenting a set of queries S_quThe jth sample in (2), y_jRepresents a sample x_jIs a positive integer, C × K represents the support set S_suC × N denotes the challenge set S_quThe number of samples of (a);

step two, generating a multi-scale feature mapping layer:

step 201, generating a support set multi-scale feature mapping prototype: will support set S_suThe sample in the system is processed by a support set feature extraction module to obtain a plurality of support set multi-scale feature mapping prototypes;

step 202, generating a multi-scale feature mapping layer of the query set: will ask set S_quObtaining a plurality of inquiry set multi-scale feature mapping layers by the inquiry set feature extraction module of each sample;

step three, transfer learning: the conversion module receives the support set multi-scale feature mapping prototype and maps the support set multi-scale feature mapping prototype into a feature space suitable for target knowledge to generate a support set multi-scale feature mapping layer;

step four, generating a multi-scale feature mapping pair: correspondingly combining the support set multi-scale feature mapping layer and the inquiry set multi-scale feature mapping layer to obtain a multi-scale feature mapping pair;

step five, calculating the relation score RES of the multi-scale feature mapping pair in the multi-scale relation generation network_ω；

Step six, measuring the sample similarity by adopting a multi-scale measurement learning model: sample similarity L_total＝λL_IIRL+μL_avmseWherein

Representing a set of queries S_quThe predicted average of the samples of the second ω -th class,

representation query set S_quIn a predicted average value of samples not belonging to the ω -th class, a_trRepresents the interval between the prediction average values of the homogeneous sample and the heterogeneous sample,

L_avmserepresenting the sample class prediction loss in the query set, and λ and μ represent multi-scale weights.

The small sample learning method based on multi-scale metric learning is characterized in that: in the sixth step

Wherein

The small sample learning method based on multi-scale metric learning is characterized in that: scoring the relationship in step five

Wherein RES_ωRepresents a relationship score, f_geAnd B represents a multi-scale relation generation network and a multi-scale relation generation module.

The small sample learning method based on multi-scale metric learning is characterized in that: step 201, in the generation of the support set multi-scale feature mapping prototype, the support set S is generated_suAnd C × K samples are subjected to feature extraction to obtain a multi-scale feature mapping layer corresponding to each sample, and then the multi-scale feature mapping layers corresponding to the samples in each category are fused into the multi-scale feature mapping layer in the category.

The small sample learning method based on multi-scale metric learning is characterized in that: the support set feature extraction module in step 201 may be different from the query set feature extraction module in step 202, and the support set feature extraction module in step 201 is the same as the query set feature extraction module in step 202 in feature layer scale.

Compared with the prior art, the invention has the following advantages:

1. the invention has simple structure, reasonable design and convenient realization, use and operation.

2. In the application, a feature extraction module receives sample data from a support set and an inquiry set and maps the sample data to a feature space, the feature extraction module outputs sample multi-scale feature output, and a conversion module performs secondary mapping on the sample multi-scale feature and maps the sample multi-scale feature to another feature space suitable for target knowledge, so that a trained model has mobility and can identify a new class.

3. The method and the device have the advantages that a hierarchical relation learning network is constructed to learn multi-scale relation pairs, so that feature association between a support set and a query set is mined.

4. According to the method, a new loss function is reconstructed, and the multi-scale metric learning model adds a loss term brought by a sample interval to an integral model on the basis of a mean square error loss function to form a new loss function L_totalMetric learning can be achieved to adapt to training of small sample learning.

5. According to the method, a multi-scale metric learning model is built, a model is built, label prediction can be carried out on an inquiry set through a small number of reference samples by combining transfer learning and metric learning, and meanwhile the method has the capability of identifying new categories and improves the identification precision.

In conclusion, the invention has simple structure and reasonable design, and obtains the multi-scale feature mapping pair through transfer learning, so that the trained model has mobility, and the loss items brought by the sample spacing to the whole model are added on the basis of the mean square error loss function to form a new loss function, thereby realizing metric learning and adapting to the training of small sample learning.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is a schematic diagram of a network architecture according to the present invention.

Fig. 3 is a schematic diagram of a network structure of the conversion module according to the present invention.

Detailed Description

The method of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments of the invention.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Spatially relative terms, such as "above … …," "above … …," "above … …," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial relationship to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary term "above … …" can include both an orientation of "above … …" and "below … …". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.

As shown in fig. 1 to 3, the present invention includes the steps of:

Wherein x_iRepresentation support set S_suThe ith sample in (1), y_iRepresents a sample x_iLabel of (2), x_jRepresenting a set of queries S_quThe jth sample in (2), y_jRepresents a sample x_jIs a positive integer, C × K represents the support set S_suC × N denotes the challenge set S_quThe number of samples.

It should be noted that the image set with the labeled sample is the support set S_suSupport set S_suComprises C classes randomly drawn from a training set, each class comprising K samples, a support set S_suA total of C × K sample images, each support set S_suAll consisting of a two-dimensional vector (x)_i,y_i) And (4) showing. It should be noted that the support set S_suThe C classes of (a) may be not previously learned.

The set of images to be classified is represented as a query set S_quQuery set S_quExtracting a portion of the remaining data from the C classes of the training set, and thus, the query set S_quAlso included are classes C, each class comprising N samples, a set of queries S_quTotal C × N sample images, each challenge set S_quAll consisting of a two-dimensional vector (x)_j,y_j) And (4) showing.

The invention aims to solve the problems that a model is constructed, migration learning and metric learning are combined, label prediction can be carried out on an inquiry set through a small number of reference samples, a new class can be identified, and the identification precision is improved.

Step two, generating a multi-scale feature mapping layer:

step 201, generating a support set multi-scale feature mapping prototype: will support set S_suThe sample in (1) is processed by a support set feature extraction module to obtain a plurality of support set multi-scale feature mapping prototypes.

Feature extraction module f_feReceiving data from a support set S_suMapping the sample data to a feature space, and using a feature pyramid structure to support the set S_suC × K samples are subjected to feature extraction to obtain a multi-scale feature mapping layer corresponding to each sample, and then the multi-scale feature mapping layers corresponding to each sample in each category are fused into the multi-scale feature mapping layer of the category, which is called as a support set multi-scale feature mapping prototype, so that a support set S is obtained_suAnd obtaining C support set multi-scale feature mapping prototypes. In this embodiment, the fusion uses a weighted average method.

When in actual use, the computer is according to the formula

Calculating to obtain support set multi-scale feature mapping, wherein V_suRepresentation support set S_suM represents a feature layer scale, f_feRepresentation support set S_suAdopted feature extraction module, feature extraction module f_feThe method mainly comprises a deep convolutional neural network which is pre-trained in a large scene classification data set. In practice, m is 3. In specific implementation, the feature extraction module f_feIt is made up of a convolutional neural network ResNet50 with its fully connected layers removed. The ResNet50 network generally consists of a stack of four different bottleneck layer structures, and since there is no change in feature layer dimensions between the same bottleneck layer, the tops of the last three different bottleneck layers are selected as inputs of the multi-scale features.

Step 202, generating a multi-scale feature mapping layer of a query set

Computer according to formula

Calculating to obtain a multi-scale feature map of the query set, wherein V_quRepresenting a set of queries S_quCharacteristic vector of f_fuRepresenting a set of queries S_quAnd the adopted feature extraction module.

In actual use, the feature extraction module f_fuMainly comprises a deep convolutional neural network pre-trained in a large scene classification data set, and a feature extraction module f_fuReceiving a query from a query set S_quAnd maps it to the feature space. For each input sample xj, the feature extraction module f_feWill output a multi-scale feature mapping layer corresponding thereto, and thus query set S_quResulting in C × N different classes of multi-scale feature mapping layers.

It should be noted that the query set S_quFeature extraction module f adopted_fuAnd support set S_suFeature extraction module f adopted_feThe characteristic layer dimension m of (2) is the same, and m is 3.

Step three, transfer learning: the conversion module receives the support set multi-scale feature mapping prototype and maps the support set multi-scale feature mapping prototype into a feature space adaptive to target knowledge to generate a support set multi-scale feature mapping layer.

Unlike other small sample learning methods, in the present application, the feature extraction module f_feThe output is the multi-scale feature output of the sample, simultaneously, a corresponding conversion module is designed, secondary mapping is carried out on the feature embedding, and a support set S is used based on the feature mapping method_suFrom the original high-dimensional feature space to the low-dimensional feature space where the support set S is_suC support set multi-scale feature mapping prototypes and query set S_quThe multi-scale feature mapping layers of the middle C × N different classes have the same or close distribution, so that the labeled support set S represented by the low-dimensional space can be utilized_suSample data pair query set S_quAnd predicting the sample data, so that the trained model has mobility and the capability of identifying a new category.

As shown in FIG. 2, in actual use, the conversion module adopts threeAnd the three convolutional neural networks are respectively called a first conversion block, a second conversion block and a third conversion block. Input of the first conversion block is a feature extraction module f_feThe third layer outputs, and the input of the second conversion block is a characteristic extraction module f_feThe output of the second layer and the output of the first conversion block, and the input of the third conversion block is a feature extraction module f_feA first layer output and an output of the second conversion block. The output of the multi-scale feature mapping layer after passing through the first conversion block, the second conversion block and the third conversion block is a feature matrix with the same dimension as the scale.

As shown in fig. 3, conv denotes a volume block and interplate denotes an interpolation block. Input1 of the conversion module is the output from the top layer of the convolutional neural network ResNet50, and input2 is the output of the upper layer conversion block which is deconvolved to have the same size characteristic as input 1. The conversion module can combine more accurate feature semantic expression of the upper conversion block and more comprehensive target feature information of the lower conversion block by using a top-to-bottom structure.

Step four, generating a multi-scale feature mapping pair: and correspondingly combining the support set multi-scale feature mapping layer and the inquiry set multi-scale feature mapping layer to obtain a multi-scale feature mapping pair.

In actual use, the query set S_quMiddle C × N multi-scale feature mapping layers and support set S_suAnd obtaining corresponding scales of the C classes of multi-scale feature mapping layers, and performing head-to-tail connection to obtain C × (C × N) multi-scale feature mapping pairs, wherein the multi-scale feature mapping pairs are used as the input of a multi-scale relation generation network.

Step five, calculating the relationship score of the multi-scale feature mapping pair in the multi-scale relationship generation network: relationship score

The multi-scale relation generation module B mainly learns the multi-scale by constructing a hierarchical relation learning networkDegree relation pair, thereby mining support set S_suAnd a set of queries S_quThe characteristic association between them.

For a multi-scale relational mapping pair, a lower scale means that the features represented by the feature mapping layer are coarser and also more common, whereas a higher scale means that the features represented by the feature mapping are finer and more unique. Therefore, the feature mapping pairs with lower scales need to be subjected to relationship learning in a deeper network, and the feature mapping pairs with higher scales are more obvious in difference or similarity, so that only a shallower network is needed to be sufficient. Based on the thought, a hierarchical relation learning network is used for carrying out relation learning on the multi-scale feature mapping pair, namely the multi-scale relation generation network, and the low learning level network and the high learning level network share parameters.

Obtaining a relationship score RES for learning results in the hierarchical relationship learning network by integrating the multi-scale feature mapping_ωTo screen out features that are favorable to the relationship. RES (resource representation)_ωValues limited to [0, 1 ]]Relationship score RES_ωReflecting from the query set S_quA certain sample and support set S_suSimilarity between classes. Score RES for C relationships_ωCarry out sequencing, RES_ωThe higher the value, the higher the query set S_quThe j sample x in (1)_jCorresponding multi-scale feature mapping layer

And support set S_suMulti-scale feature mapping layer of the middle omega category

The closer the relationship of (a) is, the greater the probability of belonging to the same class is, thereby realizing sample classification.

Multi-scale relationship generation network f_geThe method can enable the feature layer with larger scale to use deeper network for learning, and simultaneously, the learning network with small scale and the learning network with large scale share the training parameters, so that the complexity of the network is reduced. Multi-scale relationship generation network f_geIs composed of a graded nerveThe network is constructed by adopting a four-layer feedforward convolutional neural network, the convolutional neural network comprises convolutional layers, full-link layers and a classifier, when the network is actually used, the first full-link layer uses ReLU as an activation function, and the second full-link layer uses sigmoid as an activation function. A four-layer convolution structure is used to handle the most scaled feature map pair and the number of layers of the convolution structure will decrease as its scale decreases, while the low scale will share network training parameters with the larger scale.

λ and μ represent multi-scale weights.

The sample classes of each round are uncertain in the small sample learning and training process, and the training samples are few, so that the problems of slow convergence or poor stability of the model trained to the later stage may exist in the general mean square error loss function and the cross entropy loss function. Since the metric learning model in the present application is a hierarchical depth metric network, a new loss function needs to be designed so that the network can implement metric learning.

The method reconstructs a new loss function, and obtains a multi-scale metric learning model L through loss function training_avmseUsing the mean square error loss function, L_IIRLRepresenting the loss term brought by the sample spacing to the whole model, and the multi-scale metric learning model has a mean square error loss function L_avmseOf (2) aOn the basis of adding a loss term L brought by the sample spacing to the whole model_IIRLForming a new loss function L_totalTo adapt to the training of small sample learning.

In addition, L is_avmseCross entropy loss function, mean square error loss function or other metric distance functions commonly used by metric learning models can also be selected.

As shown in table 1, the experimental results of the multi-scale metric learning model used in the present application and other learning models in the prior art on the miniImageNet data set are compared, the multi-scale metric learning model is represented by MSML in the table, small sample image classification may only provide very few samples, and the experiment includes a 5-way 1-shot comparison benchmark test and a 5-way 5-shot comparison benchmark test. As shown in table 1, the multi-scale metric learning model (i.e., MSML) obtained good experimental results on the miniImageNet dataset, including an accuracy of 72% achieved on the 5-way 1-shot comparison benchmark test, and also obtained an accuracy of 84% on the 5-way 5-shot task, which is much higher than the experimental results of other learning modules in the prior art.

TABLE 1 comparison of experimental results of multi-scale metric learning model and other learning models in miniImageNet dataset

In table 1, Matching Network represents a Matching Network model, Meta-leaner LSTM represents a Meta learning loop model, MAML represents a model independent learning model, complex Nets represents a comparison Network model, SNAIL represents an attention-based Meta learning model, adarsenet represents an adaptive residual Network model, Prototypical Nets represents a prototype Network model, K-tuple Net represents a K-tuple Network model, MTL represents a Meta migration learning model, and LEO represents a hidden space optimization model.

As shown in table 2, the multi-scale metric learning model used in the present application and the experimental results of other learning models in the prior art on the tiered imagenet data set are compared, the multi-scale metric learning model is represented by MSML in the table, and the experiment includes a 5-way 1-shot comparison benchmark test and a 5-way 5-shot comparison benchmark test. As shown in table 2, the multi-scale metric learning model obtains a good experimental result on the tieedimagenet dataset, including 70% of accuracy achieved on the 5-way 1-shot comparison benchmark test, and also obtains 83% of accuracy on the 5-way 5-shot task, which is much higher than the experimental results of other learning modules in the prior art.

TABLE 2 comparison of experimental results of the multi-scale metric learning model and other learning models in the tieredIMAGENET dataset

As shown in table 3, the classification accuracy of the loss function used in the present application and other loss functions in the prior art under the same conditions were compared. As shown in Table 3, the loss function proposed in the present application achieves the highest classification accuracy regardless of whether the 5-way 1-shot task or the 5-way 5-shot task is performed.

TABLE 3 comparison table of classification accuracy between loss function of the present application and other loss functions

The above embodiments are only examples of the present invention, and are not intended to limit the present invention, and all simple modifications, changes and equivalent structural changes made to the above embodiments according to the technical spirit of the present invention still fall within the protection scope of the technical solution of the present invention.

Claims

1. A small sample learning method based on multi-scale metric learning is characterized by comprising the following steps:

step two, generating a multi-scale feature mapping layer:

step 202, generating a multi-scale feature mapping layer of the query set: will ask set S_quThe sample in the system is subjected to an inquiry set feature extraction module to obtain a plurality of inquiry set multi-scale feature mapping layers;

2. The small sample learning method based on multi-scale metric learning according to claim 1, characterized in that: in the sixth step

Wherein

3. The small sample learning method based on multi-scale metric learning according to claim 1, characterized in that: scoring the relationship in step five

4. The small sample learning method based on multi-scale metric learning according to claim 1, characterized in that: step 201, in the generation of the support set multi-scale feature mapping prototype, the support set S is generated_suC × K samples are subjected to feature extraction to obtain a multi-scale feature mapping layer corresponding to each sample, and then the multi-scale feature mapping layers corresponding to the samples in each category are fused into the multi-scale feature mapping of the categoryAnd (4) irradiating the layer.

5. The small sample learning method based on multi-scale metric learning according to claim 1, characterized in that: the support set feature extraction module in step 201 may be different from the query set feature extraction module in step 202, and the support set feature extraction module in step 201 is the same as the query set feature extraction module in step 202 in feature layer scale.