CN114782779B

CN114782779B - Small sample image feature learning method and device based on feature distribution migration

Info

Publication number: CN114782779B
Application number: CN202210487387.2A
Authority: CN
Inventors: 李晓旭; 王湘阳; 刘俊; 金志宇; 任凯; 张文斌; 曾俊瑀; 李睿凡; 陶剑; 董洪飞
Original assignee: Lanzhou University of Technology
Current assignee: Lanzhou University of Technology
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2023-06-02
Anticipated expiration: 2042-05-06
Also published as: CN114782779A

Abstract

The invention discloses a small-sample image feature learning method and device based on feature distribution migration. In the early stage, the data of the base class is combined with the method of gradient descent to optimize the parameters of the embedding module and the distribution learning module, and when the distribution is corrected in the later stage , does not require additional parameter settings; in addition, it is usually assumed that each dimension in the feature representation follows a Gaussian distribution, so that the mean and variance of the Gaussian distribution can be transferred between similar categories, reducing bias, so that the statistics of these categories A better estimate is obtained with a sufficient number of samples, and then the distribution correction model is used to correct the distribution of the samples, so as to classify the new class samples more accurately. At the same time, it can be paired with any classifier and feature extractor without additional parameters. It solves the problem of prototype deviation in small sample image classification, improves the classification effect of images, and has high practical value.

Description

Small sample image feature learning method and device based on feature distribution migration

技术领域Technical Field

本发明涉及图像分类技术领域，尤其涉及一种基于特征分布迁移的小样本图像特征学习方法及装置。The present invention relates to the technical field of image classification, and in particular to a small sample image feature learning method and device based on feature distribution migration.

背景技术Background Art

近年来，随着计算机技术的发展，人们浏览的信息日益丰富，每天都有大量图片被上传到网络，由于数量巨大，人工已经无法对此进行分类。在很多大样本图像分类任务上，机器的识别性能已经超越人类。然而，当样本量比较少时，机器的识别水平仍与人类存在较大差距。因此，研究高效可靠的图片分类算法有很迫切的社会需求。In recent years, with the development of computer technology, people browse more and more information. A large number of pictures are uploaded to the Internet every day. Due to the huge number, it is no longer possible to classify them manually. In many large-sample image classification tasks, the recognition performance of machines has surpassed that of humans. However, when the sample size is relatively small, the recognition level of machines is still far behind that of humans. Therefore, there is an urgent social need to study efficient and reliable image classification algorithms.

小样本分类(Few-shot Classification)属于小样本学习(Few-shot Learning)范畴，往往包含类别空间不相交的两类数据，即基类数据和新类数据。小样本分类旨在利用基类数据学习的知识和新类数据的少量标记样本(支持样本)来学习分类规则，准确预测新类任务中未标记样本(查询样本)的类别。Few-shot classification belongs to the category of few-shot learning, which often contains two types of data in disjoint category space, namely base class data and new class data. Few-shot classification aims to use the knowledge learned from base class data and a small number of labeled samples (support samples) of new class data to learn classification rules and accurately predict the category of unlabeled samples (query samples) in new class tasks.

针对现有的技术缺点而言，首先，对标记样本极少的小样本分类任务来说，现有的深度学习技术并不适用。因而，如何基于基类数据和标记样本极少的新类数据，来学习高辨识度的特征表示，是一个值得探索的问题。其次，对分布原型的偏差而言，由于标记样本极少，常常使得学习的原型偏差(bias)较大。因此，如何通过减少原型偏差来提高小样本图像分类性能，也是一项具有挑战的任务。最后，在基类数据特征的判别性以及基类数据特征在新类数据上的可迁移性存在着误差，易导致分类的不准确性。Regarding the shortcomings of existing technologies, first of all, existing deep learning technologies are not applicable to small sample classification tasks with very few labeled samples. Therefore, how to learn highly recognizable feature representations based on base class data and new class data with very few labeled samples is a problem worth exploring. Secondly, as for the deviation of distribution prototypes, due to the very few labeled samples, the bias of the learned prototypes is often large. Therefore, how to improve the performance of small sample image classification by reducing prototype bias is also a challenging task. Finally, there are errors in the discriminability of base class data features and the transferability of base class data features to new class data, which can easily lead to inaccurate classification.

发明内容Summary of the invention

本发明针对上述小样本图像分类中的原型偏差问题，提出一种基于特征分布迁移的小样本图像特征学习方法及装置，主要是通常假设特征表示中的每个维度都遵循高斯分布，这样高斯分布的均值和方差可以在类似的类别之间传递，结合比较样本间或者样本与分布原型间的距离来判断类别，这些类别的统计数据在足够的样本数下得到更好的估计，改善了图像的分类效果。In response to the prototype deviation problem in the above-mentioned small sample image classification, the present invention proposes a small sample image feature learning method and device based on feature distribution migration. It is mainly assumed that each dimension in the feature representation follows a Gaussian distribution, so that the mean and variance of the Gaussian distribution can be transferred between similar categories. The category is judged by comparing the distance between samples or between the sample and the distribution prototype. The statistical data of these categories can be better estimated with a sufficient number of samples, thereby improving the classification effect of the image.

为了实现上述目的，本发明提供如下技术方案：In order to achieve the above object, the present invention provides the following technical solutions:

一方面，本发明提供了一种基于特征分布迁移的小样本图像特征学习方法，包括以下步骤：On the one hand, the present invention provides a small sample image feature learning method based on feature distribution migration, comprising the following steps:

S1、对数据进行预处理，其中数据包括训练集和测试集；S1. Preprocess the data, where the data includes a training set and a test set;

S2，利用基类数据预训练嵌入模块f_θ，得到良好的特征空间；S2, use the base class data to pre-train the embedding module f _θ to obtain a good feature space;

S3，将D_train输入到嵌入模块f_θ，得到样本特征图，将其输入到分布学习模块g_φ中，最小化损失函数，优化分布学习模块g_φ；S3, input D _train into the embedding module f _θ to obtain the sample feature map, input it into the distribution learning module g _φ , minimize the loss function, and optimize the distribution learning module g _φ ;

S4，将新类数据分为支持集

和查询集

将支持集

经过嵌入模块f_θ和分布学习模块g_φ计算每类的分布原型

和

S4, divide the new class data into support sets

and queryset

Will support set

After the embedding module f _θ and the distribution learning module g _φ, the distribution prototype of each class is calculated

and

S5，计算基类数据中各类的类别概率，选取最大的前n个类别，将n个类别的分布与当前类别的分布合并，得到矫正后每类的分布原型

和

S5, calculate the category probability of each category in the base category data, select the largest first n categories, merge the distribution of n categories with the distribution of the current category, and obtain the distribution prototype of each category after correction

and

S6，计算新类查询样本的预测概率。S6, calculate the predicted probability of the new class query sample.

进一步地，步骤S1的预处理方法为：Furthermore, the preprocessing method of step S1 is:

S11，将数据

分为

和

两部分，且这两部分的类别空间互斥，将D_train用以在训练过程中调整参数，D_test作为新类数据测评模型性能；S11, the data

Divided into

and

The class spaces of the two parts are mutually exclusive. D _train is used to adjust parameters during training, and D _{test is} used as new class data to evaluate model performance.

S12，对于C-way K-shot分类任务，从D_train中随机选出C个类别，每个类别中随机选出M个样本，其中K个样本作为支持样本S_i，其余M-K个样本作为查询样本Q_i，S_i和Q_i构成一个任务T_i；同样地，对于D_test有任务

S12, for the C-way K-shot classification task, randomly select C categories from D _train , randomly select M samples from each category, K samples as support samples _Si , and the remaining MK samples as query samples _Qi . _Si and _Qi constitute a task _Ti ; similarly, for D _test, there is a task

进一步地，步骤S2中使用包含四个卷积块的嵌入模块f_θ对图像提取特征，其中含有卷积层、池化层和非线性激活函数，每个卷积块使用窗口大小为3*3的卷积核，一个批量归一化，一个RELU非线性层，一个2×2最大池化层，裁剪了最后两个块的最大池化层。Furthermore, in step S2, an embedding module f _θ containing four convolutional blocks is used to extract features from the image, which contains convolutional layers, pooling layers and non-linear activation functions. Each convolutional block uses a convolution kernel with a window size of 3*3, a batch normalization, a RELU non-linear layer, a 2×2 maximum pooling layer, and the maximum pooling layers of the last two blocks are cropped.

进一步地，步骤S3中分布学习模块g_φ由两个全连接层组成，用以提取图像特征的分布表示，得到类中每个样本的均值与方差。Furthermore, the distribution learning module g _φ in step S3 is composed of two fully connected layers to extract the distribution representation of image features and obtain the mean and variance of each sample in the class.

进一步地，步骤S3中最小化损失函数使用的是梯度下降算法，不断地调整权重ω和偏差b，使得损失函数的值变得越来越小。Furthermore, in step S3, the gradient descent algorithm is used to minimize the loss function, and the weight ω and the bias b are continuously adjusted so that the value of the loss function becomes smaller and smaller.

进一步地，步骤S3具体包括：Furthermore, step S3 specifically includes:

S31、将基类中的D_train输入嵌入模块f_θ中，依次经过卷积层、池化层和激活函数，得每个类的样本特征图；S31, embed the D _train input in the base class into the module f _θ , and pass through the convolution layer, pooling layer and activation function in sequence to obtain the sample feature map of each class;

S32，计算出每个类样本特征图的均值μ_c和方差σ_c，与预训练样本特征的空间分布相比较，对嵌入模块f_θ的参数加以调整，根据公式(1)和(2)计算每个类的均值μ_c和方差σ_c：S32, calculate the mean μ _c and variance σ _c of the feature map of each class sample, compare with the spatial distribution of the pre-training sample features, adjust the parameters of the embedding module f _θ , and calculate the mean μ _c and variance σ _c of each class according to formulas (1) and (2):

式中x_i表示为基类中C的第i个样本的特征向量，n_c表示为C类中的样本总数；Where _xi represents the feature vector of the i-th sample in the base class C, and _nc represents the total number of samples in class C;

S33，将各类样本特征图输入到分布学习模块g_φ，得到每个样本均值

和方差

利用高斯分布公式(3)计算每个样本x_i类别概率：S33, input each type of sample feature map into the distribution learning module g _φ to obtain the mean of each sample

and variance

The Gaussian distribution formula (3) is used to calculate the category probability of each sample _xi :

式中∑_c表示为C类特征的协方差矩阵，其计算公式如公式(4)所示：Where ∑ _c represents the covariance matrix of C-type features, and its calculation formula is shown in formula (4):

S34，利用交叉熵公式最小化损失函数，优化分布学习模块g_φ参数，公式如(5)所示：S34, using the cross entropy formula to minimize the loss function and optimize the distribution learning module _gφ parameters, the formula is shown in (5):

式中y表示为一组带有标签的特征向量。Where y is represented as a set of labeled feature vectors.

进一步地，步骤4具体如下：Furthermore, step 4 is as follows:

S41，每个任务

由支持集

和查询集

组成；S41, each task

Supported by

and queryset

composition;

S42，将支持集

输入嵌入模块f_θ，得到每个类样本特征图的均值μ_c和方差σ_c；S42, will support the collection

Input the embedding module f _θ to obtain the mean μ _c and variance σ _c of the feature map of each class sample;

S43，将各类样本特征图输入到分布学习模块g_φ，得到每个样本均值

和方差

S43, input each type of sample feature map into the distribution learning module g _φ to obtain the mean of each sample

and variance

S44，根据每个样本的均值

和方差

利用公式(6)和(7)计算

中每个类的分布原型

和

S44, based on the mean of each sample

and variance

Using formulas (6) and (7)

The distribution prototype of each class in

and

式(6)中，S_c表示为支持集

中的第C类，x_i表示为支持集

中的第C类中的样本，

表示为样本x_i的均值，μ_c表示为第C类的均值即第C个类的分布，式(6)整体表示为求第C类的加权调和平均数，以表示在模型中不同类的分布原型的位置，用以收紧类内关系和满足识别差距；In formula (6), _Sc represents the support set

The _Cth class in the

The samples in the Cth class in

It is represented as the mean of sample _xi , _μc is represented as the mean of the Cth class, i.e., the distribution of the Cth class. The overall expression of formula (6) is to find the weighted harmonic mean of the Cth class, which is used to represent the position of the distribution prototypes of different classes in the model, in order to tighten the intra-class relationship and meet the recognition gap;

式(7)的目的是求类别C的方差，用以在足够的类别信息下消除单个数据的类无关表示，减少整体类别信息的幅度变化。The purpose of formula (7) is to find the variance of category C, so as to eliminate the class-independent representation of a single data under sufficient category information and reduce the amplitude variation of the overall category information.

进一步地，步骤S5具体为：Furthermore, step S5 is specifically as follows:

S51，计算基类样本数据中各个类的类别概率，公式如下：S51, calculate the class probability of each class in the base class sample data, the formula is as follows:

式中，

表示为基类样本数据中类别C的均值与方差服从高斯分布，

表示为支持集

中的第C类的均值，S_d表示为将支持集

第C类的分布作为输入，与基类样本数据中的第C类的分布相比较的距离集；In the formula,

It is expressed as the mean and variance of category C in the base class sample data obeying Gaussian distribution,

Support set

The mean of the Cth class in the support set S _d is represented by

The distribution of the Cth class is used as input, and the distance set compared with the distribution of the Cth class in the base class sample data;

S52，选取最大的前n个类别，将n个类别的分布与当前类别的分布合并，公式如下：S52, select the largest first n categories, and merge the distribution of n categories with the distribution of the current category. The formula is as follows:

式中，topn(·)表示为一个从输入距离集S_d中选择顶部元素的操作符，S_N用以存储关于特征向量最近的n个最近的基类样本数据；In the formula, topn(·) represents an operator that selects the top elements from the input distance set _Sd , and S _N is used to store the n nearest base class sample data closest to the feature vector;

S53，将合并后的类输入公式(6)和(7)，得到矫正后的每个类的分布原型

和

S53, input the merged classes into formulas (6) and (7) to obtain the corrected distribution prototype of each class

and

式(6)中，S_c表示为支持集

中的第C类，x_i表示为支持集

中的第C类中的样本，

The _Cth class in the

The samples in the Cth class in

进一步地，步骤S6具体为：Further, step S6 is specifically as follows:

S61，将新类数据的查询集

的样本信息输入嵌入模块f_θ，得到每个类样本特征图的均值μ_c和方差σ_c；S61, the query set of new class data

The sample information is input into the embedding module f _θ to obtain the mean μ _c and variance σ _c of the sample feature map of each class;

S62，将各类样本特征图输入到分布学习模块g_φ，得到每个样本均值

和方差

S62, input each type of sample feature map into the distribution learning module g _φ to obtain the mean of each sample

and variance

S63，将每个样本的均值

和方差

输入公式(3)，计算出新类查询样本的预测概率，将其输入到度量模块，输出对应的类别标签：S63, the mean of each sample

and variance

Enter formula (3) to calculate the predicted probability of the new class query sample, input it into the measurement module, and output the corresponding category label:

另一方面，本发明还提供了一种面向小样本图像的任务自适应度量学习装置，用以实现上述的任一项方法，包括以下模块：On the other hand, the present invention also provides a task-adaptive metric learning device for small sample images, which is used to implement any of the above methods, and includes the following modules:

嵌入模块，用于对图像样本进行特征提取处理，构造特征空间，其中，所述图像样本包括基类样本、新类支持样本和查询样本；An embedding module, used for performing feature extraction processing on image samples and constructing a feature space, wherein the image samples include base class samples, new class support samples and query samples;

分布学习模块，用于提取图像特征的分布表示，得到类中每个样本的均值与方差；The distribution learning module is used to extract the distribution representation of image features and obtain the mean and variance of each sample in the class;

分布矫正模块，目的在于用基类样本的分布对新类样本进行分布矫正，构建图像分布矫正模型；The distribution correction module aims to use the distribution of base class samples to correct the distribution of new class samples and build an image distribution correction model;

度量模块，用于利用优化后的基类样本的分布对新类查询集样本进行分类，获取类别标签。The metric module is used to classify the new class query set samples using the optimized distribution of the base class samples and obtain the category labels.

与现有技术相比，本发明的有益效果为：Compared with the prior art, the present invention has the following beneficial effects:

本发明建立了一种基于特征分布迁移的小样本图像特征学习方法及装置，可以与任何分类器和特征提取器配对，无需额外的参数，解决了小样本图像分类中存在的原型偏差问题，改善了图像的分类效果，具有很高的实用价值。The present invention establishes a small sample image feature learning method and device based on feature distribution migration, which can be paired with any classifier and feature extractor without the need for additional parameters. It solves the prototype bias problem in small sample image classification, improves the image classification effect, and has high practical value.

本发明的装置在前期利用基类的数据结合梯度下降的方法，对嵌入模块以及分布学习模块的参数进行优化，后期进行分布矫正时，并不需要额外的参数设置；另外，通常假设特征表示中的每个维度都遵循高斯分布，这样高斯分布的均值和方差可以在类似的类别之间传递，减少偏差，以便这些类别的统计数据在足够的样本数下得到更好的估计，再利用分布矫正模型，对样本的分布进行矫正，从而更为精准的对新类样本进行分类。The device of the present invention uses the data of the base class in combination with the gradient descent method to optimize the parameters of the embedding module and the distribution learning module in the early stage, and no additional parameter setting is required when the distribution correction is performed in the later stage; in addition, it is usually assumed that each dimension in the feature representation follows a Gaussian distribution, so that the mean and variance of the Gaussian distribution can be transferred between similar categories to reduce deviations, so that the statistical data of these categories can be better estimated with a sufficient number of samples, and then the distribution correction model is used to correct the distribution of the samples, so as to more accurately classify new class samples.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例或现有技术中的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明中记载的一些实施例，对于本领域普通技术人员来讲，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for use in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in the present invention. For ordinary technicians in this field, other drawings can also be obtained based on these drawings.

图1为本发明实施例提供的基于特征分布迁移的小样本图像特征学习方法流程图。FIG1 is a flow chart of a small sample image feature learning method based on feature distribution migration provided in an embodiment of the present invention.

图2为本发明实施例提供的基于特征分布迁移的小样本图像特征学习模型的迁移学习和分布迁移的特征学习网络结构图。FIG2 is a diagram showing a feature learning network structure of transfer learning and distribution migration of a small sample image feature learning model based on feature distribution migration provided in an embodiment of the present invention.

图3为本发明实施例提供的分布矫正模块的流程图。FIG3 is a flow chart of a distribution correction module provided in an embodiment of the present invention.

图4为本发明实施例提供的基于特征分布迁移的小样本图像特征学习装置功能模块示意图。FIG4 is a schematic diagram of functional modules of a small sample image feature learning device based on feature distribution migration provided in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。本发明中的实施例，本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following is a clear and complete description of the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of the embodiments. The embodiments of the present invention and all other embodiments obtained by those skilled in the art without creative work are within the scope of protection of the present invention.

根据本文公开的一个方面，提供了一种基于特征分布迁移的小样本图像特征学习方法，如图1所示，包括以下阶段步骤：According to one aspect disclosed herein, a small sample image feature learning method based on feature distribution migration is provided, as shown in FIG1 , including the following phase steps:

具体地，步骤S1的预处理方法包括：Specifically, the preprocessing method of step S1 includes:

S11，将数据

分为

S11, the data

Divided into

和

两部分，且这两部分的类别空间互斥，将D_train用以在训练过程中调整参数，D_test作为新类数据测评模型性能；

and

The two parts have mutually exclusive category spaces. D _train is used to adjust parameters during training, and D _{test is} used as new category data to evaluate model performance.

进一步地，步骤S2中使用包含四个卷积块的嵌入模块f_θ对图像提取特征，其中含有卷积层、池化层和非线性激活函数，每个卷积块使用窗口大小为3*3的卷积核，一个批量归一化，一个RELU非线性层，一个2×2最大池化层，裁剪了最后两个块的最大池化层。例如，对于84×84×3RGB图像，每个块使用一个带有64个滤波器的3x3的卷积核。Furthermore, in step S2, an embedding module f _θ containing four convolutional blocks is used to extract features from the image, which contains convolutional layers, pooling layers and non-linear activation functions. Each convolutional block uses a convolution kernel with a window size of 3*3, a batch normalization, a RELU non-linear layer, a 2×2 maximum pooling layer, and the maximum pooling layers of the last two blocks are cropped. For example, for an 84×84×3 RGB image, each block uses a 3x3 convolution kernel with 64 filters.

其中，分布学习模块g_φ由两个全连接层组成，用以提取图像特征的分布表示，得到类中每个样本的均值与方差。Among them, the distribution learning module g _φ consists of two fully connected layers, which is used to extract the distribution representation of image features and obtain the mean and variance of each sample in the class.

最小化损失函数使用的是梯度下降算法，不断地调整权重ω和偏差b，使得损失函数的值变得越来越小。也可以替换为随机梯度下降或者批量梯度下降等。The gradient descent algorithm is used to minimize the loss function, which continuously adjusts the weight ω and the deviation b to make the value of the loss function smaller and smaller. It can also be replaced by stochastic gradient descent or batch gradient descent.

具体地，步骤S3包括：Specifically, step S3 includes:

和方差

and variance

S4，将支持集

经过嵌入模块f_θ和分布学习模块g_φ计算每类的分布原型

和

S4, will support the collection

and

具体地，步骤4包括：Specifically, step 4 includes:

S41，将新类数据分为支持集

和查询集

每个任务

由支持集

和查询集

组成；S41, divide the new class data into support sets

and queryset

Each task

Supported by

and queryset

composition;

S42，将支持集

和方差

and variance

S44，根据每个样本的均值

和方差

利用公式(6)和(7)计算

中每个类的分布原型

和

S44, based on the mean of each sample

and variance

Using formulas (6) and (7)

The distribution prototype of each class in

and

式(6)中，S_c表示为支持集

中的第C类，x_i表示为支持集

中的第C类中的样本，

The _Cth class in the

The samples in the Cth class in

和

and

具体地，步骤S5包括：Specifically, step S5 includes:

式中，

表示为基类样本数据中类别C的均值与方差服从高斯分布，

表示为支持集

中的第C类的均值，S_d表示为将支持集

Support set

The mean of the Cth class in the support set S _d is represented by

和

and

式(6)中，S_c表示为支持集

中的第C类，x_i表示为支持集

中的第C类中的样本，

The _Cth class in the

The samples in the Cth class in

S6，计算新类查询样本的预测概率，输出类别标签。S6, calculates the predicted probability of the new class query sample and outputs the category label.

具体地，步骤S6包括：Specifically, step S6 includes:

S61，将新类数据的查询集

和方差

and variance

S63，将每个样本的均值

和方差

and variance

本发明基于特征分布迁移的小样本图像特征学习方法，建立在现成的预训练特征抽取器和分类模型之上，可以与任何分类器和特征提取器配对，无需额外的参数；基于特征分布迁移的小样本图像特征学习方法装置中，采用分布学习模块，通常假设特征表示中的每个维度都遵循高斯分布，这样高斯分布的均值和方差可以在类似的类别之间传递，通过计算每个样本的均值和方差，计算出对应类别的加权调和平均数，用以表示模型中不同类别的分布原型的位置，比较与每个类别代表的距离，学会了对样本进行分类。目的在于通过融合与查询样本相连样本特征及它们之间距离的相似性，更新查询样本并分类。The present invention is based on a small sample image feature learning method of feature distribution migration, which is built on the existing pre-trained feature extractor and classification model, and can be paired with any classifier and feature extractor without additional parameters; in the small sample image feature learning method device based on feature distribution migration, a distribution learning module is adopted, and it is usually assumed that each dimension in the feature representation follows a Gaussian distribution, so that the mean and variance of the Gaussian distribution can be transferred between similar categories, and the weighted harmonic mean of the corresponding category is calculated by calculating the mean and variance of each sample, which is used to represent the position of the distribution prototype of different categories in the model, and the distance from each category representative is compared, so that the samples are learned to be classified. The purpose is to update and classify the query sample by fusing the similarity of the sample features connected to the query sample and the distance between them.

以上结合附图对所提出的基于特征分布迁移的小样本图像特征学习方法及模型的具体实施方式进行了阐述。通过以上实施方式的描述，所属领域的技术人员可以清楚的了解该方法以及装置的实施。The specific implementation of the proposed small sample image feature learning method and model based on feature distribution migration is described above in conjunction with the accompanying drawings. Through the description of the above implementation, those skilled in the art can clearly understand the implementation of the method and device.

需要说明的是，在附图和说明书正文中，未描述的实现方式，均为所属技术领域中普通技术人员所知的形式，未进行详细说明。此外，上述对各元件和方法的定义并不仅限于实例中提到的各种具体结构、形状或方式，本领域普通技术人员可对其进行简单地更改或替换。It should be noted that the implementation methods not described in the drawings and the main body of the specification are all forms known to ordinary technicians in the relevant technical field and are not described in detail. In addition, the above definitions of various elements and methods are not limited to the various specific structures, shapes or methods mentioned in the examples, and ordinary technicians in the field can simply change or replace them.

此外，除非特别描述或必须依序发生地步骤，上述步骤地顺序并无限制于以上所列，且可根据所需设计而变化或重新安排。并且上述实例可基于设计及可靠度地考虑，彼此混合搭配使用或与其他实例混合搭配使用，即不同实施中的技术特征可以自由组合形成更多地实施例子。在此提供的算法和显示不与任何特定计算机、虚拟系统或者其他设备固有相关。各种通用系统也可以与基于在此地启示一起使用。根据上面的描述，构造这类系统所要求的结构是显而易见的。此外，本文公开的也不针对任何特定的编程语言。但是应当了解，可以利用各种编程语言实现在此描述的本文公开的内容，并且上面对特定语言所做的描述是为了披露本文公开的最佳实施方式。In addition, unless the steps are specifically described or must occur in sequence, the order of the above steps is not limited to the above listed, and can be changed or rearranged according to the desired design. And the above examples can be mixed and matched with each other or with other examples based on design and reliability considerations, that is, the technical features in different implementations can be freely combined to form more implementation examples. The algorithms and displays provided herein are not inherently related to any specific computer, virtual system or other device. Various general systems can also be used together with the revelations based on this. According to the above description, the structure required to construct such a system is obvious. In addition, what is disclosed herein is not directed to any specific programming language. However, it should be understood that various programming languages can be used to implement the content disclosed herein described herein, and the above description of specific languages is to disclose the best implementation method disclosed herein.

类似的，应当理解，为了使本文尽量精简并且帮助理解各个公开方面中的一个或多个，在上面对本文公开的示例性实施例的描述中，本文公开的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而，并不应将该公开的方法解释成反映如下示意图：即要求所保护的本文公开的要求比在每个权力要求中所明确记载的特征具有更多的特征。更确切地说，如下面的权力要求书所反映的那样，公开方面在于少于前面公开的单个实施例的所有特征。因此，遵循具体实施方式的权利要求书由此明确地并入该具体实施方式，其中每个权利要求本身都作为本公开的单独实施例子。Similarly, it should be understood that in order to make this document as concise as possible and to aid in understanding one or more of the various disclosed aspects, in the above description of the exemplary embodiments disclosed herein, the various features disclosed herein are sometimes grouped together into a single embodiment, figure, or description thereof. However, the disclosed method should not be interpreted as reflecting the following schematic diagram: the requirements disclosed herein that are claimed to be protected have more features than the features explicitly recorded in each claim. More specifically, as reflected in the claims below, the disclosed aspects are less than all the features of the single embodiment disclosed above. Therefore, the claims that follow the specific embodiment are hereby expressly incorporated into the specific embodiment, wherein each claim itself serves as a separate embodiment of the present disclosure.

以上所述实施例，仅为本申请的具体实施方式，用以说明本申请的技术方案，而非对其限制，本申请的保护范围并不局限于此，尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化，或者对其中部分技术特殊进行等同替换；而这些修改、变化或者替换，并不使相应技术方案的本质脱离本申请实施例技术方案的精神和范围。都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应所述以权利要求的保护范围为准。The above-described embodiments are only specific implementation methods of the present application, which are used to illustrate the technical solutions of the present application, rather than to limit them. The protection scope of the present application is not limited thereto. Although the present application is described in detail with reference to the aforementioned embodiments, ordinary technicians in the field should understand that any technician familiar with the technical field can still modify the technical solutions recorded in the aforementioned embodiments within the technical scope disclosed in the present application, or can easily think of changes, or make equivalent replacements for some of the technical specialties therein; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present application. They should all be included in the protection scope of the present application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims

1. The small sample image feature learning method based on feature distribution migration is characterized by comprising the following steps of:

s1, preprocessing data, wherein the data comprises a training set and a testing set;

s2, pre-training an embedded module f by using base class data _θ Obtaining a feature space;

s3, D is _train Input to the embedding module f _θ Obtaining a sample characteristic diagram, and inputting the sample characteristic diagram into a distribution learning module g _φ In the method, a loss function is minimized, and a distribution learning module g is optimized _φ ；

S4, dividing the new class data into support sets

And query set->

Support set->

Through the embedded module f _θ Distribution learning moduleg _φ Calculating distribution prototype of each class->

And->

S5, calculating the class probability of each class in the basic class data, selecting the first n maximum classes, combining the distribution of the n classes with the distribution of the current class to obtain a corrected distribution prototype of each class

And->

The step S5 specifically comprises the following steps:

s51, calculating the class probability of each class in the basic class sample data, wherein the formula is as follows:

in the method, in the process of the invention,

the mean and variance of class C in the base class sample data are expressed as Gaussian distribution, ++>

Expressed as support set->

Average value of class C, S _d Expressed as support set->

The distribution of class C is used as input and is phase with the distribution of class C in the basic class sample dataA set of distances compared;

s52, selecting the first n maximum categories, combining the distribution of the n categories with the distribution of the current category, and adopting the following formula:

where topn (·) is expressed as a slave input distance set S _d An operator for selecting a top element, S _N To store the nearest n base class sample data with respect to the feature vector;

s53, inputting the combined classes into formulas (6) and (7) to obtain corrected distribution prototypes of each class

And->

In the formula (6), S _c Represented as a support set

Class C, x _i Expressed as support set->

Samples in class C of (a), +.>

Represented as sample x _i Mean, mu _c Expressed as the mean of class C, i.e., the distribution of class C, the overall expression (6) is expressed as a weighted harmonic mean of class C to represent the locations of distribution prototypes of different classes in the model for tightening intra-class relationships and satisfying recognition gaps;

the purpose of equation (7) is to solve for the variance of class C to eliminate class independent representation of individual data with sufficient class information to reduce the magnitude variation of overall class information;

s6, calculating the prediction probability of the new type query sample.

2. The small sample image feature learning method based on feature distribution migration of claim 1, wherein the preprocessing method of step S1 is as follows:

s11, data is processed

Is divided into->

And->

Two parts, and the class spaces of the two parts are mutually exclusive, D _train For adjusting parameters during training, D _test Evaluating the performance of the model as new data;

s12, for the C-way K-shot classification task, from D _train Randomly selecting C classes, randomly selecting M samples from each class, wherein K samples are used as support samples S _i The remaining M-K samples are used as query samples Q _i ，S _i And Q _i Form a task T _i The method comprises the steps of carrying out a first treatment on the surface of the Similarly, for D _test With tasks

3. The method for learning small sample image features based on feature distribution migration according to claim 1, wherein an embedding module f including four convolution blocks is used in step S2 _θ For image extraction features, which contain convolutional layers, pooling layers and nonlinear activation functions, each convolutional block uses a convolutional kernel with a window size of 3*3, a batch normalization, a RELU nonlinear layer, a 2 x 2 max pooling layer, and the max pooling layer of the last two blocks is clipped.

4. The small sample image feature learning method based on feature distribution migration of claim 1, wherein the feature distribution learning module g in step S3 _φ The method consists of two full connection layers, and is used for extracting distribution representation of image characteristics to obtain the mean value and variance of each sample in the class.

5. The small sample image feature learning method based on feature distribution migration according to claim 1, wherein the minimizing of the loss function in step S3 uses a gradient descent algorithm, and the value of the loss function becomes smaller and smaller by continuously adjusting the weight ω and the deviation b.

6. The small sample image feature learning method based on feature distribution migration of claim 1, wherein step S3 specifically includes:

s31, D in the base class _train Input embedding module f _θ Sequentially passing through a convolution layer, a pooling layer and an activation function to obtain a sample feature map of each class;

s32, calculating the mean mu of each class sample feature map _c Sum of variances sigma _c For the embedding module f, compared with the spatial distribution of the characteristics of the pre-training samples _θ Is adjusted by calculating the mean mu of each class according to formulas (1) and (2) _c Sum of variances sigma _c ：

In which x is _i Feature vector expressed as the ith sample of C in base class, n _c Expressed as the total number of samples in class C;

s33, inputting various sample feature images into the distribution learning module g _φ Obtaining the average value mu of each sample _xi Sum of variances

Calculating each sample x using Gaussian distribution formula (3) _i Category probability:

sigma in _c Covariance matrix expressed as C-class characteristics, and the calculation formula is shown as formula (4):

s34, minimizing a loss function by using a cross entropy formula, and optimizing a distribution learning module g _φ Parameters, the formula is shown as (5):

where y is represented as a set of labeled feature vectors.

7. The small sample image feature learning method based on feature distribution migration of claim 1, wherein step S4 specifically comprises the following steps:

s41, dividing the new class data into support sets

And query set->

Every task->

By support set->

And query set->

Composition;

s42, supporting the collection

Input embedding module f _θ Obtaining the average value mu of each class sample characteristic diagram _c Sum of variances sigma _c ；

S43, inputting various sample feature images into the distribution learning module g _φ Obtaining the average value of each sample

Sum of variances->

S44, according to the average value of each sample

Sum of variances->

Calculating +.about.using equations (6) and (7)>

Distribution prototype of each class->

And->

In the formula (6), S _c Represented as a support set

Class C, x _i Expressed as support set->

Samples in class C of (a), +.>

the purpose of equation (7) is to solve for the variance of class C to eliminate class independent representations of individual data with sufficient class information to reduce the magnitude variation of the overall class information.

8. The small sample image feature learning method based on feature distribution migration of claim 1, wherein step S6 specifically comprises:

s61, the query set of the new class data

Sample information input embedding module f of (2) _θ Obtaining the average value mu of each class sample characteristic diagram _c Sum of variances sigma _c ；

S62, inputting various sample feature images into a distribution learning module g _φ Obtaining the average value of each sample

Sum of variances->

S63, the average value of each sample is calculated

Sum of variances->

Inputting a formula (3), calculating the prediction probability of a new class query sample, inputting the prediction probability into a measurement module, and outputting a corresponding class label:

9. a task adaptive metric learning device for small sample image, characterized in that it is configured to implement the task adaptive metric learning method for small sample image according to any one of claims 1-8, and comprises the following modules:

the embedding module is used for carrying out feature extraction processing on the image samples and constructing a feature space, wherein the image samples comprise a base class sample, a new class support sample and a query sample;

the distribution learning module is used for extracting distribution representation of image characteristics and obtaining the mean value and variance of each sample in the class;

the distribution correction module is used for carrying out distribution correction on the new class samples by using the distribution of the base class samples and constructing an image distribution correction model;

and the measurement module is used for classifying the new class query set samples by using the optimized distribution of the base class samples and obtaining class labels.