CN116994076B

CN116994076B - Small sample image recognition method based on double-branch mutual learning feature generation

Info

Publication number: CN116994076B
Application number: CN202311264423.XA
Authority: CN
Inventors: 魏志强; 王矶法; 黄磊
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2023-09-28
Filing date: 2023-09-28
Publication date: 2024-01-19
Anticipated expiration: 2043-09-28
Also published as: CN116994076A

Abstract

The invention discloses a small sample image recognition method based on double-branch mutual learning feature generation, which relates to the technical field of small sample image recognition, and comprises the following steps: acquiring a small sample image set to be identified to form a query set to be identified; sending each image in the query set to be identified into a first feature generation module of a pre-constructed global branch to generate a first semantic feature of each image; sending each image in the query set to be identified into a second characteristic generating module of a pre-constructed local branch to generate a second semantic characteristic of each image; adding the first semantic features and the second semantic features, and determining third semantic features of each image in the query set to be identified; and respectively calculating the similarity between the third semantic feature of each image in the query set to be identified and the prototypes of a plurality of categories in the support set to determine the image category of each image in the query set to be identified. The semantic relation between the local features and the global features of the sample is mined, and the technical effect of accurately identifying the small sample image is achieved.

Description

A small-sample image recognition method based on dual-branch mutual learning feature generation

技术领域Technical field

本发明涉及小样本图像识别技术领域，并且更具体地，涉及一种基于双分支相互学习特征生成的小样本图像识别方法。The present invention relates to the technical field of small sample image recognition, and more specifically, to a small sample image recognition method based on dual-branch mutual learning feature generation.

背景技术Background technique

近年来，凭借大规模数据集和庞大的计算资源，以深度学习为代表的人工智能算法已经在人脸识别、自动驾驶、机器人等图像识别相关领域取得了巨大的成就，然而深度学习需要依靠大量的标签数据，在实际应用中数据获取往往是困难的, 这之中既有个人隐私的问题, 比如人脸数据, 也有问题对象本身就很少的问题, 比如识别珍稀保护动物的问题，除此之外, 数据标注工作往往需要耗费大量人力物力，从而阻碍了深度学习技术在某些图像识别领域的发展。与之相反，人类本身能够通过极少量的样本识别一个新物体，在人类的快速学习能力的启发下，研究人员希望机器学习模型在学习了一定类别的大量数据后，对于新的类别，只需要少量的样本就能快速学习，由此产生的小样本的图像识别问题已经逐渐成为当前的研究热点。In recent years, with large-scale data sets and huge computing resources, artificial intelligence algorithms represented by deep learning have made great achievements in image recognition-related fields such as face recognition, autonomous driving, and robots. However, deep learning requires a large number of It is often difficult to obtain data in practical applications. There are both personal privacy issues, such as face data, and problems with very few problem objects, such as the problem of identifying rare and protected animals. In addition, In addition, data annotation work often requires a lot of manpower and material resources, which hinders the development of deep learning technology in some image recognition fields. On the contrary, humans themselves can recognize a new object through a very small number of samples. Inspired by humans' rapid learning ability, researchers hope that after learning a large amount of data of a certain category, the machine learning model only needs to A small number of samples can be learned quickly, and the resulting small sample image recognition problem has gradually become a current research hotspot.

小样本图像识别任务的核心问题在于样本量过少，从而导致样本多样性过低。在数据量有限的情况下，可以通过数据增强来提高样本多样性。数据增强是指在数据量有限的情况下，借助辅助数据或者辅助信息，对原有的小样本数据集进行数据扩充或特征增强。数据扩充是指向原有的数据集添加新的无标签数据或者合成的带标签数据，特征增强是指向原样本的特征空间中添加便于分类的特征，提高样本的特征多样性。The core problem of small sample image recognition tasks is that the sample size is too small, resulting in low sample diversity. When the amount of data is limited, sample diversity can be improved through data augmentation. Data augmentation refers to using auxiliary data or auxiliary information to expand data or enhance features of the original small sample data set when the amount of data is limited. Data augmentation refers to adding new unlabeled data or synthetic labeled data to the original data set. Feature enhancement refers to adding features that facilitate classification to the feature space of the original sample to improve the feature diversity of the sample.

现有的基于特征增强的方法在生成新特征时，主要依赖于样本的全局语义特征，这些方法通过分析不同样本之间全局语义特征的相似性或差异性，生成新的全局语义特征。这种方法虽然能够提高数据集的规模和多样性，但也存在问题，那就是忽略了样本的局部特征信息，在小样本的情况下，由于数据量有限，每个样本都可能包含一些独特或重要的局部特征信息，这些信息对于区分不同类别或任务是非常有用的。如果只使用全局语义特征来生成新特征，就可能丢失或混淆这些局部特征信息，导致生成的新特征质量不高或不准确。Existing methods based on feature enhancement mainly rely on the global semantic features of samples when generating new features. These methods generate new global semantic features by analyzing the similarities or differences of global semantic features between different samples. Although this method can increase the size and diversity of the data set, it also has a problem, that is, it ignores the local characteristic information of the sample. In the case of small samples, due to the limited amount of data, each sample may contain some unique or Important local feature information, which is very useful for distinguishing different categories or tasks. If only global semantic features are used to generate new features, these local feature information may be lost or confused, resulting in low-quality or inaccurate generated new features.

发明内容Contents of the invention

针对现有技术的不足，本发明提供一种基于双分支相互学习特征生成的小样本图像识别方法。In view of the shortcomings of the existing technology, the present invention provides a small sample image recognition method based on dual-branch mutual learning feature generation.

根据本发明的一个方面，提供了一种基于双分支相互学习特征生成的小样本图像识别方法，包括：According to one aspect of the present invention, a small sample image recognition method based on dual-branch mutual learning feature generation is provided, including:

获取待识别小样本图像集合，构成待识别查询集；Obtain a small sample image set to be identified to form a query set to be identified;

将待识别查询集中的每个图像送入预先构建的全局分支的第一特征生成模块，生成每个图像的第一语义特征；Send each image in the query set to be recognized to the first feature generation module of the pre-built global branch to generate the first semantic feature of each image;

将待识别查询集中的每个图像送入预先构建的局部分支的第二特征生成模块，生成每个图像的第二语义特征；Send each image in the query set to be recognized to the second feature generation module of the pre-built local branch to generate the second semantic feature of each image;

将待识别查询集中每个图像的第一语义特征和第二语义特征相加，确定待识别查询集中每个图像的第三语义特征；Add the first semantic feature and the second semantic feature of each image in the query set to be recognized to determine the third semantic feature of each image in the query set to be recognized;

分别计算待识别查询集中每张图像的第三语义特征与支持集中多个类别原型的相似度，确定待识别查询集中每个图像的图像类别。The similarity between the third semantic feature of each image in the query set to be recognized and the prototypes of multiple categories in the support set is calculated respectively, and the image category of each image in the query set to be recognized is determined.

可选地，支持集每个类别的类别原型的构建过程如下：Optionally, a category prototype for each category in the support set is constructed as follows:

将支持集中的所有图像依次输入到全局分支的第一特征生成模块中，输出支持集中每张图像的第四语义特征；All images in the support set are input into the first feature generation module of the global branch in turn, and the fourth semantic feature of each image in the support set is output;

将支持集中的所有图像依次输入到局部分支的第二特征生成模块中，输出支持集中每张图像的第五语义特征；All images in the support set are input into the second feature generation module of the local branch in turn, and the fifth semantic feature of each image in the support set is output;

将支持集中每个类别的所有图像的第四语义特征和第五语义特征相加取平均，确定支持集中每个类别的类别原型。The fourth semantic feature and the fifth semantic feature of all images of each category in the support set are added and averaged to determine the category prototype for each category in the support set.

可选地，全局分支的第一特征生成模块和局部分支的第二特征生成模块的训练过程如下：Optionally, the training process of the first feature generation module of the global branch and the second feature generation module of the local branch is as follows:

根据小样本图像训练集，构建小样本识别任务，其中小样本识别任务包括N类，每个类别中包括M张样本图像；Based on the small sample image training set, a small sample recognition task is constructed, in which the small sample recognition task includes N categories, and each category includes M sample images;

利用特征提取网络对小样本识别任务中每张样本图像进行特征提取，确定每张样本图像的全局特征和局部特征；Use the feature extraction network to extract features from each sample image in the small sample recognition task to determine the global features and local features of each sample image;

通过小样本识别任务中每张样本图像的全局特征训练全局分支的第一特征生成模块；The first feature generation module of the global branch is trained through the global features of each sample image in the small sample recognition task;

通过小样本识别任务中每张样本图像的全局特征和局部特征训练局部分支的第二特征生成模块；The second feature generation module of the local branch is trained through the global features and local features of each sample image in the small sample recognition task;

将全局分支以及局部分支的训练信息互相学习，训练第一特征生成模块以及第二特征生成模块；Learn the training information of the global branch and the local branch from each other, and train the first feature generation module and the second feature generation module;

根据预先设置的训练总损失函数，优化第一特征生成模块以及第二特征生成模块。According to the preset training total loss function, the first feature generation module and the second feature generation module are optimized.

可选地，通过小样本识别任务中每张样本图像的全局特征训练全局分支的第一特征生成模块，包括：Optionally, train the first feature generation module of the global branch through the global features of each sample image in the small sample recognition task, including:

分别对每个类别的M张样本图像的全局特征进行掩码，得到一张样本图像的全局特征；Mask the global features of M sample images of each category to obtain the global features of one sample image;

将每个类别掩码的样本图像的全局特征使用可学习向量替换；Replace the global features of the sample images of each category mask with learnable vectors;

根据每个类别替换的可学习向量以及掩码保留的全局特征训练全局分支的第一特征生成模块；Train the first feature generation module of the global branch based on the learnable vectors replaced by each category and the global features retained by the mask;

根据预先设定的全局分支损失函数，优化第一特征生成模块，其中全局分支的损失函数包括全局预测损失函数以及全局分类损失函数。The first feature generation module is optimized according to the preset global branch loss function, where the global branch loss function includes a global prediction loss function and a global classification loss function.

可选地，通过小样本识别任务中每张样本图像的全局特征和局部特征训练局部分支的第二特征生成模块，包括：Optionally, train the second feature generation module of the local branch through the global features and local features of each sample image in the small sample recognition task, including:

选择每个类别中一张样本图像的局部特征；Select local features of a sample image in each category;

根据每个类别选取的局部特征和预先设定的M个可学习向量训练第二特征生成模块；Train the second feature generation module based on the local features selected for each category and the preset M learnable vectors;

根据预先设定的局部分支损失函数，优化第二特征生成模块，其中局部分支损失函数包括局部预测损失函数以及局部分类损失函数。The second feature generation module is optimized according to the preset local branch loss function, where the local branch loss function includes a local prediction loss function and a local classification loss function.

可选地，还包括：计算KL散度作为全局分支以及局部分支的训练信息互相学习的相互学习损失函数。Optionally, it also includes: calculating KL divergence as a mutual learning loss function in which the training information of the global branch and the local branch learn from each other.

可选地，训练总损失函数为全局分支损失函数、局部分支损失函数以及互相学习损失函数之和。Optionally, the total training loss function is the sum of the global branch loss function, the local branch loss function and the mutual learning loss function.

可选地，分别计算待识别查询集中每张图像的第三语义特征与支持集中多个类别的类别原型的相似度，确定待识别查询集中每个图像的图像类别，包括：Optionally, separately calculate the similarity between the third semantic feature of each image in the query set to be recognized and the category prototypes of multiple categories in the support set, and determine the image category of each image in the query set to be recognized, including:

分别计算待识别查询集中每张图像的第三语义特征与支持集中多个类别的类别原型的相似度，确定待识别查询集中每张图像属于支持集中每个类别的概率值；Calculate the similarity between the third semantic feature of each image in the query set to be recognized and the category prototypes of multiple categories in the support set, and determine the probability value of each image in the query set to be recognized belonging to each category in the support set;

取待识别查询集中每个图像对应的概率值最大的类别作为该图像的图像类别。The category with the largest probability value corresponding to each image in the query set to be recognized is taken as the image category of the image.

根据本发明的另一个方面，提供了一种基于双分支相互学习特征生成的小样本图像识别装置，包括：According to another aspect of the present invention, a small sample image recognition device based on dual-branch mutual learning feature generation is provided, including:

获取模块，用于获取待识别小样本图像集合，构成待识别查询集；The acquisition module is used to obtain a small sample image set to be identified to form a query set to be identified;

第一生成模块，用于将待识别查询集中的每个图像送入预先构建的全局分支的第一特征生成模块，生成每个图像的第一语义特征；The first generation module is used to send each image in the query set to be recognized to the first feature generation module of the pre-built global branch to generate the first semantic feature of each image;

第二生成模块，用于将待识别查询集中的每个图像送入预先构建的局部分支的第二特征生成模块，生成每个图像的第二语义特征；The second generation module is used to send each image in the query set to be recognized to the second feature generation module of the pre-constructed local branch to generate the second semantic feature of each image;

第一确定模块，用于将待识别查询集中每个图像的第一语义特征和第二语义特征相加，确定待识别查询集中每个图像的第三语义特征；The first determination module is used to add the first semantic feature and the second semantic feature of each image in the query set to be recognized, and determine the third semantic feature of each image in the query set to be recognized;

第二确定模块，用于分别计算待识别查询集中每张图像的第三语义特征与支持集中多个类别原型的相似度，确定待识别查询集中每个图像的图像类别。The second determination module is used to respectively calculate the similarity between the third semantic feature of each image in the query set to be recognized and multiple category prototypes in the support set, and determine the image category of each image in the query set to be recognized.

根据本发明的又一个方面，提供了一种计算机可读存储介质，所述存储介质存储有计算机程序，所述计算机程序用于执行本发明上述任一方面所述的方法。According to another aspect of the present invention, a computer-readable storage medium is provided, the storage medium stores a computer program, and the computer program is used to execute the method described in any of the above aspects of the present invention.

根据本发明的又一个方面，提供了一种电子设备，所述电子设备包括：处理器；用于存储所述处理器可执行指令的存储器；所述处理器，用于从所述存储器中读取所述可执行指令，并执行所述指令以实现本发明上述任一方面所述的方法。According to yet another aspect of the present invention, an electronic device is provided. The electronic device includes: a processor; a memory for storing instructions executable by the processor; and the processor for reading from the memory. Fetch the executable instructions and execute the instructions to implement the method described in any of the above aspects of the present invention.

从而，本发明提出了一种基于双分支相互学习特征生成的小样本图像识别方法，构建了基于局部特征信息的特征生成模块，该模块通过挖掘样本局部特征和全局特征的语义关系，利用样本的局部特征信息生成多样化的全局语义特征，进行特征增强。基于局部特征信息进行特征生成的分支和基于全局语义特征进行特征生成的分支，两个分支之间相互学习，捕获互补信息，促进彼此间的隐性知识转移，使模型生成更具有判别性的特征。Therefore, the present invention proposes a small sample image recognition method based on dual-branch mutual learning feature generation, and constructs a feature generation module based on local feature information. This module mines the semantic relationship between the local features and global features of the sample, and utilizes the sample's Local feature information generates diverse global semantic features for feature enhancement. The branch of feature generation based on local feature information and the branch of feature generation based on global semantic features learn from each other, capture complementary information, promote the transfer of implicit knowledge between each other, and enable the model to generate more discriminative features. .

附图说明Description of the drawings

通过参考下面的附图，可以更为完整地理解本发明的示例性实施方式：A more complete understanding of exemplary embodiments of the invention may be obtained by reference to the following drawings:

图1是本发明一示例性实施例提供的基于双分支相互学习特征生成的小样本图像识别方法的流程示意图；Figure 1 is a schematic flowchart of a small-sample image recognition method based on dual-branch mutual learning feature generation provided by an exemplary embodiment of the present invention;

图2是本发明一示例性实施例提供的基于双分支相互学习特征生成的小样本图像识别装置的结构示意图；Figure 2 is a schematic structural diagram of a small sample image recognition device based on dual-branch mutual learning feature generation provided by an exemplary embodiment of the present invention;

图3是本发明一示例性实施例提供的电子设备的结构。Figure 3 is a structure of an electronic device provided by an exemplary embodiment of the present invention.

具体实施方式Detailed ways

下面，将参考附图详细地描述根据本发明的示例实施例。显然，所描述的实施例仅仅是本发明的一部分实施例，而不是本发明的全部实施例，应理解，本发明不受这里描述的示例实施例的限制。Hereinafter, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, rather than all embodiments of the present invention. It should be understood that the present invention is not limited to the example embodiments described here.

应注意到：除非另外具体说明，否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。It should be noted that the relative arrangement of components and steps, numerical expressions and numerical values set forth in these examples do not limit the scope of the invention unless otherwise specifically stated.

本领域技术人员可以理解，本发明实施例中的“第一”、“第二”等术语仅用于区别不同步骤、设备或模块等，既不代表任何特定技术含义，也不表示它们之间的必然逻辑顺序。Those skilled in the art can understand that terms such as "first" and "second" in the embodiments of the present invention are only used to distinguish different steps, devices or modules, etc., and do not represent any specific technical meaning, nor do they represent the differences between them. necessary logical sequence.

还应理解，在本发明实施例中，“多个”可以指两个或两个以上，“至少一个”可以指一个、两个或两个以上。It should also be understood that in the embodiment of the present invention, "multiple" may refer to two or more than two, and "at least one" may refer to one, two, or more than two.

还应理解，对于本发明实施例中提及的任一部件、数据或结构，在没有明确限定或者在前后文给出相反启示的情况下，一般可以理解为一个或多个。It should also be understood that any component, data or structure mentioned in the embodiments of the present invention can generally be understood to mean one or more unless there is an explicit limitation or contrary inspiration is given in the context.

另外，本发明中术语“和/或”，仅仅是一种描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。另外，本发明中字符“/”，一般表示前后关联对象是一种“或”的关系。In addition, the term "and/or" in the present invention is only an association relationship describing related objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, and A and B exist simultaneously. , there are three situations of B alone. In addition, the character "/" in the present invention generally indicates that the related objects are in an "or" relationship.

还应理解，本发明对各个实施例的描述着重强调各个实施例之间的不同之处，其相同或相似之处可以相互参考，为了简洁，不再一一赘述。It should also be understood that the description of the various embodiments of the present invention focuses on the differences between the various embodiments, and the similarities or similarities between the embodiments can be referred to each other. For the sake of brevity, they will not be described again one by one.

同时，应当明白，为了便于描述，附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。At the same time, it should be understood that, for convenience of description, the dimensions of various parts shown in the drawings are not drawn according to actual proportional relationships.

以下对至少一个示例性实施例的描述实际上仅仅是说明性的，决不作为对本发明及其应用或使用的任何限制。The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application or uses.

对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论，但在适当情况下，技术、方法和设备应当被视为说明书的一部分。Techniques, methods and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, the techniques, methods and devices should be considered a part of the specification.

应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步讨论。It should be noted that similar reference numerals and letters refer to similar items in the following figures, so that once an item is defined in one figure, it does not require further discussion in subsequent figures.

本发明实施例可以应用于终端设备、计算机系统、服务器等电子设备，其可与众多其它通用或专用计算系统环境或配置一起操作。适于与终端设备、计算机系统、服务器等电子设备一起使用的众所周知的终端设备、计算系统、环境和/或配置的例子包括但不限于：个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境，等等。Embodiments of the present invention may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which may operate with numerous other general or special purpose computing system environments or configurations. Examples of well-known terminal devices, computing systems, environments and/or configurations suitable for use with terminal devices, computer systems, servers and other electronic devices include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients Computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems and distributed cloud computing technology environments including any of the above systems, etc.

终端设备、计算机系统、服务器等电子设备可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常，程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等，它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施，分布式云计算环境中，任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中，程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system executable instructions (such as program modules) being executed by the computer system. Generally, program modules may include routines, programs, object programs, components, logic, data structures, etc., that perform specific tasks or implement specific abstract data types. The computer system/server may be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices linked through a communications network. In a distributed cloud computing environment, program modules may be located on local or remote computing system storage media including storage devices.

示例性方法Example methods

图1是本发明一示例性实施例提供的基于双分支相互学习特征生成的小样本图像识别方法的流程示意图。本实施例可应用在电子设备上，如图1所示，基于双分支相互学习特征生成的小样本图像识别方法100包括以下步骤：Figure 1 is a schematic flowchart of a small-sample image recognition method based on dual-branch mutual learning feature generation provided by an exemplary embodiment of the present invention. This embodiment can be applied to electronic devices. As shown in Figure 1, the small sample image recognition method 100 based on dual-branch mutual learning feature generation includes the following steps:

步骤101，获取待识别小样本图像集合，构成待识别查询集。Step 101: Obtain a small sample image set to be identified to form a query set to be identified.

步骤102，将待识别查询集中的每个图像送入预先构建的全局分支的第一特征生成模块，生成每个图像的第一语义特征。Step 102: Send each image in the query set to be recognized to the first feature generation module of the pre-built global branch to generate the first semantic feature of each image.

步骤103，将待识别查询集中的每个图像送入预先构建的局部分支的第二特征生成模块，生成每个图像的第二语义特征。Step 103: Send each image in the query set to be recognized to the second feature generation module of the pre-built local branch to generate the second semantic feature of each image.

步骤104，将待识别查询集中每个图像的第一语义特征和第二语义特征相加，确定待识别查询集中每个图像的第三语义特征。Step 104: Add the first semantic feature and the second semantic feature of each image in the query set to be recognized to determine the third semantic feature of each image in the query set to be recognized.

步骤105，分别计算待识别查询集中每张图像的第三语义特征与支持集中多个类别原型的相似度，确定待识别查询集中每个图像的图像类别。Step 105: Calculate the similarity between the third semantic feature of each image in the query set to be recognized and multiple category prototypes in the support set, and determine the image category of each image in the query set to be recognized.

具体地，选择 N个类作为支持集，然后从每个类中选择S张图像图像构成支持集，基于支持集图像对查询集图像进行识别。Specifically, N classes are selected as the support set, and then S images are selected from each class to form the support set. , Recognize query set images based on support set images.

计算支持集中的类别原型。首先，将支持集的N×S张图像分别送入到全局分支和局部分支，通过全局分支的第一特征生成模块，得到个第四语义特征，同样，通过局部分支的第二特征生成模块，也得到/> 个第五语义特征，然后将两个分支生成的个语义特征相加，得到总的语义特征，最后对每个类中的/> 个语义特征相加并取平均，作为每个类的类别原型/> 其中 /> 为图像全局特征。Computational support for centralized category prototypes. First, N×S images of the support set are sent to the global branch and the local branch respectively, and through the first feature generation module of the global branch, we get A fourth semantic feature, similarly, through the second feature generation module of the local branch, also obtain/> fifth semantic feature, and then generate the two branches The semantic features are added together to obtain the total semantic features, and finally the // in each category is The semantic features are added and averaged as the category prototype of each category/> Among them/> is the global feature of the image.

识别查询集中的图像。对于待识别的查询集中的每个图像x，分别送入到全局分支和局部分支，通过全局分支的第一特征生成模块，得到 M个第一语义特征，同样，通过局部分支的第二特征生成模块，也得到M个第二语义特征，然后将两个分支生成的 M个语义特征相加并取平均，得到图像x的第三语义特征/>。然后根据每个待识别查询集样本与每个类别原型之间的相似度计算待识别查询集样本属于各个类别的概率值，取概率值最大的类别作为待识别查询集样本的预测标签。概率计算公式为：Identify the images in the query set. For the query set to be identified Each image x in is sent to the global branch and the local branch respectively. Through the first feature generation module of the global branch, M first semantic features are obtained. Similarly, through the second feature generation module of the local branch, M are also obtained. second semantic features, and then add and average the M semantic features generated by the two branches to obtain the third semantic feature of image x /> . Then, the probability value of the query set sample to be identified belonging to each category is calculated based on the similarity between each query set sample to be identified and the prototype of each category, and the category with the largest probability value is taken as the predicted label of the query set sample to be identified. The probability calculation formula is:

其中，C _i为待识别查询集样本在N个类中的类别标签，为根据待识别查询集样本与N个类别原型距离预测的类别标签。Among them, C _i is the category label of the query set sample to be identified in N categories, is the category label predicted based on the distance between the query set sample to be recognized and N category prototypes.

分别对每个类别的M张样本图像的全局特征进行掩码，确定一张样本图像的全局特征；Mask the global features of M sample images of each category respectively to determine the global features of a sample image;

具体地，全局分支的第一特征生成模块和局部分支的第二特征生成模块的训练步骤如下：Specifically, the training steps of the first feature generation module of the global branch and the second feature generation module of the local branch are as follows:

步骤1：构建小样本识别任务。构建一系列小样本识别任务进行训练，具体的，随机从训练集中选取N个类，然后从每个类中随机采样M个样本，构成任务T，共N×M张图像。Step 1: Construct a small sample recognition task. Construct a series of small sample recognition tasks for training. Specifically, N classes are randomly selected from the training set, and then M samples are randomly sampled from each class to form the task T , with a total of N×M images.

步骤2：特征提取。对于任务T中的每个图像x，输入到特征提取网络提取特征，特征提取网络使用视觉自注意力模型ViT（Vision Transformer），包括四个基本的编码器块Transformer Blocks，局部分支和全局分支共享前三个Transformer Blocks的参数。对于局部分支，提取图像的全局特征和局部特征 /> 。对于全局分支，只提取图像的全局特征。其中H、W和C分别是特征图的高度、宽度和通道数。Step 2: Feature extraction. For each image x in task T , the feature extraction network is input to extract features. The feature extraction network uses the visual self-attention model ViT (Vision Transformer), including four basic encoder blocks Transformer Blocks, shared by local branches and global branches. Parameters for the first three Transformer Blocks. For the local branch, extract the global features of the image and local features/> . For the global branch, only the global features of the image are extracted. Where H , W and C are the height, width and number of channels of the feature map respectively.

步骤3：全局分支。基于步骤2得到任务T中所有图像的全局特征，对于每个类M张图像的全局特征，对其进行掩码，只保留一张图像的全局特征，然后将掩码的全局特征使用可学习的向量进行替换。Step 3: Global branch. Based on step 2, the global features of all images in task T are obtained. For the global features of M images of each class, they are masked, and only the global features of one image are retained. Then the masked global features are used as learnable vector to replace.

为类别i中的M张图像的全局特征向量，mask表示掩码操作，表示对掩码的特征替换的可学习向量。 is the global feature vector of M images in category i , mask represents the masking operation, A learnable vector representing feature replacement for the mask.

然后将掩码后的特征向量送入到全局分支的第一特征生成模块。第一特征生成模块由一系列的Transformer Blocks组成。特征生成模块使用保留的全局特征信息对掩码的特征进行预测，通过一对多的方式学习样本的类内变化，使模型可以生成多样化的特征。The masked feature vector is then sent to the first feature generation module of the global branch. The first feature generation module consists of a series of Transformer Blocks. The feature generation module uses the retained global feature information to predict the features of the mask, and learns the intra-class changes of samples in a one-to-many manner, so that the model can generate diversified features.

为第一特征生成模块， />为第一特征生成模块生成的特征。 Generate module for the first feature, /> Features generated by the first feature generation module.

使用均方误差（MSE）作为全局预测损失函数来测量预测特征与掩码之前的特征之间的差异。Use mean square error (MSE) as the global prediction loss function to measure the difference between the predicted features and the features before masking.

为了使全局分支提取的特征具有判别性，全局分支在输入图像的真实标签的监督下进行训练，在整个类空间中学习类间关系挖掘，全局分类损失为：In order to make the features extracted by the global branch discriminative, the global branch is trained under the supervision of the real labels of the input images and learns inter-class relationship mining in the entire class space. The global classification loss is:

其中y _i是x _i的类别标签，h表示分类器，它是一个全连接层。where yi is the category label _of xi , h _represents the classifier, which is a fully connected layer.

最后，全局分支的损失为：Finally, the loss of the global branch is:

其中为超参数。in is a hyperparameter.

步骤4：局部分支。基于步骤二确定任务T中所有图像的全局特征和局部特征，对于每个类M张图像的局部特征，只选择一张图像的全局特征，并将M个可学习的向量送入到局部分支的特征生成模块，M个可学习的向量用于生成预测的全局特征，并使用特征提取模块提取的M个全局特征进行监督。Step 4: Local branching. Based on step 2, the global features and local features of all images in task T are determined. For the local features of M images in each category, only the global features of one image are selected, and M learnable vectors are sent to the local branch. Feature generation module, M learnable vectors are used to generate predicted global features, and M global features extracted by the feature extraction module are used for supervision.

表示M个可学习的向量，用于生成预测的全局特征，表示W×H个局部特征。 represents M learnable vectors used to generate predicted global features, Represents W×H local features.

然后将M个可学习的向量和W×H个局部特征送入到局部分支的第二特征生成模块，通过挖掘局部特征和全局特征之间的语义关系，使用局部特征信息生成全局语义特征。Then, M learnable vectors and W×H local features are sent to the second feature generation module of the local branch. By mining the semantic relationship between local features and global features, local feature information is used to generate global semantic features.

为第二特征生成模块， />为第二特征生成模块生成的M个全局特征。 Generate module for the second feature, /> M global features generated by the second feature generation module.

同全局分支一样，使用均方误差（MSE）作为局部预测损失函数来测量预测特征与原来特征之间的差异。As with the global branch, the mean square error (MSE) is used as the local prediction loss function to measure the difference between the predicted features and the original features.

g _i,j表示生成的全局特征，f _i,j表示第i类的第j个全局特征。 g _i,j represents the generated global feature, and f _i,j represents the j -th global feature of the i- th category.

其中y _i是x _i在选择的N个类中的类别标签，h表示分类器，它是一个全连接层，k表示任务T中图像的个数。where yi is the category label _of xi in the selected N classes, h represents the classifier, which is a fully connected layer, and k represents _the number of images in task T.

最后，局部分支的总损失为：Finally, the total loss of the local branch is:

其中为超参数。in is a hyperparameter.

步骤5：相互学习。为了使两个分支之间进行信息交互，两个分支从别的分支中学习互补信息，计算KL散度作为两个分支的相互学习损失：Step 5: Learn from each other. In order to enable information interaction between the two branches, the two branches learn complementary information from the other branches, and the KL divergence is calculated as the mutual learning loss of the two branches:

其中，为全局分支提取的图像特征，F_l为局部分支提取的图像特征。in, is the image feature extracted by the global branch, and F _l is the image feature extracted by the local branch.

步骤6：总损失。将全局分支损失、局部分支损失和两个分支相互学习损失结合构成总损失：Step 6: Total loss. The global branch loss, local branch loss and mutual learning loss of the two branches are combined to form the total loss:

其中，、/>和/>是超参数。in, , /> and/> is a hyperparameter.

从而，现有的基于特征增强的方法主要通过分析不同样本之间全局语义特征的相似性或差异性，生成新的全局语义特征。这种方法虽然能够提高数据集的规模和多样性，但也存在问题，那就是忽略了样本的局部特征信息，在小样本的情况下，由于数据量有限，每个样本都可能包含一些独特或重要的局部特征信息，这些信息对于区分不同类别或任务是非常有用的。如果只使用全局语义特征来生成新特征，就可能丢失或混淆这些局部特征信息，导致生成的新特征质量不高或不准确。对此，本发明构建了基于局部特征信息的特征生成模块，该模块通过挖掘样本局部特征和全局特征的语义关系，利用样本的局部特征信息生成多样化的全局语义特征，进行特征增强。此外，对于只使用全局特征或局部特征进行特征增强的局限性，本发明提出了双分支相互学习特征生成方法，包括基于局部特征信息进行特征生成的分支和基于全局语义特征进行特征生成的分支，两个分支之间相互学习，捕获互补信息，促进彼此间的隐性知识转移，使模型生成更具有判别性的特征。Therefore, existing feature enhancement-based methods mainly generate new global semantic features by analyzing the similarities or differences of global semantic features between different samples. Although this method can increase the size and diversity of the data set, it also has a problem, that is, it ignores the local characteristic information of the sample. In the case of small samples, due to the limited amount of data, each sample may contain some unique or Important local feature information, which is very useful for distinguishing different categories or tasks. If only global semantic features are used to generate new features, these local feature information may be lost or confused, resulting in low-quality or inaccurate generated new features. In this regard, the present invention constructs a feature generation module based on local feature information. This module mines the semantic relationship between local features and global features of the sample, and uses the local feature information of the sample to generate diversified global semantic features for feature enhancement. In addition, to address the limitations of using only global features or local features for feature enhancement, the present invention proposes a dual-branch mutual learning feature generation method, including a branch for feature generation based on local feature information and a branch for feature generation based on global semantic features. The two branches learn from each other, capture complementary information, promote the transfer of tacit knowledge between each other, and enable the model to generate more discriminative features.

示例性装置Exemplary device

图2是本发明一示例性实施例提供的基于双分支相互学习特征生成的小样本图像识别装置的结构示意图。如图2所示，装置200包括：Figure 2 is a schematic structural diagram of a small sample image recognition device based on dual-branch mutual learning feature generation provided by an exemplary embodiment of the present invention. As shown in Figure 2, device 200 includes:

获取模块210，用于获取待识别小样本图像集合，构成待识别查询集；The acquisition module 210 is used to acquire a small sample image set to be identified to form a query set to be identified;

第一生成模块220，用于将待识别查询集中的每个图像送入预先构建的全局分支的第一特征生成模块，生成每个图像的第一语义特征；The first generation module 220 is used to send each image in the query set to be recognized to the first feature generation module of the pre-built global branch to generate the first semantic feature of each image;

第二生成模块230，用于将待识别查询集中的每个图像送入预先构建的局部分支的第二特征生成模块，生成每个图像的第二语义特征；The second generation module 230 is used to send each image in the query set to be recognized to the second feature generation module of the pre-constructed local branch to generate the second semantic feature of each image;

第一确定模块240，用于将待识别查询集中每个图像的第一语义特征和第二语义特征相加，确定待识别查询集中每个图像的第三语义特征；The first determination module 240 is used to add the first semantic feature and the second semantic feature of each image in the query set to be recognized, and determine the third semantic feature of each image in the query set to be recognized;

第二确定模块250，用于分别计算待识别查询集中每张图像的第三语义特征与支持集中多个类别原型的相似度，确定待识别查询集中每个图像的图像类别。The second determination module 250 is used to respectively calculate the similarity between the third semantic feature of each image in the query set to be recognized and multiple category prototypes in the support set, and determine the image category of each image in the query set to be recognized.

可选地，第二确定模块250中支持集每个类别的类别原型的构建过程如下：Optionally, the construction process of the category prototype for each category of the support set in the second determination module 250 is as follows:

第一输出子模块，用于将支持集中的所有图像依次输入到全局分支的第一特征生成模块中，输出支持集中每张图像的第四语义特征；The first output sub-module is used to sequentially input all images in the support set into the first feature generation module of the global branch, and output the fourth semantic feature of each image in the support set;

第二输出子模块，用于将支持集中的所有图像依次输入到局部分支的第二特征生成模块中，输出支持集中每张图像的第五语义特征；The second output submodule is used to input all the images in the support set into the second feature generation module of the local branch in sequence, and output the fifth semantic feature of each image in the support set;

确定子模块，用于将支持集中每个类别的所有图像的第四语义特征和第五语义特征相加取平均，确定支持集中每个类别的类别原型。The determination submodule is used to add and average the fourth semantic features and fifth semantic features of all images of each category in the support set to determine the category prototype of each category in the support set.

可选地，第一生成模块220和所述第二生成模块230中全局分支的第一特征生成模块和局部分支的第二特征生成模块的训练过程如下：Optionally, the training process of the first feature generation module of the global branch and the second feature generation module of the local branch in the first generation module 220 and the second generation module 230 is as follows:

构建子模块，用于根据小样本图像训练集，构建小样本识别任务，其中小样本识别任务包括N类，每个类别中包括M张样本图像；Construct a submodule for constructing a small sample recognition task based on the small sample image training set, where the small sample recognition task includes N categories, and each category includes M sample images;

提取子模块，用于利用特征提取网络对小样本识别任务中每张样本图像进行特征提取，确定每张样本图像的全局特征和局部特征；The extraction submodule is used to use the feature extraction network to extract features from each sample image in the small sample recognition task, and determine the global features and local features of each sample image;

第一训练子模块，用于通过小样本识别任务中每张样本图像的全局特征训练全局分支的第一特征生成模块；The first training submodule is used to train the first feature generation module of the global branch through the global features of each sample image in the small sample recognition task;

第二训练子模块，用于通过小样本识别任务中每张样本图像的全局特征和局部特征训练局部分支的第二特征生成模块；The second training submodule is used to train the second feature generation module of the local branch through the global features and local features of each sample image in the small sample recognition task;

第三训练子模块，用于将全局分支以及局部分支的训练信息互相学习，训练第一特征生成模块以及第二特征生成模块；The third training sub-module is used to learn the training information of the global branch and the local branch from each other, and train the first feature generation module and the second feature generation module;

优化子模块，用于根据预先设置的训练总损失函数，优化第一特征生成模块以及第二特征生成模块。The optimization sub-module is used to optimize the first feature generation module and the second feature generation module according to the preset training total loss function.

可选地，第一训练子模块，包括：Optionally, the first training sub-module includes:

掩码单元，用于分别对每个类别的M张样本图像的全局特征进行掩码，确定一张样本图像的全局特征；The masking unit is used to mask the global features of M sample images of each category to determine the global features of a sample image;

替换单元，用于将每个类别掩码的样本图像的全局特征使用可学习向量替换；The replacement unit is used to replace the global features of the sample images of each category mask with learnable vectors;

第一训练单元，用于根据每个类别替换的可学习向量以及掩码保留的全局特征训练全局分支的第一特征生成模块；A first training unit configured to train the first feature generation module of the global branch based on the learnable vectors replaced by each category and the global features retained by the mask;

第一优化单元，用于根据预先设定的全局分支损失函数，优化第一特征生成模块，其中全局分支的损失函数包括全局预测损失函数以及全局分类损失函数。The first optimization unit is used to optimize the first feature generation module according to a preset global branch loss function, where the global branch loss function includes a global prediction loss function and a global classification loss function.

可选地，第二训练子模块，包括：Optionally, the second training sub-module includes:

选择单元，用于选择每个类别中一张样本图像的局部特征；The selection unit is used to select local features of a sample image in each category;

第二训练单元，用于根据每个类别选取的局部特征和预先设定的M个可学习向量训练第二特征生成模块；The second training unit is used to train the second feature generation module based on the local features selected for each category and the preset M learnable vectors;

第二优化单元，用于根据预先设定的局部分支损失函数，优化第二特征生成模块，其中局部分支损失函数包括局部预测损失函数以及局部分类损失函数。The second optimization unit is used to optimize the second feature generation module according to a preset local branch loss function, where the local branch loss function includes a local prediction loss function and a local classification loss function.

可选地，装置200还包括：作为模块，用于计算KL散度作为全局分支以及局部分支的训练信息互相学习的相互学习损失函数。Optionally, the apparatus 200 further includes: as a module, a mutual learning loss function for calculating the KL divergence as the training information of the global branch and the local branch learns from each other.

可选地，第二确定模块250，包括：Optionally, the second determination module 250 includes:

计算子模块，用于分别计算待识别查询集中每张图像的第三语义特征与支持集中多个类别的类别原型的相似度，确定待识别查询集中每张图像属于支持集中每个类别的概率值；The calculation submodule is used to separately calculate the similarity between the third semantic feature of each image in the query set to be recognized and the category prototypes of multiple categories in the support set, and determine the probability value of each image in the query set to be recognized belonging to each category in the support set. ;

作为子模块，用于取待识别查询集中每个图像对应的概率值最大的类别作为该图像的图像类别。As a sub-module, it is used to take the category with the largest probability value corresponding to each image in the query set to be recognized as the image category of the image.

示例性电子设备Example electronic device

图3是本发明一示例性实施例提供的电子设备的结构。如图3所示，电子设备30包括一个或多个处理器31和存储器32。Figure 3 is a structure of an electronic device provided by an exemplary embodiment of the present invention. As shown in FIG. 3 , electronic device 30 includes one or more processors 31 and memory 32 .

处理器31可以是中央处理单元（CPU）或者具有数据处理能力和/或指令执行能力的其他形式的处理单元，并且可以控制电子设备中的其他组件以执行期望的功能。The processor 31 may be a central processing unit (CPU) or other form of processing unit with data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.

存储器32可以包括一个或多个计算机程序产品，所述计算机程序产品可以包括各种形式的计算机可读存储介质，例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器（RAM）和/或高速缓冲存储器（cache）等。所述非易失性存储器例如可以包括只读存储器（ROM）、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令，处理器31可以运行所述程序指令，以实现上文所述的本发明的各个实施例的软件程序的方法以及/或者其他期望的功能。在一个示例中，电子设备还可以包括：输入装置33和输出装置34，这些组件通过总线系统和/或其他形式的连接机构（未示出）互连。Memory 32 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache), etc. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 31 may execute the program instructions to implement the methods and/or software programs of various embodiments of the present invention described above. Other desired features. In one example, the electronic device may further include an input device 33 and an output device 34, and these components are interconnected through a bus system and/or other forms of connection mechanisms (not shown).

此外，该输入装置33还可以包括例如键盘、鼠标等等。In addition, the input device 33 may also include, for example, a keyboard, a mouse, and the like.

该输出装置34可以向外部输出各种信息。该输出装置34可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。The output device 34 can output various information to the outside. The output device 34 may include, for example, a display, a speaker, a printer, a communication network and remote output devices connected thereto, and the like.

当然，为了简化，图3中仅示出了该电子设备中与本发明有关的组件中的一些，省略了诸如总线、输入/输出接口等的组件。除此之外，根据具体应用情况，电子设备还可以包括任何其他适当的组件。Of course, for simplicity, only some of the components related to the present invention in the electronic device are shown in FIG. 3 , and components such as buses, input/output interfaces, etc. are omitted. In addition to this, the electronic device may include any other suitable components depending on the specific application.

示例性计算机程序产品和计算机可读存储介质Example computer program products and computer-readable storage media

除了上述方法和设备以外，本发明的实施例还可以是计算机程序产品，其包括计算机程序指令，所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本发明各种实施例的方法中的步骤。In addition to the above methods and devices, embodiments of the present invention may also be a computer program product, which includes computer program instructions that, when executed by a processor, cause the processor to execute the “exemplary method” described above in this specification The steps in methods according to various embodiments of the invention are described in Sec.

所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本发明实施例操作的程序代码，所述程序设计语言包括面向对象的程序设计语言，诸如Java、C++等，还包括常规的过程式程序设计语言，诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。The computer program product may be written in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc., to write program codes for performing operations of embodiments of the present invention. , also includes conventional procedural programming languages, such as the "C" language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.

此外，本发明的实施例还可以是计算机可读存储介质，其上存储有计算机程序指令，所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本发明各种实施例的方法中的步骤。In addition, embodiments of the present invention may also be a computer-readable storage medium having computer program instructions stored thereon. The computer program instructions, when executed by a processor, cause the processor to execute the above-mentioned “exemplary method” part of this specification. The steps in methods according to various embodiments of the invention are described in .

所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、系统或器件，或者任意以上的组合。可读存储介质的更具体的例子（非穷举的列表）包括：具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器（RAM）、只读存储器（ROM）、可擦式可编程只读存储器（EPROM或闪存）、光纤、便携式紧凑盘只读存储器（CD-ROM）、光存储器件、磁存储器件、或者上述的任意合适的组合。The computer-readable storage medium may be any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. Readable storage media may include, for example, but are not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, systems or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

以上结合具体实施例描述了本发明的基本原理，但是，需要指出的是，在本发明中提及的优点、优势、效果等仅是示例而非限制，不能认为这些优点、优势、效果等是本发明的各个实施例必须具备的。另外，上述公开的具体细节仅是为了示例的作用和便于理解的作用，而非限制，上述细节并不限制本发明为必须采用上述具体的细节来实现。The basic principles of the present invention have been described above in conjunction with specific embodiments. However, it should be pointed out that the advantages, advantages, effects, etc. mentioned in the present invention are only examples and not limitations. These advantages, advantages, effects, etc. cannot be considered to be Each embodiment of the present invention must have. In addition, the specific details disclosed above are only for the purpose of illustration and to facilitate understanding, and are not limiting. The above details do not limit the present invention to the fact that the invention must be implemented using the above specific details.

本说明书中各个实施例均采用递进的方式描述，每个实施例重点说明的都是与其它实施例的不同之处，各个实施例之间相同或相似的部分相互参见即可。对于系统实施例而言，由于其与方法实施例基本对应，所以描述的比较简单，相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a progressive manner, and each embodiment focuses on its differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment.

本发明中涉及的器件、系统、设备、系统的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的，可以按任意方式连接、布置、配置这些器件、系统、设备、系统。诸如“包括”、“包含”、“具有”等等的词语是开放性词汇，指“包括但不限于”，且可与其互换使用。这里所使用的词汇“或”和“和”指词汇“和/或”，且可与其互换使用，除非上下文明确指示不是如此。这里所使用的词汇“诸如”指词组“诸如但不限于”，且可与其互换使用。The block diagrams of devices, systems, equipment, and systems involved in the present invention are only illustrative examples and are not intended to require or imply that they must be connected, arranged, or configured in the manner shown in the block diagrams. As those skilled in the art will recognize, these devices, systems, devices, systems may be connected, arranged, and configured in any manner. Words such as "includes," "includes," "having," etc. are open-ended terms that mean "including, but not limited to," and may be used interchangeably therewith. As used herein, the words "or" and "and" refer to the words "and/or" and are used interchangeably therewith unless the context clearly dictates otherwise. As used herein, the word "such as" refers to the phrase "such as, but not limited to," and may be used interchangeably therewith.

可能以许多方式来实现本发明的方法和系统。例如，可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本发明的方法和系统。用于所述方法的步骤的上述顺序仅是为了进行说明，本发明的方法的步骤不限于以上具体描述的顺序，除非以其它方式特别说明。此外，在一些实施例中，还可将本发明实施为记录在记录介质中的程序，这些程序包括用于实现根据本发明的方法的机器可读指令。因而，本发明还覆盖存储用于执行根据本发明的方法的程序的记录介质。The methods and systems of the present invention may be implemented in many ways. For example, the method and system of the present invention can be implemented through software, hardware, firmware, or any combination of software, hardware, and firmware. The above order for the steps of the method is for illustration only, and the steps of the method of the present invention are not limited to the order specifically described above unless otherwise specifically stated. Furthermore, in some embodiments, the present invention can also be implemented as programs recorded in recording media, and these programs include machine-readable instructions for implementing the methods according to the present invention. Thus, the present invention also covers recording media storing a program for executing the method according to the present invention.

还需要指出的是，在本发明的系统、设备和方法中，各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本发明的等效方案。提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本发明。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的，并且在此定义的一般原理可以应用于其他方面而不脱离本发明的范围。因此，本发明不意图被限制到在此示出的方面，而是按照与在此公开的原理和新颖的特征一致的最宽范围。It should also be noted that in the system, device and method of the present invention, each component or each step can be decomposed and/or recombined. These decompositions and/or recombinations should be regarded as equivalent solutions of the present invention. The above description of the disclosed aspects is provided to enable any person skilled in the art to make or use the invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

为了例示和描述的目的已经给出了以上描述。此外，此描述不意图将本发明的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例，但是本领域技术人员将认识到其某些变型、修改、改变、添加和子组合。The foregoing description has been presented for the purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the invention to the form disclosed herein. Although various example aspects and embodiments have been discussed above, those skilled in the art will recognize certain variations, modifications, changes, additions and sub-combinations thereof.

Claims

1. A small-sample image recognition method based on dual-branch mutual learning feature generation, which is characterized by:

Obtain a small sample image set to be identified to form a query set to be identified;

Send each image in the query set to be recognized to the first feature generation module of the pre-built global branch to generate the first semantic feature of each image;

Send each image in the query set to be recognized to the second feature generation module of the pre-constructed local branch to generate the second semantic feature of each image;

Add the first semantic feature and the second semantic feature of each image in the query set to be recognized to determine the third semantic feature of each image in the query set to be recognized;

Calculate the similarity between the third semantic feature of each image in the query set to be recognized and multiple category prototypes in the support set respectively, and determine the image category of each image in the query set to be recognized;

The construction process of category prototypes for each category of the support set is as follows:

Input all images in the support set into the first feature generation module of the global branch in sequence, and output the fourth semantic feature of each image in the support set;

Input all images in the support set into the second feature generation module of the local branch in sequence, and output the fifth semantic feature of each image in the support set;

Add and average the fourth semantic features and fifth semantic features of all images of each category in the support set to determine the category prototype of each category in the support set;

The training process of the first feature generation module of the global branch and the second feature generation module of the local branch is as follows:

Construct a small sample recognition task based on the small sample image training set, where the small sample recognition task includes N categories, and each category includes M sample images;

Using a feature extraction network to extract features from each sample image in the small sample recognition task, and determine the global features and local features of each sample image;

Train the first feature generation module of the global branch through the global features of each sample image in the small sample recognition task;

Train the second feature generation module of the local branch through the global features and the local features of each sample image in the small sample recognition task;

Learn the training information of the global branch and the local branch from each other, and train the first feature generation module and the second feature generation module;

The first feature generation module and the second feature generation module are optimized according to the preset training total loss function.

2. The method according to claim 1, characterized in that training the first feature generation module of the global branch through the global features of each sample image in the small sample recognition task includes:

Mask the global features of M sample images of each category respectively to obtain the global features of one sample image;

Replace the global features of the sample images of each category mask with learnable vectors;

Train the first feature generation module of the global branch based on the learnable vectors replaced by each category and the global features retained by the mask;

The first feature generation module is optimized according to a preset global branch loss function, where the loss function of the global branch includes a global prediction loss function and a global classification loss function.

3. The method according to claim 2, characterized in that the second feature generation module of the local branch is trained through the global features and the local features of each sample image in the small sample recognition task. ,include:

Select local features of a sample image in each category;

Train the second feature generation module according to the local features selected for each category and the preset M learnable vectors;

The second feature generation module is optimized according to a preset local branch loss function, where the local branch loss function includes a local prediction loss function and a local classification loss function.

4. The method according to claim 3, further comprising: calculating KL divergence as a mutual learning loss function for the training information of the global branch and the local branch to learn from each other.

5. The method according to claim 4, wherein the total training loss function is the sum of the global branch loss function, the local branch loss function and the mutual learning loss function.

6. The method according to claim 1, characterized by calculating the similarity between the third semantic feature of each image in the query set to be recognized and the category prototypes of multiple categories in the support set, and determining the Identify the image category for each image in the query set, including:

Calculate the similarity between the third semantic feature of each image in the query set to be identified and the category prototypes of multiple categories in the support set, and determine that each image in the query set to be identified belongs to each image in the query set to be identified. Probability value of category;

The category with the largest probability value corresponding to each image in the query set to be recognized is taken as the image category of the image.

7. A small sample image recognition device based on dual-branch mutual learning feature generation, which is characterized by including:

The acquisition module is used to obtain a small sample image set to be identified to form a query set to be identified;

A first generation module, configured to send each image in the query set to be recognized to a first feature generation module of a pre-constructed global branch to generate the first semantic feature of each image;

A second generation module, configured to send each image in the query set to be recognized to a second feature generation module of a pre-constructed local branch to generate a second semantic feature of each image;

A first determination module, configured to add the first semantic feature and the second semantic feature of each image in the query set to be recognized, and determine the third semantic feature of each image in the query set to be recognized;

a second determination module, configured to separately calculate the similarity between the third semantic feature of each image in the query set to be recognized and multiple category prototypes in the support set, and determine the image category of each image in the query set to be recognized;

The construction process of the category prototype for each category of the support set in the second determination module 250 is as follows:

The first output sub-module is used to sequentially input all images in the support set into the first feature generation module of the global branch, and output the fourth semantic feature of each image in the support set;

The second output submodule is used to input all the images in the support set into the second feature generation module of the local branch in sequence, and output the fifth semantic feature of each image in the support set;

The determination submodule is used to add and average the fourth semantic features and fifth semantic features of all images of each category in the support set to determine the category prototype of each category in the support set;

The training process of the first feature generation module of the global branch and the second feature generation module of the local branch in the first generation module 220 and the second generation module 230 is as follows:

Construct a submodule for constructing a small sample recognition task based on the small sample image training set, where the small sample recognition task includes N categories, and each category includes M sample images;

The extraction submodule is used to use the feature extraction network to extract features from each sample image in the small sample recognition task, and determine the global features and local features of each sample image;

The first training submodule is used to train the first feature generation module of the global branch through the global features of each sample image in the small sample recognition task;

The second training submodule is used to train the second feature generation module of the local branch through the global features and local features of each sample image in the small sample recognition task;

The third training sub-module is used to learn the training information of the global branch and the local branch from each other, and train the first feature generation module and the second feature generation module;

The optimization sub-module is used to optimize the first feature generation module and the second feature generation module according to the preset training total loss function.

8. A computer-readable storage medium, characterized in that the storage medium stores a computer program, and the computer program is used to execute the method described in any one of claims 1-6.