CN114972834B

CN114972834B - Image classification method and device of multi-level multi-classifier

Info

Publication number: CN114972834B
Application number: CN202110516975.XA
Authority: CN
Inventors: 丁小波; 蔡茂贞; 刘井安; 彭琨; 钟地秀; 李小青
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Internet Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Internet Co Ltd
Priority date: 2021-05-12
Filing date: 2021-05-12
Publication date: 2023-09-05
Anticipated expiration: 2041-05-12
Also published as: CN114972834A

Abstract

The present application discloses a multi-level multi-classifier image classification method and device. The method includes: obtaining a feature extraction classifier to perform pre-classification and feature extraction on the target image, and obtain the feature of each convolutional layer in each category. The first feature probability of the map and the feature map; based on the first feature probability, the feature maps of each convolutional layer in the various categories are probability accumulated to obtain each convolution in the various categories The first heat map corresponding to the layer; the first heat map is processed into a uniform size, and spliced into the first multi-level heat map of each category; the first multi-level heat map is input to the multi-level classifier to obtain The classification probability of each layer category output by the multi-level classifier; output the category of the target image according to the quantitative relationship between the maximum probability of the classification probability of the bottom layer and the preset threshold.

Description

Image classification method and device of multi-level multi-classifier

技术领域technical field

本申请涉及图像分类领域，尤其涉及一种多层次多分类器的图像分类方法及装置。The present application relates to the field of image classification, in particular to a multi-level and multi-classifier image classification method and device.

背景技术Background technique

随着科学技术的进步与发展，利用神经网络来进行图像识别或图像分类已经成为一种常用的方式，在现有的图像分类方法中，一般是利用各种类别的分类图像训练一个神经网络，在预测时将待预测图片输入到神经网络中，神经网络会直接输出各个类别的概率，选择其中概率最大的类别作为预测输出。With the advancement and development of science and technology, it has become a common way to use neural networks for image recognition or image classification. In the existing image classification methods, it is generally to use various categories of classified images to train a neural network. When predicting, input the picture to be predicted into the neural network, and the neural network will directly output the probability of each category, and select the category with the highest probability as the predicted output.

但是，当通过神经网络输出所有的类别概率均比较低时，直接选择概率最大的类别作为输出，会导致分类错误的情况出现，进而如何准确地输出待预测图像的类别成为现在需要解决的问题。However, when the probabilities of all categories output by the neural network are relatively low, directly selecting the category with the highest probability as the output will lead to classification errors, and how to accurately output the category of the image to be predicted has become a problem that needs to be solved now.

发明内容Contents of the invention

本申请公开一种多层次多分类器的图像分类方法及装置，以解决目前的图像分类方法在所有类别概率均较低的情况下导致较高的错漏识别的问题。The present application discloses a multi-level multi-classifier image classification method and device, in order to solve the problem of high misidentification caused by the current image classification method when the probabilities of all categories are low.

为了解决上述问题，本申请采用下述技术方案：In order to solve the above problems, the application adopts the following technical solutions:

第一方面，本申请实施例公开一种多层次多分类器的图像分类方法，所述方法包括：获取特征提取分类器对目标图像进行预分类以及特征提取，得到的各类别中每个卷积层的特征图以及所述特征图的第一特征概率；基于所述第一特征概率，将所述各类别中的每个卷积层的特征图进行概率累加，得到所述各类别中的每个卷积层对应的第一热力图；将所述第一热力图处理为统一尺寸，并拼接为各类别的第一多层次热力图；将所述第一多层次热力图输入到多层次分类器，得到所述多层次分类器输出的每层类别的分类概率；根据最底层的分In the first aspect, the embodiment of the present application discloses an image classification method with multi-level and multi-classifiers. The method includes: obtaining a feature extraction classifier to perform pre-classification and feature extraction on the target image, and each convolution in each category obtained The feature map of the layer and the first feature probability of the feature map; based on the first feature probability, the feature maps of each convolutional layer in the various categories are probabilistically accumulated to obtain each of the categories. The first heat map corresponding to each convolutional layer; the first heat map is processed into a uniform size, and spliced into the first multi-level heat map of each category; the first multi-level heat map is input to the multi-level classification device to obtain the classification probability of each layer category output by the multi-level classifier; according to the classification of the bottom layer

类概率中的最大概率与预设阈值之间的数量关系，输出所述目标图像的类别。The quantitative relationship between the maximum probability in the class probability and the preset threshold, and output the class of the target image.

第二方面，本申请实施例公开一种多层次多分类器的图像分类装置，所述装置包括：获取模块，用于获取特征提取分类器对目标图像进行预分类以及特征提取，得到的各类别中每个卷积层的特征图以及所述特征图的第一特征概率；累加模块，用于基于所述第一特征概率，将所述各类别中的每个卷积层的特征图进行概率累加，得到所述各类别中的每个卷积层对应的第一热力图；拼接模块，用于将所述第一热力图处理为统一尺寸，并拼接为各类别的第一多层次热力图；输入模块，用与将所述第一多层次热力图输入到多层次分类器，得到所述多层次分类器输出的每层类别的分类概率；输出模块，用于根据最底层的分类概率中的最大概率与预设阈值之间的数量关系，输出所述目标图像的类别。In the second aspect, the embodiment of the present application discloses a multi-level multi-classifier image classification device, the device includes: an acquisition module, which is used to obtain the pre-classification and feature extraction of the target image by the feature extraction classifier. The feature map of each convolutional layer in the feature map and the first feature probability of the feature map; the accumulation module is used to calculate the probability of the feature map of each convolution layer in each category based on the first feature probability Accumulate to obtain the first thermal map corresponding to each convolutional layer in each category; the splicing module is used to process the first thermal map into a uniform size and splicing it into the first multi-level thermal map of each category ; The input module is used to input the first multi-level heat map into the multi-level classifier to obtain the classification probability of each layer category output by the multi-level classifier; the output module is used for according to the classification probability of the bottom layer The quantitative relationship between the maximum probability of , and the preset threshold, outputs the category of the target image.

本申请实施例公开本申请采用的技术方案能够达到以下有益效果：The embodiment of the present application discloses that the technical solution adopted by the present application can achieve the following beneficial effects:

本申请实施例公开一种多层次多分类器的图像分类方法，通过特征提取分类器对目标图像进行预分类和特征提取，并将特征提取结果处理第一热力图，在将第一热力图拼接为第一多层次热力图后，将第一多层次热力图输入多层次多分类器，通过多层次多分类器对目标图像进行细分类，得到每层类别的分类概率，并根据最底层的分类概率中的最大概率与预设阈值的数量关系，输出目标图像的类别，通过这种多层次多分类方法，在提升目标图像分类的准确性的同时，可以较好地解决目前的图像分类方法在所有类别概率均较低的情况下导致较高的错漏识别的问题。The embodiment of the present application discloses a multi-level and multi-classifier image classification method. The target image is pre-classified and feature extracted by the feature extraction classifier, and the feature extraction result is processed into the first heat map, and the first heat map is stitched together. After the first multi-level heat map is obtained, the first multi-level heat map is input into the multi-level multi-classifier, and the target image is subdivided by the multi-level multi-classifier to obtain the classification probability of each layer category, and according to the classification of the bottom layer The relationship between the maximum probability in the probability and the preset threshold value is used to output the category of the target image. Through this multi-level and multi-classification method, while improving the accuracy of the target image classification, it can better solve the problem of the current image classification method. A problem with low probability of all classes leading to high false misses.

附图说明Description of drawings

图1为本申请实施例公开的一种多层次多分类器的图像分类方法的流程示意图；FIG. 1 is a schematic flow diagram of a multi-level multi-classifier image classification method disclosed in an embodiment of the present application;

图2为本申请实施例公开的另一种多层次多分类器的图像分类方法的流程示意图；FIG. 2 is a schematic flow diagram of another multi-level multi-classifier image classification method disclosed in the embodiment of the present application;

图3为本申请实施例公开的一种特征提取分类器和多层次多分类器的预训练过程的示意图；3 is a schematic diagram of a feature extraction classifier and a multi-level multi-classifier pre-training process disclosed in the embodiment of the present application;

图4为本申请实施例公开的一种多层次多分类器的图像分类装置的结构示意图。Fig. 4 is a schematic structural diagram of a multi-level multi-classifier image classification device disclosed in an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象，而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施，且“第一”、“第二”等所区分的对象通常为一类，并不限定对象的个数，例如第一对象可以是一个，也可以是多个。此外，说明书以及权利要求中“和/或”表示所连接对象的至少其中之一，字符“/”，一般表示前后关联对象是一种“或”的关系。The terms "first", "second" and the like in the specification and claims of the present application are used to distinguish similar objects, and are not used to describe a specific sequence or sequence. It should be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application can be practiced in sequences other than those illustrated or described herein, and that references to "first," "second," etc. distinguish Objects are generally of one type, and the number of objects is not limited. For example, there may be one or more first objects. In addition, "and/or" in the specification and claims means at least one of the connected objects, and the character "/" generally means that the related objects are an "or" relationship.

下面结合附图，通过具体的实施例及其应用场景对本申请实施例提供的一种多层次多分类器的图像分类方法及装置进行详细地说明。A multi-level and multi-classifier image classification method and device provided in the embodiments of the present application will be described in detail below through specific embodiments and application scenarios with reference to the accompanying drawings.

图1为本申请实施例提供的一种多层次多分类器的图像分类方法的流程示意图，该方法可以由电子设备执行，换言之，该方法可以由安装在电子设备的软件或硬件来执行。如图1所示，本申请实施例公开的一种多层次多分类器的图像分类方法可以包括以下步骤：FIG. 1 is a schematic flowchart of a multi-level and multi-classifier image classification method provided by an embodiment of the present application. The method can be executed by an electronic device. In other words, the method can be executed by software or hardware installed in the electronic device. As shown in Figure 1, a multi-level multi-classifier image classification method disclosed in the embodiment of the present application may include the following steps:

S110：获取特征提取分类器对目标图像进行预分类以及特征提取，得到的各类别中每个卷积层的特征图以及特征图的第一特征概率。S110: Acquiring the feature extraction classifier to perform pre-classification and feature extraction on the target image, and obtain the feature map of each convolutional layer in each category and the first feature probability of the feature map.

将目标图像输入特征提取分类器后，特征提取分类器中的卷积层提取目标图像的特征，提取特征后，池化层对目标图像进行特征压缩，进而提取目标图像的主要特征，提取主要特征后，全连接层连接所有的特征，并将输出值送给分类器，由分类器进行分类，使得输出层得到目标图像的各类别中每个卷积层对应的特征图以及特征图的第一特征概率，分类器的种类有多种，例如softmax分类器，具体的，通过softmax函数，可以将全连接层的输出值转化为范围在[0，1]且和为1的概率分布，即第一特征概率。After the target image is input into the feature extraction classifier, the convolutional layer in the feature extraction classifier extracts the features of the target image. After extracting the features, the pooling layer compresses the features of the target image, and then extracts the main features of the target image. Finally, the fully connected layer connects all the features, and sends the output value to the classifier, and the classifier performs classification, so that the output layer obtains the feature map corresponding to each convolution layer in each category of the target image and the first feature map of the feature map. Feature probability, there are many types of classifiers, such as softmax classifiers. Specifically, through the softmax function, the output value of the fully connected layer can be converted into a probability distribution in the range [0, 1] and the sum is 1, that is, the first A characteristic probability.

需要说明的是，特征提取分类器相当于一个卷积神经网络，卷积神经网络中包括：输入层、卷积层、池化层，全连接层以及输出层。It should be noted that the feature extraction classifier is equivalent to a convolutional neural network, which includes: input layer, convolutional layer, pooling layer, fully connected layer and output layer.

S120：基于第一特征概率，将各类别中的每个卷积层的特征图进行概率累加，得到各类别中的每个卷积层对应的第一热力图。S120: Based on the first feature probability, perform probability accumulation of feature maps of each convolutional layer in each category to obtain a first heat map corresponding to each convolutional layer in each category.

S130：将第一热力图处理为统一尺寸，并拼接为各类别的第一多层次热力图。S130: Processing the first heat map into a uniform size, and splicing it into the first multi-level heat map of each category.

通过将目标图像输入特征提取分类器，可以得到多个卷积层中每个特征图对应的权重(即第一特征概率)，将每个卷积层对应的特征图进行权重累加，可以得到第一热力图，将第一热力图处理为统一尺寸，一种可以实现的方式中，可以将多个第一热力图利用ROIPooling(Region of Interest Pooling，感兴趣区域池化)将多个第一热力图处理为统一尺寸，使得第一热力图可以拼接为各类别的第一多层次热力图。By inputting the target image into the feature extraction classifier, the weight corresponding to each feature map in multiple convolutional layers (that is, the first feature probability) can be obtained, and the weights of the feature maps corresponding to each convolutional layer can be accumulated to obtain the second A heat map, processing the first heat map into a uniform size, in an achievable way, multiple first heat maps can be pooled using ROIPooling (Region of Interest Pooling, interest area pooling) The graphs are processed to a uniform size, so that the first heat map can be spliced into the first multi-level heat map of each category.

S140：将第一多层次热力图输入到多层次分类器，得到多层次分类器输出的每层类别的分类概率。S140: Input the first multi-level heat map into the multi-level classifier, and obtain the classification probability of each class output by the multi-level classifier.

S150：根据最底层的分类概率中的最大概率与预设阈值之间的数量关系，输出目标图像的类别。S150: Output the category of the target image according to the quantitative relationship between the maximum probability of the classification probabilities at the bottom layer and a preset threshold.

通过将组合后的第一多层次热力图输入多层次分类器后，多层次分类器中具有多层分类器，每层分类器相当于一个卷积神经网络，第一多层次热力图经过每层的分类器的处理，可以得到目标图像的每层类别的分类概率，根据最底层的分类概率中的最大概率与预设阈值的数量大小关系，可以输出最高概率的类别。通过多层次多分类的方式，可以避免在对目标图像细分类时，得到的概率均较低时导致的错漏识别问题。After inputting the combined first multi-level heat map into the multi-level classifier, the multi-level classifier has multiple layers of classifiers, each layer of classifier is equivalent to a convolutional neural network, and the first multi-level heat map passes through each layer The classification probability of each layer category of the target image can be obtained through the processing of the classifier, and the category with the highest probability can be output according to the relationship between the maximum probability of the classification probability at the bottom layer and the number of preset thresholds. Through the method of multi-level and multi-classification, it is possible to avoid the problem of wrong and missed recognition caused when the probability of obtaining the sub-classification of the target image is low.

一种可以实现的方式中，将第一多层次热力图输入到多层次分类器，得到多层次分类器输出的每层类别的分类概率，包括：In an achievable manner, the first multi-level heat map is input to the multi-level classifier, and the classification probability of each class output by the multi-level classifier is obtained, including:

将第一多层次热力图输入多层次分类器中的每层分类器，得到每层类别的第一分类概率。The first multi-level heat map is input to each layer classifier in the multi-layer classifier to obtain the first classification probability of each layer category.

基于第一特征概率，将每层类别的第一分类概率和第一特征概率进行概率累加，得到第二分类概率。Based on the first feature probability, the first classification probability and the first feature probability of each layer category are probabilistically accumulated to obtain the second classification probability.

对第二分类概率进行归一化处理得到每层类别的分类概率。The second classification probability is normalized to obtain the classification probability of each layer category.

在多层次分类器接收到第一多层次热力图之后，可以得到当前层次的各个类别的概率，即第一分类概率，基于特征提取分类器对目标图像进行预分类操作所得到的第一特征概率，对当前层次各个类别的第一分类概率和第一特征概率进行概率累加，可以得到目标图像在当前层次的概率，即第二分类概率，之后通过对第二分类概率进行归一化处理，可以得到多层次分类器输出的每层类别的分类概率。After the multi-level classifier receives the first multi-level heat map, the probability of each category of the current level can be obtained, that is, the first classification probability, based on the first feature probability obtained by the pre-classification operation of the target image by the feature extraction classifier , the probability of the first classification probability and the first feature probability of each category of the current level are accumulated to obtain the probability of the target image at the current level, that is, the second classification probability, and then by normalizing the second classification probability, it can be Get the classification probability of each layer class output by the multi-level classifier.

在得到多层次多分类器输出的每层类别的分类概率之后，可以根据最底层的分类概率中的最大概率与预设阈值之间的数量关系，判断并输出目标图像的类别，一种可以实现的方式中，根据最底层的分类概率中的最大概率与预设阈值之间的数量关系，输出目标图像的类别，包括：After obtaining the classification probability of each layer category output by the multi-level multi-classifier, the category of the target image can be judged and output according to the quantitative relationship between the maximum probability of the bottom-level classification probability and the preset threshold, a method that can be realized In the way, according to the quantitative relationship between the maximum probability of the lowest classification probability and the preset threshold, the category of the target image is output, including:

在最大概率大于预设阈值的情况下，输出最大概率对应的类别作为目标图像的类别。When the maximum probability is greater than the preset threshold, the category corresponding to the maximum probability is output as the category of the target image.

在最大概率不大于预设阈值的情况下，判断当前层次是否是最高层，如果是，输出所述最大概率对应的类别作为所述目标图像的类别；否则，结合上一层分类器输出的分类概率，重新计算最底层中各类别的第三分类概率，其中，在第三分类概率中的最大概率大于预设阈值的情况下，输出第三分类概率中的最大概率对应的类别作为目标图像的类别。When the maximum probability is not greater than the preset threshold, judge whether the current level is the highest level, if so, output the category corresponding to the maximum probability as the category of the target image; otherwise, combine the classification output by the previous classifier Probability, recalculate the third classification probability of each category in the bottom layer, wherein, when the maximum probability in the third classification probability is greater than the preset threshold, output the category corresponding to the maximum probability in the third classification probability as the target image category.

根据多层次分类器得到的每层的各类别的分类概率，首先可以从最底层进行判断，选择各类别中的最大概率，判断最大概率是否大于预设阈值，如果是，这说明分类器具有较大的把握判断该类别，进而可以直接选择具有最大概率的类别作为输出。否则，则结合上一层输出的分类概率进行判断，重新计算最底层各个类别的概率(即第三分类概率)，在第三分类概率中的最大概率大于预设阈值的情况下，输出第三分类概率中的最大概率对应的类别作为目标图像的类别。According to the classification probabilities of each category in each layer obtained by the multi-level classifier, we can first judge from the bottom layer, select the maximum probability in each category, and judge whether the maximum probability is greater than the preset threshold. If so, it means that the classifier has relatively high The category is judged with high confidence, and the category with the highest probability can be directly selected as the output. Otherwise, judge based on the classification probabilities output by the previous layer, recalculate the probability of each category at the bottom layer (that is, the third classification probability), and output the third classification probability when the maximum probability of the third classification probability is greater than the preset threshold The category corresponding to the maximum probability in the classification probability is used as the category of the target image.

图2为另一种多层次多分类器的图像分类方法的流程示意图，如图2所示，本申请实施例公开另一种多层次多分类器的图像分类方法，该方法包括以下步骤：Fig. 2 is a schematic flow chart of another image classification method with multi-level and multi-classifiers. As shown in Fig. 2, the embodiment of the present application discloses another image classification method with multi-level and multi-classifiers. The method includes the following steps:

S210：将目标图像输入特征提取分类器进行特征提取以及类别判断，得到各个类别的概率p_i(i＝1Λk₁)，其中k₁为特征提取分类器的类别数目。S210: Input the target image into the feature extraction classifier for feature extraction and category judgment, and obtain the probability p _i (i=1Λk ₁ ) of each category, where k ₁ is the number of categories of the feature extraction classifier.

S220：计算各个类别对应的多个卷积层的热力图。S220: Calculating heat maps of multiple convolutional layers corresponding to each category.

S230：将多个热力图利用ROI Pooling处理为统一尺寸，并组合成一张多层次热力图。S230: Using ROI Pooling to process multiple heat maps into a uniform size, and combine them into a multi-level heat map.

S240：将各个类别的多层次热力图输入多层次分类器中的每层分类器中，得到当前层次的各个类别的概率p_ij(i＝1Λk₁,j＝1Λk₂)，其中k₂为当前层次的类别数目。S240: Input the multi-level heat map of each category into each layer classifier in the multi-level classifier, and obtain the probability p _ij (i=1Λk ₁ , j=1Λk ₂ ) of each category at the current level, where k ₂ is the current The number of categories in the hierarchy.

S250：对当前层次各个类别的概率进行概率累加，得到 S250: Accumulate the probability of each category of the current level to obtain

S260：对p_j进行归一化处理，得到最终的分类概率p'_j， S260: Perform normalization processing on p _j to obtain the final classification probability p' _j ,

S270：根据每层的分类概率，选择最底层的分类概率中的最大概率p。S270: According to the classification probability of each layer, select the maximum probability p among the classification probabilities of the bottom layer.

S280：判断最大概率p是否大于预设阈值h；S280: Determine whether the maximum probability p is greater than a preset threshold h;

如果是，则直接选择具有最大概率的类别作为输出；If yes, directly select the category with the largest probability as the output;

否则，结合上一层输出的分类概率进行判断，重新计算出最底层各个类别的概率p_m，p_m＝p_np_mn，其中，p_mn为当前最底层各个类别的概率，p_n为上层分类的类别概率，p_m为重新计算的最底层各个类别的概率，对p_m进行归一化处理，得到 Otherwise, judge based on the classification probabilities output by the previous layer, and recalculate the probability p _m of each category at the bottom layer, p _m =p _n p _mn , where p _mn is the probability of each category at the bottom layer at the moment, and p _n is the probability of each category at the bottom layer The category probability of the classification, p _m is the recalculated probability of each category at the bottom layer, and p _m is normalized to obtain

S290：判断当前层次是否是最高层，如果是，则直接输出最高类别的概率。S290: Determine whether the current level is the highest level, and if so, directly output the probability of the highest category.

例如，最底层分类可以为猫(0.4)、狗(0.1)、花(0.4)、草(0.1)，预设阈值为0.5，此时无法判别，而上层分类为动物(0.9)、植物(0.1)，则根据p_m＝p_np_mn重新计算得到猫(0.9*0.4＝0.36)、狗(0.9*0.1＝0.09)、花(0.1*0.4＝0.04)、草(0.1*0.1＝0.01)，根据归一化后得到猫(0.72)、狗(0.18)、花(0.08)、草(0.02)，此时，猫的概率大于预设阈值0.5，则输出类别为猫。For example, the bottom-level classification can be cat (0.4), dog (0.1), flower (0.4), grass (0.1), and the preset threshold is 0.5, which cannot be distinguished at this time, while the upper-level classification is animal (0.9), plant (0.1 ), then according to p _m =p _n p _mn recalculate to get cat (0.9*0.4=0.36), dog (0.9*0.1=0.09), flower (0.1*0.4=0.04), grass (0.1*0.1=0.01), according to After normalization, cat (0.72), dog (0.18), flower (0.08), and grass (0.02) are obtained. At this time, if the probability of cat is greater than the preset threshold of 0.5, the output category is cat.

为了可以更好地保证预测的准确性，一种可以实现的方式中，在获取特征提取分类器对目标图像进行预分类以及特征提取，得到的各类别中每个卷积层的特征图以及特征图的第一特征概率之前，该方法还包括：In order to better ensure the accuracy of the prediction, in an achievable way, the target image is pre-classified and feature extracted by the feature extraction classifier, and the feature map and feature map of each convolutional layer in each category are obtained. Before the first feature probabilities of the graph, the method also includes:

预训练特征提取分类器和多层次分类器。Pretrained feature extraction classifiers and multi-level classifiers.

通过对特征提取分类器和多层次分类器的预训练，使得在实际预测目标图像时，可以输出较为准确的类别。Through the pre-training of the feature extraction classifier and multi-level classifier, it can output more accurate categories when actually predicting the target image.

一种可以实现的方式中，预训练特征提取分类器和多层次分类器，可以包括以下步骤：In an achievable manner, the pre-training feature extraction classifier and multi-level classifier may include the following steps:

将目标类别的训练图像输入特征提取分类器中的卷积神经网络，输出每层卷积神经网络层的特征图的特征向量。The training images of the target category are input to the convolutional neural network in the feature extraction classifier, and the feature vectors of the feature maps of each layer of the convolutional neural network are output.

选择最后n层卷积神经网络层的特征向量进行全局平均池化，并将池化后的特征向量拼接为第一特征向量，其中，n为大于1的整数。The feature vectors of the last n convolutional neural network layers are selected for global average pooling, and the pooled feature vectors are spliced into the first feature vector, where n is an integer greater than 1.

将第一特征向量与输出层全连接并进行softmax分类。Fully connect the first feature vector to the output layer and perform softmax classification.

通过将最后几层的卷积神经网络层的特征向量拼接为第一特征向量，使得特征提取分类器可以基于最后几层卷积神经网络的特征向量进行预训练，并在预训练完成的情况下，对训练图像进行特征提取和预分类。By concatenating the feature vectors of the last few layers of convolutional neural network layers into the first feature vector, the feature extraction classifier can be pre-trained based on the feature vectors of the last few layers of convolutional neural networks, and when the pre-training is completed , to perform feature extraction and pre-classification on the training images.

基于预训练好的特征提取分类器，对训练图像进行特征提取，得到每个卷积神经网络层的特征图以及特征图的第二特征概率。Based on the pre-trained feature extraction classifier, feature extraction is performed on the training image to obtain the feature map of each convolutional neural network layer and the second feature probability of the feature map.

基于第二特征概率，将与目标类别对应的每个卷积神经网络层的特征图进行概率累加，得到每个卷积神经网络层对应的第二热力图。Based on the second feature probability, the feature maps of each convolutional neural network layer corresponding to the target category are probabilistically accumulated to obtain a second heat map corresponding to each convolutional neural network layer.

将第二热力图处理为统一尺寸，并拼接为一张多层次热力图。Process the second heat map into a uniform size and stitch it into a multi-level heat map.

将多层次热力图输入多层次分类器中的每层分类器，对多层次分类器中的每层分类器进行训练。The multi-level heat map is input to each layer classifier in the multi-level classifier, and each layer classifier in the multi-level classifier is trained.

基于预训练完成的特征提取分类器，可以得到多个卷积层中，每个特征图对应的权重(及第二特征概率)，选择目标类别的权重，将每个卷积层的特征图进行权重累加，可以得到每个卷积层的第二热力图，将第二热力图进行拼接，得到一张多层次热力图，然后，基于这张热力图，可以分别训练每层的分类器。Based on the pre-trained feature extraction classifier, the weight (and the second feature probability) corresponding to each feature map in multiple convolutional layers can be obtained, the weight of the target category is selected, and the feature map of each convolutional layer is The weights are accumulated to obtain the second heat map of each convolutional layer, and the second heat map is spliced to obtain a multi-level heat map. Then, based on this heat map, the classifiers of each layer can be trained separately.

具体的，图3示出特征提取分类器和多层次多分类器的预训练过程示意图，在将图像输入特征提取分类器，经过多层卷积神经网络层处理后，选取其中的几层卷积神经网络提取的特征向量，对其进行全局平均池化将全局平均池化后的特征向量进行特征拼接，从而得到第一特征向量，之后进行全连接，并将全连接后的输出值通过softmax进行分类，进而可以实现对特征提取分类器的预训练，以及对图像的预分类和特征提取。基于训练好的特征提取分类器，可以得到多个卷积层中，每个特征图所对应的权重，选择图像的目标类别的权重，将每个卷积层的特征图进行权重累加后，可以得到每个卷积层的热力图(即上文所述的第二热力图)，将多个热力图拼接为一张多层次热力图后输入多层次多分类器，基于这张多层次热力图，多层次多分类器中的多个卷积神经网络可以得到训练，即每层的分类器可以得到训练。Specifically, Fig. 3 shows a schematic diagram of the pre-training process of the feature extraction classifier and the multi-level multi-classifier. After inputting the image into the feature extraction classifier, after being processed by a multi-layer convolutional neural network layer, several layers of convolution are selected. The feature vector extracted by the neural network is subjected to global average pooling, and the feature vector after the global average pooling is spliced to obtain the first feature vector, and then the full connection is performed, and the output value after the full connection is performed through softmax. Classification, and then can realize the pre-training of the feature extraction classifier, as well as the pre-classification and feature extraction of the image. Based on the trained feature extraction classifier, the weight corresponding to each feature map in multiple convolutional layers can be obtained, the weight of the target category of the selected image, and the weights of the feature maps of each convolutional layer can be accumulated. Get the heat map of each convolutional layer (that is, the second heat map mentioned above), stitch multiple heat maps into a multi-level heat map and input it into a multi-level multi-classifier, based on this multi-level heat map , multiple convolutional neural networks in a multi-level multi-classifier can be trained, that is, a classifier for each layer can be trained.

基于上文所述的一种多层次多分类器的图像分类方法，本申请实施例公开一种多层次多分类器的图像分类装置400，如图4所示，该装置包括：Based on the above-mentioned image classification method with multi-level and multi-classifiers, the embodiment of the present application discloses a multi-level and multi-classifier image classification device 400, as shown in FIG. 4 , the device includes:

获取模块410，用于获取特征提取分类器对目标图像进行预分类以及特征提取，得到的各类别中每个卷积层的特征图以及特征图的第一特征概率。The acquiring module 410 is configured to acquire the feature map of each convolutional layer in each category and the first feature probability of the feature map obtained by performing pre-classification and feature extraction on the target image by the feature extraction classifier.

累加模块420，用于基于第一特征概率，将各类别中的每个卷积层的特征图进行概率累加，得到各类别中的每个卷积层对应的第一热力图。The accumulation module 420 is configured to, based on the first feature probability, perform probability accumulation of the feature maps of each convolutional layer in each category to obtain a first heat map corresponding to each convolutional layer in each category.

拼接模块430，用于将第一热力图处理为统一尺寸，并拼接为各类别的第一多层次热力图。The splicing module 430 is configured to process the first heat map into a uniform size and splice it into first multi-level heat maps of each category.

输入模块440，用与将第一多层次热力图输入到多层次分类器，得到多层次分类器输出的每层类别的分类概率。The input module 440 is used to input the first multi-level heat map into the multi-level classifier to obtain the classification probability of each class output by the multi-level classifier.

输出模块450，用于根据最底层的分类概率中的最大概率与预设阈值之间的数量关系，输出目标图像的类别。The output module 450 is configured to output the category of the target image according to the quantitative relationship between the maximum probability of the lowest classification probability and a preset threshold.

本申请实施例公开一种多层次多分类器的图像分类装置，通过特征提取分类器对目标图像进行预分类和特征提取，并将特征提取结果处理第一热力图，在将第一热力图拼接为第一多层次热力图后，将第一多层次热力图输入多层次多分类器，通过多层次多分类器，可以对目标图像进行细分类，得到每层类别的分类概率，并根据最底层的分类概率中的最大概率与预设阈值的数量关系，输出目标图像的类别，通过这种多层次多分类方法，在提升目标图像分类的准确性的同时，可以较好地解决目前的图像分类方法在所有类别概率均较低的情况下导致较高的错漏识别的问题。The embodiment of the present application discloses a multi-level multi-classifier image classification device, which pre-classifies the target image and extracts features through the feature extraction classifier, and processes the feature extraction results into the first heat map, and stitches the first heat map After creating the first multi-level heat map, input the first multi-level heat map into the multi-level multi-classifier. Through the multi-level multi-classifier, the target image can be subdivided to obtain the classification probability of each layer category, and according to the bottom layer The relationship between the maximum probability of the classification probability and the preset threshold value is used to output the category of the target image. Through this multi-level and multi-classification method, while improving the accuracy of the target image classification, the current image classification can be better solved. method leads to a high rate of false-miss identification problems when all class probabilities are low.

一种可以实现的方式中，输入模块440用于：In a possible implementation manner, the input module 440 is used for:

在得到多层次多分类器输出的每层类别的分类概率之后，可以根据最底层的分类概率中的最大概率与预设阈值之间的数量关系，判断并输出目标图像的类别，一种可以实现的方式中，输出模块450可以用于：After obtaining the classification probability of each layer category output by the multi-level multi-classifier, the category of the target image can be judged and output according to the quantitative relationship between the maximum probability of the bottom-level classification probability and the preset threshold, a method that can be realized In a manner, the output module 450 can be used to:

通过这种方式，可以根据最底层各类别的最大概率与预设阈值之间的数量关系，并结合上一层输出的概率进行判断，直到最大概率大于预设阈值，输出最大概率的类别。In this way, it can be judged according to the quantitative relationship between the maximum probability of each category at the bottom layer and the preset threshold, combined with the output probability of the previous layer, until the maximum probability is greater than the preset threshold, and the category with the highest probability is output.

一种可以实现的方式中，该装置还包括：In a manner that can be implemented, the device also includes:

预训练模块，用于在获取模块410之前，预训练特征提取分类器和多层次分类器。The pre-training module is used to pre-train the feature extraction classifier and the multi-level classifier before the acquisition module 410 .

在进一步的技术方案中，预训练模块用于：In a further technical solution, the pre-training module is used for:

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外，需要指出的是，本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能，还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能，例如，可以按不同于所描述的次序来执行所描述的方法，并且还可以添加、省去、或组合各种步骤。另外，参照某些示例所描述的特征可在其他示例中被组合。It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element. In addition, it should be pointed out that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions are performed, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

本申请上文实施例中重点描述的是各个实施例之间的不同，各个实施例之间不同的优化特征只要不矛盾，均可以组合形成更优的实施例，考虑到行文简洁，在此则不再赘述。The above-mentioned embodiments of this application focus on the differences between the various embodiments. As long as the different optimization features of the various embodiments are not contradictory, they can be combined to form a better embodiment. Considering the simplicity of the text, here No longer.

以上所述仅为本申请的实施例而已，并不用于限制本申请。对于本领域技术人员来说，本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等，均应包含在本申请的权利要求范围之内。The above descriptions are only examples of the present application, and are not intended to limit the present application. For those skilled in the art, various modifications and changes may occur in this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application shall be included within the scope of the claims of the present application.

Claims

1. An image classification method of a multi-level multi-classifier, the method comprising:

the method comprises the steps of obtaining a feature extraction classifier to pre-classify a target image and extracting features, and obtaining a feature image of each convolution layer in each class and a first feature probability of the feature image;

based on the first feature probability, carrying out probability accumulation on the feature map of each convolution layer in each category to obtain a first thermodynamic diagram corresponding to each convolution layer in each category;

processing the first thermodynamic diagram into uniform size and splicing the first thermodynamic diagram into a first multi-level thermodynamic diagram of each category;

inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier;

outputting the category of the target image according to the quantity relation between the maximum probability in the lowest-layer classification probability and a preset threshold value;

the step of inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier comprises the following steps:

inputting the first multi-level thermodynamic diagram into each layer of classifier in the multi-level classifier to obtain a first classification probability of each layer of class;

based on the first feature probability, carrying out probability accumulation on the first classification probability of each layer of category and the first feature probability to obtain a second classification probability;

normalizing the second classification probability to obtain the classification probability of each layer of class;

wherein outputting the category of the target image according to the quantitative relation between the maximum probability of the lowest-layer classification probability and a preset threshold value comprises:

outputting a category corresponding to the maximum probability as the category of the target image under the condition that the maximum probability is larger than the preset threshold value;

judging whether the current layer is the highest layer or not under the condition that the maximum probability is not greater than the preset threshold value, and if so, outputting a category corresponding to the maximum probability as the category of the target image;

otherwise, calculating third classification probability of each class in the bottommost layer again by combining the classification probability output by the classifier of the previous layer, wherein the class corresponding to the maximum probability in the third classification probability is output as the class of the target image under the condition that the maximum probability in the third classification probability is larger than the preset threshold value;

before the feature extraction classifier performs pre-classification and feature extraction on the target image, and the obtained feature map of each convolution layer in each class and the first feature probability of the feature map, the method further includes:

pre-training the feature extraction classifier and the multi-level classifier;

wherein said pre-training said feature extraction classifier and said multi-level classifier comprises:

inputting training images of target categories into the convolutional neural network in the feature extraction classifier, and outputting feature vectors of feature graphs of each convolutional neural network layer;

selecting the feature vectors of the last n convolutional neural network layers to carry out global average pooling, and splicing the pooled feature vectors into a first feature vector, wherein n is an integer greater than 1;

fully connecting the first feature vector with an output layer and classifying softmax;

performing feature extraction on the training image based on the pre-trained feature extraction classifier to obtain a feature map of each convolutional neural network layer and a second feature probability of the feature map;

based on the second feature probability, carrying out probability accumulation on the feature map of each convolutional neural network layer corresponding to the target category to obtain a second thermodynamic diagram corresponding to each convolutional neural network layer;

processing the second thermodynamic diagram into uniform size and splicing the uniform size into a multi-level thermodynamic diagram;

and inputting the multi-level thermodynamic diagram into each layer of classifier in the multi-level classifier, and training each layer of classifier in the multi-level classifier.

2. An image classification apparatus of a multi-level multi-classifier, the apparatus comprising:

the acquisition module is used for acquiring a feature image of each convolution layer in each class and a first feature probability of the feature image, wherein the feature image is obtained by pre-classifying a target image by the feature extraction classifier and extracting features;

the accumulation module is used for carrying out probability accumulation on the feature graphs of each convolution layer in each category based on the first feature probability to obtain a first thermodynamic diagram corresponding to each convolution layer in each category;

the splicing module is used for processing the first thermodynamic diagrams into uniform sizes and splicing the first thermodynamic diagrams into first multi-level thermodynamic diagrams of various types;

the input module is used for inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier;

the output module is used for outputting the category of the target image according to the quantity relation between the maximum probability in the lowest-layer classification probability and a preset threshold value;

wherein, the input module is used for:

wherein, output module is used for:

the pre-training module is used for pre-training the feature extraction classifier and the multi-level classifier before the acquisition module;

wherein, the pre-training module is used for: