CN115496948A - A network-supervised fine-grained image recognition method and system based on deep learning - Google Patents

A network-supervised fine-grained image recognition method and system based on deep learning Download PDF

Info

Publication number
CN115496948A
CN115496948A CN202211167812.6A CN202211167812A CN115496948A CN 115496948 A CN115496948 A CN 115496948A CN 202211167812 A CN202211167812 A CN 202211167812A CN 115496948 A CN115496948 A CN 115496948A
Authority
CN
China
Prior art keywords
graph
feature
noise label
feature map
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211167812.6A
Other languages
Chinese (zh)
Other versions
CN115496948B (en
Inventor
林坚满
陈添水
林坚涛
杨志景
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202211167812.6A priority Critical patent/CN115496948B/en
Publication of CN115496948A publication Critical patent/CN115496948A/en
Application granted granted Critical
Publication of CN115496948B publication Critical patent/CN115496948B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a network supervision fine-grained image recognition method and system based on deep learning, which are characterized in that an example graph containing noise label features is obtained by carrying out feature processing on an input image containing a noise label, a graph prototype is constructed for each category by utilizing the example graph containing the label, a preset graph matching neural network model is trained by utilizing the obtained example graph containing the noise label features and the graph prototype, and the optimized graph matching neural network model is utilized to recognize fine-grained images; the method identifies the fine-grained image based on deep learning, and by introducing the image prototype and the example image containing the noise label characteristics for comparison learning, the noise label can be effectively corrected and the outlier sample can be eliminated, so that the efficiency and the accuracy of identifying the fine-grained image are obviously improved.

Description

一种基于深度学习的网络监督细粒度图像识别方法和系统A network-supervised fine-grained image recognition method and system based on deep learning

技术领域technical field

本发明涉及图像识别技术领域,更具体地,涉及一种基于深度学习的网络监督细粒度图像识别方法和系统。The present invention relates to the technical field of image recognition, and more specifically, to a network-supervised fine-grained image recognition method and system based on deep learning.

背景技术Background technique

细粒度图像识别旨在识别给定对象类别的子类,例如不同种类的鸟类以及飞机和汽车,在智慧建设以及互联网等领域有着重要的科学意义和应用价值。近年来,随着深度学习的不断发展,细粒度图像识别取得了很大的进展。Fine-grained image recognition aims to identify subcategories of a given object category, such as different kinds of birds as well as airplanes and cars, and has important scientific significance and application value in the fields of smart construction and the Internet. In recent years, with the continuous development of deep learning, great progress has been made in fine-grained image recognition.

目前大部分算法主要采用以优质数据驱动的深度学习来实现细粒度图像识别,在很大程度上依赖于大规模的人工标注的数据,而这些数据集的收集之难以及数据标注成本之高已经成为制约其推广和普及的瓶颈。At present, most algorithms mainly use high-quality data-driven deep learning to achieve fine-grained image recognition, which largely relies on large-scale manually labeled data, and the difficulty of collecting these datasets and the high cost of data labeling have already Become a bottleneck restricting its promotion and popularization.

在互联网高速发展的当下,网络上有大量的弱标签数据可用于缓解目前细粒度图像识别算法对人工标注的依赖,即将网络检索所得的数据用于训练神经网络模型。然而,网络检索的数据中包含一定比例的噪声标签,这会对模型的训练产生不良影响。此外,细粒度图像中固有的类间方差小和类内方差大的特点进一步提高了识别难度。With the rapid development of the Internet, a large amount of weakly labeled data on the Internet can be used to alleviate the current fine-grained image recognition algorithm's dependence on manual annotation, that is, the data retrieved from the network is used to train the neural network model. However, the data retrieved by the network contains a certain proportion of noisy labels, which will adversely affect the training of the model. In addition, the small inter-class variance and large intra-class variance inherent in fine-grained images further increase the difficulty of recognition.

目前的现有技术公开了基于类间相似度的分布式标签的细粒度图像识别算法,包括以下步骤:使用骨干网络提取输入图像的特征表示;利用中心损失模块通过特征表示计算中心损失并更新类别中心;分类损失模块利用特征表示和最终标签分布计算分类损失(例如交叉熵损失),其中的最终标签分布通过计算独热标签分布和由类别中心生成的分布式标签分布的加权和得到;由中心损失和分类损失加权求和得到最终的目标损失函数,以此优化整个模型;现有技术中的方法能够通过降低模型预测的确信度缓解过拟合的问题,能够有效学习细粒度数据的辨别性特征,在一定程度上提高区分不同细粒度类别数据的准确性;但现有技术中的方法主要采用以优质数据驱动的深度学习来区分从属类别,依赖于大规模的人工标注的图像数据,数据收集及标注成本较高,在进行细粒度图像识别时常常费时费力,存在着效率和准确率均较低的问题。The current state-of-the-art discloses a fine-grained image recognition algorithm based on inter-class similarity distributed labels, including the following steps: using the backbone network to extract the feature representation of the input image; using the center loss module to calculate the center loss and update the category through the feature representation center; the classification loss module calculates the classification loss (e.g., cross-entropy loss) using the feature representation and the final label distribution, where the final label distribution is obtained by calculating the weighted sum of the one-hot label distribution and the distributed label distribution generated by the category center; by the center The weighted sum of loss and classification loss is used to obtain the final target loss function, so as to optimize the entire model; the method in the prior art can alleviate the problem of overfitting by reducing the certainty of model prediction, and can effectively learn the discriminability of fine-grained data features, to a certain extent, improve the accuracy of distinguishing different fine-grained category data; but the methods in the prior art mainly use high-quality data-driven deep learning to distinguish subordinate categories, relying on large-scale manually labeled image data, data The cost of collection and labeling is high, and it is often time-consuming and labor-intensive when performing fine-grained image recognition, and there are problems of low efficiency and accuracy.

发明内容Contents of the invention

本发明为克服上述现有技术在进行细粒度图像识别时效率和准确率低下的缺陷,提供一种基于深度学习的网络监督细粒度图像识别方法和系统,能够高效准确地对图像进行细粒度识别。In order to overcome the defects of low efficiency and accuracy in the fine-grained image recognition of the above-mentioned prior art, the present invention provides a network-supervised fine-grained image recognition method and system based on deep learning, which can efficiently and accurately perform fine-grained image recognition .

为解决上述技术问题,本发明的技术方案如下:In order to solve the problems of the technologies described above, the technical solution of the present invention is as follows:

一种基于深度学习的网络监督细粒度图像识别方法,包括以下步骤:A network supervised fine-grained image recognition method based on deep learning, comprising the following steps:

S1:从互联网中获取含有噪声标签的输入图像;S1: Obtain an input image with noisy labels from the Internet;

S2:对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图;S2: Perform feature extraction on the input image containing the noise label, and obtain a region discrimination feature map and an overall feature map;

S3:根据所获得的区域判别特征图和整体特征图,获取含有噪声标签特征的实例图;S3: According to the obtained region discriminant feature map and overall feature map, obtain an instance map containing noise label features;

S4:根据所获取的含有噪声标签特征的实例图,为每个类别构造图原型;S4: Construct a graph prototype for each category according to the obtained instance graph containing noise label features;

S5:将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,获得优化后的图匹配神经网络模型;S5: Input the obtained instance graph and graph prototype containing noise label features into the preset graph matching neural network model for training, and obtain the optimized graph matching neural network model;

S6:获取待识别图像,提取待识别图像特征后,利用所述优化后的图匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果。S6: Acquiring the image to be recognized, extracting the features of the image to be recognized, using the optimized graph matching neural network model to recognize the image to be recognized, and obtaining a recognition result of the image to be recognized.

优选地,所述步骤S2中,对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图,具体方法为:Preferably, in the step S2, the feature extraction is performed on the input image containing the noise label, and the region discriminant feature map and the overall feature map are obtained, the specific method is:

用特征提取器对所述含有噪声标签的输入图像进行特征提取,获取整体特征图;将所述整体特征图通过一个卷积层,获取均值滤波后的整体特征图;对所述均值滤波后的整体特征图基于通道数计算每个位置的均值,获取整体均值特征图;搜寻整体均值特征图中的最大响应值区域,并定位最大响应值区域的坐标,根据最大响应值区域的坐标获取区域判别特征图。Carry out feature extraction to the input image that contains noise label with feature extractor, obtain overall feature map; Pass described overall feature map through a convolutional layer, obtain the overall feature map after mean value filtering; After described mean value filtering The overall feature map calculates the mean value of each position based on the number of channels, and obtains the overall mean feature map; searches for the maximum response value area in the overall mean feature map, and locates the coordinates of the maximum response value area, and obtains area discrimination based on the coordinates of the maximum response value area feature map.

优选地,所述搜寻整体均值特征图中的最大响应值区域,并定位最大响应值区域的坐标的具体方法为:Preferably, the specific method of searching for the maximum response value area in the overall mean feature map and locating the coordinates of the maximum response value area is:

根据以下公式进行搜寻整体均值特征图中的最大响应值区域,并定位最大响应值区域的坐标:Search for the maximum response value area in the overall mean feature map according to the following formula, and locate the coordinates of the maximum response value area:

Figure BDA0003862349850000021
Figure BDA0003862349850000021

Figure BDA0003862349850000022
Figure BDA0003862349850000022

其中,

Figure BDA0003862349850000023
表示整体均值特征图,f‘g表示均值滤波后的整体特征图,C表示均值滤波后的整体特征图的通道数,
Figure BDA0003862349850000031
表示搜寻最大响应值区域对应的行和列,(i,j)表示最大响应值区域的坐标。in,
Figure BDA0003862349850000023
Represents the overall mean feature map, f' g represents the overall feature map after mean filtering, C represents the number of channels of the overall feature map after mean filtering,
Figure BDA0003862349850000031
Indicates the row and column corresponding to the region of the maximum response value searched, and (i,j) represents the coordinates of the region of the maximum response value.

优选地,所述步骤S3中,根据所获得的区域判别特征图和整体特征图,获取含有噪声标签特征的实例图,具体方法为:Preferably, in the step S3, according to the obtained region discrimination feature map and the overall feature map, an instance map containing noise label features is obtained, and the specific method is as follows:

将所获得的区域判别特征图采用双线性插值的方法变换为相同的维度,获取相同维度的区域特征图;利用全局平均池化的方法对整体特征图和相同维度的区域特征图进行降维,获取降维后的整体特征图和降维后的区域特征图;根据降维后的整体特征图和降维后的区域特征图获取含有噪声标签特征的实例图:The obtained region discriminant feature map is transformed into the same dimension by bilinear interpolation method, and the regional feature map of the same dimension is obtained; the global average pooling method is used to reduce the dimensionality of the overall feature map and the regional feature map of the same dimension , to obtain the overall feature map after dimensionality reduction and the regional feature map after dimensionality reduction; according to the overall feature map after dimensionality reduction and the regional feature map after dimensionality reduction, an instance map containing noise label features is obtained:

Gins=<Vins,Eins>G ins =<V ins ,E ins >

其中,Gins表示含有噪声标签特征的实例图,Vins表示降维后的整体特征图和降维后的区域特征图中所有特征点的集合,Eins表示含有噪声标签特征的实例图中特征点之间连接的邻接矩阵。Among them, G ins represents the instance map containing noise label features, V ins represents the set of all feature points in the overall feature map after dimension reduction and the region feature map after dimension reduction, and E ins represents the feature in the instance map containing noise label features Adjacency matrix of connections between points.

优选地,所述步骤S4中,根据所获取的含有噪声标签特征的实例图,构造图原型的具体方法为:Preferably, in the step S4, according to the obtained instance graph containing noise label features, the specific method of constructing the graph prototype is as follows:

根据所获取的含有噪声标签特征的实例图,为每个类别构造一个与所述含有噪声标签特征的实例图相同结构的图原型,图原型采用移动平均的方式进行更新:According to the obtained instance graph containing noise label features, construct a graph prototype with the same structure as the instance graph containing noise label features for each category, and the graph prototype is updated by moving average:

Gk=<Vk,Ek>G k =<V k , E k >

Figure BDA0003862349850000032
Figure BDA0003862349850000032

其中,Gk表示所构建的第k个类别的图原型,Vk表示第k个类别的图原型中所有特征点的集合,Ek表示第k个类别的图原型中特征点之间连接的邻接矩阵,G'k为更新后的图原型,m为预设参数。Among them, G k represents the constructed graph prototype of the k-th category, V k represents the collection of all feature points in the graph prototype of the k-th category, and E k represents the connection between feature points in the graph prototype of the k-th category Adjacency matrix, G' k is the updated graph prototype, m is the preset parameter.

优选地,所述步骤S5中,将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,获得优化后的图匹配神经网络模型,具体方法为:Preferably, in the step S5, the obtained instance graph and graph prototype containing noise label features are input into a preset graph matching neural network model for training to obtain an optimized graph matching neural network model, the specific method is:

所述预置的图匹配神经网络模型包括图内传播层、图聚合层、图间传播层和图匹配层,获得优化后的图匹配神经网络模型包括以下步骤;The preset graph matching neural network model includes a graph intra-graph propagation layer, a graph aggregation layer, an inter-graph propagation layer and a graph matching layer, and obtaining the optimized graph matching neural network model includes the following steps;

S5.1:将所获得的含有噪声标签特征的实例图Gins与图原型Gk输入图内传播层,获得第一特征矩阵和第二特征矩阵,将第一特征矩阵和第二特征矩阵分别通过图卷积操作进行迭代更新;S5.1: Input the obtained instance graph G ins and graph prototype G k containing noise label features into the in-graph propagation layer to obtain the first feature matrix and the second feature matrix, and the first feature matrix and the second feature matrix respectively Iterative update by graph convolution operation;

S5.2:将迭代更新后的第一特征矩阵和第二特征矩阵输入所述图聚合层进行特征结合,获得聚合特征向量;S5.2: Input the iteratively updated first feature matrix and second feature matrix into the graph aggregation layer for feature combination to obtain an aggregated feature vector;

S5.3:将所述聚合特征向量输入图间传播层进行图卷积操作,并迭代更新所述聚合特征向量,获得第一特征表达fins和第二特征表达ZkS5.3: Input the aggregated feature vector into the inter-graph propagation layer to perform a graph convolution operation, and iteratively update the aggregated feature vector to obtain the first feature expression f ins and the second feature expression Z k ;

S5.4:将第一特征表达fins和第二特征表达Zk输入图匹配层计算相似度Sk,根据相似度Sk计算图匹配损失

Figure BDA0003862349850000044
S5.4: Input the first feature expression f ins and the second feature expression Z k into the graph matching layer to calculate the similarity S k , and calculate the graph matching loss according to the similarity S k
Figure BDA0003862349850000044

S5.5:对含有噪声标签特征的实例图中的噪声标签进行修正以及对离群样本进行剔除;S5.5: Correct the noise labels in the instance graph containing noise label features and eliminate outlier samples;

S5.6:计算分类交叉熵损失

Figure BDA0003862349850000045
和总损失
Figure BDA0003862349850000046
根据总损失
Figure BDA0003862349850000047
对所述图匹配神经网络模型进行优化,获得优化后的图匹配神经网络模型。S5.6: Compute categorical cross-entropy loss
Figure BDA0003862349850000045
and total loss
Figure BDA0003862349850000046
According to the total loss
Figure BDA0003862349850000047
The graph matching neural network model is optimized to obtain the optimized graph matching neural network model.

优选地,所述步骤S5.4中,将第一特征表达fins和第二特征表达Zk输入图匹配层计算相似度Sk,根据相似度Sk计算图匹配损失

Figure BDA0003862349850000048
具体为:Preferably, in the step S5.4, the first feature expression f ins and the second feature expression Z k are input into the graph matching layer to calculate the similarity S k , and the graph matching loss is calculated according to the similarity S k
Figure BDA0003862349850000048
Specifically:

将所述第一特征表达fins和第二特征表达Zk输入图匹配层进行图匹配,并计算相似度Sk,具体为:The first feature expression f ins and the second feature expression Z k are input into the graph matching layer for graph matching, and the similarity S k is calculated, specifically:

Figure BDA0003862349850000041
Figure BDA0003862349850000041

所述图匹配层设置图匹配损失函数,根据相似度Sk计算图匹配损失,所述图匹配损失函数具体为:The graph matching layer sets the graph matching loss function, and calculates the graph matching loss according to the similarity S k , and the graph matching loss function is specifically:

Figure BDA0003862349850000042
Figure BDA0003862349850000042

Figure BDA0003862349850000043
Figure BDA0003862349850000043

其中,

Figure BDA0003862349850000049
为图匹配损失,yi表示原始标签,k表示图原型的类别,K表示图原型的类别总数。in,
Figure BDA0003862349850000049
is the graph matching loss, y i represents the original label, k represents the category of the graph prototype, and K represents the total number of categories of the graph prototype.

优选地,所述步骤S5.5中,对含有噪声标签特征的实例图中的噪声标签进行修正以及对离群样本进行剔除,具体方法为:Preferably, in the step S5.5, the noise labels in the instance graph containing noise label features are corrected and the outlier samples are eliminated, the specific method is:

所述图内传播层设置有分类器,将所述含有噪声标签特征的实例图输入分类器中,获得分类器分布概率pi,计算图匹配分布概率di,根据分类器分布概率pi和图匹配分布概率di计算总概率qi,具体为:The propagation layer in the graph is provided with a classifier, and the instance graph containing noise label features is input into the classifier to obtain the classifier distribution probability p i , and calculate the graph matching distribution probability d i , according to the classifier distribution probability p i and The graph matching distribution probability d i calculates the total probability q i , specifically:

qi=αpi+(1-α)di q i =αp i +(1-α)d i

Figure BDA0003862349850000051
Figure BDA0003862349850000051

其中,α为预设参数,τ为温度系数;Among them, α is a preset parameter, and τ is a temperature coefficient;

根据总概率qi和预设阈值T对含有噪声标签特征的实例图中的噪声标签进行修正以及对离群样本OOD进行剔除,具体为:According to the total probability q i and the preset threshold T, the noise labels in the instance graph containing noise label features are corrected and the outlier sample OOD is eliminated, specifically:

Figure BDA0003862349850000052
Figure BDA0003862349850000052

其中,

Figure BDA00038623498500000512
为伪标签,T为预设阈值,当总概率qi的最大值大于T时,将总概率qi最大值对应的类别作为伪标签;当总概率qi大于类别平均概率时,将原始标签yi作为伪标签,实现对含有噪声标签特征的实例图中的噪声标签进行修正;其他情况将OOD作为伪标签,OOD表示离群样本,实现对离群样本的剔除。in,
Figure BDA00038623498500000512
is a pseudo label, T is a preset threshold, when the maximum value of the total probability q i is greater than T, the category corresponding to the maximum value of the total probability q i is used as a pseudo label; when the total probability q i is greater than the average probability of the category, the original label y i is used as a pseudo-label to correct the noise label in the instance graph containing noise label features; in other cases, OOD is used as a pseudo-label, and OOD represents outlier samples to realize the elimination of outlier samples.

优选地,所述步骤S5.6中,计算分类交叉熵损失

Figure BDA0003862349850000055
和总损失
Figure BDA0003862349850000056
根据总损失
Figure BDA0003862349850000057
对所述图匹配神经网络模型进行优化,获得优化后的图匹配神经网络模型,具体方法为:Preferably, in said step S5.6, the classification cross entropy loss is calculated
Figure BDA0003862349850000055
and total loss
Figure BDA0003862349850000056
According to the total loss
Figure BDA0003862349850000057
Optimizing the graph matching neural network model to obtain the optimized graph matching neural network model, the specific method is:

所述图内传播层设置有分类交叉熵损失函数,具体为:The propagation layer in the graph is provided with a classification cross-entropy loss function, specifically:

Figure BDA0003862349850000053
Figure BDA0003862349850000053

其中,

Figure BDA0003862349850000058
为分类交叉熵损失,pij为第i张含有噪声标签特征的实例图相对第j个类别的分类器分布概率,
Figure BDA0003862349850000059
为第i张含有噪声标签特征的实例图相对第j个类别的伪标签;in,
Figure BDA0003862349850000058
is the classification cross-entropy loss, p ij is the classifier distribution probability of the i-th instance image containing noise label features relative to the j-th category,
Figure BDA0003862349850000059
is the pseudo-label of the i-th instance image containing noise label features relative to the j-th category;

根据分类交叉熵损失函数和图匹配损失函数构建总损失函数,所述总损失函数具体为:The total loss function is constructed according to the classification cross-entropy loss function and the graph matching loss function, and the total loss function is specifically:

Figure BDA0003862349850000054
Figure BDA0003862349850000054

其中,

Figure BDA00038623498500000510
为总损失,λpro为比例系数;in,
Figure BDA00038623498500000510
is the total loss, λ pro is the proportional coefficient;

根据总损失

Figure BDA00038623498500000511
对所述图匹配神经网络模型进行优化,获得优化后的图匹配神经网络模型。According to the total loss
Figure BDA00038623498500000511
The graph matching neural network model is optimized to obtain the optimized graph matching neural network model.

本发明还提供一种基于深度学习的网络监督细粒度图像识别系统,应用上述一种基于深度学习的网络监督细粒度图像识别方法,包括:The present invention also provides a network-supervised fine-grained image recognition system based on deep learning, applying the above-mentioned network-supervised fine-grained image recognition method based on deep learning, including:

图像获取单元:用来从互联网中获取含有噪声标签的输入图像;Image acquisition unit: used to acquire input images containing noise labels from the Internet;

特征提取单元:用来对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图;Feature extraction unit: used to perform feature extraction on the input image containing the noise label, and obtain a region discriminant feature map and an overall feature map;

实例图生成单元:用来根据所获得的区域判别特征图和整体特征图,获取含有噪声标签特征的实例图;Instance map generation unit: used to obtain an instance map containing noise label features according to the obtained regional discrimination feature map and overall feature map;

图原型构造单元:用来根据所获取的含有噪声标签特征的实例图,为每个类别构造图原型;Graph prototype construction unit: used to construct a graph prototype for each category based on the obtained instance graph containing noise label features;

图匹配单元:用来将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,获得优化后的图匹配神经网络模型;Graph matching unit: used to input the obtained instance graph and graph prototype containing noise label features into the preset graph matching neural network model for training, and obtain the optimized graph matching neural network model;

图像识别单元:用来获取待识别图像,提取待识别图像特征后,利用所述优化后的图匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果。Image recognition unit: used to obtain the image to be recognized, extract the features of the image to be recognized, use the optimized graph matching neural network model to recognize the image to be recognized, and obtain the recognition result of the image to be recognized.

与现有技术相比,本发明技术方案的有益效果是:Compared with the prior art, the beneficial effects of the technical solution of the present invention are:

本发明提供一种基于深度学习的网络监督细粒度图像识别方法和系统,该方法通过对含有噪声标签的输入图像进行特征处理,获取含有噪声标签特征的实例图,利用含有噪声标签特征的实例图为每个类别构建一个对应的图原型,用所获得的含有噪声标签特征的实例图与图原型对预置的图像匹配神经网络模型中进行训练以及噪声标签的修正,利用优化后的图像匹配神经网络模型进行细粒度图像的识别;该方法基于深度学习进行网络监督细粒度图像的识别,通过引入图原型与含有噪声标签特征的实例图进行对比学习,能够有效地对噪声标签进行校正,显著提高了细粒度图像识别的效率和准确率。The present invention provides a method and system for network-supervised fine-grained image recognition based on deep learning. The method obtains an instance map containing noise label features by performing feature processing on an input image containing noise labels, and utilizes the instance map containing noise label features Construct a corresponding graph prototype for each category, use the obtained instance graph and graph prototype containing noise label features to train the preset image matching neural network model and correct the noise label, and use the optimized image matching neural network Network model for fine-grained image recognition; this method is based on deep learning for network-supervised fine-grained image recognition. By introducing graph prototypes and instance graphs with noise label features for comparative learning, noise labels can be effectively corrected and significantly improved. Improve the efficiency and accuracy of fine-grained image recognition.

附图说明Description of drawings

图1为实施例1所提供的一种基于深度学习的网络监督细粒度图像识别方法流程图。FIG. 1 is a flow chart of a network-supervised fine-grained image recognition method based on deep learning provided in Embodiment 1.

图2为实施例2所提供的一种基于深度学习的网络监督细粒度图像识别方法示意图。FIG. 2 is a schematic diagram of a network-supervised fine-grained image recognition method based on deep learning provided in Embodiment 2.

图3为实施例3所提供的一种基于深度学习的网络监督细粒度图像识别系统结构图。FIG. 3 is a structural diagram of a network-supervised fine-grained image recognition system based on deep learning provided in Embodiment 3.

301-图像获取单元,302-特征提取单元,303-实例图生成单元,304-图原型构造单元,305-图匹配单元,306-图像识别单元。301-image acquisition unit, 302-feature extraction unit, 303-instance graph generation unit, 304-graph prototype construction unit, 305-graph matching unit, 306-image recognition unit.

具体实施方式detailed description

附图仅用于示例性说明,不能理解为对本专利的限制;The accompanying drawings are for illustrative purposes only and cannot be construed as limiting the patent;

为了更好说明本实施例,附图某些部件会有省略、放大或缩小,并不代表实际产品的尺寸;In order to better illustrate this embodiment, some parts in the drawings will be omitted, enlarged or reduced, and do not represent the size of the actual product;

对于本领域技术人员来说,附图中某些公知结构及其说明可能省略是可以理解的。For those skilled in the art, it is understandable that some well-known structures and descriptions thereof may be omitted in the drawings.

下面结合附图和实施例对本发明的技术方案做进一步的说明。The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

实施例1Example 1

如图1所示,本实施例提供一种基于深度学习的网络监督细粒度图像识别方法,包括以下步骤:As shown in Fig. 1, the present embodiment provides a network supervision fine-grained image recognition method based on deep learning, comprising the following steps:

S1:从互联网中获取含有噪声标签的输入图像;S1: Obtain an input image with noisy labels from the Internet;

S2:对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图;S2: Perform feature extraction on the input image containing the noise label, and obtain a region discrimination feature map and an overall feature map;

S3:根据所获得的区域判别特征图和整体特征图,获取含有噪声标签特征的实例图;S3: According to the obtained region discriminant feature map and overall feature map, obtain an instance map containing noise label features;

S4:根据所获取的含有噪声标签特征的实例图,为每个类别构造图原型;S4: Construct a graph prototype for each category according to the obtained instance graph containing noise label features;

S5:将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,获得优化后的图匹配神经网络模型;S5: Input the obtained instance graph and graph prototype containing noise label features into the preset graph matching neural network model for training, and obtain the optimized graph matching neural network model;

S6:获取待识别图像,提取待识别图像特征后,利用所述优化后的图匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果。S6: Acquiring the image to be recognized, extracting the features of the image to be recognized, using the optimized graph matching neural network model to recognize the image to be recognized, and obtaining a recognition result of the image to be recognized.

在具体实施过程中,首先通过网络检索获取含有噪声标签的输入图像,之后用CNN卷积神经网络对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图,之后根据所获得的区域判别特征图和整体特征图获取含有噪声标签特征的实例图,之后根据含有噪声标签特征的实例图为每个类别构建一个对应的图原型,之后将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,并计算图匹配损失和分类交叉熵损失进行优化神经网络,获得优化后的图匹配神经网络模型,最后利用所述优化后的图匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果;In the specific implementation process, firstly, the input image containing the noise label is obtained through network retrieval, and then the CNN convolutional neural network is used to extract the feature of the input image containing the noise label, and the region discrimination feature map and the overall feature map are obtained, and then according to The obtained region discriminative feature map and the overall feature map obtain the instance map containing the noise label feature, and then construct a corresponding graph prototype for each category according to the instance map containing the noise label feature, and then use the obtained noise label feature The instance graph and graph prototype are input into the preset graph matching neural network model for training, and the graph matching loss and classification cross-entropy loss are calculated to optimize the neural network to obtain the optimized graph matching neural network model, and finally use the optimized graph matching neural network model The graph matching neural network model recognizes the image to be recognized, and obtains the recognition result of the image to be recognized;

该方法基于深度学习进行细粒度图像的识别,通过引入图原型与含有噪声标签特征的实例图进行对比学习,能够有效地对噪声标签进行校正,显著提高了细粒度图像识别的效率和准确率。This method is based on deep learning for fine-grained image recognition. By introducing graph prototypes and instance graphs with noise label features for comparative learning, it can effectively correct noise labels and significantly improve the efficiency and accuracy of fine-grained image recognition.

实施例2Example 2

如图2所示,本实施例提供一种基于深度学习的网络监督细粒度图像识别方法,包括以下步骤:As shown in Figure 2, the present embodiment provides a network supervision fine-grained image recognition method based on deep learning, including the following steps:

S1:从互联网中获取含有噪声标签的输入图像;S1: Obtain an input image with noisy labels from the Internet;

S2:对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图,具体方法为:S2: Perform feature extraction on the input image containing the noise label, and obtain the region discriminant feature map and the overall feature map, the specific method is:

用特征提取器对所述含有噪声标签的输入图像进行特征提取,获取整体特征图;将所述整体特征图通过一个卷积层,获取均值滤波后的整体特征图;对所述均值滤波后的整体特征图基于通道数计算每个位置的均值,获取整体均值特征图;搜寻整体均值特征图中的最大响应值区域,并定位最大响应值区域的坐标,根据最大响应值区域的坐标获取区域判别特征图;Carry out feature extraction to the input image that contains noise label with feature extractor, obtain overall feature map; Pass described overall feature map through a convolutional layer, obtain the overall feature map after mean value filtering; After described mean value filtering The overall feature map calculates the mean value of each position based on the number of channels, and obtains the overall mean feature map; searches for the maximum response value area in the overall mean feature map, and locates the coordinates of the maximum response value area, and obtains area discrimination based on the coordinates of the maximum response value area feature map;

所述搜寻整体均值特征图中的最大响应值区域,并定位最大响应值区域的坐标的具体方法为:The specific method of searching for the maximum response value area in the overall mean feature map and locating the coordinates of the maximum response value area is:

根据以下公式进行搜寻整体均值特征图中的最大响应值区域,并定位最大响应值区域的坐标:Search for the maximum response value area in the overall mean feature map according to the following formula, and locate the coordinates of the maximum response value area:

Figure BDA0003862349850000081
Figure BDA0003862349850000081

Figure BDA0003862349850000082
Figure BDA0003862349850000082

其中,

Figure BDA0003862349850000083
表示整体均值特征图,f‘g表示均值滤波后的整体特征图,C表示均值滤波后的整体特征图的通道数,
Figure BDA0003862349850000084
表示搜寻最大响应值区域对应的行和列,(i,j)表示最大响应值区域的坐标;in,
Figure BDA0003862349850000083
Represents the overall mean feature map, f' g represents the overall feature map after mean filtering, C represents the number of channels of the overall feature map after mean filtering,
Figure BDA0003862349850000084
Indicates the row and column corresponding to the area of maximum response value searched, and (i,j) indicates the coordinates of the area of maximum response value;

S3:根据所获得的区域判别特征图和整体特征图,获取含有噪声标签特征的实例图,具体方法为:S3: According to the obtained region discriminant feature map and overall feature map, obtain an instance map containing noise label features, the specific method is:

将所获得的区域判别特征图采用双线性插值的方法变换为相同的维度,获取相同维度的区域特征图;利用全局平均池化的方法对整体特征图和相同维度的区域特征图进行降维,获取降维后的整体特征图和降维后的区域特征图;根据降维后的整体特征图和降维后的区域特征图获取含有噪声标签特征的实例图:The obtained region discriminant feature map is transformed into the same dimension by bilinear interpolation method, and the regional feature map of the same dimension is obtained; the global average pooling method is used to reduce the dimensionality of the overall feature map and the regional feature map of the same dimension , to obtain the overall feature map after dimensionality reduction and the regional feature map after dimensionality reduction; according to the overall feature map after dimensionality reduction and the regional feature map after dimensionality reduction, an instance map containing noise label features is obtained:

Gins=<Vins,Eins>G ins =<V ins ,E ins >

其中,Gins表示含有噪声标签特征的实例图,Vins表示降维后的整体特征图和降维后的区域特征图中所有特征点的集合,Eins表示含有噪声标签特征的实例图中特征点之间连接的邻接矩阵;Among them, G ins represents the instance map containing noise label features, V ins represents the set of all feature points in the overall feature map after dimension reduction and the region feature map after dimension reduction, and E ins represents the feature in the instance map containing noise label features adjacency matrix of connections between points;

S4:根据所获取的含有噪声标签特征的实例图,为每个类别构造图原型,具体方法为:S4: Construct a graph prototype for each category according to the obtained instance graph containing noise label features, the specific method is:

根据所获取的含有噪声标签特征的实例图,为每个类别构造一个与所述含有噪声标签特征的实例图相同结构的图原型,图原型采用移动平均的方式进行更新:According to the obtained instance graph containing noise label features, construct a graph prototype with the same structure as the instance graph containing noise label features for each category, and the graph prototype is updated by moving average:

Gk=<Vk,Ek>G k =<V k , E k >

Figure BDA0003862349850000091
Figure BDA0003862349850000091

其中,Gk表示所构建的第k个类别的图原型,Vk表示第k个类别的图原型中所有特征点的集合,Ek表示第k个类别的图原型中特征点之间连接的邻接矩阵,G'k为更新后的图原型,m为预设参数;Among them, G k represents the constructed graph prototype of the k-th category, V k represents the collection of all feature points in the graph prototype of the k-th category, and E k represents the connection between feature points in the graph prototype of the k-th category Adjacency matrix, G' k is the updated graph prototype, m is the preset parameter;

S5:将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,获得优化后的图匹配神经网络模型;S5: Input the obtained instance graph and graph prototype containing noise label features into the preset graph matching neural network model for training, and obtain the optimized graph matching neural network model;

所述预置的图匹配神经网络模型包括图内传播层、图聚合层、图间传播层和图匹配层,获取优化后的图匹配神经网络模型包括以下步骤;The preset graph matching neural network model includes a graph intra-graph propagation layer, a graph aggregation layer, an inter-graph propagation layer and a graph matching layer, and obtaining the optimized graph matching neural network model includes the following steps;

S5.1:将所获得的含有噪声标签特征的实例图Gins与图原型Gk输入图内传播层,获得第一特征矩阵和第二特征矩阵,将第一特征矩阵和第二特征矩阵分别通过图卷积操作进行迭代更新,具体为:S5.1: Input the obtained instance graph G ins and graph prototype G k containing noise label features into the in-graph propagation layer to obtain the first feature matrix and the second feature matrix, and the first feature matrix and the second feature matrix respectively Iterative updates are performed through graph convolution operations, specifically:

将所获得的含有噪声标签特征的实例图Gins与图原型Gk输入图内传播层,将降维后的整体特征图和降维后的区域特征图中所有特征点的集合Vins重构为第一特征矩阵

Figure BDA0003862349850000092
其中,n1为含有噪声标签特征的实例图所有特征点的数量,c1为含有噪声标签特征的实例图中每个特征点对应的维度;Input the obtained instance graph G ins containing noise label features and the graph prototype G k into the in-graph propagation layer, and reconstruct the set V ins of all feature points in the overall feature map after dimensionality reduction and the regional feature map after dimensionality reduction is the first characteristic matrix
Figure BDA0003862349850000092
Among them, n 1 is the number of all feature points in the instance graph containing noise label features, and c 1 is the dimension corresponding to each feature point in the instance graph containing noise label features;

将图原型中所有特征点的集合Vk重构为第二特征矩阵

Figure BDA0003862349850000093
其中,n2为图原型中所有特征点的数量,c2为图原型中每个特征点对应的维度;Reconstruct the set V k of all feature points in the graph prototype into the second feature matrix
Figure BDA0003862349850000093
Among them, n 2 is the number of all feature points in the graph prototype, and c 2 is the dimension corresponding to each feature point in the graph prototype;

对所述第一特征矩阵和第二特征矩阵分别进行图卷积操作,并迭代更新所述第一特征矩阵和第二特征矩阵,具体为:Perform graph convolution operations on the first feature matrix and the second feature matrix, and iteratively update the first feature matrix and the second feature matrix, specifically:

Figure BDA0003862349850000094
Figure BDA0003862349850000094

Figure BDA0003862349850000095
Figure BDA0003862349850000095

其中,

Figure BDA0003862349850000096
为第l次迭代更新后的第一特征矩阵,
Figure BDA0003862349850000097
为第l次迭代更新后的第二特征矩阵,
Figure BDA0003862349850000098
Figure BDA0003862349850000099
为图内传播层的参数;in,
Figure BDA0003862349850000096
is the first feature matrix after the l-th iteration update,
Figure BDA0003862349850000097
is the second feature matrix after the l-th iteration update,
Figure BDA0003862349850000098
with
Figure BDA0003862349850000099
is the parameter of the propagation layer in the graph;

S5.2:将迭代更新后的第一特征矩阵和第二特征矩阵输入所述图聚合层进行特征结合,获得聚合特征向量,具体为:S5.2: Input the iteratively updated first feature matrix and second feature matrix into the graph aggregation layer for feature combination to obtain an aggregated feature vector, specifically:

将迭代更新后的第一特征矩阵和第二特征矩阵输入所述图像聚合层进行特征结合,获得聚合特征向量,具体为:Input the iteratively updated first feature matrix and the second feature matrix into the image aggregation layer for feature combination to obtain an aggregated feature vector, specifically:

Figure BDA0003862349850000101
Figure BDA0003862349850000101

其中,

Figure BDA0003862349850000102
为聚合特征向量,
Figure BDA0003862349850000103
为更新后的第一特征矩阵,
Figure BDA0003862349850000104
为更新后的第二特征矩阵;in,
Figure BDA0003862349850000102
is the aggregated feature vector,
Figure BDA0003862349850000103
is the updated first feature matrix,
Figure BDA0003862349850000104
is the updated second feature matrix;

S5.3:将所述聚合特征向量输入图间传播层进行图卷积操作,并迭代更新所述聚合特征向量,获得第一特征表达fins和第二特征表达Zk,具体为:S5.3: Input the aggregated feature vector into the inter-graph propagation layer for graph convolution operation, and iteratively update the aggregated feature vector to obtain the first feature expression f ins and the second feature expression Z k , specifically:

将所述聚合特征向量输入图间传播层进行图卷积操作,并迭代更新所述聚合特征向量,具体为:Input the aggregated feature vector into the inter-graph propagation layer for graph convolution operation, and iteratively update the aggregated feature vector, specifically:

Figure BDA0003862349850000105
Figure BDA0003862349850000105

其中,

Figure BDA0003862349850000106
为第l次迭代更新后的聚合特征向量,Ecross为聚合特征向量的邻接矩阵,
Figure BDA0003862349850000107
Figure BDA0003862349850000108
为图间传播层的参数;in,
Figure BDA0003862349850000106
E cross is the adjacency matrix of the aggregated feature vector,
Figure BDA0003862349850000107
with
Figure BDA0003862349850000108
is the parameter of the inter-graph propagation layer;

根据第l次迭代更新后的聚合特征向量获得第一特征表达fins和第二特征表达ZkObtain the first feature expression f ins and the second feature expression Z k according to the aggregated feature vector updated in the l iteration;

S5.4:将第一特征表达fins和第二特征表达Zk输入图匹配层计算相似度Sk,根据相似度Sk计算图匹配损失

Figure BDA00038623498500001012
具体为:S5.4: Input the first feature expression f ins and the second feature expression Z k into the graph matching layer to calculate the similarity S k , and calculate the graph matching loss according to the similarity S k
Figure BDA00038623498500001012
Specifically:

将所述第一特征表达fins和第二特征表达Zk输入图匹配层进行图匹配,并计算相似度Sk,具体为:The first feature expression f ins and the second feature expression Z k are input into the graph matching layer for graph matching, and the similarity S k is calculated, specifically:

Figure BDA0003862349850000109
Figure BDA0003862349850000109

所述图匹配层设置图匹配损失函数,根据相似度Sk计算图匹配损失,所述图匹配损失函数具体为:The graph matching layer sets the graph matching loss function, and calculates the graph matching loss according to the similarity S k , and the graph matching loss function is specifically:

Figure BDA00038623498500001010
Figure BDA00038623498500001010

Figure BDA00038623498500001011
Figure BDA00038623498500001011

其中,

Figure BDA00038623498500001013
为图匹配损失,yi表示原始标签,k表示图原型的类别,K表示图原型的类别总数;in,
Figure BDA00038623498500001013
is the graph matching loss, y i represents the original label, k represents the category of the graph prototype, and K represents the total number of categories of the graph prototype;

S5.5:对含有噪声标签特征的实例图中的噪声标签进行修正以及对离群样本进行剔除,具体为:S5.5: Correct the noise labels in the instance graph containing noise label features and eliminate outlier samples, specifically:

所述图内传播层设置有分类器,将所述含有噪声标签特征的实例图输入分类器中,获得分类器分布概率pi,计算图匹配分布概率di,根据分类器分布概率pi和图匹配分布概率di计算总概率qi,具体为:The propagation layer in the graph is provided with a classifier, and the instance graph containing noise label features is input into the classifier to obtain the classifier distribution probability p i , and calculate the graph matching distribution probability d i , according to the classifier distribution probability p i and The graph matching distribution probability d i calculates the total probability q i , specifically:

qi=αpi+(1-α)di q i =αp i +(1-α)d i

Figure BDA0003862349850000111
Figure BDA0003862349850000111

其中,α为预设参数,τ为温度系数;Among them, α is a preset parameter, and τ is a temperature coefficient;

根据总概率qi和预设阈值T对含有噪声标签特征的实例图中的噪声标签进行修正以及对离群样本OOD进行剔除,具体为:According to the total probability q i and the preset threshold T, the noise labels in the instance graph containing noise label features are corrected and the outlier sample OOD is eliminated, specifically:

Figure BDA0003862349850000112
Figure BDA0003862349850000112

其中,

Figure BDA0003862349850000115
为伪标签,T为预设阈值,当总概率qi的最大值大于T时,将总概率qi最大值对应的类别作为伪标签;当总概率qi大于类别平均概率时,将原始标签yi作为伪标签,实现对含有噪声标签特征的实例图中的噪声标签进行修正;其他情况将OOD作为伪标签,OOD表示离群样本,实现对离群样本的剔除;in,
Figure BDA0003862349850000115
is a pseudo label, T is a preset threshold, when the maximum value of the total probability q i is greater than T, the category corresponding to the maximum value of the total probability q i is used as a pseudo label; when the total probability q i is greater than the average probability of the category, the original label y i is used as a pseudo-label to correct the noise label in the instance graph containing noise label features; in other cases, OOD is used as a pseudo-label, and OOD represents an outlier sample to realize the elimination of outlier samples;

S5.6:计算分类交叉熵损失

Figure BDA0003862349850000116
和总损失
Figure BDA0003862349850000117
根据总损失
Figure BDA0003862349850000118
对所述图匹配神经网络模型进行优化,获得优化后的图匹配神经网络模型,具体为:S5.6: Compute categorical cross-entropy loss
Figure BDA0003862349850000116
and total loss
Figure BDA0003862349850000117
According to the total loss
Figure BDA0003862349850000118
The graph matching neural network model is optimized to obtain the optimized graph matching neural network model, specifically:

所述图内传播层设置有分类交叉熵损失函数,具体为:The propagation layer in the graph is provided with a classification cross-entropy loss function, specifically:

Figure BDA0003862349850000113
Figure BDA0003862349850000113

其中,

Figure BDA0003862349850000119
为分类交叉熵损失,pij为第i张含有噪声标签特征的实例图相对第j个类别的分类器分布概率,
Figure BDA00038623498500001110
为第i张含有噪声标签特征的实例图相对第j个类别的伪标签;in,
Figure BDA0003862349850000119
is the classification cross-entropy loss, p ij is the classifier distribution probability of the i-th instance image containing noise label features relative to the j-th category,
Figure BDA00038623498500001110
is the pseudo-label of the i-th instance image containing noise label features relative to the j-th category;

根据分类交叉熵损失函数和图匹配损失函数构建总损失函数,所述总损失函数具体为:The total loss function is constructed according to the classification cross-entropy loss function and the graph matching loss function, and the total loss function is specifically:

Figure BDA0003862349850000114
Figure BDA0003862349850000114

其中,

Figure BDA0003862349850000125
为总损失,λpro为比例系数;in,
Figure BDA0003862349850000125
is the total loss, λ pro is the proportional coefficient;

根据总损失

Figure BDA0003862349850000126
对所述图匹配神经网络模型进行优化,获得优化后的图匹配神经网络模型;According to the total loss
Figure BDA0003862349850000126
Optimizing the graph matching neural network model to obtain the optimized graph matching neural network model;

S6:获取待识别图像,提取待识别图像特征后,利用所述优化后的图匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果。。S6: Acquiring the image to be recognized, extracting the features of the image to be recognized, using the optimized graph matching neural network model to recognize the image to be recognized, and obtaining a recognition result of the image to be recognized. .

在具体实施过程中,首先通过网络检索获取含有噪声标签的输入图像,本实施例中所使用的数据集为WebFG-496,该数据集由三个子数据集组成,分别为Web-Bird、Web-Aircraft和Web-Car,所述含有噪声标签的输入图像尺寸为448×448;In the specific implementation process, firstly, the input image containing noise labels is retrieved through the network. The data set used in this embodiment is WebFG-496, which consists of three sub-data sets, namely Web-Bird, Web-Bird, and Web-Bird. Aircraft and Web-Car, the size of the input image containing noise labels is 448×448;

之后设置以ResNet50-varian作为骨干CNN的卷积神经网络,用特征提取器对所述含有噪声标签的输入图像进行特征提取,获取整体特征图,所述整体特征图维度为14×14×2048;将所述整体特征图通过一个卷积层,获取均值滤波后的整体特征图;对所述均值滤波后的整体特征图基于通道数计算每个位置的均值,获取整体均值特征图;Afterwards, the convolutional neural network with ResNet50-varian as the backbone CNN is set, and the feature extractor is used to extract the features of the input image containing the noise label to obtain the overall feature map, and the overall feature map dimension is 14 * 14 * 2048; passing the overall feature map through a convolutional layer to obtain an overall feature map after mean filtering; calculating the mean value of each position based on the number of channels for the overall feature map after mean filtering to obtain an overall mean feature map;

根据以下公式进行搜寻整体均值特征图中的最大响应值区域,并定位最大响应值区域的坐标:Search for the maximum response value area in the overall mean feature map according to the following formula, and locate the coordinates of the maximum response value area:

Figure BDA0003862349850000121
Figure BDA0003862349850000121

Figure BDA0003862349850000122
Figure BDA0003862349850000122

其中,

Figure BDA0003862349850000123
表示整体均值特征图,f‘g表示均值滤波后的整体特征图,C表示均值滤波后的整体特征图的通道数,
Figure BDA0003862349850000124
表示搜寻最大响应值区域对应的行和列,(i,j)表示最大响应值区域的坐标;in,
Figure BDA0003862349850000123
Represents the overall mean feature map, f' g represents the overall feature map after mean filtering, C represents the number of channels of the overall feature map after mean filtering,
Figure BDA0003862349850000124
Indicates the row and column corresponding to the area of maximum response value searched, and (i,j) indicates the coordinates of the area of maximum response value;

根据所获得的最大值响应区域的坐标在所述整体特征图中截取若干不同大小的局部区域,本实施例设置三种不同的面积大小S1、S2、S3以及三种不同的长宽比A1、A2、A3共9种组合,对所述整体特征图进行截取,其中三种不同面积大小S1、S2、S3分别为整体特征图面积的二分之一、三分之一、三分之二,三类不同的长宽比值A1、A2、A3分别为1、0.5、2;According to the coordinates of the obtained maximum response area, several local areas of different sizes are intercepted in the overall feature map. In this embodiment, three different area sizes S 1 , S 2 , S 3 and three different lengths and widths are set. Compared with A 1 , A 2 , and A 3 in total 9 combinations, the overall feature map is intercepted, and the three different area sizes S 1 , S 2 , and S 3 are respectively one-half and three times the area of the overall feature map. One-third, two-thirds, three different aspect ratios A 1 , A 2 , A 3 are 1, 0.5, 2 respectively;

用特征提取器对所截取的若干不同大小的局部区域进行特征提取,获取区域判别特征图;Use the feature extractor to perform feature extraction on several intercepted local areas of different sizes, and obtain the area discriminant feature map;

构建含有噪声标签特征的实例图和每个类别对应的图原型,将得到的含有噪声标签特征的实例图和图原型分别输入图内传播层GCN进行图卷积操作,本实施例中,输出通道数分别为1024和2048;将输出的含有噪声标签特征的实例图和图原型特征进行聚合,并获得第一特征表达fins和第二特征表达Zk;根据第一特征表达fins和第二特征表达Zk分别计算图匹配损失和分类交叉熵损失来对图匹配神经网络模型进行优化;Construct an instance graph containing noise label features and a graph prototype corresponding to each category, and input the obtained instance graph and graph prototype containing noise label features into the graph propagation layer GCN for graph convolution operation. In this embodiment, the output channel The numbers are 1024 and 2048 respectively; aggregate the output instance graphs containing noise label features and graph-based features, and obtain the first feature expression f ins and the second feature expression Z k ; according to the first feature expression f ins and the second The feature expression Z k calculates the graph matching loss and classification cross entropy loss respectively to optimize the graph matching neural network model;

本实施例中,α=0.5,τ=0.1,T=0.75,λpro=1;In this embodiment, α=0.5, τ=0.1, T=0.75, λ pro =1;

从CUB200-2011、FGVC-Aircraft和Stanford Cars中获取待识别图像作为验证数据,提取待识别图像的特征后,利用所述优化后的图像匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果;Obtain the image to be recognized from CUB200-2011, FGVC-Aircraft and Stanford Cars as verification data, extract the features of the image to be recognized, use the optimized image matching neural network model to recognize the image to be recognized, and obtain the image to be recognized recognition result;

如下表所示,为不同方法细粒度图像的识别准确率对比图:As shown in the table below, it is a comparison chart of the recognition accuracy of fine-grained images by different methods:

Figure BDA0003862349850000131
Figure BDA0003862349850000131

表1-不同方法细粒度图像的识别准确率对比图Table 1 - Comparison of recognition accuracy of fine-grained images by different methods

与基本模型进行比较,本实施例中的方法在三个数据集上的性能表现都远超于各类基本模型,本实施例使用的骨干网络为ResNet-50,相比于单独ResNet-50模型,本实施例的方法在三个数据集上都有了大幅度的提升,平均识别准确率提升了20.14%;为了进行公平的比较,统一使用ResNet-50作为骨干网络,由图3可知,当使用ResNet-50作为骨干网络时,本实施例的方法取得最高的83.53%的平均准确率,而在Web-Bird、Web-Aircraft和Web-Car上的准确率分别为76.62%、85.79%和82.09%,比目前较为先进的方法Peer-learning高出2.23%、4.2%和1.94%;更进一步地使用其它模型如B-CNN作为骨干网络,从比较结果中可知,本实施例的方法可与不同的骨干网络进行适配,从而在细粒度图像识别中得到较为明显的性能提升;Compared with the basic model, the performance of the method in this example on the three data sets is far superior to that of various basic models. The backbone network used in this example is ResNet-50. Compared with the single ResNet-50 model , the method of this embodiment has greatly improved on the three data sets, and the average recognition accuracy rate has increased by 20.14%. In order to make a fair comparison, ResNet-50 is uniformly used as the backbone network. When using ResNet-50 as the backbone network, the method of this embodiment achieves the highest average accuracy rate of 83.53%, while the accuracy rates on Web-Bird, Web-Aircraft and Web-Car are 76.62%, 85.79% and 82.09% respectively %, 2.23%, 4.2% and 1.94% higher than the currently more advanced method Peer-learning; further use other models such as B-CNN as the backbone network, as can be seen from the comparison results, the method of this embodiment can be different from The backbone network is adapted to obtain a more obvious performance improvement in fine-grained image recognition;

该方法基于深度学习进行网络监督细粒度图像的识别,通过引入图原型与含有噪声标签特征的实例图进行对比学习,能够有效地对噪声标签进行校正,显著提高了细粒度图像识别的效率和准确率。This method is based on deep learning for network-supervised fine-grained image recognition. By introducing graph prototypes and instance graphs with noisy label features for comparative learning, it can effectively correct noisy labels and significantly improve the efficiency and accuracy of fine-grained image recognition. Rate.

实施例3Example 3

如图3所示,本实施例提供一种基于深度学习的网络监督细粒度图像识别系统,应用实施例1或2所述的基于深度学习的网络监督细粒度图像识别方法,包括:As shown in FIG. 3 , this embodiment provides a network-supervised fine-grained image recognition system based on deep learning, applying the deep-learning-based network-supervised fine-grained image recognition method described in Embodiment 1 or 2, including:

图像获取单元301:用来从互联网中获取含有噪声标签的输入图像;Image acquisition unit 301: used to acquire input images containing noise labels from the Internet;

特征提取单元302:用来对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图;Feature extraction unit 302: used to perform feature extraction on the input image containing noise labels, and obtain a region discrimination feature map and an overall feature map;

实例图生成单元303:用来根据所获得的区域判别特征图和整体特征图,获取含有噪声标签特征的实例图;Instance map generation unit 303: used to obtain an instance map containing noise label features according to the obtained region discrimination feature map and overall feature map;

图原型构造单元304:用来根据所获取的含有噪声标签特征的实例图,为每个类别构造图原型;Graph prototype construction unit 304: used to construct a graph prototype for each category according to the obtained instance graph containing noise label features;

图匹配单元305:用来将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,获得优化后的图匹配神经网络模型;Graph matching unit 305: used to input the obtained instance graph and graph prototype containing noise label features into the preset graph matching neural network model for training, and obtain the optimized graph matching neural network model;

图像识别单元306:用来获取待识别图像,提取待识别图像特征后,利用所述优化后的图匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果;Image recognition unit 306: used to acquire the image to be recognized, extract the features of the image to be recognized, use the optimized graph matching neural network model to recognize the image to be recognized, and obtain the recognition result of the image to be recognized;

在具体实施过程中,首先利用图像获取单元301进行网络检索,获取含有噪声标签的输入图像;之后利用特征提取单元302对所述含有噪声标签的输入图像进行特征提取,获取区域判别特征图和整体特征图;利用实例图生成单元303根据所获得的区域判别特征图和整体特征图,获取含有噪声标签特征的实例图;之后根据所获取的含有噪声标签特征的实例图,利用图原型构造单元304为每个类别构造图原型;之后利用图匹配单元305将所获得的含有噪声标签特征的实例图与图原型输入预置的图匹配神经网络模型中进行训练,获得优化后的图匹配神经网络模型;最后图像识别单元306获取待识别图像,提取待识别图像特征后,利用所述优化后的图像匹配神经网络模型对待识别图像进行识别,获得待识别图像的识别结果;In the specific implementation process, first use the image acquisition unit 301 to perform network retrieval to obtain the input image containing noise labels; then use the feature extraction unit 302 to perform feature extraction on the input image containing noise labels to obtain the region discrimination feature map and the overall Feature map: Utilize the example map generation unit 303 to obtain an example map containing noise label features according to the obtained region discrimination feature map and overall feature map; then use the graph prototype construction unit 304 according to the obtained example map containing noise label features Construct a graph prototype for each category; then use the graph matching unit 305 to input the obtained instance graph containing noise label features and the graph prototype into the preset graph matching neural network model for training, and obtain the optimized graph matching neural network model ; Finally, the image recognition unit 306 acquires the image to be recognized, extracts the features of the image to be recognized, uses the optimized image matching neural network model to recognize the image to be recognized, and obtains the recognition result of the image to be recognized;

该系统基于深度学习进行细粒度图像的识别,通过引入图原型与含有噪声标签特征的实例图进行对比学习,能够有效地对噪声标签进行校正,显著提高了细粒度图像识别的效率和准确率。The system recognizes fine-grained images based on deep learning, and can effectively correct noisy labels by introducing graph prototypes and instance graphs containing noise label features for comparative learning, significantly improving the efficiency and accuracy of fine-grained image recognition.

相同或相似的标号对应相同或相似的部件;The same or similar reference numerals correspond to the same or similar components;

附图中描述位置关系的用语仅用于示例性说明,不能理解为对本专利的限制;The terms describing the positional relationship in the drawings are only for illustrative purposes and cannot be interpreted as limitations on this patent;

显然,本发明的上述实施例仅仅是为清楚地说明本发明所作的举例,而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明权利要求的保护范围之内。Apparently, the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, rather than limiting the implementation of the present invention. For those of ordinary skill in the art, other changes or changes in different forms can be made on the basis of the above description. It is not necessary and impossible to exhaustively list all the implementation manners here. All modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included within the protection scope of the claims of the present invention.

Claims (10)

1. A network supervision fine-grained image recognition method based on deep learning is characterized by comprising the following steps:
s1: acquiring an input image containing a noise label from the Internet;
s2: performing feature extraction on the input image containing the noise label to obtain a region discrimination feature map and an overall feature map;
s3: acquiring an example graph containing noise label features according to the obtained region discrimination feature graph and the overall feature graph;
s4: constructing a graph prototype for each category according to the obtained example graph containing the noise label characteristics;
s5: inputting the obtained example graph containing the noise label characteristics and the graph prototype into a preset graph matching neural network model for training to obtain an optimized graph matching neural network model;
s6: and acquiring an image to be recognized, extracting the characteristics of the image to be recognized, and recognizing the image to be recognized by using the optimized graph matching neural network model to obtain a recognition result of the image to be recognized.
2. The method according to claim 1, wherein in step S2, feature extraction is performed on the input image containing the noise label to obtain a region discrimination feature map and an overall feature map, and the specific method is as follows:
performing feature extraction on the input image containing the noise label by using a feature extractor to obtain an overall feature map; passing the integral characteristic diagram through a convolution layer to obtain an integral characteristic diagram after mean value filtering; calculating the average value of each position of the overall characteristic diagram after the average value filtering based on the number of channels to obtain an overall average value characteristic diagram; searching a maximum response value area in the overall mean value characteristic diagram, positioning the coordinate of the maximum response value area, and acquiring an area judgment characteristic diagram according to the coordinate of the maximum response value area.
3. The method for identifying the network supervision fine-grained image based on the deep learning according to claim 2, wherein the specific method for searching the maximum response value area in the overall mean value feature map and locating the coordinate of the maximum response value area comprises the following steps:
searching a maximum response value area in the overall mean value characteristic diagram according to the following formula, and positioning the coordinates of the maximum response value area:
Figure FDA0003862349840000011
Figure FDA0003862349840000012
wherein,
Figure FDA0003862349840000021
feature graph representing the overall mean, f g ' denotes a mean-filtered global feature map, C denotes the number of channels of the mean-filtered global feature map,
Figure FDA0003862349840000022
the method is characterized in that the row and the column corresponding to the area with the maximum response value are searched, and (i, j) represents the coordinate of the area with the maximum response value.
4. The method according to claim 3, wherein in step S3, an instance graph containing noise label features is obtained according to the obtained region discrimination feature map and the global feature map, and the specific method is as follows:
converting the obtained region distinguishing feature map into the same dimension by a bilinear interpolation method to obtain a region feature map with the same dimension; reducing dimensions of the overall feature map and the regional feature map with the same dimensions by using a global average pooling method to obtain the overall feature map after dimension reduction and the regional feature map after dimension reduction; acquiring an example graph containing noise label features according to the overall feature graph after dimensionality reduction and the regional feature graph after dimensionality reduction:
G ins =<V ins ,E ins >
wherein, G ins Example graph, V, representing features containing noise labels ins Representing the set of all feature points in the overall feature map after dimension reduction and the regional feature map after dimension reduction, E ins A adjacency matrix representing the connections between feature points in the example graph containing the noise label features.
5. The method according to claim 4, wherein in step S4, according to the obtained example graph containing the noise label feature, a concrete method for constructing a graph prototype comprises:
according to the obtained example graph containing the noise label features, constructing a graph prototype with the same structure as the example graph containing the noise label features for each category, wherein the graph prototype is updated in a moving average mode:
G k =<V k ,E k >
Figure FDA0003862349840000023
wherein G is k Graph primitive type, V, representing the kth class constructed k Set of all feature points in the prototype of the graph representing the kth class, E k Adjacent matrix, G' k For the updated graph prototype, m is a preset parameter.
6. The method for identifying network supervision fine-grained images based on deep learning according to claim 5, wherein in the step S5, the obtained example graph containing the noise label features and the graph primitive type are input into a preset graph matching neural network model for training to obtain an optimized graph matching neural network model, and the specific method is as follows:
the preset graph matching neural network model comprises an intra-graph propagation layer, a graph aggregation layer, an inter-graph propagation layer and a graph matching layer, and the step of obtaining the optimized graph matching neural network model comprises the following steps;
s5.1: the obtained example graph G containing the noise label characteristics ins And graph original form G k Inputting a propagation layer in the graph, obtaining a first characteristic matrix and a second characteristic matrix, and respectively carrying out iterative updating on the first characteristic matrix and the second characteristic matrix through graph convolution operation;
s5.2: inputting the first feature matrix and the second feature matrix after iterative updating into the graph aggregation layer for feature combination to obtain an aggregation feature vector;
s5.3: inputting the aggregation characteristic vector into an inter-graph propagation layer for graph convolution operation, and iteratively updating the aggregation characteristicEigenvector to obtain a first feature expression f ins And a second characteristic expression Z k
S5.4: expressing the first characteristic f ins And a second characteristic expression Z k Input graph matching layer calculation similarity S k According to the similarity S k Calculating graph match penalty
Figure FDA0003862349840000031
S5.5: correcting the noise labels in the example graph containing the noise label characteristics and removing outlier samples;
s5.6: computing categorical cross entropy loss
Figure FDA0003862349840000032
And total loss
Figure FDA0003862349840000033
According to total loss
Figure FDA0003862349840000034
And optimizing the graph matching neural network model to obtain the optimized graph matching neural network model.
7. The method as claimed in claim 6, wherein in step S5.4, the first feature is expressed as f ins And the second feature expression Z k Calculating similarity S of input graph matching layer k According to the similarity S k Calculating graph match penalty
Figure FDA0003862349840000035
The method specifically comprises the following steps:
expressing the first characteristic f ins And a second characteristic expression Z k Inputting a graph matching layer to perform graph matching and calculating the similarity S k The method specifically comprises the following steps:
Figure FDA0003862349840000036
the graph matching layer sets a graph matching loss function according to the similarity S k Calculating a graph matching loss, wherein the graph matching loss function is specifically as follows:
Figure FDA0003862349840000037
Figure FDA0003862349840000038
wherein,
Figure FDA0003862349840000039
for graph matching loss, y i Representing the original label, K representing the category of the diagram prototype, K representing the total number of categories of the diagram prototype.
8. The method for identifying network supervision fine-grained images based on deep learning according to claim 7, wherein in step S5.5, the noise labels in the example graph containing the noise label features are corrected and outlier samples are eliminated, and the specific method is as follows:
the propagation layer in the graph is provided with a classifier, the example graph containing the noise label characteristics is input into the classifier, and the distribution probability p of the classifier is obtained i Calculating the probability d of the distribution of the matching of the graph i According to the classifier distribution probability p i Probability d of distribution of matching with the map i Calculating the total probability q i The method specifically comprises the following steps:
q i =αp i +(1-α)d i
Figure FDA0003862349840000041
wherein alpha is a preset parameter, and tau is a temperature coefficient;
according to the total probability q i And correcting the noise label in the example graph containing the noise label characteristic by a preset threshold T and removing the outlier sample OOD, wherein the method specifically comprises the following steps:
Figure FDA0003862349840000042
wherein,
Figure FDA0003862349840000043
is a false label, T is a preset threshold value, when the total probability q i Is greater than T, the total probability q is determined i The category corresponding to the maximum value is used as a pseudo label; when total probability q i When the probability is larger than the class average probability, the original label y is labeled i As a pseudo tag, correcting the noise tag in the example graph containing the noise tag characteristic; in other cases, OOD is used as a pseudo label, OOD represents outlier samples, and outlier samples are removed.
9. The method for identifying network supervision fine-grained images based on deep learning as claimed in claim 8, wherein in the step S5.6, classification cross entropy loss is calculated
Figure FDA0003862349840000044
And total loss
Figure FDA0003862349840000045
According to total loss
Figure FDA0003862349840000046
Optimizing the graph matching neural network model to obtain the optimized graph matching neural network model, wherein the specific method comprises the following steps of:
the propagation layer in the graph is provided with a classified cross entropy loss function, which specifically comprises the following steps:
Figure FDA0003862349840000047
wherein,
Figure FDA0003862349840000048
to classify cross entropy losses, p ij For the ith example graph containing noise label features to the probability of classifier distribution of the jth class,
Figure FDA0003862349840000049
the ith example graph containing the noise label characteristics is relative to the jth category of pseudo labels;
constructing a total loss function according to the classified cross entropy loss function and the graph matching loss function, wherein the total loss function specifically comprises the following steps:
Figure FDA0003862349840000051
wherein,
Figure FDA0003862349840000052
for total loss, λ pro Is a proportionality coefficient;
according to total loss
Figure FDA0003862349840000053
And optimizing the graph matching neural network model to obtain the optimized graph matching neural network model.
10. A network supervision fine-grained image recognition system based on deep learning, which applies the network supervision fine-grained image recognition method based on deep learning in any one of claims 1 to 9, and is characterized by comprising the following steps:
an image acquisition unit: the method comprises the steps of obtaining an input image containing a noise label from the Internet;
a feature extraction unit: the system is used for extracting the characteristics of the input image containing the noise label to obtain an area discrimination characteristic diagram and an integral characteristic diagram;
example graph generation unit: the method is used for obtaining an example graph containing the noise label characteristics according to the obtained region distinguishing characteristic graph and the whole characteristic graph;
the figure prototype structure unit: the prototype of the graph is constructed for each category according to the acquired example graph containing the noise label characteristic;
a graph matching unit: the graph prototype model is used for inputting the obtained example graph containing the noise label characteristics and the graph prototype into a preset graph matching neural network model for training to obtain an optimized graph matching neural network model;
an image recognition unit: the method is used for obtaining an image to be recognized, recognizing the image to be recognized by utilizing the optimized graph matching neural network model after extracting the characteristics of the image to be recognized, and obtaining the recognition result of the image to be recognized.
CN202211167812.6A 2022-09-23 2022-09-23 A network-supervised fine-grained image recognition method and system based on deep learning Active CN115496948B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211167812.6A CN115496948B (en) 2022-09-23 2022-09-23 A network-supervised fine-grained image recognition method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211167812.6A CN115496948B (en) 2022-09-23 2022-09-23 A network-supervised fine-grained image recognition method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN115496948A true CN115496948A (en) 2022-12-20
CN115496948B CN115496948B (en) 2025-06-27

Family

ID=84470196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211167812.6A Active CN115496948B (en) 2022-09-23 2022-09-23 A network-supervised fine-grained image recognition method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN115496948B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116012569A (en) * 2023-03-24 2023-04-25 广东工业大学 Multi-label image recognition method based on deep learning and under noisy data
CN119579992A (en) * 2024-11-26 2025-03-07 济南大学 Semi-supervised image classification method and system based on pseudo-label and embedded cluster matching

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800811A (en) * 2019-01-24 2019-05-24 吉林大学 A kind of small sample image-recognizing method based on deep learning
CN113392875A (en) * 2021-05-20 2021-09-14 广东工业大学 Method, system and equipment for classifying fine granularity of image
CN113592023A (en) * 2021-08-11 2021-11-02 杭州电子科技大学 High-efficiency fine-grained image classification model based on depth model framework

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800811A (en) * 2019-01-24 2019-05-24 吉林大学 A kind of small sample image-recognizing method based on deep learning
CN113392875A (en) * 2021-05-20 2021-09-14 广东工业大学 Method, system and equipment for classifying fine granularity of image
CN113592023A (en) * 2021-08-11 2021-11-02 杭州电子科技大学 High-efficiency fine-grained image classification model based on depth model framework

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116012569A (en) * 2023-03-24 2023-04-25 广东工业大学 Multi-label image recognition method based on deep learning and under noisy data
CN116012569B (en) * 2023-03-24 2023-08-15 广东工业大学 Multi-label image recognition method based on deep learning and under noisy data
CN119579992A (en) * 2024-11-26 2025-03-07 济南大学 Semi-supervised image classification method and system based on pseudo-label and embedded cluster matching
CN119579992B (en) * 2024-11-26 2025-05-30 济南大学 Semi-supervised image classification method and system based on pseudo tag and embedded cluster matching

Also Published As

Publication number Publication date
CN115496948B (en) 2025-06-27

Similar Documents

Publication Publication Date Title
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN110942091B (en) Semi-supervised few-sample image classification method for searching reliable abnormal data center
CN107437100A (en) A kind of picture position Forecasting Methodology based on the association study of cross-module state
CN111079847B (en) Remote sensing image automatic labeling method based on deep learning
CN110297931B (en) Image retrieval method
CN110728694B (en) Long-time visual target tracking method based on continuous learning
CN110046671A (en) A kind of file classification method based on capsule network
CN107515895A (en) A visual target retrieval method and system based on target detection
CN113326731A (en) Cross-domain pedestrian re-identification algorithm based on momentum network guidance
CN110097060B (en) Open set identification method for trunk image
CN106203523A (en) The classification hyperspectral imagery of the semi-supervised algorithm fusion of decision tree is promoted based on gradient
CN109753897B (en) Behavior recognition method based on memory cell reinforcement-time sequence dynamic learning
CN110390275A (en) A hand gesture classification method based on transfer learning
CN113159066B (en) Fine-grained image recognition algorithm of distributed labels based on inter-class similarity
CN111783688B (en) A classification method of remote sensing image scene based on convolutional neural network
CN112926451B (en) Cross-modal pedestrian re-identification method based on self-simulation mutual distillation
CN115496948A (en) A network-supervised fine-grained image recognition method and system based on deep learning
CN109543546B (en) Gait age estimation method based on depth sequence distribution regression
CN111259917B (en) An Image Feature Extraction Method Based on Local Neighbor Component Analysis
CN117058437A (en) Flower classification method, system, equipment and medium based on knowledge distillation
CN115457332A (en) Image multi-label classification method based on graph convolution neural network and class activation mapping
CN111832580B (en) SAR Target Recognition Method Combining Few-Shot Learning and Target Attribute Features
CN111695531B (en) Cross-domain pedestrian re-identification method based on heterogeneous convolution network
CN116681975A (en) An open set image recognition method and system based on active learning
CN112949771A (en) Hyperspectral remote sensing image classification method based on multi-depth multi-scale hierarchical attention fusion mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant