CN118429733A - Multi-head attention-driven kitchen garbage multi-label classification method - Google Patents
Multi-head attention-driven kitchen garbage multi-label classification method Download PDFInfo
- Publication number
- CN118429733A CN118429733A CN202410900342.2A CN202410900342A CN118429733A CN 118429733 A CN118429733 A CN 118429733A CN 202410900342 A CN202410900342 A CN 202410900342A CN 118429733 A CN118429733 A CN 118429733A
- Authority
- CN
- China
- Prior art keywords
- kitchen waste
- module
- label classification
- layer
- head attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 239000010813 municipal solid waste Substances 0.000 title claims abstract description 14
- 239000010806 kitchen waste Substances 0.000 claims abstract description 43
- 238000000605 extraction Methods 0.000 claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 14
- 238000005096 rolling process Methods 0.000 claims abstract description 11
- 230000009467 reduction Effects 0.000 claims abstract description 6
- 238000013145 classification model Methods 0.000 claims abstract description 4
- 239000000284 extract Substances 0.000 claims abstract description 3
- 238000010586 diagram Methods 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 18
- 230000003068 static effect Effects 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 15
- 230000007246 mechanism Effects 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 3
- 101100001674 Emericella variicolor andI gene Proteins 0.000 claims description 2
- 230000003213 activating effect Effects 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 238000011176 pooling Methods 0.000 claims description 2
- 238000005728 strengthening Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 8
- 239000010410 layer Substances 0.000 description 27
- 230000004927 fusion Effects 0.000 description 4
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000008194 pharmaceutical composition Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- 239000013585 weight reducing agent Substances 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a multi-head attention-driven kitchen garbage multi-label classification method, which comprises the following steps: constructing a kitchen waste multi-label classification data set, wherein the kitchen waste multi-label classification data set comprises a plurality of images of different categories of kitchen waste, and the image labels comprise one or more categories; constructing a multi-head attention-driven graph rolling light-weight network model, wherein the model comprises a feature extraction module, a multi-head attention module and a dynamic graph rolling module; the feature extraction module extracts features from an input image, sends the features into the multi-head attention module to strengthen a category sensing area of a feature map, and sends the features into the dynamic map convolution module to self-adaptively capture the category sensing area; training the graph roll lightweight network model by using the constructed kitchen waste multi-label classification data set; and finally, performing multi-label classification on the kitchen waste image to be predicted by using the classification model obtained through training. The invention can reduce the performance loss caused by the reduction of model parameters, enhance the recognition capability and improve the multi-label classification effect.
Description
Technical Field
The invention relates to the field of image processing, in particular to a multi-head attention-driven kitchen garbage multi-label classification method.
Background
In recent years, deep learning is proposed as a means for improving garbage classification efficiency, however, most of currently disclosed garbage data sets are designed based on household garbage recognition, and related researches on garbage classification in real scenes are lacking, and in addition, garbage images often contain multiple categories, which are typical multi-label image classification in the field of computer vision. The multi-label image classification is one of the tasks in the computer vision field, and the main task is to accurately predict all the categories contained in the image, and since in the real world, objects usually appear in multiple simultaneous ways, the multi-label image classification also accords with the knowledge of human bodies.
From the multi-label classification angle of kitchen waste, real scenes have higher real-time requirements on models, and because of complex image background, various object categories and relevance among object labels, the existing method aims at solving the problems that if higher classification precision is required, a complex network model is often required, and the classification precision of a basic deep convolution network cannot be ensured. Therefore, the intelligent kitchen garbage multi-label efficient classification algorithm under the research real scene has important practical value and significance.
Disclosure of Invention
The invention provides a multi-head attention-driven kitchen garbage multi-label classification method, which reduces performance loss caused by reduction of model parameters, further enhances identification capability and improves multi-label classification effect.
In order to achieve the technical purpose, the invention adopts the following technical scheme:
A multi-head attention-driven kitchen garbage multi-label classification method comprises the following steps:
Constructing a kitchen waste multi-label classification data set, wherein the multi-label classification data set comprises a plurality of images of kitchen waste in different categories, and the labels of each image comprise one or more categories;
constructing a multi-head attention-driven graph rolling light-weight network model, wherein the model comprises a light-weight characteristic extraction module, a multi-head attention module and a dynamic graph rolling module; the feature extraction module extracts feature images of an input model, the extracted feature images are sent to the multi-head attention module to be processed so as to strengthen category sensing areas of the feature images, and the feature images after strengthening the category sensing areas are sent to the dynamic image convolution module to be processed so as to adaptively capture the category sensing areas and output prediction categories;
training the graph roll lightweight network model by using the constructed kitchen waste multi-label classification data set;
Finally, a model obtained through training, namely a kitchen waste multi-label classification model, is used for multi-label classification of the kitchen waste image to be predicted.
Further, based on different classification distributions of kitchen waste in different seasons, the kitchen waste multi-label classification data set is obtained by selecting a plurality of kitchen waste images with different classification combinations from different time periods of one year.
Further, the lightweight feature extraction module uses a lightweight backbone network ShufflenetV.
Further, the multi-head attention module comprises a first full-connection layer, a zoom dot product attention sub-module, a second full-connection layer, a Dropout layer and a normalization layer;
The first full connection layer inputs the characteristic diagram Dimension reduction conversion into feature map; Wherein,Representing the length, width, and number of channels of the image, respectively;
The zoom dot product attention submodule adopts a multi-head attention mechanism, and each head is used for mapping the characteristic diagram As the key and the value, the query adopts a set of parameters which can be learned, and the calculation formula is as follows:
In the method, in the process of the invention, Representing the output of a scaled dot product attention sub-module employing a multi-headed attention mechanism,The feature stitching is represented and is performed,In order to attach the weight matrix to the vehicle,Attention module for dot product scalingThe output of the individual head is provided with,Query, key, value and corresponding weight matrix, respectively, of the scaled dot product attention submodule inputMultiplying to obtain; is a scaling factor; Respectively the first The weight matrix to be learned of each head is the weight matrixWeights in the ith dimension;
The second fully-connected layer, dropout layer, and normalization layer further process the output of the scaled dot product attention sub-module, expressed as:
In the method, in the process of the invention, Representing a point-by-point addition,、、Respectively representing the treatment of a second full connection layer, a Dropout layer and a normalization layer; and the characteristic diagram represents the output of the multi-head attention module.
Further, the dynamic graph convolution module comprises a static graph convolution layer and a dynamic graph convolution layer;
Output of the dynamic graph convolution layer to the static graph convolution layer The process is expressed as:
In the method, in the process of the invention, Is the correlation matrix of the dynamic graph convolution layer, andIs the state update weight of the dynamic graph convolutional layer,Is thatThe function is activated and the function is activated,Is thatActivating a function; By combining characteristic diagrams And global representation thereofObtained by concatenation, where global representsOutput through static graph convolutional layersCarrying out pooling and 1*1 one-dimensional convolution and activating functions to obtain; c represents the total number of categories of multi-label classification of kitchen waste, and D 1 represents the dimension of an output characteristic diagram H of a static diagram convolution layer.
Further, the static graph convolution layer processes the input feature graph, expressed as:
In the method, in the process of the invention, Is a characteristic diagram output by the multi-head attention module,The output characteristic diagram of the static diagram convolution layer consists of C types of corresponding characteristics, namely;In order to activate the function,Is the correlation matrix of the convolution layer of the static diagramIs the state update weight of the static graph convolutional layer.
According to the multi-head attention-driven kitchen waste multi-label classification method, a lightweight network is utilized to optimize a graph volume integral model, meanwhile, a multi-head attention mechanism is introduced to reduce loss of characteristic information, characteristic information of different layers is captured, characteristic extraction capacity of a backbone network under a complex scene is enhanced, performance loss caused by reduction of model parameter quantity is reduced, and a dynamic graph convolution module is further utilized to realize self-adaptive capture of a semantic perception area, so that identification capacity is further enhanced, and multi-label classification effect is improved. Compared with the existing kitchen waste classification technology, the method has the following advantages:
(1) The method overcomes the defects that the conventional GCN method takes a common deep convolutional network as a characteristic to extract the main network with low precision and takes a Transformer as the characteristic to extract the main network model with large parameter, adopts ShuffleNetV2 as the main network for characteristic extraction, and optimizes the network model to realize light weight.
(2) The invention designs a multi-head attention module and a dynamic graph convolution module to optimize a lightweight graph convolution classification network, reduces performance loss caused by the reduction of model parameter quantity, further enhances identification capability and improves multi-label classification effect.
(3) The method has strong practicability and generalization capability, not only can be applied to the current multi-label classification of kitchen waste to obtain a good effect, but also can obtain excellent multi-label classification precision on MS-COCO and VOC 2007 data sets.
Drawings
Fig. 1 is a flow chart of a multi-head attention-driven kitchen garbage multi-label classification method according to an embodiment of the invention.
FIG. 2 is a schematic diagram of a multi-head attention module according to an embodiment of the present invention.
Fig. 3 is a relationship between parameters and accuracy of different backbone networks according to an embodiment of the present invention.
FIG. 4 shows the effect of fusion of different modules on classification mAP according to an embodiment of the invention.
Fig. 5 illustrates the effect of different module fusion on classifying various classes of APs in an embodiment of the present invention.
Detailed Description
The following describes in detail the embodiments of the present invention, which are developed based on the technical solution of the present invention, and provide detailed embodiments and specific operation procedures, and further explain the technical solution of the present invention.
The technical scheme of the invention is to realize multi-label classification of kitchen waste images based on a multi-head attention-driven graph rolling light-weight network, and can utilize a Python programming language to carry out experiments and also can use a C/C++ programming language to carry out engineering application.
The invention provides a multi-head attention-driven graph rolling light-weight network multi-label classification method, which is characterized in that the processing flow is shown in fig. 1 and comprises the following steps:
step 1, constructing a kitchen waste multi-label classification dataset MLKW collected in a real scene, wherein the collection and labeling of the dataset are included;
The data in the embodiment are acquired on the kitchen waste sorting center pipeline, the acquired 15,994 Zhang Chuyu waste images complete the manufacture of the MLKW data set, and MLKW is closer to a real scene, so that the method has strong engineering application significance. In order to embody the characteristics of kitchen waste in different time periods, the invention selects images from different time periods of the year by considering different classification distributions of the kitchen waste in different seasons, wherein the images comprise 3771 images in spring, 4735 images in summer, 2130 images in autumn and 4358 images in winter. Through cleaning and screening the image data, a multi-label kitchen waste data set containing eight categories and totally 3107 pieces of full labels is finally obtained, the rest of the multi-label kitchen waste data set is labeled, and the data set is divided into a training set and a testing set according to the proportion of 8:2. At the same time, verification was performed on the paspal VOC 2007 dataset and the MS-COCO dataset, which are widely used in the field of multi-label classification, where the VOC 2007 dataset totals 9963 images and covers the common 20 categories, and the MS-COCO dataset contains one training set (82081) and one verification set (40504) for a total of 122581 images and covers the common 80 categories, with approximately 2.9 category labels per image.
And 2, constructing a multi-head attention-driven graph rolling light-weight network model, wherein the model comprises a light-weight characteristic extraction module, a multi-head attention module and a dynamic graph rolling module.
1. And the characteristic extraction module.
In the embodiment, the lightweight backbone network ShufflenetV is used as the feature extraction network of the GCN module, so that the weight reduction, that is, GCLN is realized. However, in order to adapt to the lightweight model of the present invention, the original resolution of the image was adjusted to be 3,256× 2,724, 448×448. Model prior to training, first pre-trained ShuffleNetV2_x1_0 was used as the backbone network on the ImageNet1K dataset.
2.A multi-head attention module.
And sending the feature images extracted by the feature extraction module into a category perception area of the multi-head attention module reinforced feature images.
The purpose of the multi-headed attention Module (MHA) is to capture the category region of interest, features extracted through the backbone networkW, H, D respectively represent the length, width and channel number of the image, and is converted intoTaking the key and value as a multi-head attention mechanism, and adopting a set of learnable parameters for query to complete global examination of image features, see fig. 2. In MHA, the dot product attention (scaled dot-product attention) is scaled as the computational core, whose formula is:
(1)
wherein: q, K, V is a matrix of inputs q ', k ', v ' and corresponding weights through the fully connected layers Multiplication.In order for the scaling factor to be a factor,The dimensions of (a) are the same as those of the initial inputs q, k, v. See fig. 2.
Unlike the multi-headed attention mechanism in Vision Transformer, the present invention introduces residual connections by redesigning it to reduce the loss of information. The calculation process is as follows: the q, k and v matrixes are mapped to different subspaces to generate a plurality of heads, and the information of different subspaces and different dimensions is captured by scaling dot product attention mechanisms on each head. Wherein:
(2)
(3)
wherein: w is an additional weight matrix, concat denotes feature stitching.
After the multi-head attention is paid, the Dropout layer and normalization are added to relieve the problem of overfitting in the training process, and in addition, the loss of information is reduced through residual connection, and the calculation formula is as follows:
(4)
Wherein: v c denotes the category of interest, c denotes the number of categories, Representing addition point by point, Q is Q' andAnd the phases multiply.
3. A dynamic convolution module.
And conveying the feature images output by the multi-head attention module to a dynamic image convolution module for final classification.
The embodiment obtains the interested category through a multi-head attention mechanismA dynamic graph convolution module (DYNAMIC GCN) is designed to have a specific label dependence on each image.
By passing Vc to the static GCN and the dynamic GCN in sequence. Single layer static GCN is simply defined asWhereinActivating a functionIs thatWherein, the method comprises the steps of, wherein,Correlation matrix ofIs a state update weight. Dynamic GCN is then employed. Correlation matrixIs a characteristic matrix influenced by image characteristics, and can effectively relieve the problem of overfitting in this way, and dynamically output GCNThe specific calculation process is as follows:
(5)
wherein, Is thatThe function is activated and the function is activated,Is thatThe function is activated and the function is activated,The weight value is updated for the status of the device,To construct a weight of the dynamic correlation matrix Ad,By combining H with its global representationObtained in series, from a global averaging pool and a conv layer sequence. In the form of a pharmaceutical composition,The definition is as follows:
(6)
and step 3, training the graph rolling lightweight network model by using the constructed kitchen waste multi-label classification data set.
During training, the dynamic graph convolution module (DGCN) uses the nonlinear activation function LeakyReLU to set the slope to 0.2. Some super parameters in the training process are as follows: the optimizer adopts SGD with momentum of 0.9 and weight attenuation coefficient of 0.0001. The batch size was 16, the initial learning rate of the multi-head attention Module (MHA) and the dynamic graph convolution module (DGCN) modules was 0.5, the initial learning rate of the backbone network was 0.05, a total of 50 epochs were trained, and the learning rates were updated at 30 th and 40 th epochs with a decay factor of 0.1, respectively. Ubuntu 16.04LTS system and system environment Pytorch 1.6.6 and Python3.6 are required. The hardware platform comprises a display card Tesla V100, a CUDA11.0 and a CuDNN8.0.2GPUs, wherein the memory of the CPU is not lower than 64G, and the solid state disk is not lower than 512G.
And 4, performing multi-label classification on the kitchen waste image to be predicted by using the model obtained through training, namely the kitchen waste multi-label classification model.
In order to facilitate understanding of the technical effects of the present invention, a comparison of the application of the present invention and the conventional method is provided as follows:
Tables 1 and 3 show a comparison of different lightweight backbone networks, including ShuffleNetV, mobileNetV, EFFICIENTNET, and multi-label image classification algorithms including ML-GCN, ADD-GCN, MHA-GCN, on different approaches. The average precision (mAP) and parameter quantity (Params) of different backbone networks among different algorithms are mainly compared. As can be seen from table 1, the present invention is superior to other methods in all indexes under the same backbone. As can be seen from FIG. 2, by increasing the number of parameters of the model by a small amount, a larger performance improvement is obtained, and the performance represented by the present invention on MLKW datasets is improved by 8.6% at a minimum over ML-GCN and 4.8% at a minimum over ADD-GCN, indicating the effectiveness of the methods herein.
Table 2 is the experimental results of the present invention on other methods on the VOC2007 dataset. It is clear that AP values over multiple categories have advantages over other methods, increasing the mAP to 94.0% compared to ResNet a 101. Notably, the invention is designed for kitchen waste data sets, and has the same performance as ML-GCN on VOC2007, which also proves the generalization and expandability of the method.
Table 3 shows the results of experiments performed by other methods according to the invention on MS-COCO data sets. Table 3 shows the performance of MHA-GCN on MS-COCO and the results of comparison with ResNet, SRTN, ML-GCN, etc. Obviously, the MHA-GCN OF the method is superior to other methods in a plurality OF indexes such as mAP, OF1, CF1 and the like, and in addition, compared with ResNet, the overall performance OF the MHA-GCN is improved by 6.1%, which further shows the advantages OF the method.
In order to further verify the effectiveness of the multi-head attention module MHA and the dynamic graph convolution module GCN of the method of the invention, the invention verifies the influence of different module fusion on classified mAP and the influence of different module fusion on classified APs on MLKW data, see fig. 4 and 5, respectively.
The above embodiments are preferred embodiments of the present application, and various changes or modifications may be made thereto by those skilled in the art, which should be construed as falling within the scope of the present application as claimed herein, without departing from the general inventive concept.
Claims (6)
1. A multi-head attention-driven kitchen garbage multi-label classification method is characterized by comprising the following steps of:
Constructing a kitchen waste multi-label classification data set, wherein the multi-label classification data set comprises a plurality of images of kitchen waste in different categories, and the labels of each image comprise one or more categories;
constructing a multi-head attention-driven graph rolling light-weight network model, wherein the model comprises a light-weight characteristic extraction module, a multi-head attention module and a dynamic graph rolling module; the feature extraction module extracts feature images of an input model, the extracted feature images are sent to the multi-head attention module to be processed so as to strengthen category sensing areas of the feature images, and the feature images after strengthening the category sensing areas are sent to the dynamic image convolution module to be processed so as to adaptively capture the category sensing areas and output prediction categories;
training the graph roll lightweight network model by using the constructed kitchen waste multi-label classification data set;
Finally, a model obtained through training, namely a kitchen waste multi-label classification model, is used for multi-label classification of the kitchen waste image to be predicted.
2. The multi-head attention-driven kitchen waste multi-label classification method according to claim 1, wherein the kitchen waste multi-label classification data set is obtained by selecting kitchen waste images with a plurality of different types and combinations from different time periods of one year based on different types and distributions of kitchen waste in different seasons.
3. The multi-head attention driven kitchen waste multi-tag classification method of claim 1, wherein the lightweight feature extraction module uses a lightweight backbone network ShufflenetV.
4. The multi-head attention driven kitchen waste multi-label classification method according to claim 1, wherein the multi-head attention module comprises a first full-connection layer, a zoom dot product attention sub-module, a second full-connection layer, a Dropout layer and a normalization layer;
The first full connection layer inputs the characteristic diagram Dimension reduction conversion into feature map; Wherein,Representing the length, width, and number of channels of the image, respectively;
The zoom dot product attention submodule adopts a multi-head attention mechanism, and each head is used for mapping the characteristic diagram As the key and the value, the query adopts a set of parameters which can be learned, and the calculation formula is as follows:
;
;
;
In the method, in the process of the invention, Representing the output of a scaled dot product attention sub-module employing a multi-headed attention mechanism,The feature stitching is represented and is performed,In order to attach the weight matrix to the vehicle,Attention module for dot product scalingThe output of the individual head is provided with,Query, key, value and corresponding weight matrix, respectively, of the scaled dot product attention submodule inputMultiplying to obtain; is a scaling factor; Respectively the first The weight matrix to be learned of each head is the weight matrixWeights in the ith dimension;
The second fully-connected layer, dropout layer, and normalization layer further process the output of the scaled dot product attention sub-module, expressed as:
;
In the method, in the process of the invention, Representing a point-by-point addition,、、Respectively representing the treatment of a second full connection layer, a Dropout layer and a normalization layer; and the characteristic diagram represents the output of the multi-head attention module.
5. The multi-head attention driven kitchen waste multi-label classification method according to claim 1, wherein the dynamic graph convolution module comprises a static graph convolution layer and a dynamic graph convolution layer;
Output of the dynamic graph convolution layer to the static graph convolution layer The process is expressed as:
;
In the method, in the process of the invention, Is the correlation matrix of the dynamic graph convolution layer, andIs the state update weight of the dynamic graph convolutional layer,Is thatThe function is activated and the function is activated,Is thatActivating a function; By combining characteristic diagrams And global representation thereofObtained by concatenation, where global representsOutput through static graph convolutional layersCarrying out pooling and 1*1 one-dimensional convolution and activating functions to obtain; c represents the total number of categories of multi-label classification of kitchen waste, and D 1 represents the dimension of an output characteristic diagram H of a static diagram convolution layer.
6. The multi-head attention driven kitchen waste multi-label classification method according to claim 5, wherein the static map convolution layer processes an input feature map, expressed as:
;
In the method, in the process of the invention, Is a characteristic diagram output by the multi-head attention module,The output characteristic diagram of the static diagram convolution layer consists of C types of corresponding characteristics, namely;In order to activate the function,Is the correlation matrix of the convolution layer of the static diagramIs the state update weight of the static graph convolutional layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410900342.2A CN118429733B (en) | 2024-07-05 | 2024-07-05 | Multi-head attention-driven kitchen garbage multi-label classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410900342.2A CN118429733B (en) | 2024-07-05 | 2024-07-05 | Multi-head attention-driven kitchen garbage multi-label classification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118429733A true CN118429733A (en) | 2024-08-02 |
CN118429733B CN118429733B (en) | 2024-10-11 |
Family
ID=92321837
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410900342.2A Active CN118429733B (en) | 2024-07-05 | 2024-07-05 | Multi-head attention-driven kitchen garbage multi-label classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118429733B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021115159A1 (en) * | 2019-12-09 | 2021-06-17 | 中兴通讯股份有限公司 | Character recognition network model training method, character recognition method, apparatuses, terminal, and computer storage medium therefor |
JP6980958B1 (en) * | 2021-06-23 | 2021-12-15 | 中国科学院西北生態環境資源研究院 | Rural area classification garbage identification method based on deep learning |
CN114612681A (en) * | 2022-01-30 | 2022-06-10 | 西北大学 | GCN-based multi-label image classification method, model construction method and device |
CN116484740A (en) * | 2023-04-28 | 2023-07-25 | 南京信息工程大学 | Line parameter identification method based on space topology characteristics of excavated power grid |
CN116863531A (en) * | 2023-05-22 | 2023-10-10 | 山东师范大学 | Human behavior recognition method and system based on self-attention enhanced graph neural network |
US20240119721A1 (en) * | 2022-10-06 | 2024-04-11 | Qualcomm Incorporated | Processing data using convolution as a transformer operation |
WO2024139297A1 (en) * | 2022-12-30 | 2024-07-04 | 深圳云天励飞技术股份有限公司 | Road disease identification method and re-identification method, and related device |
-
2024
- 2024-07-05 CN CN202410900342.2A patent/CN118429733B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021115159A1 (en) * | 2019-12-09 | 2021-06-17 | 中兴通讯股份有限公司 | Character recognition network model training method, character recognition method, apparatuses, terminal, and computer storage medium therefor |
JP6980958B1 (en) * | 2021-06-23 | 2021-12-15 | 中国科学院西北生態環境資源研究院 | Rural area classification garbage identification method based on deep learning |
CN114612681A (en) * | 2022-01-30 | 2022-06-10 | 西北大学 | GCN-based multi-label image classification method, model construction method and device |
US20240119721A1 (en) * | 2022-10-06 | 2024-04-11 | Qualcomm Incorporated | Processing data using convolution as a transformer operation |
WO2024139297A1 (en) * | 2022-12-30 | 2024-07-04 | 深圳云天励飞技术股份有限公司 | Road disease identification method and re-identification method, and related device |
CN116484740A (en) * | 2023-04-28 | 2023-07-25 | 南京信息工程大学 | Line parameter identification method based on space topology characteristics of excavated power grid |
CN116863531A (en) * | 2023-05-22 | 2023-10-10 | 山东师范大学 | Human behavior recognition method and system based on self-attention enhanced graph neural network |
Non-Patent Citations (3)
Title |
---|
HAI QIN, ET AL.: "Active Learning-DETR: Cost-Effective Object Detection for Kitchen Waste", IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 22 February 2024 (2024-02-22), pages 1 - 15 * |
陈佳伟;韩芳;王直杰;: "基于自注意力门控图卷积网络的特定目标情感分析", 计算机应用, no. 08, 10 August 2020 (2020-08-10), pages 38 - 42 * |
龚亮威 等: "基于多头类特定残差注意力和图卷积的多标签图像分类算法", 微电子学与计算机, 31 August 2023 (2023-08-31), pages 45 - 54 * |
Also Published As
Publication number | Publication date |
---|---|
CN118429733B (en) | 2024-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021042828A1 (en) | Neural network model compression method and apparatus, and storage medium and chip | |
CN109711481B (en) | Neural networks for drawing multi-label recognition, related methods, media and devices | |
CN109271522B (en) | Comment emotion classification method and system based on deep hybrid model transfer learning | |
CN109711463B (en) | Attention-based important object detection method | |
CN112949673B (en) | Feature fusion target detection and identification method based on global attention | |
WO2021022521A1 (en) | Method for processing data, and method and device for training neural network model | |
CN115937655B (en) | Multi-order feature interaction target detection model, construction method, device and application thereof | |
CN105243154B (en) | Remote sensing image retrieval method based on notable point feature and sparse own coding and system | |
CN110222718B (en) | Image processing method and device | |
CN110321805B (en) | Dynamic expression recognition method based on time sequence relation reasoning | |
CN112488301B (en) | Food inversion method based on multitask learning and attention mechanism | |
CN113505719B (en) | Gait recognition model compression system and method based on local-integral combined knowledge distillation algorithm | |
CN114780767B (en) | Large-scale image retrieval method and system based on deep convolutional neural network | |
CN112507800A (en) | Pedestrian multi-attribute cooperative identification method based on channel attention mechanism and light convolutional neural network | |
CN113378938A (en) | Edge transform graph neural network-based small sample image classification method and system | |
CN115049941A (en) | Improved ShuffleNet convolutional neural network and remote sensing image classification method thereof | |
Liu et al. | A novel image retrieval algorithm based on transfer learning and fusion features | |
Ramesh Babu et al. | A novel framework design for semantic based image retrieval as a cyber forensic tool | |
CN115457332A (en) | Image multi-label classification method based on graph convolution neural network and class activation mapping | |
Ye et al. | PlantBiCNet: A new paradigm in plant science with bi-directional cascade neural network for detection and counting | |
CN118429733B (en) | Multi-head attention-driven kitchen garbage multi-label classification method | |
CN116704382A (en) | Unmanned aerial vehicle image semantic segmentation method, device, equipment and storage medium | |
CN115496948A (en) | Network supervision fine-grained image identification method and system based on deep learning | |
CN115984699A (en) | Illegal billboard detection method, device, equipment and medium based on deep learning | |
CN116543250A (en) | Model compression method based on class attention transmission |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |