CN111476315A - Image multi-label identification method based on statistical correlation and graph convolution technology - Google Patents
Image multi-label identification method based on statistical correlation and graph convolution technology Download PDFInfo
- Publication number
- CN111476315A CN111476315A CN202010342622.8A CN202010342622A CN111476315A CN 111476315 A CN111476315 A CN 111476315A CN 202010342622 A CN202010342622 A CN 202010342622A CN 111476315 A CN111476315 A CN 111476315A
- Authority
- CN
- China
- Prior art keywords
- image
- label
- network
- graph convolution
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computational Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an image multi-label identification method based on statistical correlation and graph convolution technology, which overcomes the defect that the relation between objects in a multi-label image is not fully considered compared with the prior art. The invention comprises the following steps: collecting and preprocessing a multi-label image; calculating the correlation between the labels; constructing an image multi-label identification network; training the image multi-label recognition network; acquiring a multi-label image to be detected; and obtaining an image multi-label identification result. The method utilizes image label data to learn the adjacency matrix, updates object feature representation in the image through a graph convolution network, and improves the multi-label classification performance of the image by combining global feature residual errors.
Description
Technical Field
The invention relates to the technical field of image analysis, in particular to an image multi-label identification method based on statistical correlation and graph convolution technology.
Background
In recent years, convolutional neural networks have been developed dramatically in the field of computer vision, especially image classification techniques. Because of the limitation of local receptive fields of convolution kernels, the convolutional neural network is better at identifying a single object and ignores the relationship between objects. In an image, there are basically a plurality of related objects appearing simultaneously, such as: teachers and students, mice and keyboards, goats and grasslands, etc. There are also some relationships that hardly appear in the same image, such as: dogs and airplanes, yaks and seas, snowflakes, swimsuits, and the like. Therefore, a large number of dependency relationships are contained in the image, and the current convolutional neural network cannot model the dependency relationships between the objects from the training data so as to improve the classification accuracy.
Graph convolutional networks are widely used to solve the inherent limitations of convolutional neural networks, and the main parts of the graph convolutional neural networks are an adjacency matrix, a feature representation matrix of nodes and a learnable weight matrix. Among them, a lot of research is focused on the adjacency matrix. Some researches establish the adjacency matrix through semantic network, context information, knowledge graph and other methods, but the message transmission between nodes is limited to the first-order neighbor nodes of the nodes. Furthermore, the graph network obtained through external information may not fit well with the learned image dataset, leading to a situation where the knowledge graph misleads the training.
In particular, in a multi-label image, there are a plurality of identification (label) objects in one image, and there is usually a certain correlation between these identification objects, for example: when an object such as a keyboard, a mouse and the like is found in a partial area of an image, the image is considered to have a computer object with a high probability, and accordingly, a display object can be estimated to exist in the image. In other words, for the identification of the keyboard and the mouse in the picture, the probability that the computer host and the display exist in the picture is increased, and the probability that the airplane, the elephant and other objects exist in the picture can be reduced. It can be seen that modeling and reasoning about dependencies between multiple objects in an image is crucial.
Therefore, how to ignore the relationship between objects in the current situation of the convolutional neural network, modeling the dependency relationship between objects from the image data has become an urgent technical problem to be solved.
Disclosure of Invention
The invention aims to solve the defect that the relationship between objects in a multi-label image is not fully considered in the prior art, and provides an image multi-label identification method based on statistical correlation and graph convolution technology to solve the problem.
In order to achieve the purpose, the technical scheme of the invention is as follows:
an image multi-label identification method based on statistical correlation and graph convolution technology comprises the following steps:
collecting and preprocessing multi-label images: collecting multi-label images, and processing labels into a matrix of N × C, wherein N is the number of samples, and C is the type or category number of the labels;
calculating the correlation between the labels: calculating the mutual dependency relationship among the labels by utilizing the mutual information, constructing a dependency relationship full-link graph and normalizing the dependency relationship full-link graph to obtain an adjacency matrix;
constructing an image multi-label identification network: constructing an image multi-label identification network based on the graph convolution network;
training the image multi-label recognition network: training a graph convolution network and a full connection layer in the image multi-label identification network;
acquiring a multi-label image to be detected: acquiring a multi-label image to be detected;
obtaining an image multi-label identification result: and inputting the multi-label image to be detected into the trained image multi-label identification network to obtain a final multi-label classification result.
The collection and pre-processing of the multi-label image comprises the steps of:
constructing an N-C all-zero matrix D, wherein N is the number of images in the training set, C is the total number of categories in the training set, and C is arranged according to any rule;
converting the image labeling data into a label data matrix D, wherein one image and standard information thereof correspond to one row of data in the label data matrix D; for the images in all the labeled data, if a certain label exists in the image, the corresponding row and column are found in the label data matrix D, and are assigned as "1", which represents that the label exists.
The calculating the correlation between the labels comprises the following steps:
for each column in the tag data matrix D, the mutual information of the column and other columns is calculated, and the calculation formula is as follows:
I(X;Y)=H(X)-H(X|Y)
H(X)=-∑X=xP(x)*logP(x),
wherein, X and Y are random variables representing the category of the label, X and Y are values of the random variables X and Y, X, Y ∈ {0,1}, P (X) is the probability that X is the random variable X ═ X, P (X | Y) is the conditional probability, H (X) is the information entropy, and H (X | Y is the conditional information entropy;
regarding each column of the label data as a random variable X or Y, regarding the numerical value of each row as X or Y, calculating mutual information between nodes, constructing a C-row C-column matrix A for storing mutual information values, AijA mutual information value representing the ith column and the jth column;
computing the normalization of the matrix A as an adjacency matrix for the graph convolution network
Wherein: a. theijFor the mutual information values of the ith class and the jth class, exp is an exponential function, softmax is a normalizing function,is a normalized adjacency matrix.
The method for constructing the image multi-label identification network comprises the following steps:
setting and utilizing Fast R-CNN as a baseline module to obtain the characteristic X of each pictureIAnd a bounding box;
Setting and utilizing a mutual information method to obtain a full-connection adjacent matrix, and carrying out normalization processing on the full-connection adjacent matrix, wherein the expression is as follows:
I(X;Y)=H(X)-H(X|Y),
setting a graph volume network: representing the initial features of each bounding boxAre combined to form X(0)Combining fully-connected adjacency matricesAs input of the graph convolution network, L-layer graph convolution is carried out to obtain feature representationThe expression is as follows:
wherein:is an adjacent matrix, X is a matrix formed by the feature vectors of a plurality of nodes, W is a learnable parameter, and sigma (-) is an activation function;
setting of the full connection layer: and connecting the integral image features and the bounding box features after convolution of the image convolution network in series, connecting two layers of fully-connected neural networks, and activating by softmax to obtain a final classification result.
The training of the graph convolution network comprises the following steps:
obtaining global feature representation of the image and a boundary box of an object in the image and feature representation thereof by using Fast R-CNN and ROI;
taking the characteristic representation of the object as the input of the graph convolution network, updating the corresponding node representation,
wherein, X(l+1)For the (l + 1) th layer map convolution characteristic, σ is a nonlinear activation function,for the normalized global adjacency matrix obtained in the second step, X(l)Is the l-th layer feature representation, W is the learning parameter;
and connecting the global feature representation of the image and the object representation updated by the graph convolution network in series, connecting two FC layers, and finally obtaining a final multi-label identification result after normalization by a softmax function.
The training of the full connection layer comprises the following steps:
inputting the training image into a network to obtain a training result;
modifying the connection weight of the fully-connected network layer according to a gradient descent algorithm;
and correcting the graph convolution network parameter W according to a gradient descent algorithm.
The obtaining of the image multi-label identification result comprises the following steps:
obtaining the characteristic X of the multi-label image to be detected by using Fast R-CNN as a baseline moduleIAnd a bounding box;
obtaining initial feature representation of each bounding box in multi-label image to be detected by using ROI
All the images to be detectedMerged into a convolutional network X(0)As input to the graph convolution network;
integral initial characteristic X of output series images of graph convolution networkIConnected to a fully connected network;
and (4) passing the output of the two layers of trained fully-connected networks through a softmax function to obtain a final multi-label classification result.
Advantageous effects
Compared with the prior art, the image multi-label identification method based on the statistical correlation and the graph convolution technology utilizes image label data to learn the adjacency matrix, updates object feature representation in the image through the graph convolution network, and improves the image multi-label classification performance by combining global feature residual errors.
The method can well combine the image feature extraction capability of the convolutional neural network and the mutual dependency relationship of the labels, thereby improving the precision of multi-label classification.
Drawings
FIG. 1 is a sequence diagram of the method of the present invention.
Detailed Description
So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:
as shown in fig. 1, the image multi-label identification method based on statistical correlation and graph convolution technology according to the present invention includes the following steps:
the first step, the collection and preprocessing of multi-label images: collecting multi-label images, and processing labels into a matrix of N × C, wherein N is the number of samples, and C is the type or category number of the labels. The method comprises the following specific steps:
(1) and constructing an all-zero matrix D of N x C, wherein N is the number of images in the training set, C is the total number of categories in the training set, and C is arranged according to any rule.
(2) Converting the image labeling data into a label data matrix D, wherein one image and standard information thereof correspond to one row of data in the label data matrix D; for the images in all the labeled data, if a certain label exists in the image, the corresponding row and column are found in the label data matrix D, and are assigned as "1", which represents that the label exists.
Secondly, calculating the correlation between the labels: and calculating the mutual dependency relationship among the labels by utilizing the mutual information, constructing a dependency relationship full-link graph and normalizing the dependency relationship full-link graph to obtain the adjacency matrix. The statistical correlation among the labels is modeled, and the performance of multi-label classification can be improved. The adjacency matrix may direct the messaging of features between objects in the graph convolution network, thereby enhancing the feature representation of the associated object and reducing the messaging between non-statistically related objects. Most of the existing methods for constructing the adjacency matrix are constructed through external knowledge (such as semantic networks, knowledge maps and the like), but the external knowledge cannot well fit with a training data set, so that the adjacency matrix misleads message transmission, and therefore statistical correlation among labels is modeled from label data of the training data set. Traditional statistical correlation modeling often requires independent detection of tag data, which is a time-consuming and labor-consuming task. The information entropy can describe the size of information uncertainty contained in the random variable, and the mutual information can describe the degree of information uncertainty of one random variable reducing with the addition of another random variable. In addition, the computation complexity of mutual information is far less than that of chi-square detection, so that the mutual information is used for computing the correlation between image labels, and the correlation is normalized to be used as an adjacent matrix for guiding the message transmission between image multi-label identification objects.
The method comprises the following specific steps:
(1) for each column in the tag data matrix D, the mutual information of the column and other columns is calculated, and the calculation formula is as follows:
I(X;Y)=H(X)-H(X|Y(
H(X)=-∑X=xP(x)*logP(x),
x and Y are random variables and represent the category of the label, X and Y are values of the random variables X and Y, X, Y ∈ {0,1}, P (X) is the probability that X is the random variable X ═ X, P (X | Y) is the conditional probability, H (X) is the information entropy, H (X | Y is the conditional information entropy, the information entropy describes the uncertainty of the information, mutual information is innovatively used for replacing the conditional independence test, and the correlation between the picture category labels is described quantitatively.
Regarding each column of the label data as a random variable X or Y, regarding the numerical value of each row as X or Y, calculating mutual information between nodes, constructing a C-row C-column matrix A for storing mutual information values, AijRepresenting the mutual information values of the ith and jth columns.
(2) Computing the normalization of the matrix A as an adjacency matrix for the graph convolution network
Wherein: a. theijFor the mutual information values of the ith class and the jth class, exp is an exponential function, softmax is a normalizing function,is a normalized adjacency matrix.
Thirdly, constructing an image multi-label identification network: and constructing an image multi-label identification network based on the graph convolution network.
The graph convolution network can effectively fuse a connection meaning frame such as a convolution network and a symbolic meaning reasoning frame, and carry out message transmission and reasoning between image objects according to the guidance of the adjacent matrix, thereby improving the performance of multi-label classification. In most applications of the conventional graph convolution network, external knowledge is used as an adjacent matrix, and semantic directions are used as node feature vectors. But the external knowledge and external node vector representations do not fit well with the training data set, we use Fast R-CNN and roi (region of interest) to extract the feature representation of each object as the feature vector of the node. Meanwhile, the feature representation of the whole image and the node (object) feature representation obtained by the graph convolution information are connected in series and handed to a two-layer full-connection network to obtain a final classification result. The benefits of this are mainly two: 1. the message transmission and feature enhancement capability of the graph convolution network aims at training data and is not misled by external knowledge; 2. the series connection of the image overall characteristic and the graph convolution network node characteristic can ensure that the overall information of the image is not lost while the classification receiving domain is in the local object region, thereby achieving the stable classification effect.
The method for constructing the image multi-label identification network comprises the following steps:
(1) setting and utilizing Fast R-CNN as a baseline module to obtain the characteristic X of each pictureIAnd a bounding box;
(2) setting the ROI (region of interest) to obtain the initial feature representation of each bounding box
(3) Setting and utilizing a mutual information method to obtain a full-connection adjacent matrix, and carrying out normalization processing on the full-connection adjacent matrix, wherein the expression is as follows:
I(X;Y)=H(X)-H(X|Y),
(4) setting a graph volume network: representing the initial features of each bounding boxAre combined to form X(0)Combining fully-connected adjacency matricesAs input of the graph convolution network, L-layer graph convolution is carried out to obtain feature representationThe expression is as follows:
wherein:is a adjacency matrix, X is a matrix composed of feature vectors of a plurality of nodes, W is a learnable parameter, and σ (-) is an activation function.
(5) Setting of the full connection layer: and connecting the integral image features and the bounding box features after convolution of the image convolution network in series, connecting two layers of fully-connected neural networks, and activating by softmax to obtain a final classification result.
Fourthly, training the image multi-label recognition network: and training a graph convolution network and a full connection layer in the image multi-label identification network.
The method for training the graph convolution network comprises the following steps:
(1) obtaining global feature representation of the image and a boundary box of an object in the image and feature representation thereof by using Fast R-CNN and ROI;
(2) taking the characteristic representation of the object as the input of the graph convolution network, updating the corresponding node representation,
wherein, X(l+1)For the (l + 1) th layer map convolution characteristic, σ is a nonlinear activation function,for the normalized global adjacency matrix obtained in the second step, X(l)Is the l-th layer feature representation, W is the learning parameter;
(3) and connecting the global feature representation of the image and the object representation updated by the graph convolution network in series, connecting two FC layers, and finally obtaining a final multi-label identification result after normalization by a softmax function.
Training the fully-connected layer utilizes a conventional method, which includes the steps of:
(1) inputting the training image into a network to obtain a training result;
(2) modifying the connection weight of the fully-connected network layer according to a gradient descent algorithm;
(3) and correcting the graph convolution network parameter W according to a gradient descent algorithm.
And fifthly, acquiring the multi-label image to be detected: and acquiring a multi-label image to be detected.
Sixthly, obtaining an image multi-label identification result: and inputting the multi-label image to be detected into the trained image multi-label identification network to obtain a final multi-label classification result. The method comprises the following specific steps:
(1) obtaining the characteristic X of the multi-label image to be detected by using Fast R-CNN as a baseline moduleIAnd a bounding box;
(2) obtaining initial feature representation of each bounding box in multi-label image to be detected by using ROI
(3) All the images to be detectedMerged into a convolutional network X(0)As input to the graph convolution network;
(4) integral initial characteristic X of output series images of graph convolution networkIConnected to a fully connected network;
(5) and (4) passing the output of the two layers of trained fully-connected networks through a softmax function to obtain a final multi-label classification result.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (7)
1. An image multi-label identification method based on statistical correlation and graph convolution technology is characterized by comprising the following steps:
11) collecting and preprocessing multi-label images: collecting multi-label images, and processing labels into a matrix of N × C, wherein N is the number of samples, and C is the type or category number of the labels;
12) calculating the correlation between the labels: calculating the mutual dependency relationship among the labels by utilizing the mutual information, constructing a dependency relationship full-link graph and normalizing the dependency relationship full-link graph to obtain an adjacency matrix;
13) constructing an image multi-label identification network: constructing an image multi-label identification network based on the graph convolution network;
14) training the image multi-label recognition network: training a graph convolution network and a full connection layer in the image multi-label identification network;
15) acquiring a multi-label image to be detected: acquiring a multi-label image to be detected;
16) obtaining an image multi-label identification result: and inputting the multi-label image to be detected into the trained image multi-label identification network to obtain a final multi-label classification result.
2. The method for image multi-label recognition based on statistical correlation and graph convolution technology as claimed in claim 1, wherein the collecting and preprocessing of the multi-label image comprises the following steps:
21) constructing an N-C all-zero matrix D, wherein N is the number of images in the training set, C is the total number of categories in the training set, and C is arranged according to any rule;
22) converting the image labeling data into a label data matrix D, wherein one image and standard information thereof correspond to one row of data in the label data matrix D; for the images in all the labeled data, if a certain label exists in the image, the corresponding row and column are found in the label data matrix D, and are assigned as "1", which represents that the label exists.
3. The method for image multi-label recognition based on statistical correlation and graph convolution technology as claimed in claim 1, wherein said calculating the correlation between labels comprises the following steps:
31) for each column in the tag data matrix D, the mutual information of the column and other columns is calculated, and the calculation formula is as follows:
I(X;Y)=H(X)-H(X|Y)
H(X)=-∑X=xP(x)*logP(x),
wherein, X and Y are random variables representing the category of the label, X and Y are values of the random variables X and Y, X, Y ∈ {0,1}, P (X) is the probability that X is the random variable X ═ X, P (X | Y) is the conditional probability, H (X) is the information entropy, and H (X | Y) is the conditional information entropy;
regarding each column of the label data as a random variable X or Y, regarding the numerical value of each row as X or Y, calculating mutual information between nodes, constructing a C-row C-column matrix A for storing mutual information values, AijA mutual information value representing the ith column and the jth column;
32) computing the normalization of the matrix A as an adjacency matrix for the graph convolution network
4. The image multi-label identification method based on statistical correlation and graph convolution technology as claimed in claim 1, wherein said constructing image multi-label identification network comprises the following steps:
41) setting and utilizing Fast R-CNN as a baseline module to obtain the characteristic X of each pictureIAnd a bounding box;
43) Setting and utilizing a mutual information method to obtain a full-connection adjacent matrix, and carrying out normalization processing on the full-connection adjacent matrix, wherein the expression is as follows:
I(X;Y)=H(X)-H(X|Y),
44) setting a graph volume network: representing the initial features of each bounding boxAre combined to form X(0)Combining fully-connected adjacency matricesAs input of the graph convolution network, L-layer graph convolution is carried out to obtain feature representationThe expression is as follows:
wherein:is an adjacent matrix, X is a matrix formed by the feature vectors of a plurality of nodes, W is a learnable parameter, and sigma (-) is an activation function;
45) setting of the full connection layer: and connecting the integral image features and the bounding box features after convolution of the image convolution network in series, connecting two layers of fully-connected neural networks, and activating by softmax to obtain a final classification result.
5. The method of claim 1, wherein the training of the histogram network comprises the following steps:
51) obtaining global feature representation of the image and a boundary box of an object in the image and feature representation thereof by using Fast R-CNN and ROI;
52) taking the characteristic representation of the object as the input of the graph convolution network, updating the corresponding node representation,
wherein, X(l+1)For the (l + 1) th layer map convolution characteristic, σ is a nonlinear activation function,for the normalized global adjacency matrix obtained in the second step, X(l)For the level 1 feature representation, W is a learning parameter;
53) and connecting the global feature representation of the image and the object representation updated by the graph convolution network in series, connecting two FC layers, and finally obtaining a final multi-label identification result after normalization by a softmax function.
6. The method of claim 1, wherein the training of the fully connected layer comprises the following steps:
61) inputting the training image into a network to obtain a training result;
62) modifying the connection weight of the fully-connected network layer according to a gradient descent algorithm;
63) and correcting the graph convolution network parameter W according to a gradient descent algorithm.
7. The method for image multi-label recognition based on statistical correlation and graph convolution technology as claimed in claim 1, wherein the obtaining of the image multi-label recognition result comprises the following steps:
71) obtaining the characteristic X of the multi-label image to be detected by using Fast R-CNN as a baseline moduleIAnd a bounding box;
72) obtaining initial feature representation of each bounding box in multi-label image to be detected by using ROI
73) All the images to be detectedMerged into a convolutional network X(0)As input to the graph convolution network;
74) integral initial characteristic X of output series images of graph convolution networkIConnected to a fully connected network;
75) and (4) passing the output of the two layers of trained fully-connected networks through a softmax function to obtain a final multi-label classification result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010342622.8A CN111476315B (en) | 2020-04-27 | 2020-04-27 | Image multi-label identification method based on statistical correlation and graph convolution technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010342622.8A CN111476315B (en) | 2020-04-27 | 2020-04-27 | Image multi-label identification method based on statistical correlation and graph convolution technology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111476315A true CN111476315A (en) | 2020-07-31 |
CN111476315B CN111476315B (en) | 2023-05-05 |
Family
ID=71763058
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010342622.8A Active CN111476315B (en) | 2020-04-27 | 2020-04-27 | Image multi-label identification method based on statistical correlation and graph convolution technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111476315B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112183299A (en) * | 2020-09-23 | 2021-01-05 | 成都佳华物链云科技有限公司 | Pedestrian attribute prediction method and device, electronic equipment and storage medium |
CN112487207A (en) * | 2020-12-09 | 2021-03-12 | Oppo广东移动通信有限公司 | Image multi-label classification method and device, computer equipment and storage medium |
CN112862089A (en) * | 2021-01-20 | 2021-05-28 | 清华大学深圳国际研究生院 | Medical image deep learning method with interpretability |
CN112906720A (en) * | 2021-03-19 | 2021-06-04 | 河北工业大学 | Multi-label image identification method based on graph attention network |
CN113204659A (en) * | 2021-03-26 | 2021-08-03 | 北京达佳互联信息技术有限公司 | Label classification method and device for multimedia resources, electronic equipment and storage medium |
CN113988147A (en) * | 2021-12-08 | 2022-01-28 | 南京信息工程大学 | Multi-label classification method and device for remote sensing image scene based on graph network, and multi-label retrieval method and device |
CN114550310A (en) * | 2022-04-22 | 2022-05-27 | 杭州魔点科技有限公司 | Method and device for identifying multi-label behaviors |
CN114648635A (en) * | 2022-03-15 | 2022-06-21 | 安徽工业大学 | Multi-label image classification method fusing strong correlation among labels |
CN115031794A (en) * | 2022-04-29 | 2022-09-09 | 天津大学 | Novel gas-solid two-phase flow measuring method of multi-characteristic-diagram convolution |
CN117475240A (en) * | 2023-12-26 | 2024-01-30 | 创思(广州)电子科技有限公司 | Vegetable checking method and system based on image recognition |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109816009A (en) * | 2019-01-18 | 2019-05-28 | 南京旷云科技有限公司 | Multi-tag image classification method, device and equipment based on picture scroll product |
CN110705425A (en) * | 2019-09-25 | 2020-01-17 | 广州西思数字科技有限公司 | Tongue picture multi-label classification learning method based on graph convolution network |
WO2020048119A1 (en) * | 2018-09-04 | 2020-03-12 | Boe Technology Group Co., Ltd. | Method and apparatus for training a convolutional neural network to detect defects |
-
2020
- 2020-04-27 CN CN202010342622.8A patent/CN111476315B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020048119A1 (en) * | 2018-09-04 | 2020-03-12 | Boe Technology Group Co., Ltd. | Method and apparatus for training a convolutional neural network to detect defects |
CN109816009A (en) * | 2019-01-18 | 2019-05-28 | 南京旷云科技有限公司 | Multi-tag image classification method, device and equipment based on picture scroll product |
CN110705425A (en) * | 2019-09-25 | 2020-01-17 | 广州西思数字科技有限公司 | Tongue picture multi-label classification learning method based on graph convolution network |
Non-Patent Citations (2)
Title |
---|
李辉等: "基于图卷积网络的多标签食品原材料识别", 《南京信息工程大学学报(自然科学版)》 * |
蒋俊钊等: "基于标签相关性的卷积神经网络多标签分类算法", 《工业控制计算机》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112183299B (en) * | 2020-09-23 | 2024-02-09 | 成都佳华物链云科技有限公司 | Pedestrian attribute prediction method and device, electronic equipment and storage medium |
CN112183299A (en) * | 2020-09-23 | 2021-01-05 | 成都佳华物链云科技有限公司 | Pedestrian attribute prediction method and device, electronic equipment and storage medium |
CN112487207A (en) * | 2020-12-09 | 2021-03-12 | Oppo广东移动通信有限公司 | Image multi-label classification method and device, computer equipment and storage medium |
CN112862089B (en) * | 2021-01-20 | 2023-05-23 | 清华大学深圳国际研究生院 | Medical image deep learning method with interpretability |
CN112862089A (en) * | 2021-01-20 | 2021-05-28 | 清华大学深圳国际研究生院 | Medical image deep learning method with interpretability |
CN112906720A (en) * | 2021-03-19 | 2021-06-04 | 河北工业大学 | Multi-label image identification method based on graph attention network |
CN112906720B (en) * | 2021-03-19 | 2022-03-22 | 河北工业大学 | Multi-label image identification method based on graph attention network |
CN113204659B (en) * | 2021-03-26 | 2024-01-19 | 北京达佳互联信息技术有限公司 | Label classification method and device for multimedia resources, electronic equipment and storage medium |
CN113204659A (en) * | 2021-03-26 | 2021-08-03 | 北京达佳互联信息技术有限公司 | Label classification method and device for multimedia resources, electronic equipment and storage medium |
CN113988147B (en) * | 2021-12-08 | 2022-04-26 | 南京信息工程大学 | Multi-label classification method and device for remote sensing image scene based on graph network, and multi-label retrieval method and device |
CN113988147A (en) * | 2021-12-08 | 2022-01-28 | 南京信息工程大学 | Multi-label classification method and device for remote sensing image scene based on graph network, and multi-label retrieval method and device |
CN114648635A (en) * | 2022-03-15 | 2022-06-21 | 安徽工业大学 | Multi-label image classification method fusing strong correlation among labels |
CN114648635B (en) * | 2022-03-15 | 2024-07-09 | 安徽工业大学 | Multi-label image classification method fusing strong correlation among labels |
CN114550310A (en) * | 2022-04-22 | 2022-05-27 | 杭州魔点科技有限公司 | Method and device for identifying multi-label behaviors |
CN115031794A (en) * | 2022-04-29 | 2022-09-09 | 天津大学 | Novel gas-solid two-phase flow measuring method of multi-characteristic-diagram convolution |
CN115031794B (en) * | 2022-04-29 | 2024-07-26 | 天津大学 | Novel gas-solid two-phase flow measuring method based on multi-feature graph convolution |
CN117475240A (en) * | 2023-12-26 | 2024-01-30 | 创思(广州)电子科技有限公司 | Vegetable checking method and system based on image recognition |
Also Published As
Publication number | Publication date |
---|---|
CN111476315B (en) | 2023-05-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111476315B (en) | Image multi-label identification method based on statistical correlation and graph convolution technology | |
CN114067160B (en) | Small sample remote sensing image scene classification method based on embedded smooth graph neural network | |
Kauffmann et al. | From clustering to cluster explanations via neural networks | |
CN108875827B (en) | Method and system for classifying fine-grained images | |
US11003949B2 (en) | Neural network-based action detection | |
CN112906720B (en) | Multi-label image identification method based on graph attention network | |
CN113657425B (en) | Multi-label image classification method based on multi-scale and cross-modal attention mechanism | |
CN110909820A (en) | Image classification method and system based on self-supervision learning | |
CN112116599B (en) | Sputum smear tubercle bacillus semantic segmentation method and system based on weak supervised learning | |
CN112308115B (en) | Multi-label image deep learning classification method and equipment | |
CN110705490B (en) | Visual emotion recognition method | |
CN111612051A (en) | Weak supervision target detection method based on graph convolution neural network | |
Cholakkal et al. | Backtracking spatial pyramid pooling-based image classifier for weakly supervised top–down salient object detection | |
Hossain et al. | Recognition and solution for handwritten equation using convolutional neural network | |
CN115131613B (en) | Small sample image classification method based on multidirectional knowledge migration | |
CN114332893A (en) | Table structure identification method and device, computer equipment and storage medium | |
CN112183464A (en) | Video pedestrian identification method based on deep neural network and graph convolution network | |
Juyal et al. | Multilabel image classification using the CNN and DC-CNN model on Pascal VOC 2012 dataset | |
CN108960005B (en) | Method and system for establishing and displaying object visual label in intelligent visual Internet of things | |
CN113553326A (en) | Spreadsheet data processing method, device, computer equipment and storage medium | |
CN111259176B (en) | Cross-modal Hash retrieval method based on matrix decomposition and integrated with supervision information | |
Liu et al. | Self-supervised image co-saliency detection | |
CN114299342B (en) | Unknown mark classification method in multi-mark picture classification based on deep learning | |
Kanungo | Analysis of Image Classification Deep Learning Algorithm | |
CN112232398B (en) | Semi-supervised multi-category Boosting classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |