CN113378965A - Multi-label image identification method and system based on DCGAN and GCN - Google Patents

Multi-label image identification method and system based on DCGAN and GCN Download PDF

Info

Publication number
CN113378965A
CN113378965A CN202110713085.8A CN202110713085A CN113378965A CN 113378965 A CN113378965 A CN 113378965A CN 202110713085 A CN202110713085 A CN 202110713085A CN 113378965 A CN113378965 A CN 113378965A
Authority
CN
China
Prior art keywords
dcgan
algorithm
gcn
label
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110713085.8A
Other languages
Chinese (zh)
Other versions
CN113378965B (en
Inventor
刘嵩
来庆涵
周梓涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN202110713085.8A priority Critical patent/CN113378965B/en
Publication of CN113378965A publication Critical patent/CN113378965A/en
Application granted granted Critical
Publication of CN113378965B publication Critical patent/CN113378965B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides a multi-label recognition algorithm based on DCGAN and GCN, including: constructing a DCGAN model based on the GAN model, and generating a similar image based on the DCGAN model; extracting features based on a transferred CNN algorithm, transferring parameters of a neural network of a DCGAN model to the CNN algorithm to extract features of a multi-label image, and generating a class label classifier by using a GCN algorithm through a relation graph among training labels; and generating a data pre-training model by generating a confrontation network through deep convolution, and transferring parameters of a convolution neural network of the pre-training model to a target task to fine tune the network so as to obtain a more accurate image recognition effect. Meanwhile, random noise is added when the image is generated, and therefore robustness of the pre-training model can be improved.

Description

Multi-label image identification method and system based on DCGAN and GCN
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a multi-label image recognition method and system based on a DCGAN (deep convolutional countermeasure network) and a GCN (graph neural network).
Background
In the internet era, multimedia data, such as images and short videos, become the mainstream of information, and have an important influence on the life of people. Image recognition is a branch of the computer vision field, and by labeling an image with an appropriate label, visual information conveyed by the image is converted into semantic information which is easy to understand by people, so that people can better understand and analyze the image. Single-label image classification algorithms have been studied for many years, such as support vector machines, random forest algorithms, and the like. Supervised deep learning algorithms such as convolutional neural networks have excellent performance in single-label image recognition, and are widely applied to the fields of transportation, medical treatment and the like. However, in real life, one image has a plurality of objects, scenes, and the like in most cases. Different objects in each picture are associated with each other, so multi-label image classification becomes a general problem, and multi-label image classification is popularization of the single-label classification problem. Convolutional neural networks have made good progress in the application of single-label image classification, and provide a basis for the research of multi-label images.
The content in the multi-label image is complex, and the problems of occlusion of the target, complex background, unobvious target and the like are possible. It may not be applicable if the processing algorithm of the single label image is applied directly to the multi-label image. A simpler method for solving the problem of multi-label image identification is to convert the multi-label problem into a plurality of single-label problems.
One of the main difficulties in multi-tag learning is the explosive growth of output space, and in order to solve the problem of tag space with exponential complexity, the correlation between tags needs to be mined. For example, if an image is labeled with rainforest tropics and soccer, then it is highly likely to have brazilian labels. A document labeled as an entertainment tag is less likely to be politically related. The effective mining of the correlation among the labels is the key for the success of multi-label learning.
The inventor finds that a multi-label image recognition method based on DCGAN and GCN can be formed by a method of generating data by a GAN algorithm and mapping label features to corresponding label classifiers by a graph convolution neural network.
Disclosure of Invention
In order to solve the defects of the prior art, the embodiment of the present disclosure provides a multi-label image recognition method based on DCGAN and GCN, which can extract features by adopting a deep learning method, can solve the recognition problem of multi-label images, and can reduce labor cost.
In order to achieve the purpose, the following technical scheme is adopted for achieving the purpose:
a multi-label identification algorithm and a multi-label identification system based on DCGAN and GCN are provided.
In a first aspect, the present disclosure provides a DCGAN and GCN-based multi-tag identification algorithm, including:
constructing a DCGAN model based on the GAN model, and generating a similar image based on the DCGAN model;
generating similar images based on a DCGAN model, extracting features by using a CNN algorithm based on migration, migrating parameters of a neural network of the DCGAN model into the CNN algorithm to extract features of the multi-label images, and generating a class label classifier by using the GCN algorithm through a relation graph among training labels;
and (3) generating a category label classifier based on the GCN algorithm, classifying and identifying the multi-label image, performing point multiplication on the features extracted by the CNN algorithm and a semantic feature vector matrix in the category classifier generated by the GCN algorithm, and identifying the image by using the multi-label classifier.
In a second aspect, the present disclosure provides a multi-label recognition system for DCGAN and GCN based auto-supervised learning, comprising a picture generation module configured to generate similar images based on a DCGAN model;
the feature extraction module is configured to extract features based on the transferred CNN algorithm, transfer parameters of a neural network of the DCGAN model to the CNN algorithm to extract features of the multi-label image, and generate a class label classifier through a relation graph between training labels by using the GCN algorithm;
and the image identification module is configured to classify and identify the multi-label image based on the GCN algorithm, and identify the image by the multi-label classifier after point multiplication is carried out on the features extracted by the CNN algorithm and a semantic feature vector matrix in a class classifier generated by the GCN algorithm.
In a third aspect, the present disclosure provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform a DCGAN and GCN based multi-tag identification algorithm as described in the first aspect.
In a fourth aspect, the present disclosure provides an electronic device comprising a memory and a processor, and computer instructions stored in the memory and executed on the processor, wherein the computer instructions, when executed by the processor, implement a DCGAN and GCN based multi-tag identification algorithm as described in the first aspect.
Compared with the prior art, the beneficial effect of this disclosure is:
according to the method, data generated in a DCGAN model and data in an original data set are mixed into a new data set, so that multi-label images are more diverse, various features generated in a training process are easy to classify and recognize, the problem of overfitting possibly occurring in the training process is solved, and meanwhile, random noise is added during training to enhance the robustness of a pre-training model;
according to the method, a deep convolution generation countermeasure network (DCGAN) algorithm is selected to generate the pictures similar to reality in the data set, so that the integrity of the pictures is guaranteed, and the diversity of the pictures in the data set is increased;
and generating a data pre-training model by generating a confrontation network through deep convolution, and transferring parameters of a convolution neural network of the pre-training model to a target task to fine tune the network so as to obtain a more accurate image recognition effect. Meanwhile, random noise is added when the image is generated, and therefore robustness of the pre-training model can be improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a schematic flow chart of a multi-label image recognition method based on DCGAN and GCN according to the present disclosure;
FIG. 2 is a schematic flow chart of the DCGAN algorithm of the embodiment of the present disclosure;
fig. 3 is a picture (b) and an original picture (a) generated in the DCGAN model according to the embodiment of the present disclosure;
FIG. 4 is a graph of a loss function of an embodiment of the present disclosure;
fig. 5 is a ResNet shallow residual unit diagram (a) and a deep residual unit diagram (b) of an embodiment of the present disclosure;
FIG. 6 is a tag dependency modeling diagram of an embodiment of the present disclosure;
fig. 7 is a diagram of a graph convolution neural network structure according to an embodiment of the present disclosure.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The deep convolution generation countermeasure network algorithm is a generation model for unsupervised learning improved by a generation countermeasure network (GAN) model, and compared with a general GAN algorithm, the deep convolution generation countermeasure network algorithm has better capability of generating data and is more stable in training, and generated samples are more diversified.
The Graph Convolutional neural Network (GCN) algorithm is a semi-supervised learning method, which is trained end to end by using node attributes and node labels, and the core idea is to update the representation of nodes through the transmission of information between the nodes. The graph convolution neural network is used for constructing a directed graph among a plurality of objects according to a topological structure, and expressing the correlation among labels in a graph mode, so that the graph convolution neural network is shallow and easy to understand.
Example 1
As shown in fig. 1, the present disclosure provides a DCGAN and GCN based multi-tag identification algorithm, including:
constructing a DCGAN model based on the GAN model, and generating a similar image based on the DCGAN model;
generating similar images based on a DCGAN model, extracting features by using a CNN algorithm based on migration, migrating parameters of a neural network of the DCGAN model into the CNN algorithm to extract features of the multi-label images, and generating a class label classifier by using the GCN algorithm through a relation graph among training labels;
and (3) generating a category label classifier based on the GCN algorithm, classifying and identifying the multi-label image, performing point multiplication on the features extracted by the CNN algorithm and a semantic feature vector matrix in the category classifier generated by the GCN algorithm, and identifying the image by using the multi-label classifier.
Further, the generating of the similar image based on the DCGAN model comprises generating training data of a similar image adding data set through the DCGAN model; and mixing the data generated in the DCGAN model with the data in the original data set into a new data set.
In particular, the amount of the solvent to be used,
the DCGAN algorithm generates data equivalent to Pretext task for pretraining, and in the training process, the added training data can relieve the overfitting problem and enhance the robustness of the algorithm. According to the method, a deep convolution generation countermeasure network (DCGAN) algorithm is selected to generate the pictures similar to reality in the data set, so that the integrity of the pictures is guaranteed, and the diversity of the pictures in the data set is increased.
The DCGAN model transforms a generator g (generator) and a discriminator d (discriminator) in the GAN model into two convolutional neural networks.
As shown in fig. 2, which is a flow chart of the DCGAN algorithm, the generator is replaced by a pre-trained ResNet-101 network, the G network of the two convolutional neural networks uses Relu as an activation function, the last layer uses a Tanh function, and the D network uses LeakyRelu as an activation function. BN (batch normalization) layer is adopted in the generator G and the discriminator D, and the normalization method behind the convolution layer can help the network to be converged quickly.
In the process of model training, the goal of generating the network G is to generate vivid pictures as much as possible to deceive and distinguish the network D, the goal of D is to distinguish the pictures generated by the network G from real pictures as much as possible, and pictures which can be falsely and truly generated in the process of mutual game between the generator G and the discriminator D are generated.
As shown in fig. 3, the images in the partial data sets are added into a deep convolution generation countermeasure network (DCGAN) algorithm model for training, and through continuous training, "game" between the generator and the discriminator finally generates a picture similar to the images in the original data sets.
As shown in fig. 4, data generated in a deep convolution generated countermeasure network (DCGAN) algorithm model is mixed with data in an original data set to form a new data set, so that the multi-label image is more diverse, various features generated in a training process are easy to classify and recognize, the problem of overfitting possibly occurring in the training process is also relieved, and meanwhile, random noise is added during training to enhance the robustness of a pre-training model.
Fig. 4 shows a graph of the loss function of the deep convolution generated countermeasure network (DCGAN) during the training process. The present disclosure has made relevant experiments on the PASCAL VOC data set, and in this section shows the results of an algorithm trained on the PASCAL VOC data set.
Further, the migration-based CNN algorithm feature extraction includes migrating parameters of a neural network of a generator in the DCGAN algorithm to the CNN algorithm to extract features of multi-label images in a newly combined training set, and utilizing back propagation of the algorithm after a loss function is calculated by inputting the training set to achieve fine tuning of the network.
Further, a residual error network corresponding to a pre-trained generator when the DCGAN is used for generating the image is used as a CNN algorithm for extracting the features, and the features of the image are obtained by adopting global maximum pooling;
optionally, the image label relation graph is input into a GCN algorithm, and a class classifier is generated by mapping labels in the image through training.
In particular, the amount of the solvent to be used,
residual error network ResNet-101 is used by a generator G trained when the DCGAN generates an image in a pre-training mode, and ResNet-101 network parameters of the generator G generated by generating a countermeasure network (DCGAN) through deep convolution after the pre-training are transferred to a CNN algorithm, so that the image of a training set can be subjected to feature extraction, and meanwhile, the network can be finely adjusted by utilizing the back propagation of the algorithm after the loss function is calculated by inputting the training set.
According to the method, a residual error network ResNet-101 corresponding to a pre-trained generator G is used as a CNN algorithm for extracting features when a DCGAN is used for generating an image, and finally, the features x of the image are obtained by global maximum pooling;
x=fGMP(fcnn(I;θCNN))∈RD (1)
in the formula (1), θCNNThe parameter is represented, D2048, and I represents the image.
As shown in fig. 5, ResNet uses two types of residual units, a shallow residual unit and a deep residual unit. While the ResNet-101 algorithm uses deep residual units.
The ResNet-101 algorithm parameters are shown in table 1, and the ResNet change is mainly reflected in that the ResNet network replaces a full connection layer with a global average pore layer in addition to using a stride of 2 as a downsampling, and the ResNet network maintains the complexity of the network layer. As can be seen from the table, as the network is deeper, it does residual learning between three layers, the three layers of convolution kernels being 1x1, 3x3 and 1x1, respectively.
TABLE 1 ResNet-101 Algorithm parameters
Figure BDA0003133733140000091
Firstly, inputting the number of all labels in a training set, learning the relevance among various class labels through GCN, and training the probability among the learned labels by adopting a cross-correlation matrix.
Secondly, modeling the correlation dependence between the labels in the form of conditional probability and constructing a correlation coefficient matrix.
And finally, after the characteristics are extracted through a CNN algorithm, performing point multiplication on the characteristics and an output matrix obtained by GCN network training to obtain vectors for classification, and performing multi-classification by using a cross entropy loss function.
The GCN algorithm learns the semantic features of the corresponding labels of the images in the algorithm. And embedding the label information by a GLOVE pre-training language model by the GCN to obtain an input matrix of the GCN. The input during training is the number of all labels in a training set, the relevance between various class labels is learned through GCN, and the probability between the learned labels is trained by adopting a cross-correlation matrix to initially adjoin a matrix.
As shown in FIG. 6, the present disclosure models the dependency of the correlation between tags in the form of conditional probabilities.
From FIG. 6, P (L) can be seenj|Li) Is not equal to P (L)i|Lj) And therefore the correlation coefficient matrix is not symmetric.
Constructing a correlation coefficient matrix, comprising the following steps:
(1) counting the occurrence times of the label pairs in the training data set to obtain a matrix M (C);
(2) using the label co-occurrence matrix to obtain a conditional probability matrix: pi ═ Mi/Ni, where Ni is the probability of a label appearing in the training dataset;
(3) and (4) carrying out binarization processing to eliminate noise introduced by the co-occurrence probability.
A threshold τ is used to filter the noise edge, where a (C × C) is a binary correlation coefficient matrix:
Figure BDA0003133733140000101
(4) when training is carried out, according to an image label input into a training set, a corresponding word embedding vector is obtained, so that an input H (C x D) and an adjacent matrix A of the GCN are obtained, H and A are input into the GCN together, a C x D-dimensional output matrix is finally obtained, the output matrix of the GCN and the output vector of each image in the CNN are subjected to point multiplication, vectors for classification are finally obtained, and then a cross entropy loss function is used for multi-classification and back propagation adjustment parameters are carried out.
In the testing stage, after the picture is input, the characteristic is extracted through a CNN algorithm, the characteristic is subjected to point multiplication with an output matrix obtained through GCN network training to obtain a vector for classification, and multi-classification is carried out by using a cross entropy loss function.
As shown in fig. 7, which is a structure of a graph convolution neural network, the whole graph is input, in the convolution layer 1, a convolution operation is performed on the neighbors of each node, and the node is updated by the convolution result; then through an activation function such as ReLU, through a layer of convolution layer 2 and an activation function; the above process is repeated until the number of layers reaches the desired depth.
Similar to the Graph Neural Network (GNN) algorithm, the graph convolution neural network also has a local output function for converting the node states (including hidden states and node features) into task-related labels, such as the naval account number classification; there are also tasks to classify the whole graph, such as compound classification.
Unlike standard convolution methods, the goal of a graph convolution neural network is to learn the function f (,) of a graph G. The input to this function is a feature description and a relationship matrix A ∈ Rn×nThereby updating the node characteristics to Hl+1∈Rn×d'. Each GCN layer can be written as a non-linear function:
Hl+1=f(Hl,A) (3)
f (,) may be expressed as:
Figure BDA0003133733140000111
as can be seen by the formula, complex relationships between nodes can be modeled by stacking multiple GCN layers.
And constructing a Graph among target labels of the multi-label images in the data set, wherein each node (label) is represented by a word vector (word embedding). A graph convolutional neural network (GCN) network maps the label graph into a set of interdependent target classifiers. A GCN-based map during trainingMethod for learning interdependent object classifier by ray function from label features
Figure BDA0003133733140000112
Stacked GCNs are used in the study, where each GCN layer I takes its input as the previous layer HlAs input, and then outputs a new node signature Hl+1. The input to the first layer is the word embedding vector H ∈ RC×dThe output of the last layer of the matrix is the classifier W ∈ RC×D
And finally, point-multiplying a semantic feature vector matrix (C multiplied by D dimension matrix) in the category classifier generated by GCN with the feature vector extracted by the ResNet-101 algorithm to obtain a vector for classification, and then training the classifier to perform classification and identification of the multi-label image.
By applying the learned classifier to the image features, a prediction score is obtained:
Figure BDA0003133733140000121
suppose that the true label of an image is y ∈ RcAnd there are C types of labels in total, the loss function of the whole multi-label classification recognition algorithm network is as follows:
Figure BDA0003133733140000122
σ () is a sigmoid function in the above formula.
Example 2
The multi-label identification system based on the self-supervision learning of DCGAN and GCN is realized based on the server, and the server comprises:
a picture generation module configured to generate similar images based on a DCGAN model;
the feature extraction module is configured to extract features based on the transferred CNN algorithm, transfer parameters of a neural network of the DCGAN model to the CNN algorithm to extract features of the multi-label image, and generate a class label classifier through a relation graph between training labels by using the GCN algorithm;
and the image identification module is configured to classify and identify the multi-label image based on the GCN algorithm, and identify the image by the multi-label classifier after point multiplication is carried out on the features extracted by the CNN algorithm and a semantic feature vector matrix in a class classifier generated by the GCN algorithm.
Example 3
A computer readable storage medium storing computer instructions which, when executed by a processor, perform a DCGAN and GCN based multi-tag identification algorithm as described in the first aspect.
Example 4
An electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, which when executed by the processor, perform a DCGAN and GCN based multi-tag identification algorithm as described in the first aspect.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A multi-label identification algorithm based on DCGAN and GCN is characterized by comprising the following steps:
constructing a DCGAN model based on the GAN model, and generating a similar image based on the DCGAN model;
generating similar images based on a DCGAN model, extracting features by using a CNN algorithm based on migration, migrating parameters of a neural network of the DCGAN model into the CNN algorithm to extract features of the multi-label images, and generating a class label classifier by using the GCN algorithm through a relation graph among training labels;
and (3) generating a category label classifier based on the GCN algorithm, classifying and identifying the multi-label image, performing point multiplication on the features extracted by the CNN algorithm and a semantic feature vector matrix in the category classifier generated by the GCN algorithm, and identifying the image by using the multi-label classifier.
2. The DCGAN and GCN based multi-label recognition algorithm of claim 1, wherein the DCGAN model based generation of similar images comprises generating training data of similar image augmentation data sets by the DCGAN model; and mixing the data generated in the DCGAN model with the data in the original data set into a new data set.
3. The DCGAN and GCN based multi-label recognition algorithm of claim 1, wherein the migration based CNN algorithm feature extraction comprises migrating the neural network parameters of the generator in the DCGAN algorithm into the CNN algorithm to perform feature extraction on the multi-label images in the newly merged training set, and using the back propagation of the algorithm after the loss function is calculated in the input training set to realize the fine tuning of the network;
and adopting a residual error network corresponding to a pre-trained generator when the DCGAN generates the image as a CNN algorithm for extracting the features, and acquiring the features of the image by adopting global maximum pooling.
4. The multi-label recognition algorithm based on DCGAN and GCN as claimed in claim 1, wherein the residual network corresponding to the pre-trained generator when DCGAN is used to generate image is used as CNN algorithm for extracting features, and the features of image are obtained by global maximum pooling.
5. The multi-label recognition algorithm based on DCGAN and GCN as claimed in claim 1, wherein the number of all labels in the training set is inputted, the correlation between each class label is learned by GCN, and the probability between the learned labels is trained by using the initial adjacency matrix.
6. The DCGAN and GCN based multi-tag identification algorithm of claim 1,
and modeling the correlation dependence between the labels in the form of conditional probability, and constructing a correlation coefficient matrix.
7. The multi-label recognition algorithm based on DCGAN and GCN as claimed in claim 1, wherein after the features are extracted by CNN algorithm, the vectors for classification are obtained by point multiplication with the output matrix obtained by GCN network training, and the multi-classification is performed by using cross entropy loss function.
8. A multi-label identification system based on DCGAN and GCN is realized based on a server, and is characterized in that the server comprises:
a picture generation module configured to generate similar images based on a DCGAN model;
the characteristic extraction module is configured to generate similar images based on a DCGAN model, extract characteristics by using a CNN algorithm based on migration, migrate parameters of a neural network of the DCGAN model into the CNN algorithm to extract characteristics of multi-label images, and generate a class label classifier by using a GCN algorithm through a relation graph among training labels;
and the image identification module is configured to generate a category label classifier based on the GCN algorithm, classify and identify the multi-label image, perform point multiplication on the features extracted by the CNN algorithm and a semantic feature vector matrix in the category classifier generated by the GCN algorithm, and identify the image by using the multi-label classifier.
9. A computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to execute a DCGAN and GCN based multi-tag identification algorithm according to any of claims 1-7.
10. A terminal device comprising a processor and a computer readable storage medium, the processor being configured to implement instructions; a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor and to execute a DCGAN and GCN based multi-tag identification algorithm according to any of claims 1-7.
CN202110713085.8A 2021-06-25 2021-06-25 Multi-label image identification method and system based on DCGAN and GCN Active CN113378965B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110713085.8A CN113378965B (en) 2021-06-25 2021-06-25 Multi-label image identification method and system based on DCGAN and GCN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110713085.8A CN113378965B (en) 2021-06-25 2021-06-25 Multi-label image identification method and system based on DCGAN and GCN

Publications (2)

Publication Number Publication Date
CN113378965A true CN113378965A (en) 2021-09-10
CN113378965B CN113378965B (en) 2022-09-02

Family

ID=77579281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110713085.8A Active CN113378965B (en) 2021-06-25 2021-06-25 Multi-label image identification method and system based on DCGAN and GCN

Country Status (1)

Country Link
CN (1) CN113378965B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183464A (en) * 2020-10-26 2021-01-05 天津大学 Video pedestrian identification method based on deep neural network and graph convolution network
CN113868240A (en) * 2021-11-30 2021-12-31 深圳佑驾创新科技有限公司 Data cleaning method and computer readable storage medium
CN114612681A (en) * 2022-01-30 2022-06-10 西北大学 GCN-based multi-label image classification method, model construction method and device
CN114882279A (en) * 2022-05-10 2022-08-09 西安理工大学 Multi-label image classification method based on direct-push type semi-supervised deep learning
CN115240037A (en) * 2022-09-23 2022-10-25 卡奥斯工业智能研究院(青岛)有限公司 Model training method, image processing method, device and storage medium
CN115439845A (en) * 2022-08-02 2022-12-06 北京邮电大学 Image extrapolation method and device based on graph neural network, storage medium and terminal
CN115909390A (en) * 2021-09-30 2023-04-04 腾讯科技(深圳)有限公司 Vulgar content identification method, vulgar content identification device, computer equipment and storage medium
CN117392470A (en) * 2023-12-11 2024-01-12 安徽中医药大学 Fundus image multi-label classification model generation method and system based on knowledge graph

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816009A (en) * 2019-01-18 2019-05-28 南京旷云科技有限公司 Multi-tag image classification method, device and equipment based on picture scroll product
CN110084296A (en) * 2019-04-22 2019-08-02 中山大学 A kind of figure expression learning framework and its multi-tag classification method based on certain semantic
CN110516561A (en) * 2019-08-05 2019-11-29 西安电子科技大学 SAR image target recognition method based on DCGAN and CNN
CN111276240A (en) * 2019-12-30 2020-06-12 广州西思数字科技有限公司 Multi-label multi-mode holographic pulse condition identification method based on graph convolution network
CN112183464A (en) * 2020-10-26 2021-01-05 天津大学 Video pedestrian identification method based on deep neural network and graph convolution network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816009A (en) * 2019-01-18 2019-05-28 南京旷云科技有限公司 Multi-tag image classification method, device and equipment based on picture scroll product
CN110084296A (en) * 2019-04-22 2019-08-02 中山大学 A kind of figure expression learning framework and its multi-tag classification method based on certain semantic
CN110516561A (en) * 2019-08-05 2019-11-29 西安电子科技大学 SAR image target recognition method based on DCGAN and CNN
CN111276240A (en) * 2019-12-30 2020-06-12 广州西思数字科技有限公司 Multi-label multi-mode holographic pulse condition identification method based on graph convolution network
CN112183464A (en) * 2020-10-26 2021-01-05 天津大学 Video pedestrian identification method based on deep neural network and graph convolution network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ALEC RADFORD 等: ""UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS"", 《ARXIV》 *
龙程: ""基于对抗网络的图像数据集扩充研究与实现"", 《中国优秀博硕士 学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183464A (en) * 2020-10-26 2021-01-05 天津大学 Video pedestrian identification method based on deep neural network and graph convolution network
CN115909390A (en) * 2021-09-30 2023-04-04 腾讯科技(深圳)有限公司 Vulgar content identification method, vulgar content identification device, computer equipment and storage medium
CN113868240A (en) * 2021-11-30 2021-12-31 深圳佑驾创新科技有限公司 Data cleaning method and computer readable storage medium
CN114612681A (en) * 2022-01-30 2022-06-10 西北大学 GCN-based multi-label image classification method, model construction method and device
CN114882279A (en) * 2022-05-10 2022-08-09 西安理工大学 Multi-label image classification method based on direct-push type semi-supervised deep learning
CN114882279B (en) * 2022-05-10 2024-03-19 西安理工大学 Multi-label image classification method based on direct-push semi-supervised deep learning
CN115439845A (en) * 2022-08-02 2022-12-06 北京邮电大学 Image extrapolation method and device based on graph neural network, storage medium and terminal
CN115240037A (en) * 2022-09-23 2022-10-25 卡奥斯工业智能研究院(青岛)有限公司 Model training method, image processing method, device and storage medium
CN117392470A (en) * 2023-12-11 2024-01-12 安徽中医药大学 Fundus image multi-label classification model generation method and system based on knowledge graph
CN117392470B (en) * 2023-12-11 2024-03-01 安徽中医药大学 Fundus image multi-label classification model generation method and system based on knowledge graph

Also Published As

Publication number Publication date
CN113378965B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN113378965B (en) Multi-label image identification method and system based on DCGAN and GCN
Iscen et al. Label propagation for deep semi-supervised learning
De Rezende et al. Exposing computer generated images by using deep convolutional neural networks
CN111582409B (en) Training method of image tag classification network, image tag classification method and device
Socher et al. Parsing natural scenes and natural language with recursive neural networks
Zhang et al. Patch strategy for deep face recognition
CN111476315A (en) Image multi-label identification method based on statistical correlation and graph convolution technology
US11816882B2 (en) Image recognition learning device, image recognition device, method and program
CN107491782B (en) Image classification method for small amount of training data by utilizing semantic space information
CN112396106A (en) Content recognition method, content recognition model training method, and storage medium
CN111985520A (en) Multi-mode classification method based on graph convolution neural network
CN112101364A (en) Semantic segmentation method based on parameter importance incremental learning
CN114037055A (en) Data processing system, method, device, equipment and storage medium
CN114298179A (en) Data processing method, device and equipment
CN113987236A (en) Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network
Quiroga et al. A study of convolutional architectures for handshape recognition applied to sign language
CN116958729A (en) Training of object classification model, object classification method, device and storage medium
EP3910549A1 (en) System and method for few-shot learning
Tang et al. HRRegionNet: Chinese character segmentation in historical documents with regional awareness
Rad et al. A multi-view-group non-negative matrix factorization approach for automatic image annotation
CN113627466A (en) Image tag identification method and device, electronic equipment and readable storage medium
CN113569094A (en) Video recommendation method and device, electronic equipment and storage medium
CN114692715A (en) Sample labeling method and device
Kampffmeyer Advancing Segmentation and Unsupervised Learning Within the Field of Deep Learning
Du et al. One-stage object detection with graph convolutional networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant