CN110929029A - Text classification method and system based on graph convolution neural network - Google Patents

Text classification method and system based on graph convolution neural network Download PDF

Info

Publication number
CN110929029A
CN110929029A CN201911064089.7A CN201911064089A CN110929029A CN 110929029 A CN110929029 A CN 110929029A CN 201911064089 A CN201911064089 A CN 201911064089A CN 110929029 A CN110929029 A CN 110929029A
Authority
CN
China
Prior art keywords
text
graph
node
layer
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911064089.7A
Other languages
Chinese (zh)
Inventor
唐钰葆
于静
曹聪
刘燕兵
谭建龙
郭莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201911064089.7A priority Critical patent/CN110929029A/en
Publication of CN110929029A publication Critical patent/CN110929029A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a text classification method and system based on a graph convolution neural network. The method comprises the following steps: 1) for each classified labeled text in a text training set of a target field, generating a text feature vector of the text according to the word frequency and the inverse document rate of words in the text; combining all text feature vectors to generate a text feature matrix, namely a TF-IDF matrix, and constructing a graph structure of the text training set according to word vector similarity of words; 2) training a graph convolution neural network by using the graph structure and the text feature matrix; 3) and for a text a to be classified in the target field, inputting the text feature vector of the text a into the trained graph convolution neural network to obtain the category of the text a. The invention not only considers the semantic structure information of the text, but also captures the hidden characteristics of the text from another angle, and has high classification accuracy.

Description

Text classification method and system based on graph convolution neural network
Technical Field
The invention belongs to the field of graph data mining and graph classification, and particularly relates to a text classification method and system based on a graph convolution neural network.
Background
With the arrival of big data, the data scale shows an explosive growth trend, and the relationship among massive heterogeneous data is gradually compact. A graph is a common type of abstract data structure that represents relationships between things. Data elements which are closely related in real life, such as social networks, academic networks and the like can be represented by graph data. The actual problem can be converted into the technical problem of the graph and data mining. For example, the social software WeChat takes the micro signals as nodes, and the mutual relationships such as 'friend relationships', 'comment like comments' between the micro signals are taken as the edges of the graph, so that graph structure data are constructed. The actual problems can be converted into technical problems of graphs and data mining, and graph data classification is a research focus in large-scale data processing. And (4) graph classification, namely automatically distinguishing and classifying different types of graphs, wherein the graph classification is mainly applied to identification of sudden and terrorist behaviors, social network relationship classification, chemical molecule classification and the like.
The graph classification can provide important technical means for data analysis and understanding in different fields, and related research and application are in the spotlight. Although graph classification has an important role in various areas of society, graph classification still faces many technical challenges.
The graph data has strong local coupling, and the nodes have relations, so the representation of the graph needs to contain the structural information and the attributes of the graph. The existing data representation mode mainly aims at serialized documents, structured images and the like, and is difficult to be expanded to the representation of graphs, so graph classification faces serious challenges.
Meanwhile, on the other hand, the feature representation of the graph, namely the feature representation of the nodes calculated through the connection relation among the nodes, and the training of the classifier by using the feature set are two independent processes, each process needs independent design and optimization, and even if each step is optimal, the classifier with the optimal overall effect is difficult to ensure.
As can be seen from the above, the graph classification has important positions in various fields, but has the challenges of strong local coupling and difficult feature representation. In the field of graph classification, there are the above-mentioned chemical molecule classification, relationship network entity classification, etc., and in the present application, the text classification task is aimed at. And (3) text classification, namely performing certain data preprocessing according to the given text content with the label, and classifying the text by using some algorithms or models. There are two main categories of text classification methods: the first category is the traditional text classification technique, which consists of feature extraction and classification using classifiers. The second type is that a deep learning method is used, features are not extracted manually, features, specific pattern rules and the like in the text are learned through a deep learning model, so that a classification model is obtained through training, and then the classification of the text can be realized by utilizing the classification model. Common models are LSTM, CNN, RNN, GRU, etc. These methods, despite their advantages, have difficulty in ensuring that an overall-optimal classification model is obtained.
Disclosure of Invention
The application provides a text classification method and system based on a graph convolution neural network. The text in the invention is natural language text, such as news category, entertainment news, financial news, military news and other texts. The method has the basic idea that a text is expressed into a graph structure, the semantic structure relation of the text and the characteristics of the text are considered, a graph convolution neural network is constructed to realize end-to-end classification of graph data, namely, the text information and the text characteristics of the graph structure are directly used as input, and the output is the category of each text, namely, a label. By representing the text as a graph structure, the semantic structure information of the text can be considered, the hidden features of the text can be captured from another angle, and the processing result can compete with the mainstream text classification method after passing through the graph convolution neural network. The algorithm flow chart of the present invention is shown in fig. 1.
A text classification method based on a graph convolution neural network comprises the following steps:
1) carrying out word segmentation, meaningless word removal, punctuation mark removal, TF-IDF matrix calculation and other preprocessing on the text;
2) constructing a graph structure of the preprocessed text obtained in the step 1), wherein words are used as nodes of the graph, and a plurality of words (8 words are selected in the application) which are most similar to one node (cosine similarity of two word vectors is calculated) are used as neighbor nodes of the words;
3) preprocessing a graph structure, calculating a Laplace matrix of the graph, and the like;
4) and constructing and training a graph convolution neural network, wherein the graph convolution neural network comprises an input layer, two hidden layers and an output layer. Wherein the hidden layer comprises a graph convolution layer, an active layer and a pooling layer.
5) Preprocessing the text to be classified, constructing a text characteristic matrix and a graph structure as the input of the graph convolution neural network, and using the step 4) to train to obtain the graph convolution neural network to classify to obtain the category of the text.
Furthermore, the graph convolution neural network comprises an input layer and two hidden layers which are sequentially connected, each hidden layer comprises a graph convolution layer, a pooling layer and an activation layer which are operated in the same way, the input of the second hidden layer is the output of the first hidden layer, the second hidden layer carries out further feature capture on the output of the first hidden layer, and finally the second hidden layer is connected with the full connection layer and the softmax output layer; the input layer is used for importing the constructed graph structure and the TF-IDF matrix of the text into the whole network for subsequent training. The graph convolution layer is used for carrying out convolution operation on an input graph structure and text characteristics and capturing characteristic information of a text from the graph convolution layer; the pooling layer is used for carrying out layered sampling on the characteristics obtained by the activation layer; the activation layer is used for carrying out nonlinear activation processing on the features obtained by the graph convolution layer and using a ReLU activation function; the full connection layer processes the output of the activation layer, and integrates the output of the previous layer to obtain output with richer information; the input of the softmax layer is the output of the full connection layer, and is used for predicting the category of the corresponding article, and the calculation formula is shown in the following specific implementation process; cross entropy is used as a loss function for the graph convolution neural network.
Further, the TF-IDF matrix of the text is used as the feature matrix of the text. TF-IDF (term frequency-inverse document frequency) is a common statistical weighting technique used to evaluate the importance of a word to one of the documents in a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus.
Furthermore, the graph convolution layer operation is to perform fourier transform on the graph structure to the spectrum domain, realize convolution operation in the spectrum domain, and then perform inverse fourier transform to complete the convolution operation of the graph. The theoretical basis is spectrogram theory, and the undirected connected graph is defined as G ═ V, E and W, wherein V is a finite set | V | ═ n nodes, E is a group of edges, and W ∈ Rn*nThe method is a weighted adjacency matrix for coding a connection weight between two nodes, wherein the weight is defined according to a specific problem, and in the application, W is an adjacency matrix without a weight. Signal x defined at the nodes (i.e., vertices) of the graph: v → R can be thought of as the vector x ∈ RnWherein x isiIs the value of x at the ith node; the signal x may be understood as attribute information contained in a node, for example, in the present application, a node is represented by a word vector, which contains semantic information of the word, i.e., a signal of the node. An important operation in spectrogram analysis is the graph laplace, whose combined definition is L ═ D-W ∈ Rn*nWherein D ∈ Rn*nIs a diagonal matrix, Dii=∑jWijNormalized is defined as
Figure BDA0002258742230000031
Wherein, WijRepresenting the values between the ith and jth nodes in the adjacency matrix. If the two nodes have edges connected, the value is 1, otherwise the value is 0. In is an identity matrix, R is a real number, RnRepresenting a one-dimensional vector of length n, Rn*nRepresenting a 2-dimensional vector of size n x n. Function f ∈ R, extending on the graph structure from the Fourier transform, defining any node on the graph GnFourier transform of the corresponding graph based on the feature vector of graph Laplace
Figure BDA0002258742230000032
An expansion formula:
Figure BDA0002258742230000033
n is the number of nodes, ulIs the coefficient of the number of the first and second,
Figure BDA0002258742230000034
the function f is a general abstract definition of a Fourier transform formula of a graph aiming at the coefficient of a node i, and node information is represented in the invention. The inverse fourier transform of the corresponding graph is defined as:
Figure BDA0002258742230000035
ul(i) are the coefficients for node i in the inverse fourier transform. In classical fourier analysis, the eigenvalues contain the notion of frequency. When the eigenvalue is close to 0, i.e. at low frequency, the associated complex exponential eigenfunction is a smooth, slowly fluctuating function; on the contrary, when the characteristic value is far from 0, namely, when the frequency is high, the fluctuation characteristic of the corresponding complex exponential characteristic function is severe. For graph structures, the graph laplacian eigenvalues and the graph laplacian vectors have a similar concept as frequency, and the frequency in the conventional fourier transform is analogous to the laplacian eigenvalues/vectors of the graph fourier transform.
The obtained graph Laplace matrix L is a real symmetric semi-positive definite matrix, and an orthogonal eigenvector set, namely the orthogonal eigenvector set, is obtained by decomposing the eigenvalue of the matrix L
Figure BDA0002258742230000036
(called the mode of the graph Fourier), in the Fourier transform of the graph
Figure BDA0002258742230000037
It is considered the frequency of the graph. The Laplace operator is determined by Fourier basis U ═ U0,…,un-1]∈Rn*nSo that L is equal to U Λ UTWherein Λ ═ diag ([ λ ])0,…,λn-1])∈Rn*n. Fourier transform signal x ∈ R of the graphnThen will be defined as
Figure BDA0002258742230000038
Its inverse is
Figure BDA0002258742230000039
After fourier transformation of the graph, it looks like euclidean space, so that basic operational concepts of graph signal processing such as filtering, down-sampling, etc. can be implemented.
Furthermore, the pooling layer coarsens the graph structure (namely, hierarchically samples the features obtained by the active layer), finds a representative node of the graph and completes sampling; and then, a balanced binary tree mode is constructed to pool the graph structure characteristics obtained by the activation layer.
Further, the pooling layer calculates the normalized cutting value of each node and the adjacent nodes thereof, and the formula is Wi,j(1/di+1/dj) Wherein d isi,djThe degrees of the node i and the node j, the degree of the node represents the number of nodes connected with the node, Wi,jIs the weight of the edge of node i, node j. And selecting the adjacent node with the maximum normalized cutting value with the current node to be combined with the current node, and coarsening the step. The coarsening can be continuously performed for a plurality of times, after the coarsening is performed to a proper level, nodes of each level are randomly numbered, and a balanced binary tree is constructed according to the coarsening mapping process. And performing maximum pooling operation on the topmost layer of the binary tree, and sequentially mapping back to the original graph structure, thereby completing pooling.
Furthermore, in the training process of the graph convolution neural network, the full connection layer adopts a dropout strategy, a plurality of nodes are randomly selected according to probability p in each iteration and do not participate in actual operation, a softmax function is used for calculating after the full connection layer output y is obtained, and the maximum value of the softmax function is selected as the category of the corresponding article.
Further, in the step 1), removing punctuation marks and invisible characters, removing stop words and low-frequency words from each article, and calculating a TF-IDF (word frequency-inverse document rate) matrix of each article as a feature matrix of the article.
Further, for the words in the text processed in step 1), the similarity of the word vector of each word and other words is sequentially calculated, and a plurality of words most similar to each word (8 words are selected in the invention) are selected as neighbor nodes of the word, so that a graph structure is constructed.
Further, a Mini-batch gradient descent method or a momentum optimization method is adopted to train the graph convolution neural network.
Since there is often a large amount of "noisy" data in the real dataset, it can interfere with subsequent feature capture. Therefore, the proposal of the application needs to preprocess the original data, remove the 'noise' data in the original data set, and make the data more easily extracted with refined and non-redundant features.
Because corpus data is stored in text form, it needs to be converted to digital form in order to be used as input for training of the atlas neural network. Therefore, after the preprocessing operation on the article original data set is completed, the article is subjected to text representation by using the TF-IDF matrix of the article and the word vector so as to improve the effect. After word vectors corresponding to the article information are obtained, word vector similarity between words is calculated, and therefore a graph is constructed. The proposal of the application realizes the construction of a graph convolution neural network and trains a model on a data set so as to realize article classification; after the model training is completed, the model of the application is scored on the test set to check the effect of the model.
Compared with the prior text classification technical scheme, the text classification method has the following technical advantages:
1. the text classification is realized based on the graph convolution neural network method, the texts are represented in a graph structure mode, the semantic structure correlation among the texts can be captured, and the text feature capture is better. Meanwhile, parameter sharing is realized through graph convolution operation, parameter number is reduced through pooling operation, overfitt avoids model overfitting, the defects of low efficiency, low text classification accuracy and the like are overcome, and the method has the advantages of no need of manually extracting features and the like; the requirement on data is loose, only a text form is needed, and the universality is high;
2. the data preprocessing operation adopted by the application proposal, the method for constructing the graph and text feature matrix, the graph convolution neural network structure and the like are easy to use;
3. the text classification method and the text classification device overcome the defects of low efficiency, low classification accuracy, lack of persuasion and the like in the existing text classification technical scheme, classify articles in a quantitative representation mode, are high in accuracy, and have a solid theoretical foundation.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of data preprocessing;
FIG. 3 is a schematic diagram of a construction diagram;
FIG. 4 is a diagram of a graph convolution neural network structure;
FIG. 5 is a diagram illustrating a graph convolution operation;
FIG. 6 is a schematic view of pooling;
FIG. 7 is a schematic view of a fully connected layer;
FIG. 8 is a schematic drawing of dropout;
fig. 9 is a schematic diagram of gradient descent.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and examples.
As shown in fig. 1, the text classification algorithm mainly includes five key processes: preprocessing data, constructing a graph structure, preprocessing the graph structure, constructing and training a graph convolution neural network model and predicting text categories by using the graph convolution neural network model. In the following, a specific embodiment of this algorithm will be described by way of elaborating the above five key processes, respectively.
The first process is as follows: data pre-processing
In real data, there are often a lot of redundant information, default values and noise, and there may be abnormal points due to human errors. In addition, as for the data set adopted in the proposal of the application, due to the characteristics of the text information, the data set also has the defects of non-structure, no separators between words and the like which are not beneficial to extracting the characteristics. Therefore, data preprocessing is an essential loop in the text classification prefiltering algorithm proposed in the present application.
Common data preprocessing operations include numerical normalization, data structuring, data de-redundancy, and the like. For the present application, it is necessary to represent the original data set (text information) in digital form to perform data preprocessing operations such as removing stop words, removing punctuation marks and invisible characters, and removing low-frequency words from the original data set. There are many ways to represent text information into numbers, such as statistical word frequency, TF-IDF, word vector, etc. (see fig. 2 for flow). The model needs two parts of input, namely a feature matrix of a text and a graph structure.
For the feature matrix of the text, a TF-IDF matrix of the text will be employed. TF-IDF is a statistical method to evaluate the importance of a word to one of a set of documents or a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus. The word frequency (TF) represents the frequency of the occurrence of the entry (keyword) in the text, and the formula is
Figure BDA0002258742230000061
nijIs that the word is in the document djThe frequency of occurrence. Inverse file frequency (IDF): the IDF of a particular term can be obtained by dividing the total number of documents by the number of documents containing that term and taking the logarithm of the quotient
Figure BDA0002258742230000062
I D is the total number of files in the corpus, | { j: ti∈djDenotes the inclusion word tiIf the word is not in the corpus, it will result in a denominator of zero, so 1+ | { j: t [ ] is typically usedi∈djJ. In summary, the calculation formula of TF-IDF is: TF-IDF ═ TF × IDF. Thus, the column length of the TF-IDF matrix represents the total number of documents, the row length represents the number of words in each document, and each value in the matrix represents the TF-IDF value corresponding to the current word.
And a second process: building graph structures
For the graph structure, the nodes of the graph adopt the word vector of each word, and the neighbor nodes are the words with the highest similarity. In the present application, the effect of selecting the 8 words with the highest similarity is the best, so the number of neighbor nodes is set to 8 (see fig. 3 for schematic diagram). Finally, the graph structure is represented as a graph matrix G ∈ N × N, N representing the number of all words, GijWhether the ith word and the jth word have edges or not is shown, if the value is 1, the right edge is formed, otherwise, if the value is 0, no edge is formed.
Word vectors, also known as word embedding, represent words in a corpus or vocabulary in the form of vectors.
In this way, words in the primitive material library or vocabulary are mapped to points in vector space, which can be used as input for training of the convolutional neural network model. In the actual development process, there are many technical models for obtaining word vectors, such as Skip-gram, CBOW, randomly generating word vectors and adjusting them continuously. And if more linguistic data are provided, the word vector is obtained by adopting Skip-gram.
The third process: graph structure preprocessing
Since the subsequent calculation involves a convolution operation of the graph, the laplacian matrix of the graph is required according to the spectrogram theory, and therefore the calculation is performed in advance. The combined definition of the graph Laplace matrix is L ═ D-W ∈ Rn*nWherein D ∈ Rn*nIs a diagonal matrix Dii=∑jWijNormalized is defined as
Figure BDA0002258742230000063
Where In is the identity matrix. Firstly, calculating a graph matrix constructed by word vectors, calculating to obtain a degree matrix of the graph matrix, and then judging whether regularization is needed. If no regularization is needed, the laplacian matrix of the graph is derived from L-D-W. Otherwise, if regularization is required, the corresponding calculation formula of the Laplace matrix is
Figure BDA0002258742230000071
To implement the subsequent graph convolution (filtering) operation, a fourier transform of the graph needs to be implemented. The laplacian matrix L of the graph obtained from the above is a real symmetric positive semi-definite matrix, which has an orthogonal feature vector set,
Figure BDA0002258742230000072
model called graph Fourier, in the Fourier transform of graphs
Figure BDA0002258742230000073
Seen as the frequency of the graph. The Laplace operator is determined by Fourier basis U ═ U0,…,un-1]∈Rn*nSo that L is equal to U Λ UTWherein Λ ═ diag ([ λ ])0,…,λn-1])∈Rn*n. Fourier transform signal x ∈ R of the graphn(ii) a Then will be defined as
Figure BDA0002258742230000074
Its inverse is
Figure BDA0002258742230000075
Where x is the text feature matrix and U is the fourier basis resulting from the laplacian matrix decomposition of the graph.
The laplacian matrix of the graph is calculated by the graph structure preprocessing step, and the graph fourier transform is performed at the same time.
The process four is as follows: construction and training of graph convolutional neural network model
The Convolutional Neural Network (CNN) is one of the most representative Network structures in deep learning, and overcomes the defects of the traditional Neural Network such as various parameters through methods such as local connection, weight sharing and pooling, so as to obtain excellent performances in various fields such as visual processing and natural language processing. The model using CNN on Graph data is called Graph Convolutional neural network (GCN). Generalization of CNN to graph data requires three main steps: (1) in order to realize the filtering operation, the graph needs to be converted from a node domain to a spectrum domain, and a local convolution filter used on the graph is designed; (2) the graph with the approximate nodes gathered together is coarsened. The reason is that when the image is maximally pooled or averaged, the operation of averaging or selecting the maximum value is performed every few data points. Similarly, when the graph data is subjected to pooling operation, marking and distinguishing similar nodes and coarsening the graph with the similar nodes gathered together; (3) after the graph is coarsened, graphs of different coarsened versions are obtained, and the aggregation of approximate nodes is realized. The pooling operation of the graph is then performed, translating the spatial resolution to a higher degree of filter resolution.
The graph convolution neural network structure adopted in the present application is shown in fig. 4, and includes network structures such as a graph convolution layer, an activation function layer, a pooling layer, and a full connection layer. To facilitate understanding of the structure of the convolutional neural network used in the present application, the structure thereof will be described in detail.
The structure I is as follows: picture volume lamination
The filtering operation can be implemented in the spectral domain of the graph, in which the graph data has been transformed from the spatial domain to the spectral domain via the fourier transform of the graph, see fig. 5 for a flow chart. The convolution operation of the graph in the fourier, i.e. spectral, domain is defined as: x Gy ═ U ((U)Tx)⊙(UTy)), where ⊙ is an element-by-element Hadamard productθG is obtained after filtrationθ(L)x=gθ(UΛUT)x=Ugθ(Λ)UTx. A nonparametric filter, i.e., a filter with all spatial parameters, will be defined as gθ(Λ) ═ diag (θ). Wherein the parameter theta is equal to RnIs a vector of fourier coefficients.
Although the filtering operation can be achieved after the graph has been fourier transformed into the spectral domain, such a filter (i.e., convolution kernel) is parameter-free. The disadvantage of the filter without parameters: local features cannot be captured, learning complexity is still in direct proportion to the number of the graphs, and when graph data are too large, learning cost is too high, so that efficiency is low. This problem can be solved with a polynomial filter:
Figure BDA0002258742230000081
wherein the parameter theta is equal to RnIs a polynomial coefficient vector. Taking the node i as the center and the value of the neighbor node j thereof, passing through a filter gθIs prepared from (g)θ(L)δi)j=(gθ(L))i,j=∑kθk(Lk)i,jThe initial expression of the node is word vector, the node information is updated in the training process, the node information update is influenced by the neighbor node, and the initial expression is obtained through calculation by a formula (g)θ(L)δi)j=(gθ(L))i,j=∑kθk(Lk)i,jContinuously calculating and updating; the convolution kernel passes the kronecker function deltaiThe convolution operation carried out by epsilon R can capture local characteristics. dG(i,j)>K means (L)K)i,jIs 0, wherein dGIs the shortest path distance, i.e., the minimum number of edges connecting two nodes on the graph. Thus, the spectral filter represented by the laplacian K-th order polynomial is just K-localized. Furthermore, their learning complexity is o (k), the size of the support of the filter, and therefore the same complexity as classical CNN.
Even if the operation of filtering the signal x by learning a local filter using the above K parameters, x has y ═ Ugθ(Λ)UTx, which is due to the multiplication of the Fourier basis U, so that the cost is still high O (n)2). The solution to this problem is to parameterize gθ(L), which is considered a polynomial function, can be recursively computed from L because K is multiplied by the sparse matrix L at the cost that O (K | E |) is much smaller than O (n)2). One such polynomial, conventionally used to approximate kernels (e.g., wavelets) in image signal processing, is the chebyshev expansion.
k-order Chebyshev polynomial Tk(x) The relational calculation, T, can be performed by recursionk(x)=2xTk-1(x)-Tk-2(x) Wherein T is01 and T1X. These polynomials form an orthogonal basis for L
Figure BDA0002258742230000082
About
Figure BDA0002258742230000083
Is the Hilbert space of the squared integrable function. The filter can thus be parameterized as a truncated expansion
Figure BDA0002258742230000084
Order K-1 of the above formula, wherein the parameter theta epsilon RKIs a vector of chebyshev coefficients,
Figure BDA0002258742230000085
is that
Figure BDA0002258742230000086
Figure BDA0002258742230000087
The Chebyshev polynomial of order k of the evaluation, in which the standard eigenvalues of the diagonal matrix lie in [ -1, 1]. The filtering operation may then be written as
Figure BDA0002258742230000088
Wherein
Figure BDA0002258742230000089
Is a Chebyshev polynomial of the k-th order, derived from the standard Laplace
Figure BDA00022587422300000810
And (6) evaluating. To represent
Figure BDA00022587422300000811
We can use this iterative relationship to compute
Figure BDA00022587422300000812
And is
Figure BDA00022587422300000813
The whole filtration operation
Figure BDA00022587422300000814
The cost is then O (K | E |).
The structure II is as follows: non-linear active layer
To add the nonlinear element, an activation layer is therefore added. The present application proposes a ReLU (modified linear unit) method. ReLU is defined as:
Figure BDA0002258742230000091
while there are other activation functions, such as sigmoid function, tanh function, ReLU has advantages that they do not. The convergence speed of the ReLU is faster if a random gradient descent method is used in model optimization. Moreover, indexes are used in the sigmoid activation function and the tanh activation function, so that the calculation cost is very high, and the defect is obvious particularly when the data volume is large. The function definition according to ReLU is intuitively perceived as being computationally inexpensive. In addition, sigmoid and tanh are not effective in the gradient disappearance problem, but ReLU can be effectively alleviated. Of course, the ReLU has certain disadvantages, but in this experiment it was shown that its advantages affect more, so the ReLU activation function was chosen.
The structure is three: pooling layer
After the graph convolution layer completes convolution operation on the graph structure, after the characteristics used for classification are extracted, the next step is to use the characteristics for classification. However, the feature and related parameters obtained by the graph convolution are still too many, which results in too large calculation amount and even over-fitting phenomenon. Thus, the present application proposes to deploy the pooling layer after the map rolling layer to avoid the effects of the above-mentioned adverse factors.
Pooling is simply understood to be the sampling of features obtained by a graph convolution layer. Conventional rule data is deleted every few data points when pooling is performed. However, the node of a weight graph is downsampled, and the concept of every other node is not realized. Therefore, similar to regular data, it is necessary to cluster the similar nodes of the graph together, i.e. the graph cluster. In practice, a graph structure with a large number of nodes is clustered once, and most similar nodes cannot be clustered together. This operation therefore needs to be repeated, which is in fact a multi-scale clustering of the graphs. However, clustering of graphs is an NP-hard problem, and therefore, it is necessary to adopt a method that can obtain an approximate result.
The clustering algorithm of the graph mainly comprises the following steps: partitional clustering algorithms, hierarchical clustering algorithms, density-based clustering algorithms, grid-based clustering algorithms, and the like. The multi-scale clustering algorithm comprises three steps: coarsening the graph, dividing the graph and refining the graph.
And (3) coarsening the graph: and combining the nodes and the edges on the graph according to a set rule to obtain a coarsened version. On the basis, the rule of node and edge combination is continuously repeated, and a coarsening version with a higher level is further obtained. And determining the coarsening degree and the coarsening frequency according to specific requirements. In the proposal of the application, the combination rule adopts Graclus greedy algorithm. The greedy rule of Graclus involves picking an unmarked node i at each coarsening level and matching it to one of its unmarked neighbors j to maximize the local normalized cut value Wi,j(1/di+1/dj). Two matching nodes are then marked and the coarsened weight is set to the sum of their weights. The matching is repeated until all nodes are marked. From one level to the next coarser level, it roughly divides the node number into two parts, where there may be a few individual nodes that are not matched.
In the present application, clustering of graphs is mainly applied in that: and after the graph structure is coarsened, randomly numbering nodes on the graph, and constructing a balanced binary tree. Each coarsened version of the node corresponds to a level of the balanced binary tree. The most coarsened node on the graph is the parent node of the balanced binary tree, the next most coarsened node on the graph corresponds to the second level of the balanced binary tree, and so on, the most original node on the graph is the leaf node on the balanced binary tree.
After the graph structure is convoluted and activated, a new feature graph is obtained, and the pooling layer coarsens the feature graph to a certain degree and constructs a corresponding balanced binary tree. And performing downsampling operation on the binary tree, and mapping the binary tree to the second layer and the third layer … … of the binary tree in sequence from the father node of the binary tree, wherein the pooling of the graph is equivalent to pooling one-dimensional data.
By way of example (see FIG. 6) G0Is the original finest graph, each node is randomly numbered as shown. And combining the nodes and the edges by using a Graclus algorithm, wherein the nodes are combined into one node under the assumption that the nodes 0 and 1 meet the maximum normalized cutting value, the nodes 4 and 5 are combined into one node, the nodes 8 and 9 are combined into one node, and the nodes which are not matched and combined with the nodes 6 and 10 are single nodes, so that the requirement of a balanced binary tree is met, the nodes 7 and 11 are added, and the initial value is set to be 0, thereby obtaining G1. Similarly, nodes on G1 are numbered randomly, nodes 2 and 3 are merged by using the Graclus algorithm, nodes 4 and 5 are merged, and node 0 has no matching node, so as to satisfy the rule of the balanced binary tree, a dummy node 1 is added to obtain G2. At this time, G2 is the most coarsened graph.
A balanced binary tree is constructed from the three coarsened versions. Pooling starts with the parent node of the binary tree, here using maximal pooling as an example. From node 0, mapping to the child nodes in the second layer in sequence, wherein the child nodes correspond to node 0 and node 1, and node 0 in the second layer is a single node and corresponds to nodes 0 and 1 of leaf nodes; the node 1 of the second layer is a false node, the corresponding child nodes are all false nodes, and the values of the nodes are all 0, so that the pooling result is not influenced. Therefore, maximizing the pooling of parent node 0 is equivalent to maximizing the pooling of node 0 and node 1 in the original graph structure. By analogy, the maximum pooling is performed on the father node 1, which is equivalent to the maximum pooling performed on the nodes 4,5 and 6 in the original graph structure. The maximum pooling of the parent node 2 is equivalent to the maximum pooling of the nodes 8,9 and 10 in the original graph structure. Therefore, the pooling result of the entire graph is z ═ max {0,1}, max {4,5,6}, max {8,9,10} }.
The structure is four: full connection layer
The fully-connected layer, as the name implies, is such that each node of the fully-connected layer is connected to each node of the previous layer, as shown in fig. 7. In the proposal of the application, the upper layer is a pooling layer, the input and output layers are arranged behind the full-connection layer, and the softmax is used for carrying out category prediction. In addition, the application proposes to adopt a dropout strategy in order to avoid the disadvantages that the weight parameters of the full connection layer are too many, calculation is difficult, and overfitting is easy to cause. So called dropout, in the training process, every iteration randomly selects some nodes with probability p to not participate in the actual operation, as shown in fig. 8, and the second node of the input layer temporarily does not participate in the operation.
The structure is five: output layer
The output layer outputs the categories of the articles. After the full link layer output y is obtained, the corresponding category, namely the category of the article, can be obtained by using the softmax function on the full link layer output y. Wherein the softmax function is as follows,
Figure BDA0002258742230000111
in the formula, l represents the number of categories, yiThe ith value representing the fully connected layer output. The result of the above formula is a probability value. And calculating the softmax function value of all the values output by the full connection layer, and selecting the maximum value as the category of the article.
The structure is six: loss function and training method
After the model is determined, the next and final step is to determine the loss function and the training method.
The loss function is used to measure the predicted value of the model. It is a non-negative real-valued function, usually represented by the function L (y, f (x)). The smaller the loss function is, the better the robustness of the model is, i.e. the parameters are adjusted by the training method during the training process so that the value of the loss function is reduced. Commonly used loss functions are a mean absolute value loss function, a mean square error loss function, a cross entropy loss function, and the like. The cross-entropy loss function is generally superior to other loss functions in experimental effect in more networks, and well reflects the difference between the expected output and the current actual output. Therefore, the present application proposes to use the commonly used cross entropy as the loss function, and the formula is as follows.
Figure BDA0002258742230000112
Here, N represents the number of samples. After the loss function is determined, the next step is to determine the training method. In the neural network, the adjustment optimization of the parameters is completed by gradient descent.
The gradient descent method is a first-order optimization algorithm, also commonly referred to as the steepest descent method. To find the local minimum value of a function by using the gradient descent method, iterative search must be performed to a distance point with a specified step length corresponding to the opposite direction of the gradient (or approximate gradient) on the function at the current point, as shown in the formula
Figure BDA0002258742230000113
Wherein the function f (x) is at point x1Can be fine and defined, and gamma is the step size. It is easy to see that when gamma is>When 0 is a sufficiently small value, there is f (x)1)≥f(x2). The gradient descent diagram is shown in fig. 9.
However, since the model is too complex, the computation amount of calculating the gradient for all training samples is too large, and the academia and the industry often adopt an improved gradient descent method as a scheme for finding the optimal value or the local optimal value by the model. The commonly used modified gradient descent method includes a random gradient descent method, a batch gradient descent method, an Adam gradient descent method, and the like. The application proposal adopts the small batch gradient descent method and the momentum optimization method as a model optimization scheme because the latter can calculate the self-adaptive learning rate of each parameter.
Process five prediction
Finally, after the model training is completed, the text information is classified on the data set by using the graph convolution neural network model, and the classification effect is compared with other text classification technical schemes to check.
To verify the performance of the convolutional neural network used in the present application on the text classification problem, this section will compare the effect of the convolutional neural network on the classification with other text classification technical solutions on the same article data set.
The hardware environment of the experiment in this section is 2.8GHz CPU, 506.3GB memory, 88 nuclear server, and the operating system is 64-bit Linux system.
The data set used in this experiment is shown in table 1:
TABLE 1 data set
Figure BDA0002258742230000121
Specifically, the model hyper-parameters proposed in the present application are shown in table 2 according to the data set characteristics proposed in the present application and the conventional setting scheme of the hyper-parameters of the convolutional neural network.
TABLE 2 model hyper-parameter table
Hyper-parameter Means of Numerical value
num_GCN Number of layers of picture-volume laminate 2
learning_rate Initial learning rate 0.0001
dropout_keep_prob dropout ratio 0.5
batch_size Size of batch 128
num_epochs Number of training rounds 50
output_dim Output dimension of output layer 512
In the experiment, the word vector is generated by using a skip-gram method in a word2vec tool, a ReLU function is selected as an activation function, a cross entropy loss function is selected as a loss function of a model, a small batch gradient descent method and a momentum optimization method are adopted as training methods of the model, and the initial learning rate is set to be 0.0001. The results of the experiment are shown in table 3.
TABLE 3 results of the experiment
Model (model) Rate of accuracy
CBOW 0.92
GCN+CBOW 0.95
Fast Text 0.91
GCN+Fast Text 0.95
LSTM 0.93
Text-CNN 0.94
The experimental analysis in this section is as follows:
as can be seen from Table 1, there are 4 chapter categories in this section of experiment, and each sample belongs to one category. Thus, the classification is performed randomly from one document, with correct results around 1/4. As can be seen from Table 3, the accuracy of the convolutional neural network is much higher than that of the artificial random selection, and the final accuracy index is higher than that of other text classification technical solutions, which is satisfactory! For the above experimental results, the following specific analysis exists:
1) the method and the device for representing the text information in the graph structure are used for representing the text information, the graph is constructed through word similarity, the semantic structure correlation among the texts is captured well, and the implicit relation of the text information is further well described.
2) The graph convolution neural network realizes the capture of structural information between texts through graph convolution operation, simultaneously takes the statistical attribute characteristics of the texts into consideration by using the TF-IDF matrix of the texts, and comprehensively takes the display and implicit characteristics of the texts into consideration through the two aspects. Meanwhile, the number of parameters is reduced through the pooling operation of multilayer clustering, dropout avoids model overfitting, overcomes the defects of low efficiency, low text classification accuracy and the like, has the advantages of no need of manually extracting features and the like, and is obviously superior to other schemes in a final experimental result.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A text classification method based on a graph convolution neural network comprises the following steps:
1) for each classified labeled text in a text training set of a target field, generating a text feature vector of the text according to the word frequency and the inverse document rate of words in the text; combining all text feature vectors to generate a text feature matrix, namely a TF-IDF matrix, and constructing a graph structure of the text training set according to word vector similarity of words;
2) training a graph convolution neural network by using the graph structure and the text feature matrix;
3) and for a text a to be classified in the target field, inputting the text feature vector of the text a into the trained graph convolution neural network to obtain the category of the text a.
2. The method of claim 1, wherein the graph structure is generated by: and taking the words in the text as nodes of the graph, and taking a plurality of words most similar to one node as neighbor nodes of the node to generate the graph structure.
3. The method according to claim 1 or 2, wherein in step 2), the graph structure is preprocessed first, and a laplacian matrix of the graph is calculated; and then training the graph convolutional neural network by using the Laplace matrix and the text characteristic matrix of the graph.
4. The method of claim 3, wherein the graph Laplace matrix is L-D-W e Rn*nWherein D ∈ Rn*nIs a diagonal matrix, Dii=∑jWij,W∈Rn*nIs an adjacency matrix, W, that encodes the connection weights between two nodesijRepresenting the value corresponding to the ith node and the jth node in the adjacency matrix, and if the ith node and the jth node have edge connection, WijThe value is 1, otherwise 0.
5. The method of claim 1, wherein the graph convolutional neural network comprises an input layer, a number of hidden layers, a fully connected layer, and an output layer connected in sequence; wherein, each hidden layer comprises a graph convolution layer, a pooling layer and an activation layer; the input layer is used for receiving the graph structure and the text characteristics and inputting the graph structure and the text characteristics into the hidden layer; the graph convolution layer is used for carrying out convolution operation on the input graph structure and the text characteristics to obtain the characteristic information of the text and inputting the characteristic information into the activation layer; the activation layer is used for carrying out nonlinear activation processing on the features captured by the input convolution layer; the pooling layer is used for carrying out layered sampling on the information obtained by the activation layer; and the information of the layered sampling is input into an output layer after passing through the full connection layer, and the category of the corresponding text is predicted.
6. The method of claim 4, wherein the graph convolution layer performs a graph Fourier transform on the graph structure to a spectrum domain, performs a convolution operation in the spectrum domain, and performs an inverse graph Fourier transform on the graph structure back to a frequency domain to obtain a convolution result; the pooling layer is represented by the formula Wi,j(1/di+1/dj) Calculating the normalized cutting value of each node and the adjacent node, then selecting the adjacent node with the maximum normalized cutting value of the current node to be combined with the current node, and then completing pooling through one-dimensional pooling; wherein d isiIs the degree of node i, djIs the degree of node j, Wi,jIs the weight of the edge between node i and node j.
7. The method of claim 6, wherein the function f ∈ R that defines the nodes on any graph GnFourier transform of the corresponding graph based on the feature vector of graph Laplace
Figure FDA0002258742220000011
An expansion formula:
Figure FDA0002258742220000012
where n is the number of nodes in the graph structure, ulIs the coefficient of the number of the first and second,
Figure FDA0002258742220000021
is a coefficient for node i; the inverse fourier transform of the corresponding graph is defined as:
Figure FDA0002258742220000022
ul(i) is the coefficient for node i in the inverse fourier transform; graph G (V, E, W), where V is a finite set | V | ═ n nodes, E is a set of edges, and W ∈ Rn*nIs a adjacency matrix that encodes the connection weights between two nodes.
8. The method of claim 6 or 7, wherein the graph convolutional layer filters the node signal x in the graph structure by using a filter, wherein the filtering operation is
Figure FDA0002258742220000023
y is the filtered signal, theta ∈ RKIs a chebyshev coefficient vector; the signal x belongs to RnX is semantic information of a word corresponding to a node, xiIs the value of x at the ith node.
9. The method of claim 1, wherein the atlas neural network is trained using a Mini-batch gradient descent method or a momentum optimization method.
10. A text classification system based on a graph convolution neural network is characterized by comprising a text preprocessing module, a graph convolution neural network training module and a text classification module; wherein,
the text preprocessing module is used for generating text characteristic vectors of the text according to the word frequency and the inverse document rate of words in the text, and then combining the text characteristic vectors to generate a text characteristic matrix, namely a TF-IDF matrix; constructing a graph structure of the text training set according to the word vector similarity of the words;
the graph convolution neural network training module is used for training a graph convolution neural network according to the text feature matrix and the graph structure;
and the text classification module is used for inputting the text feature vector of the text a to be classified into the trained atlas neural network to obtain the category of the text a.
CN201911064089.7A 2019-11-04 2019-11-04 Text classification method and system based on graph convolution neural network Pending CN110929029A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911064089.7A CN110929029A (en) 2019-11-04 2019-11-04 Text classification method and system based on graph convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911064089.7A CN110929029A (en) 2019-11-04 2019-11-04 Text classification method and system based on graph convolution neural network

Publications (1)

Publication Number Publication Date
CN110929029A true CN110929029A (en) 2020-03-27

Family

ID=69850245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911064089.7A Pending CN110929029A (en) 2019-11-04 2019-11-04 Text classification method and system based on graph convolution neural network

Country Status (1)

Country Link
CN (1) CN110929029A (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111538870A (en) * 2020-07-07 2020-08-14 北京百度网讯科技有限公司 Text expression method and device, electronic equipment and readable storage medium
CN111552803A (en) * 2020-04-08 2020-08-18 西安工程大学 Text classification method based on graph wavelet network model
CN111598214A (en) * 2020-04-02 2020-08-28 浙江工业大学 Cross-modal retrieval method based on graph convolution neural network
CN111694957A (en) * 2020-05-29 2020-09-22 新华三大数据技术有限公司 Question list classification method and device based on graph neural network and storage medium
CN111965476A (en) * 2020-06-24 2020-11-20 国网江苏省电力有限公司淮安供电分公司 Low-voltage diagnosis method based on graph convolution neural network
CN111984762A (en) * 2020-08-05 2020-11-24 中国科学院重庆绿色智能技术研究院 Text classification method sensitive to attack resistance
CN112131506A (en) * 2020-09-24 2020-12-25 厦门市美亚柏科信息股份有限公司 Webpage classification method, terminal equipment and storage medium
CN112270322A (en) * 2020-12-17 2021-01-26 恒银金融科技股份有限公司 Method for recognizing crown word number of bank note by utilizing neural network model
CN112287664A (en) * 2020-12-28 2021-01-29 望海康信(北京)科技股份公司 Text index data analysis method and system, corresponding equipment and storage medium
CN112347246A (en) * 2020-10-15 2021-02-09 中科曙光南京研究院有限公司 Self-adaptive document clustering method and system based on spectral decomposition
CN112487305A (en) * 2020-12-01 2021-03-12 重庆邮电大学 GCN-based dynamic social user alignment method
CN112529068A (en) * 2020-12-08 2021-03-19 广州大学华软软件学院 Multi-view image classification method, system, computer equipment and storage medium
CN112529071A (en) * 2020-12-08 2021-03-19 广州大学华软软件学院 Text classification method, system, computer equipment and storage medium
CN112651487A (en) * 2020-12-21 2021-04-13 广东交通职业技术学院 Data recommendation method, system and medium based on graph collapse convolution neural network
CN112685504A (en) * 2021-01-06 2021-04-20 广东工业大学 Production process-oriented distributed migration chart learning method
CN112733933A (en) * 2021-01-08 2021-04-30 北京邮电大学 Data classification method and device based on unified optimization target frame graph neural network
CN112765352A (en) * 2021-01-21 2021-05-07 东北大学秦皇岛分校 Graph convolution neural network text classification method based on self-attention mechanism
CN112800239A (en) * 2021-01-22 2021-05-14 中信银行股份有限公司 Intention recognition model training method, intention recognition method and device
CN112925907A (en) * 2021-02-05 2021-06-08 昆明理工大学 Microblog comment viewpoint object classification method based on event graph convolutional neural network
CN113360648A (en) * 2021-06-03 2021-09-07 山东大学 Case classification method and system based on correlation graph learning
CN113435478A (en) * 2021-06-03 2021-09-24 华东师范大学 Method and system for classifying clothing template pictures by using graph convolution neural network
CN113642674A (en) * 2021-09-03 2021-11-12 贵州电网有限责任公司 Multi-round dialogue classification method based on graph convolution neural network
CN113792144A (en) * 2021-09-16 2021-12-14 南京理工大学 Text classification method based on semi-supervised graph convolution neural network
CN113946683A (en) * 2021-09-07 2022-01-18 中国科学院信息工程研究所 Knowledge fusion multi-mode false news identification method and device
CN113987152A (en) * 2021-11-01 2022-01-28 北京欧拉认知智能科技有限公司 Knowledge graph extraction method, system, electronic equipment and medium
CN114021550A (en) * 2021-11-04 2022-02-08 成都中科信息技术有限公司 News trend prediction system and method based on graph convolution neural network
WO2022105108A1 (en) * 2020-11-18 2022-05-27 苏州浪潮智能科技有限公司 Network data classification method, apparatus, and device, and readable storage medium
CN114817538A (en) * 2022-04-26 2022-07-29 马上消费金融股份有限公司 Training method of text classification model, text classification method and related equipment
CN114943324A (en) * 2022-05-26 2022-08-26 中国科学院深圳先进技术研究院 Neural network training method, human motion recognition method and device, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543084A (en) * 2018-11-09 2019-03-29 西安交通大学 A method of establishing the detection model of the hidden sensitive text of network-oriented social media
CN109783696A (en) * 2018-12-03 2019-05-21 中国科学院信息工程研究所 A kind of multi-mode index of the picture construction method and system towards weak structure correlation
CN109902288A (en) * 2019-01-17 2019-06-18 深圳壹账通智能科技有限公司 Intelligent clause analysis method, device, computer equipment and storage medium
CN110134934A (en) * 2018-02-02 2019-08-16 普天信息技术有限公司 Text emotion analysis method and device
US20190304156A1 (en) * 2018-04-03 2019-10-03 Sri International Artificial intelligence for generating structured descriptions of scenes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134934A (en) * 2018-02-02 2019-08-16 普天信息技术有限公司 Text emotion analysis method and device
US20190304156A1 (en) * 2018-04-03 2019-10-03 Sri International Artificial intelligence for generating structured descriptions of scenes
CN109543084A (en) * 2018-11-09 2019-03-29 西安交通大学 A method of establishing the detection model of the hidden sensitive text of network-oriented social media
CN109783696A (en) * 2018-12-03 2019-05-21 中国科学院信息工程研究所 A kind of multi-mode index of the picture construction method and system towards weak structure correlation
CN109902288A (en) * 2019-01-17 2019-06-18 深圳壹账通智能科技有限公司 Intelligent clause analysis method, device, computer equipment and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DAVID I SHUMANY,SUNIL K.NARANG, PASCAL FROSSARD: "The Emerging Field of Signal Processing on Graphs", 《HTTPS://ARXIV.ORG/ABS/1211.0053V2》 *
JING YU: "Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval", 《HTTPS://ARXIV.ORG/ABS/1802.00985》 *
JING YU: "Semantic Modeling of Textual Relationships in Cross-modal Retrieval", 《INTERNATIONAL CONFERENCE ON KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT 2019》 *
LIANG YAO, CHENGSHENG MAO, YUAN LUO: "Graph Convolutional Networks for Text Classification", 《HTTPS://ARXIV.ORG/ABS/1809.05679》 *
MICHAËL DEFFERRARD XAVIER BRESSON PIERRE VANDERGHEYNST: "Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering", 《HTTPS://ARXIV.ORG/ABS/1606.09375V3》 *

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598214B (en) * 2020-04-02 2023-04-18 浙江工业大学 Cross-modal retrieval method based on graph convolution neural network
CN111598214A (en) * 2020-04-02 2020-08-28 浙江工业大学 Cross-modal retrieval method based on graph convolution neural network
CN111552803A (en) * 2020-04-08 2020-08-18 西安工程大学 Text classification method based on graph wavelet network model
CN111552803B (en) * 2020-04-08 2023-03-24 西安工程大学 Text classification method based on graph wavelet network model
CN111694957A (en) * 2020-05-29 2020-09-22 新华三大数据技术有限公司 Question list classification method and device based on graph neural network and storage medium
CN111694957B (en) * 2020-05-29 2024-03-12 新华三大数据技术有限公司 Method, equipment and storage medium for classifying problem sheets based on graph neural network
CN111965476A (en) * 2020-06-24 2020-11-20 国网江苏省电力有限公司淮安供电分公司 Low-voltage diagnosis method based on graph convolution neural network
CN111538870B (en) * 2020-07-07 2020-12-18 北京百度网讯科技有限公司 Text expression method and device, electronic equipment and readable storage medium
CN111538870A (en) * 2020-07-07 2020-08-14 北京百度网讯科技有限公司 Text expression method and device, electronic equipment and readable storage medium
CN111984762A (en) * 2020-08-05 2020-11-24 中国科学院重庆绿色智能技术研究院 Text classification method sensitive to attack resistance
CN111984762B (en) * 2020-08-05 2022-12-13 中国科学院重庆绿色智能技术研究院 Text classification method sensitive to attack resistance
CN112131506A (en) * 2020-09-24 2020-12-25 厦门市美亚柏科信息股份有限公司 Webpage classification method, terminal equipment and storage medium
CN112347246A (en) * 2020-10-15 2021-02-09 中科曙光南京研究院有限公司 Self-adaptive document clustering method and system based on spectral decomposition
CN112347246B (en) * 2020-10-15 2024-04-02 中科曙光南京研究院有限公司 Self-adaptive document clustering method and system based on spectrum decomposition
WO2022105108A1 (en) * 2020-11-18 2022-05-27 苏州浪潮智能科技有限公司 Network data classification method, apparatus, and device, and readable storage medium
CN112487305A (en) * 2020-12-01 2021-03-12 重庆邮电大学 GCN-based dynamic social user alignment method
CN112529071A (en) * 2020-12-08 2021-03-19 广州大学华软软件学院 Text classification method, system, computer equipment and storage medium
CN112529068A (en) * 2020-12-08 2021-03-19 广州大学华软软件学院 Multi-view image classification method, system, computer equipment and storage medium
CN112529071B (en) * 2020-12-08 2023-10-17 广州大学华软软件学院 Text classification method, system, computer equipment and storage medium
CN112529068B (en) * 2020-12-08 2023-11-28 广州大学华软软件学院 Multi-view image classification method, system, computer equipment and storage medium
CN112270322A (en) * 2020-12-17 2021-01-26 恒银金融科技股份有限公司 Method for recognizing crown word number of bank note by utilizing neural network model
CN112651487A (en) * 2020-12-21 2021-04-13 广东交通职业技术学院 Data recommendation method, system and medium based on graph collapse convolution neural network
CN112287664B (en) * 2020-12-28 2021-04-06 望海康信(北京)科技股份公司 Text index data analysis method and system, corresponding equipment and storage medium
CN112287664A (en) * 2020-12-28 2021-01-29 望海康信(北京)科技股份公司 Text index data analysis method and system, corresponding equipment and storage medium
CN112685504B (en) * 2021-01-06 2021-10-08 广东工业大学 Production process-oriented distributed migration chart learning method
CN112685504A (en) * 2021-01-06 2021-04-20 广东工业大学 Production process-oriented distributed migration chart learning method
US11367002B1 (en) 2021-01-06 2022-06-21 Guangdong University Of Technology Method for constructing and training decentralized migration diagram neural network model for production process
CN112733933B (en) * 2021-01-08 2024-01-05 北京邮电大学 Data classification method and device based on unified optimization target frame graph neural network
CN112733933A (en) * 2021-01-08 2021-04-30 北京邮电大学 Data classification method and device based on unified optimization target frame graph neural network
CN112765352A (en) * 2021-01-21 2021-05-07 东北大学秦皇岛分校 Graph convolution neural network text classification method based on self-attention mechanism
CN112800239B (en) * 2021-01-22 2024-04-12 中信银行股份有限公司 Training method of intention recognition model, and intention recognition method and device
CN112800239A (en) * 2021-01-22 2021-05-14 中信银行股份有限公司 Intention recognition model training method, intention recognition method and device
CN112925907A (en) * 2021-02-05 2021-06-08 昆明理工大学 Microblog comment viewpoint object classification method based on event graph convolutional neural network
CN113435478A (en) * 2021-06-03 2021-09-24 华东师范大学 Method and system for classifying clothing template pictures by using graph convolution neural network
CN113435478B (en) * 2021-06-03 2022-07-08 华东师范大学 Method and system for classifying clothing template pictures by using graph convolution neural network
CN113360648A (en) * 2021-06-03 2021-09-07 山东大学 Case classification method and system based on correlation graph learning
CN113642674A (en) * 2021-09-03 2021-11-12 贵州电网有限责任公司 Multi-round dialogue classification method based on graph convolution neural network
CN113946683A (en) * 2021-09-07 2022-01-18 中国科学院信息工程研究所 Knowledge fusion multi-mode false news identification method and device
CN113792144A (en) * 2021-09-16 2021-12-14 南京理工大学 Text classification method based on semi-supervised graph convolution neural network
CN113792144B (en) * 2021-09-16 2024-03-12 南京理工大学 Text classification method of graph convolution neural network based on semi-supervision
CN113987152A (en) * 2021-11-01 2022-01-28 北京欧拉认知智能科技有限公司 Knowledge graph extraction method, system, electronic equipment and medium
CN114021550A (en) * 2021-11-04 2022-02-08 成都中科信息技术有限公司 News trend prediction system and method based on graph convolution neural network
CN114817538A (en) * 2022-04-26 2022-07-29 马上消费金融股份有限公司 Training method of text classification model, text classification method and related equipment
CN114817538B (en) * 2022-04-26 2023-08-08 马上消费金融股份有限公司 Training method of text classification model, text classification method and related equipment
CN114943324A (en) * 2022-05-26 2022-08-26 中国科学院深圳先进技术研究院 Neural network training method, human motion recognition method and device, and storage medium
CN114943324B (en) * 2022-05-26 2023-10-13 中国科学院深圳先进技术研究院 Neural network training method, human motion recognition method and device, and storage medium

Similar Documents

Publication Publication Date Title
CN110929029A (en) Text classification method and system based on graph convolution neural network
Huixian The analysis of plants image recognition based on deep learning and artificial neural network
Van Der Maaten Accelerating t-SNE using tree-based algorithms
Xia et al. Complete random forest based class noise filtering learning for improving the generalizability of classifiers
Adams et al. A survey of feature selection methods for Gaussian mixture models and hidden Markov models
Friedman et al. Introduction to pattern recognition: statistical, structural, neural, and fuzzy logic approaches
US7362892B2 (en) Self-optimizing classifier
Van Hulle Self-organizing Maps.
KR20180120061A (en) Artificial neural network model learning method and deep learning system
WO2019102005A1 (en) Object recognition using a convolutional neural network trained by principal component analysis and repeated spectral clustering
Widiyanto et al. Implementation of convolutional neural network method for classification of diseases in tomato leaves
Shadrach et al. RETRACTED ARTICLE: Neutrosophic Cognitive Maps (NCM) based feature selection approach for early leaf disease diagnosis
EP2614470A2 (en) Method for providing with a score an object, and decision-support system
CN113157957A (en) Attribute graph document clustering method based on graph convolution neural network
CN109344898A (en) Convolutional neural networks image classification method based on sparse coding pre-training
CN113642674A (en) Multi-round dialogue classification method based on graph convolution neural network
CN112348090A (en) Neighbor anomaly detection system based on neighbor self-encoder
Yang et al. Classification of medical images with synergic graph convolutional networks
Henriques et al. Spatial clustering using hierarchical SOM
Maddumala A Weight Based Feature Extraction Model on Multifaceted Multimedia Bigdata Using Convolutional Neural Network.
Chander et al. Data clustering using unsupervised machine learning
Parsa et al. Coarse-grained correspondence-based ancient Sasanian coin classification by fusion of local features and sparse representation-based classifier
CN113569920A (en) Second neighbor anomaly detection method based on automatic coding
CN117349494A (en) Graph classification method, system, medium and equipment for space graph convolution neural network
Balaganesh et al. Movie success rate prediction using robust classifier

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200327

WD01 Invention patent application deemed withdrawn after publication