CN111209398B

CN111209398B - Text classification method and system based on graph convolution neural network

Info

Publication number: CN111209398B
Application number: CN201911393728.4A
Authority: CN
Inventors: 李建欣; 邵明来; 周佰聪; 孙佩源; 邰振赢
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2023-01-17
Anticipated expiration: 2039-12-30
Also published as: CN111209398A

Abstract

The invention designs a text classification method and a text classification system based on a graph convolution neural network through a graph neural network modeling method, wherein the method mainly comprises the following three steps: inputting a text and constructing a similarity graph; constructing a deep learning model; and forming a text classification method overall model by the similarity graph and the deep learning model, and performing text classification. Through the method, an efficient text classification method is provided based on the text similarity graph, the graph convolution network and the graph attention network, namely, the text can be automatically classified only by inputting a source text, and a system is formed by combining hardware on the basis of a model.

Description

Text classification method and system based on graph convolution neural network

Technical Field

The invention relates to the field of neural networks, in particular to a text classification method based on a graph convolution neural network.

Background

Recently, a great deal of research has been focused on data of a graph structure, which is a data structure that can be used in many ways. Graph-convolution networks (GCNs) provide us with a very efficient way to analyze graph structure data. It is a very powerful model using neighborhood information, and this special convolution has found wide application in research. It has done much work in several ways to date. Recent work has focused on the following four aspects: community detection using graphical methods, malware detection, object or significance detection in video or pictures, and internet security. There are several other aspects, such as computer vision and research into social networks. In this study, we mainly applied this model to the classification of text.

In the invention, a simplified graph convolution network and a graph attention network are compounded, the accuracy of classification is improved by utilizing the graph attention network while the calculation consumption of the graph convolution neural network is reduced, and a new method is introduced to construct a graph based on the similarity between graph nodes. Then, we import the graph into the network to complete text classification.

Disclosure of Invention

The text representation of the main problems of the existing text classification algorithm is high in dimension and sparsity, the feature expression capability is weak, in addition, the feature engineering needs to be carried out manually, and the cost is high. The deep learning initially succeeds in image and voice greatly, and accordingly promotes the development of the deep learning on NLP, so that the deep learning model has good effect on text classification.

In order to achieve the purpose, the invention adopts the following technical scheme:

a text classification method based on a graph convolution neural network comprises the following steps:

the method comprises the following steps: inputting texts, constructing a similarity graph, constructing an adjacency matrix in a multidimensional tuple database containing basic features in each text based on a plurality of text data sets, defining each text as a node, determining the existence of an edge between two nodes according to the similarity between the two nodes, and generating an edge between the nodes if the similarity is greater than a specific value so as to classify the nodes in the graph in a binary mode;

step two: constructing a deep learning model by using a simplified graph convolution neural network and a graph attention neural network and combining a Softmax function;

step three: and forming a text classification method overall model by the similarity graph and the deep learning model, and outputting the classified text, wherein the text classification method overall model is realized by using the simplified graph convolution neural network and the graph attention network and combining a Softmax function.

In the first step, the similarity calculation method between two nodes is as follows:

in the similarity calculation mode between the two nodes: γ is defined as:

further, for quantitative measurements, γ is defined as

Definition of dist (A) _v ,A _w ) As follows:

the simplified graph convolution neural network calculation mode is defined as:

the graph attention network is defined as:

wherein alpha is _ij Is defined as:

e _ij is defined as follows:

the overall model of the text classification method is as follows:

a system for text classification based on a graph-convolution neural network, comprising:

the information input module is used for importing the source text acquired by the external database after normalization;

the text classification module based on the graph convolution neural network is used for classifying the input source text by applying the text classification method based on the graph convolution neural network;

and the information output module is used for outputting the text classification result generated in the text classification module based on the atlas neural network in a new database form.

According to the technical scheme, a novel model called ASGCN is provided to classify different texts. In this model, we combine a simplified graph convolution network and a graph attention network to get better classification results. Furthermore, we propose a new way of constructing the graph, so that it can better fit the proposed model and provide us with better results, thus enabling the following effects to be achieved:

1. pre-judging the text category through the correlation relationship among the texts;

2. a new model based on the graph neural network is constructed to improve the classification precision.

Detailed Description

The following is a preferred embodiment of the present invention, and the technical solution of the present invention is further described, but the present invention is not limited to this embodiment.

The text classification method based on the graph convolution neural network in the embodiment mainly comprises the following three steps:

the method comprises the following steps: inputting a text and constructing a similarity graph;

step two: constructing a deep learning model;

step three: and forming a text classification method overall model by the similarity graph and the deep learning model, and outputting a classification result.

To implement this method, a python runtime environment is deployed and configured.

as with the pixel neighborhood system processed using CNNs, the GCN network will pass information for each node to its neighborhood rather than processing each feature separately. This is why we have to construct a well organized graph to be able to better reveal the context between the texts. Also, a deep learning model is required for classification. Therefore, it is very important to select a measure capable of explaining the similarity between nodes. Our method is based on a dataset of N texts. Each text in the database has a d-dimensional tuple containing the base features. The construction of the adjacency matrix will use all this information and the result will be G = (V, E). V is a set of nodes, each node representing a text. All nodes in the training set and the test set are included, and the number of elements in V is N. E is the set of edges in the graph, and the existence of an edge between two nodes is determined by the similarity between the two nodes. Our goal is to binary classify the nodes in the graph. We consider each text as a node in the graph, called n _i . One text contains a set M of H-type phenotype features, i.e., M = { M = { (M) } _h }. The similarity between nodes in the graph is defined as follows:

wherein Sim (A) _v ,A _w ) Representing the similarity between node v and node w. The more similar the two nodes, the larger the value. In this formula, M _h (v) Representing the h-th feature of the v-th text. In processing the classification information, γ is defined as:

for quantitative measures, the definition of γ is somewhat different:

θ is a threshold that affects the result. Finally, we define dist (A) _v ,A _w ) As follows:

in this formula, σ determines the width of the kernel, and x (v) is the feature vector of the v-th object. ρ represents the correlation distance. And the existence of an edge between two nodes is determined by the value of the similarity. If the similarity is greater than λ, an edge will be generated between these nodes. Our graph is composed of these edges. The adjacency matrix of the figure is denoted a.

Step two: constructing a deep learning model;

a simplified graph convolutional neural network and a graph attention neural network are used in our model. The structure of the simplified graph convolutional neural network is as follows:

wherein

Represents the output of the convolutional layer, Θ is the trained parameter, X represents the input matrix of the network, consisting of the feature vectors of each text, i.e.: x = [ X ] ₁ ，...，x _n ] ^T . And S represents a normalized adjacency matrix with self-loops, whose expression is

Wherein

Is that

The degree matrix of (c).

This formula can be simplified as:

the structure of the graph attention network is as follows: in this layer, for each vector fed into the layer, the following formula will apply:

wherein,

is the feature vector of the jth node after convolutional layer processing, and W is the training parameter. N is a radical of hydrogen _i Here the neighborhood of the ith node in the figure, and alpha _ij Is defined as:

wherein alpha is _ij Is defined as:

In summary, the general formula of this model can be written as:

and outputting the result of text classification by the model to the outside to obtain the required text classification result.

Claims

1. A text classification method based on a graph convolution neural network is characterized in that: the method comprises the following steps:

the method comprises the following steps: inputting a text, constructing a similarity graph, wherein each text has a multi-dimensional tuple containing basic characteristic keywords in a data set containing a plurality of texts, constructing an adjacency matrix based on the text data set, defining each text as a node, determining the existence of an edge between two nodes according to the similarity between the two nodes, and if the similarity is greater than a certain threshold value, generating an edge between the nodes so as to prejudge the node category in the graph and construct the similarity graph;

step three: constructing a text classification method overall model by a similarity graph and a deep learning model, and outputting classified texts, wherein the text classification method overall model is realized by using the simplified graph convolution neural network and the graph attention network and combining Softmax function calculation; the simplified graph convolution neural network calculation mode is defined as:

the graph attention network is defined as:

wherein alpha is _ij Is defined as follows:

e _ij is defined as:

in the construction step of the similarity graph, the similarity calculation mode between two nodes is as follows:

in the similarity calculation mode between the two nodes:

γ is defined as:

for quantitative measurements, γ is defined as

Definition of dist (A) _v ,A _w ) As follows:

2. the method of claim 1, wherein the text classification method based on the atlas neural network is characterized in that:

the overall model of the text classification method is as follows:

3. a text classification system based on a graph convolution neural network is characterized in that: the method comprises the following steps:

the information input module is used for standardizing and importing the source text acquired by the external database;

a text classification module based on the atlas neural network, which applies the text classification method based on the atlas neural network in any one of claims 1-2 to classify the input source text;