CN112529071A

CN112529071A - Text classification method, system, computer equipment and storage medium

Info

Publication number: CN112529071A
Application number: CN202011425848.0A
Authority: CN
Inventors: 刘勋; 宗建华; 夏国清; 叶和忠; 刘强
Original assignee: South China Institute Of Software Engineering Gu
Current assignee: South China Institute Of Software Engineering Gu
Priority date: 2020-12-08
Filing date: 2020-12-08
Publication date: 2021-03-19
Anticipated expiration: 2040-12-08
Also published as: CN112529071B

Abstract

The invention provides a text classification method, a system, computer equipment and a storage medium, wherein the method comprises the steps of establishing a new high-low order graph convolution neural network model which comprises a high-low order graph convolution layer for simultaneously capturing multi-order neighborhood information of nodes, an information fusion layer for mixing first-order to high-order characteristics of different neighborhoods, a first-order graph convolution layer and a softmax classification output layer, inputting a training set text graph network to train to obtain a text classification model, and then inputting a test set text graph network to the classification model to obtain a classification result. When the embodiment of the invention is used for text classification, the text classification efficiency and the classification effect are ensured, meanwhile, the problems of complex calculation, large parameter quantity, over-smoothness, limited receptive field and the like when the existing graph convolution is applied to text classification are solved by a method for simultaneously capturing multi-order neighborhood information of nodes, and the expression capability of a text classification model, the stability of the model and the precision of a text classification task are further improved.

Description

Text classification method, system, computer equipment and storage medium

Technical Field

The invention relates to the technical field of text classification, in particular to a text classification method, a text classification system, computer equipment and a storage medium based on a high-low order graph convolution network.

Background

With the rapid development of the internet technology, various social platforms, technical communication platforms, shopping platforms and the like are rapidly developed, massive text data information is generated continuously, and the text data information becomes an object of enthusiasm of big data mining research due to the existence of ultrahigh-value data information, so that the status of text classification in information processing is more and more important. Researchers all hope to adopt an effective text classification method to efficiently manage, extract and analyze useful information in text data, and provide powerful support for enterprise or social development.

At present, the technology of text classification has been developed from early manual classification relying on the prior knowledge of linguistic experts to deep machine learning, and deep learning models represented by Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) are widely applied to the task of text classification, but these models may ignore global word co-occurrence information in corpus, and the discontinuous and long-distance semantic information carried by these information has important influence on the document classification result. Although the existing graph convolutional neural network can process data of any structure and capture global word co-occurrence information, and can effectively learn a text graph network with rich relationships and protect the global structure information of the graph when the graph is embedded, the existing graph convolutional neural network generally has two layers, the shallow layer mechanism limits the scale of receptive fields and the expression capability of models, and the multilayer (more than 2 layers) network can make different types of text node values tend to a fixed value and further bring about the problem of over-smoothness. On the basis of continuing the advantages of text classification of the conventional graph convolution network, the problem of over-smoothness in graph convolution network application is solved, and the receptive field of a classification model can be increased, so that the expression capability of the model and the precision of a text classification task are improved.

Disclosure of Invention

The invention aims to solve the problems of over-smoothness and limitation of model receptive field when the prior graph convolution network is applied to text classification, and further improve the expression capability of a text classification model and the precision of a text classification task.

In order to achieve the above objects, it is necessary to provide a text classification method, a system, a computer device and a storage medium for solving the above technical problems.

In a first aspect, an embodiment of the present invention provides a text classification method, where the method includes the following steps:

establishing a high-low order graph convolution neural network model; the high-low order graph convolution neural network model sequentially comprises an input layer, a high-low order graph convolution layer, an information fusion layer, a first order graph convolution layer and an output layer;

obtaining a corpus of text classification by adopting the high-low order graph convolutional neural network model; the corpus comprises a plurality of samples, each sample containing a document and a title;

preprocessing the corpus set to obtain a training set and a test set;

respectively constructing a training set text graph network and a test set text graph network according to the training set and the test set;

inputting the training set text graph network into a high-low order graph convolutional neural network model, and training by combining a loss function to obtain a text classification model;

and inputting the test set text graph network into the text classification model for testing to obtain a classification result.

Further, if the output of the high-low order graph convolution neural network model is Z, then:

where X is the input matrix of the graph, w₁And w₂Respectively a parameter matrix between the input layer to the hidden layer and a parameter matrix between the hidden layer to the output layer,

is the regularized adjacency matrix of the graph with self-joins, k is the highest order of graph convolution,

ReLU (. cndot.) is the activation function, NMPooling (. cndot.) is the information fusion layer, and softmax (. cndot.) is the multi-class output function.

Further, the high-low order graph convolution layer includes a first order graph convolution to a k order graph convolution based on weight sharing; the order k of the high-low order graph convolution layer is one of orders of two orders and above, or a combination of any plural orders.

Further, the information fusion layer adopts minimum value negation information fusion pooling, and the implementation steps include:

according to the input matrix X and the parameter matrix w₁And regularizing adjacency matrices

Calculating minimum value matrixes of convolution of different graphs;

and negating each element value of the minimum value matrix to obtain a pooled graph feature matrix.

Further, the step of preprocessing the corpus to obtain a training set and a test set includes:

carrying out pre-processing of removing duplication, word segmentation and stop word and special symbol removal on the titles and the documents of the samples in the corpus to obtain words in the corpus, and forming the words and the documents in the corpus into a corpus text group;

and dividing the corpus text group into a training set and a test set according to the quantity proportion.

Further, the step of respectively constructing a training set text graph network and a test set text graph network according to the training set and the test set comprises:

respectively establishing a training set text chart and a test set text chart of which feature matrixes are corresponding dimension unit matrixes according to the training set and the test set;

and determining the adjacency matrixes of the training set text graph and the test set text graph according to the TF-IDF algorithm and the PMI algorithm.

Further, the step of determining the adjacency matrix of the training set text graph and the test set text graph according to the TF-IDF algorithm and the PMI algorithm comprises:

calculating the weights of document nodes and word node connecting edges in the adjacent matrix of the training set text graph according to the TF-IDF algorithm, and calculating the weights of the word nodes and word node connecting edges in the adjacent matrix of the training set text graph according to the PMI algorithm;

and calculating the weights of the document nodes and the word node connecting edges in the adjacent matrix of the test set text graph according to the TF-IDF algorithm, and calculating the weights of the word nodes and the word node connecting edges of the test set text graph according to the PMI algorithm.

In a second aspect, an embodiment of the present invention provides a text classification system, where the system includes:

the classification model establishing module is used for establishing a high-low order graph convolution neural network model; the high-low order graph convolution neural network model sequentially comprises an input layer, a high-low order graph convolution layer, an information fusion layer, a first order graph convolution layer and an output layer;

the corpus classification module is used for acquiring a corpus set for text classification by adopting the high-low order graph convolutional neural network model; the corpus comprises a plurality of samples, each sample containing a document and a title;

the corpus preprocessing module is used for preprocessing the corpus to obtain a training set and a test set;

the text graph network building module is used for respectively building a training set text graph network and a test set text graph network according to the training set and the test set;

the text classification model training module is used for inputting the training set text graph network into a high-low order graph convolutional neural network model, and training the high-low order graph convolutional neural network model by combining a loss function to obtain a text classification model;

and the text classification test module is used for inputting the test set text graph network into the text classification model for testing to obtain a classification result.

In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method when executing the computer program.

In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the above method.

The method realizes the effects that after a TF-IDF algorithm and a PMI algorithm are adopted to construct a training set text graph network and a test set text graph network on a preprocessed corpus, the training set text graph network is input into a high-low order graph convolutional neural network model with an input layer, 1 high-low order graph convolutional layer, 1 information fusion layer, 1 first order graph convolutional layer and a softmax function output layer, training is carried out by combining with a defined loss function to determine a classification model parameter matrix, and the test set text is accurately classified. Compared with the prior art, the method has the advantages that in the application of text classification, the text classification efficiency and the classification effect are guaranteed, meanwhile, the problems of complex calculation, large parameter quantity, over-smoothness and limitation of model receptive field when the conventional graph convolution network is applied to text classification are solved through the method of constructing the two-layer network structure model and capturing the multi-order neighborhood information of the nodes by utilizing the high-low-order graph convolution, and the expression capability of the text classification model, the stability of the model and the precision of a text classification task are further improved.

Drawings

FIG. 1 is a flow chart illustrating a text classification method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of the high-low level graph convolutional neural network model structure of FIG. 1;

FIG. 3 is a schematic flow chart of corpus preprocessing of step S13 in FIG. 1;

FIG. 4 is a schematic flow chart illustrating the step S14 in FIG. 1 of constructing a corresponding text graph network according to the training set and the test set;

FIG. 5 is a schematic diagram of the creation of a network of text graphs based on a portion of the data of OH using the method of FIG. 4;

FIG. 6 is a schematic flowchart of the step S142 in FIG. 4 for constructing the adjacency matrix of the text graph according to the TF-IDF algorithm and the PMI algorithm;

FIG. 7 is a schematic structural diagram of a text classification system in an embodiment of the invention;

fig. 8 is an internal structural diagram of a computer device in the embodiment of the present invention.

Detailed Description

In order to make the purpose, technical solution and advantages of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments, and it is obvious that the embodiments described below are part of the embodiments of the present invention, and are used for illustrating the present invention only, but not for limiting the scope of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The text classification method provided by the invention can be applied to a terminal or a server, the adopted high-low order graph convolutional neural Network Model (NMGC) is an improvement of the existing graph convolutional network model, other similar full-supervised classification tasks can be completed, and a text corpus is preferably adopted for training and testing.

In one embodiment, as shown in fig. 1, there is provided a text classification method, including the steps of:

s11, establishing a high-low level graph convolutional neural network model; the high-low order graph convolution neural network model comprises an input layer, a high-low order graph convolution layer, an information fusion layer, a first order graph convolution layer and an output layer;

wherein, the high-low level graph convolutional layer and the first-order graph convolutional layer in the high-low level graph convolutional neural network model are both 1. And if the output of the high-low order graph convolution neural network model is Z, then:

ReLU (. cndot.) is an activation function, NMPooling (. cndot.) is an information fusion layer, and softmax (. cndot.) is a multi-class output function, and the specific model structure is shown in FIG. 2.

The high-low graph convolution layer in the present embodiment includes first-order graph convolution to k-order graph convolution based on weight sharing, i.e., the first-order graph convolution to k-order graph convolution

The high-low order graph convolution is by first order graph convolution

Capturing first-order neighborhood information of text nodes by second-order to k-order graph convolution

Capturing higher-order neighborhood information of text nodes to augmentThe receptive field of the model is enhanced, and the learning ability of the model is further enhanced. The order k of the high-low order graph convolution layer can be one of the orders of two orders and above, or a combination of any plurality of orders. When k is 2, namely the adopted model is an NMGC-2 model with a mixed neighborhood of 1 st order and 2 nd order, the formula is as follows:

when k is 3, namely the adopted model is an NMGC-3 model with mixed neighborhood of 1 st, 2 nd and 3 rd orders, the formula is as follows:

when k is equal to n, the adopted model is an NMGC-n model with a neighborhood mixture from 1 st order to n th order, and the formula is as follows:

in the model, the same weight parameters are adopted in each order neighborhood of the convolution layer of the same graph to realize weight sharing and parameter quantity reduction, and the parameters are specifically represented in the selection of parameters W1 and W2 in the formulas (1) to (4).

When the method is actually applied to large-scale text graph network training, calculation is needed first

Due to the fact that

Usually a sparse matrix with m non-zero elements, and the convolution based on the high-low order graph adopts a weight sharing mechanism and adopts multiplication from right to left to calculate

E.g. when k is 2, use

By multiplication to obtain

In the same way, the method for preparing the composite material,

and so on through

Calculating a k-th order graph convolution, i.e. by left-multiplying a k-1 order graph convolution

The calculation method effectively reduces the calculation complexity. In addition, since different order graph convolutions employ a weight sharing mechanism, the parameters of the high-order graph convolution and the parameters of the first-order graph convolution are the same, assuming that

(n number of nodes) of the node,

(r₀individual attribute feature dimensions),

(r₁a filter), and

(r₂a filter), then

The time complexity and the parameter quantity of the high-low level graph convolution model are respectively O (k multiplied by m multiplied by r)₀×r₁) And O (r)₀×r₁) The high-order graph convolution calculation efficiency is guaranteed to a certain extent.

S12, acquiring a corpus of text classification by adopting the high-low order graph convolutional neural network model; the corpus comprises a plurality of samples, each sample containing a document and a title;

the text classification corpus can be selected according to actual needs, in the application, supervised text data sets of R52 and R8, 20-Newsgroups (20NG), Ohsated (OH) and Movie Review (MR) of Reuters 21578 are adopted, and specific information of the data sets is as follows: the 20NG data set includes 18846 newsgroup documents, without duplicate documents, divided into 20 different classes, of all the newsgroup documents, 11314 documents were used for training, and the remaining 7532 documents were used as test sets; the OH dataset is a medical dataset from the MEDLINE database, 7400 medical documents were selected, of which 3357 documents were used for training, while the remaining 4043 documents were taken as a test set, dividing into 23 different classes. R52 and R8 are two subsets of Reuters 21578, and are divided into 52, 8 different classes, respectively, where the number of training and test documents of the R52 dataset is 6532 and 2568, respectively, and the number of training and test documents of the R8 dataset is 5485 and 2189, respectively. MR is a movie review data set with 10662 review documents, half as many positive and negative review documents, each containing only one sentence, using 7108 review documents as training and 3554 review documents as testing. 10% of the data in the training set of data above will be used as validation, and the specific information of the supervised text data set is shown in Table 1.

TABLE 1 supervision text data set

Data set	Number of documents	Number of words	Training	Testing	Categories	Number of nodes	Average length
								R52	9,100	8,892	6,532	2,568	52	17,992	69.82
R8	7,674	7,688	5,485	2,189	8	15,362	65.72
								20NG	18,846	42,757	11,314	7,532	20	61,603	221.26
OH	7,400	14,157	3,357	4,043	23	21,557	135.82
								MR	10,662	18,764	7,108	3,554	2	29,426	20.39

S13, preprocessing the corpus to obtain a training set and a test set;

wherein, the step S13 of preprocessing the corpus to obtain a training set and a test set includes, as shown in fig. 3:

s131, preprocessing the titles and the documents of the samples in the corpus to remove duplicates, participles and stop words and special symbols to obtain words in the corpus, and forming the words and the documents in the corpus into a corpus text group;

the corpus collected by the network only comprises documents and titles, more words in the documents and the titles can be relied on when actual text data is processed, therefore, certain preprocessing is carried out before model training, word segmentation logic can be customized by combining the requirements of users through tools such as the existing python nltk and the like, word segmentation operation is carried out on the titles and text chapters of all samples in the corpus, preprocessing such as stop words and special symbols is removed, then the words in the corpus is obtained, and the obtained corpus words and the documents form a corpus text group for subsequent analysis and use.

And S132, dividing the corpus text group into a training set and a test set according to the quantity proportion.

When a corpus text group is used for training a graph convolution model, collected data are divided into a training set and a test set according to a certain quantity proportion according to requirements, the training set is used for training parameters of the model and optimizing the parameters to determine a final model, and the test set is directly classified by using the determined model.

S14, respectively constructing a training set text graph network and a test set text graph network according to the training set and the test set;

s15, inputting the training set text graph network into a high-low order graph convolutional neural network model, and training by combining a loss function to obtain a text classification model;

wherein, the loss function used in the model training is:

x_lfor a set of labeled vertices (nodes), M is the number of classes, Y_lmReal labels, Z, representing label nodes_lmAnd represents the probability value between 0 and 1 predicted by softmax (input tag node).

When the feature matrix and the regularization adjacency matrix of the training set text graph are used as input matrixes to be input into the high-low order graph convolution neural network model for training and learning,

updating the parameters of the convolution of the learning graph by a gradient descent method to primarily determine a text classification model, transmitting 10% of verification sets reserved in a training set into the model and adjusting the parameters by combining with a defined loss function to finally obtain a stable text classification model, and then obtaining a classification result by using a test set, thereby well ensuring the classification precision of the classification model.

And S16, inputting the test set text graph network into the text classification model for testing to obtain a classification result.

In the embodiment of the application, firstly, the important reference data set of the text classification is adopted for parameter training during the training of the classification model based on the generalization ability consideration of the model, and the data set does not have repeated data, so that the workload of model training can be reduced to a certain extent, and the efficiency of model training is improved; secondly, a high-low graph convolution network model with only two-layer graph convolution is established, so that the over-smooth phenomenon of the training model is reduced while the training parameters are reduced to a certain extent, and the universality of the classification model obtained by training is improved.

In one embodiment, the information fusion layer in the formula (1) of the present invention adopts minimum value negation information fusion pooling, and the specific calculation method thereof is as follows:

Calculating minimum value matrixes of convolution of different graphs; the minimum value is calculated as follows:

for the minimum value matrix H_minIs negated to obtain a pooled graph feature matrix, i.e., H_nm＝-H_min。

The information fusion mode in the above embodiment is described by a specific third-order embodiment, and the higher-order case is similar. Suppose the order K of the neighborhood is 3 and the first order domain is H₁Second order field is H₂Third order field is H₃Then, the process of information fusion is:

(1)

then:

(2) to H_minEach element value ofGet the inverse, have

The implementation process of the NMPooling-based high-low order graph convolution algorithm in this embodiment is as follows:

inputting:

H⁽¹⁾,W

convolution operation:

information fusion: h_nm＝NMPooling(H₁,H₂,…,H_k)

Nonlinear activation: h ═ RELU (H)_nm)

In the embodiment, the text graph network is firstly input into the high-low order graph convolution for the above algorithm processing, then the NMPooling information fusion is used for mixing the first-order to high-order characteristics of different neighborhoods, and the text graph is input into the classical first-order graph convolution after nonlinear activation for further learning the representation of the text graph, so that the method for obtaining the classification probability result is finally obtained, more and richer characteristic information can be reserved in the learning process for the learning of the global graph topology, and the learning effect is further improved.

In one embodiment, as shown in fig. 4, the step S14 of constructing a training set text graph network and a test set text graph network respectively according to the training set and the test set includes:

s141, respectively establishing a training set text chart and a test set text chart of which feature matrixes are corresponding dimension unit matrixes according to the training set and the test set;

in the text classification training, converting a text corpus into a corresponding text graph is a necessary step for performing machine training. In this embodiment, the training set text graph and the test set text graph are all necessary inputs of the high-low order graph convolution neural network model, and corresponding text graphs need to be respectively established based on text data of the training set and the test set, for example, the training set text graph G ═ V, E, where V is a vertex set composed of all words and all documents of the training set text, that is, the number of nodes in the text graph network is the number of documents plus the number of words, that is, the sum of the size of the actual corpus and the vocabulary amount, E is an edge set including all dependencies between two words in the training set and between words and documents, and similarly, a file graph of the test set can be obtained.

As shown in FIG. 5, a text graph network is established for a part of the corpus of Ohsumed, the nodes beginning with "O" are document nodes, the other nodes are word nodes, the gray lines represent the edges between words, the black lines represent the edges between documents and words, the document nodes of the same color belong to the same class, and the document nodes of different colors belong to different classes. In this embodiment, the feature matrices corresponding to the training set text diagram and the test set text diagram are set as identity matrices of corresponding dimensions, that is, one-hot codes are used as model inputs for each word and document.

And S142, determining the adjacency matrixes of the training set text graph and the test set text graph according to the TF-IDF algorithm and the PMI algorithm.

The adjacency matrix of the text graph comprises the weights of the words and the document edges, the weights of the words and the word edges and the weights of the documents and the document edges. In the present embodiment, the edges between the words and the document are established by the number of times the words appear in the document, and the edges between the words are established by co-occurrence of the words.

As shown in fig. 6, the step S142 of determining the adjacency matrices of the training set text graph and the test set text graph according to the TF-IDF algorithm and the PMI algorithm includes:

s1421, calculating the weights of the document nodes and the word node connecting edges in the adjacent matrix of the training set text graph according to the TF-IDF algorithm, and calculating the weights of the word nodes and the word node connecting edges in the adjacent matrix of the training set text graph according to the PMI algorithm;

s1422, calculating the weights of the document nodes and word node connecting edges in the adjacent matrix of the test set text graph according to the TF-IDF algorithm, and calculating the weights of the word nodes and word node connecting edges of the test set text graph according to the PMI algorithm.

Wherein the weights of the document nodes and the word node connecting edges are calculated according to the word frequency-inverse document frequency (TF-IDF). The word frequency (TF) represents the number of times a given word appears in a document, the larger the value of the word frequency is, the larger the contribution degree of the given word to the document is, and if the value of the word frequency is smaller, the lower the contribution degree of the given word to the document is, or even no contribution degree is. The expression of word frequency is as follows:

tf_j,k＝n_j,k/∑_ii,k,

wherein n is_j,kIs the number of times the word j appears in the document k, Σ_ii, k represent the number of occurrences of word j and other words in document k. The Inverse Document Frequency (IDF) reflects the ability of a given word to distinguish across a document, with the greater the inverse document frequency, the fewer documents containing the given word, and the greater the ability to distinguish across the document. The inverse document frequency is calculated as follows:

idf_j＝logD/{k:t_j∈d_k}，

where D is the number of all documents in the corpus, { k: t_j∈d_kMeans containing a given word t_jThe number of documents.

TF-IDF considers the influence of a given word on a certain document through TF, and also describes the importance degree of the given word on the whole document by using IDF, and the importance degree is defined as the product of word frequency and inverse document frequency, and the calculation formula is as follows:

TF-IDF＝TF*IDF。

to exploit global word co-occurrence information, a sliding window of fixed size is set for all documents in the corpus to aggregate co-occurrence features. The present embodiment utilizes the PMI algorithm to measure the correlation between words, calculates the weight between two word nodes, and the PMI value of the word j and the word k is defined as:

PMI(j,k)＝log p(j,k)/p(j)p(k),

where p (j, k) is W (j, k)/W, p (j) is W (j)/W, W (j, k) represents the number of sliding windows including word j and word k, W represents the total number of sliding windows, and W (j) represents the number of sliding windows including word j. The larger the value of PMI, the stronger the semantic relevance of word j to word k, the smaller the value of PMI, the weaker the semantic relevance of word j to word k, when the PMI value is negative, the very weak or no relevance between word j and word k. Establishing edges between words only takes into account the fact that the PMI value is positive.

To sum up, in this embodiment, the element values in the adjacency matrix corresponding to the text graph, that is, the weights for constructing the network edge of the text graph, are defined as follows:

after a training set text graph network is constructed, a feature matrix and a neighbor matrix of the graph are transmitted into a high-low order graph convolution neural network model for training.

In the embodiment, after the training set and the test set text corpus are converted into the corresponding text graphs, the method for determining the adjacency matrixes corresponding to the text graphs by adopting the TF-IDF algorithm and the PMI algorithm captures global word co-occurrence information and considers the distinguishing capability of the documents, so that a weight relation with high accuracy for describing the text graph information is given, the model training effect is further improved, and the precision of model classification is also improved.

In the embodiment of the present application, classification tests are performed based on supervised text data sets R52 and R8, 20NG, OH, and MR, and it is found that, in a text classification task, a high-low-order graph convolution neural network model with k being 2 and k being 3 already has very good performance in terms of classification accuracy and computational complexity, and a k value being 4 or higher will reduce text classification accuracy, so that only comparison results of classification effects, model parameters, and computational complexity of NMGC-2 and NMGC-3 models (i.e. only two cases of k being 2 and k being 3 are considered) with other existing classification models are given, as shown in tables 2-4 below:

table 2 NMGC-2 and NMGC-3 comparison of tests based on the same text data set with the existing model

Table 2 illustrates: the accuracy in the table is expressed as a percentage and the number is the average of 10 runs.

TABLE 3 NMGC-2 and NMGC-3 model classification result comparison table for different hidden neuron numbers

TABLE 4 comparison of computational complexity and parameter values for NMGC-2, NMGC-3 and GCN models

Table 4 illustrates: 1. 2 and 3 represent the graph convolution order, and 200, 64, 128 and 32 represent the number of hidden neurons.

Based on the above experimental results, the embodiment provides a high-low order graph convolution neural Network Model (NMGC) including a high-low order graph convolution capable of simultaneously capturing the correlation between low-order and high-order neighborhood text nodes and an nmpoling information fusion layer capable of mixing first-order to high-order features of different neighborhoods, which can retain more and richer feature information in text classification, learn a global graph topology, not only broaden the receptive field, but also improve the model expression capability. In addition, the computation complexity and the parameter amount are reduced by adopting weight sharing and setting a small number of hidden neurons for convolution of different orders, overfitting of the model is avoided, and experimental results based on five reference text data sets show that text classification by applying a high-low order graph convolution neural network model and adopting a classical first order graph convolution network has obvious advantages in the aspects of classification precision, classification performance and parameters, and the method is most stable while achieving the highest precision.

It should be noted that, although the steps in the above-described flowcharts are shown in sequence as indicated by arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the above-described flowcharts may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 7, there is provided a text classification system, the system comprising:

a classification model establishing module 71, configured to establish a high-low level graph convolution neural network model; the high-low order graph convolution neural network model sequentially comprises an input layer, a high-low order graph convolution layer, an information fusion layer, a first order graph convolution layer and an output layer;

a corpus classifying module 72, configured to obtain a corpus set for text classification using the high-low order graph convolutional neural network model; the corpus comprises a plurality of samples, each sample containing a document and a title;

a corpus preprocessing module 73, configured to preprocess the corpus to obtain a training set and a test set;

a text graph network building module 74, configured to build a training set text graph network and a test set text graph network according to the training set and the test set, respectively;

the text classification model training module 75 is configured to input the training set text graph network into a high-low order graph convolutional neural network model, and perform training in combination with a loss function to obtain a text classification model;

and a text classification test module 76, configured to input the test set text graph into the text classification model through a network for testing, so as to obtain a classification result.

For the specific definition of the text classification system, reference may be made to the above definition of the text classification method, which is not described herein again. The various modules in the text classification system described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

Fig. 8 shows an internal structure diagram of a computer device in one embodiment, and the computer device may be specifically a terminal or a server. As shown in fig. 8, the computer apparatus includes a processor, a memory, a network interface, a display, and an input device, which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a power rate probability prediction method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those of ordinary skill in the art that the architecture shown in FIG. 8 is merely a block diagram of some of the structures associated with the present solution and is not intended to limit the computing devices to which the present solution may be applied, and that a particular computing device may include more or less components than those shown in the drawings, or may combine certain components, or have the same arrangement of components.

In one embodiment, a computer device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the steps of the above method being performed when the computer program is executed by the processor.

In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method.

To sum up, the embodiment of the present invention provides a text classification method, a system, a computer device, and a storage medium, wherein the text classification method based on a high-low order graph convolutional network provides a method for text classification using a new high-low order graph convolutional neural network model including a high-low order graph convolutional layer capturing multi-order neighborhood information of nodes, an NMPooling information fusion layer mixing first-order to high-order features of different neighborhoods, a first-order graph convolutional layer, and a softmax classification output layer, on the basis of fully considering various problems such as easily ignored global word co-occurrence information of text classification, an easily occurring narrow receptive field, an excessively smooth model, and a lack of expression capability. When the method is applied to actual text classification, the method can capture the low-order neighborhood and the high-order neighborhood information of text nodes through the high-low-order graph convolution layer to obtain more and richer text node information so as to widen the receptive field and improve the expression capability of the model, and reduces the calculation complexity and parameter quantity of the model by adopting the method of weight sharing and setting less number of hidden neurons for convolution of different orders, thereby avoiding overfitting of the model and improving the stability of the model and the precision of text classification.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above.

The embodiments in this specification are described in a progressive manner, and all the same or similar parts of the embodiments are directly referred to each other, and each embodiment is described with emphasis on differences from other embodiments. In particular, for embodiments of the system, the computer device, and the storage medium, since they are substantially similar to the method embodiments, the description is relatively simple, and in relation to the description, reference may be made to some portions of the description of the method embodiments. It should be noted that, the technical features of the embodiments may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express some preferred embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these should be construed as the protection scope of the present application. Therefore, the protection scope of the present patent shall be subject to the protection scope of the claims.

Claims

1. A method of text classification, the method comprising the steps of:

preprocessing the corpus set to obtain a training set and a test set;

2. The text classification method of claim 1, wherein the output of the high-low-level graph convolutional neural network model is Z, then:

ReLU (. cndot.) is a nonlinear activation function, NMPooling (. cndot.) is an information fusion layer, and softmax (. cndot.) is a multi-class output function.

3. The text classification method of claim 2, wherein the high-low level graph convolution layer includes a first order graph convolution to a k order graph convolution based on weight sharing; the order k of the high-low order graph convolution layer is one of orders of two orders and above, or a combination of any plural orders.

4. The text classification method according to claim 2, characterized in that the information fusion layer employs minimum-inverted information fusion pooling, which is implemented by the steps of:

Calculating minimum value matrixes of convolution of different graphs;

5. The method for classifying texts according to claim 1, wherein the step of preprocessing the corpus to obtain a training set and a test set comprises:

6. The text classification method according to claim 1, wherein the step of constructing a training set text graph network and a test set text graph network from the training set and the test set, respectively, comprises:

7. The text classification method of claim 6, wherein the step of determining the adjacency matrices of the training set text graph and the test set text graph according to the TF-IDF algorithm and the PMI algorithm comprises:

8. A text classification system, the system comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the computer program is executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.