CN116304061B - Text classification method, device and medium based on hierarchical text graph structure learning - Google Patents

Text classification method, device and medium based on hierarchical text graph structure learning Download PDF

Info

Publication number
CN116304061B
CN116304061B CN202310551919.9A CN202310551919A CN116304061B CN 116304061 B CN116304061 B CN 116304061B CN 202310551919 A CN202310551919 A CN 202310551919A CN 116304061 B CN116304061 B CN 116304061B
Authority
CN
China
Prior art keywords
text
graph
edge
representing
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310551919.9A
Other languages
Chinese (zh)
Other versions
CN116304061A (en
Inventor
龙军
王子冬
杨柳
陈庭轩
黄金彩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202310551919.9A priority Critical patent/CN116304061B/en
Publication of CN116304061A publication Critical patent/CN116304061A/en
Application granted granted Critical
Publication of CN116304061B publication Critical patent/CN116304061B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a text classification method, a device and a medium based on hierarchical text graph structure learning, wherein the method comprises the following steps: step S1: preprocessing the training set text according to three linguistic features to obtain three graph structure matrixes; step S2: performing edge level diagram structure learning to obtain three types of edge vectors; step S3: removing redundancy to obtain three text edge vectors; step S4: weighted summation is carried out to obtain a text graph structural representation; step S5: processing by adopting a graph convolution neural network, and generating graph-level text representation through a graph pooling layer; step S6: and carrying out softmax classification, wherein the class with the highest probability is the final classification result. The method has the advantages that three linguistic features are adopted to preprocess the training set text, and the text classification problem is converted into the graph classification problem; according to the invention, through multi-granularity graph structure learning, different graph structures are integrated, so that the semantic loss of the graph structure in the subsequent learning process is prevented.

Description

Text classification method, device and medium based on hierarchical text graph structure learning
Technical Field
The invention relates to the field of natural language processing, in particular to a text classification method, device and medium based on hierarchical text graph structure learning.
Background
The text classification is used as a basic technology in the field of natural language processing and is widely applied to realistic scenes such as knowledge question-answering, emotion analysis and the like. Currently, with the development of deep learning, the graphic neural network has made remarkable progress in text classification. However, how to graphically represent text is a difficulty. Existing methods of representing text using drawings do not take into account the accuracy and integrity of the original text drawing. In the diagram construction stage, due to errors existing in the algorithm, errors are likely to exist in the text diagram constructed by using methods such as entity/relation extraction and the like, so that the edges of the errors in the diagram are sparse or redundant, and the performance of a subsequent text classification task is affected. And limited by the priori knowledge of humans, the predefined graph structure only carries part of the information of the system, which prevents understanding of the underlying mechanism of how edges in the graph affect subsequent tasks, thereby limiting the application of the graph method in text classification.
In view of the foregoing, there is an urgent need for a text classification method, apparatus, and medium based on hierarchical text graph structure learning to solve the problems in the prior art.
Disclosure of Invention
The invention aims to overcome the defects of the existing text classification technology, and provides a text classification method, a text classification device and a text classification medium based on hierarchical text graph structure learning, so that updating and error correction of the text graph structure are realized, and the accuracy and the robustness of text graph classification are improved. The method provided by the invention adopts a local to global view angle to learn the graph structure hierarchically, thereby enriching the structural representation of the text graph, reducing the error introduced by the initial graph structure and modeling the relationship among nodes in fine granularity, and the specific technical scheme is as follows:
the text classification method based on hierarchical text graph structure learning comprises the following steps:
step S1: inputting and preprocessing the training set text to be classified according to three different linguistic characteristics to obtain node sets and edge sets of the three training set text, namely three graph structure matrixes; the three linguistic features are a text co-occurrence diagram, a text grammar diagram and a text semantic diagram respectively;
step S2: adopting a characteristic representation model based on edge level graph structure learning to process the three node sets and the graph structures of the edge sets to obtain three edge vectors;
step S3: removing redundancy of the three types of edge vectors according to the measurement standard of mutual information to obtain three types of text edge vectors;
step S4: carrying out weighted summation on the three text edge vectors to obtain text graph structural representation;
step S5: processing the text graph structural representation obtained in the step S4 and text semantic features corresponding to the text graph structural representation by adopting a graph convolution neural network, and generating graph-level text representation through a graph pooling layer;
step S6: and (5) carrying out softmax classification on the graph-level text representation obtained in the step (S5), and taking the category with the highest probability as a final classification result.
Preferably, in step S1, the text co-occurrence diagram construction mode specifically includes: will be in textIs->Expressed as text co-occurrence diagram->Node->The edge weight between any two word nodes in the graph adopts the point-to-point information of the word nodesThe edge weight expression for the text co-occurrence graph is represented as follows:
wherein,,edge weight representing text co-occurrence graph, +.>Representation word node->He word node->Is a piece of dot mutual information.
Preferably, in step S1, the text grammar map construction mode specifically includes: extracting text using parsing toolsIs->Syntax dependency of->Generating a relation triplet->Use +.>As nodes of the text grammar, the dependency relationships are used as edges among the nodes, the edge weights are expressed by using the frequencies of the dependency relationships in the data set, and the edge weight expression of the text grammar is as follows:
wherein,,edge weights representing text grammar map, +.>Representing the number of times two words have syntactic dependencies in all sentences of the corpus, ++>Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Preferably, in step S1, the text semantic graph construction method specifically includes: encoding text using BERT modelArbitrary word +.>Obtaining a feature vector->Using cosine similarity to calculate semantic similarity between feature vectors, if the semantic similarity is greater than a set threshold +.>Then it is indicated that there is a semantic relationship to the word, and the edge weight expression of the text semantic graph is as follows:
wherein,,edge weights representing text semantic graphs, +.>Representing the number of times two words have semantic relations in all sentences of the corpus, +.>Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Preferably, in step S2, the process of learning the graph structure specifically includes: giving confidence to the graph structure matrix, and optimizing the graph structure matrix based on the confidence; using Laplace regularization to restrict the characteristics of the nodes, and using the characteristics as likelihood functions of Bayesian estimation; setting a learning process of a priori function constraint adjacency matrix; combining the likelihood function and the prior function, and restraining the adjacency matrix of the learned graph through a Bayesian estimation framework;
the above optimization and constraint are performed on the three text graphs respectively, and the final loss function expression is as follows:
wherein,,loss function representing graph structure learning at edge level, +.>Representation->Constraint function for constraining the adjacency matrix of +.>Representing the structure of a learned text semantic graph, +.>Representing a learned text dependency graph structure, +.>Representing the learned text co-occurrence graph structure.
Preferably, in step S3, the redundancy elimination process is specifically: three text graph structure feature mappings generated in graph structure learning of opposite side levels of a graph convolution neural network are used to obtain mapped feature vectors, mutual information of different nodes in the same text graph is maximized, mutual information of nodes in different text graphs is minimized, estimation is carried out on the three text graphs based on the mutual information, and an optimization objective function is as follows:
wherein,,representing an optimized objective function>Representation->And->Mutual information estimation between +.>Side vector representing text semantic graph, +.>Side vector representing text dependency graph, +.>An edge vector representing a text co-line graph.
Preferably, in step S4, the edge vectors of the three text graphs are weighted and summed to obtain the final optimized graph structureThe expression is as follows:
wherein,,
preferably, in step S5, the process of generating the graph-level text representation specifically includes: processing the final optimized text graph structure in step S4 using a graph convolution neural networkAnd its features, update textSemantic feature->For->And carrying out global pooling processing to obtain a graph-level text representation, wherein the expression is as follows:
wherein,,representation of the text at the representation level->For node->Is characterized by->Representing global pooling, ->A set of nodes representing a text graph.
In addition, the invention also discloses a text classification device based on hierarchical text graph structure learning, which comprises:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory is used for storing a computer program;
the processor is configured to implement the text classification method as described above when executing the computer program.
In addition, the invention also discloses a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the text classification method when being executed by a processor.
The technical scheme of the invention has the following beneficial effects:
(1) The invention adopts three different methods to extract three linguistic features of the text, describes the characteristics of the text from different aspects, carries out pretreatment on the three linguistic features, converts the text classification problem into the graph classification problem, converts words into nodes in the graph, and converts the relationship between the words into edges in the graph.
(2) The invention adopts the graph structure learning of the edge level and the graph level to learn the graph structures of the three text graphs with different granularities, the edge level and the graph level capture the structural characteristics of the graph from a thin granularity and a thick granularity respectively, and the confidence is given to each edge for learning the characteristic relation among the nodes with fine granularity. For the graph level, the graph structures of multiple sources are adaptively integrated to obtain an optimal combination mode of the graph structures of different sources, and the graph structures of different semantic features are integrated on the basis of removing repeated information through multi-granularity graph structure learning, so that the occurrence of the phenomenon of semantic loss of the graph structures in the subsequent learning process is prevented, and the model performance is further improved.
In addition to the objects, features and advantages described above, the present invention has other objects, features and advantages. The present invention will be described in further detail with reference to the drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of the steps of a text classification method in a preferred embodiment of the invention;
FIG. 2 is a text co-occurrence diagram of a simulation experiment in a preferred embodiment of the present invention;
FIG. 3 is a text semantic graph of a simulation experiment in a preferred embodiment of the present invention;
FIG. 4 is a text grammar diagram of a simulation experiment in a preferred embodiment of the present invention;
fig. 5 is a final text diagram of a simulation experiment in a preferred embodiment of the present invention.
Detailed Description
Embodiments of the invention are described in detail below with reference to the attached drawings, but the invention can be implemented in a number of different ways, which are defined and covered by the claims.
Examples:
referring to fig. 1, the embodiment discloses a text classification method based on hierarchical text graph structure learning, which comprises the following steps:
step S1: inputting and preprocessing the training set text to be classified according to three different linguistic characteristics to obtain a node set and an edge set of the training set text, namely three graph structure matrixes; the three linguistic features are a text co-occurrence diagram, a text grammar diagram and a text semantic diagram respectively;
step S2: processing the three node sets and the graph structures of the edge sets by adopting a feature representation model based on graph structure learning of the edge level to obtain three edge vectors containing semantic information;
step S3: removing redundancy from the three edge vectors according to the measurement standard of mutual information to obtain three independent text edge vectors;
step S4: carrying out weighted summation on the three text edge vectors subjected to redundancy elimination to obtain a text graph structural representation integrating the global relationship;
step S5: processing the text graph structure obtained in the step S4 and the text semantic features corresponding to the text graph structure by adopting a graph convolution neural network, and generating graph-level text representation through a graph pooling layer;
step S6: and (5) carrying out softmax classification on the graph-level text representation obtained in the step (S5), and taking the category with the highest probability as a final classification result.
Applying the above method to a text data setText classification is carried out, wherein->Represents the->Sample number->Indicate->Text units->Representing the corresponding tag. For->A text co-occurrence map is constructed using three patterning methods>Text grammar map->Text semantic graphWherein->Representing text diagram->Node of->Adjacency matrix representing text diagram for representing text diagram +.>Is a topology of (a). From the dataset->Is selected to be +.>Text of->
Specifically, in step S1, the text co-occurrence diagram construction mode specifically includes: will be in textIs->Expressed as text co-occurrence diagram->Node->The edge weight between any two word nodes in the graph adopts the point-to-point information of the word nodesThe edge weight expression for the text co-occurrence graph is represented as follows:
wherein,,edge weight representing text co-occurrence graph, +.>Representation word node->He word node->Is a piece of dot mutual information.
Specifically, in step S1, the text grammar map construction mode specifically includes: extracting text using parsing toolsIs->Syntax dependency of->Generating a relation triplet->Use +.>As nodes of the text grammar, the dependency relationships are used as edges among the nodes, the edge weights are expressed by using the frequencies of the dependency relationships in the data set, and the edge weight expression of the text grammar is as follows:
wherein,,edge weights representing text grammar map, +.>Representing the number of times two words have syntactic dependencies in all sentences of the corpus, ++>Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Specifically, in step S1, the text semantic graph construction method specifically includes: processing text using a pre-trained BERT modelArbitrary word +.>Obtaining a feature vector->Using cosine similarity to calculate semantic similarity between feature vectors, if the semantic similarity is greater than a set threshold +.>Then it is indicated that there is a semantic relationship to the word, and the edge weight expression of the text semantic graph is as follows:
wherein,,edge weights representing text semantic graphs, +.>Representing the number of times two words have semantic relations in all sentences of the corpus, +.>Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Further, text graph structure learning is performed on the three constructed text graphs, including graph structure learning at the edge level and graph structure learning at the graph level. The edge level and the graph level capture the structural characteristics of the graph from a thin granularity and a thick granularity respectively, and confidence is given to each edge to learn the characteristic relation among nodes with the thin granularity. For the graph level, the graph structures of multiple sources are adaptively integrated to obtain an optimal combination mode of the graph structures of different sources. Modeling feature relationships from the fine granularity of edges enables more precise control of the flow of information during learning from the bottom layer, whereas a single source graph structure represents only graph structure data that is described from one perspective, potentially resulting in bias in classification results.
Specifically, in step S2, the edge level diagram structure learning (edge-level graph structure learning) process is specifically as follows:
because errors exist in the composition process, the original graph structure may be wrong and incomplete, firstly, the graph structure at the edge level is learned to endow confidence to the original graph structure matrix, the graph structure matrix is optimized based on the confidence, and the confidence optimization expression is as follows:
wherein,,for confidence matrix, ++>For the original graph matrix, ">For linear mapping +.>Matrix with all elements 1, +.>Representing the optimized graph structure.
In this embodiment, when defining the confidence matrix, the assumption that the features of adjacent nodes are similar is adopted, and the relationship between nodes is modeled by using the attention of the global graph, and the elementsIs defined as follows:
wherein,,representing node->And node->Relation between->To activate the function.
Element(s)Is substituted into the confidence optimization expression to obtain the final adjacency matrix +/at each iteration>
In order to further enhance the smoothness of the graph nodes, the present embodiment uses laplace regularization to constrain the characteristics of the nodes, and uses the characteristics as a likelihood function of bayesian estimation, and the expression is as follows:
wherein,,features representing nodes of the graph, < >>Representing a normalized Laplace matrix, +.>Representing the parameters that are predefined and,the smaller the difference between the neighboring nodes of the graph, the smaller the difference, indicating that the two nodes are more similar.
In order to further give the learned graph symmetry and simplicity properties, it is necessary to constrain the learning process of the adjacency matrix by an a priori function defined as follows:
wherein,,is a pair of the picturesRestriction of sex, ->Is a constraint on simplicity in the drawing. Symmetry and simplicity for the graph satisfying both constraints, +.>And->For manually adjusted superparameters,/->For learned graph structure->Transposed matrix of>Representing the diagram structure->Is a matrix of the matrix.
And combining the likelihood function and the prior function, and restraining the adjacency matrix of the learned graph through a Bayesian estimation framework, wherein the expression is as follows:
wherein,,representation->Constraint function for constraining the adjacency matrix of +.>For manually adjusted superparameters,/->Expressed as natural constant->An exponential function of the base.
The above optimization and constraint are performed on the three text graphs respectively, and the final loss function expression is as follows:
wherein,,loss function representing learning of edge level graph structure, +.>Representing the structure of a learned text semantic graph, +.>Representing a learned text dependency graph structure, +.>Representing the learned text co-occurrence graph structure.
By learning the graph structures at the edge level for the three text graphs, three optimized graph structures are obtained, each of which contains some unique information. Integration of these three graph structures is required to produce the final graph structure. Since the three text graphs may contain repeated redundant information, the redundant information needs to be removed, so that the independence of the text graphs is improved. The present invention uses mutual information as a measure of text graph independence.
If the correlation between two text graphs is high, the mutual information between them is also large, and vice versa. However, in practical applications, it is difficult to directly calculate the mutual information of the graph, and thus the InfoNCE method is used to estimate the lower bound of the mutual information.
Specifically, in step S3, the redundancy elimination process specifically includes: the three text graph structures generated in the edge level graph structure learning are mapped by using a graph convolution neural network GCN,obtaining the mapped feature vector by using a semantic text graphFor example, a mapped feature vector +.>The other two types of text graphs are mapped in the same manner, and are not repeated here.
After the feature vectors of three texts are obtained, infoNCE is used for limiting the relation between the text graphs, the mutual information of different nodes in the same text graph is maximized, the mutual information of the nodes in different text graphs is minimized, the relation between the text semantic graph and the text dependency graph is taken as an example, and the expression is as follows:
wherein,,representing node->And node->Similarity between->Is the temperature coefficient in InfoNCE.
And estimating the three text graphs based on mutual information, wherein the optimization objective function is as follows:
wherein,,representing an optimized objective function>Representation->And->Mutual information estimation between +.>Side vector representing text semantic graph, +.>Side vector representing text dependency graph, +.>An edge vector representing a text co-line graph.
Specifically, in step S4, the edge vectors of the three text graphs are weighted and summed to obtain a final optimized graph structureThe expression is as follows:
wherein,,
specifically, in step S5, the process of generating the graph-level text representation specifically includes: processing the final optimized graph structure in step S4 using a graph convolution neural networkAnd its features->Update semantic feature of text->For->And carrying out global pooling processing to obtain a graph-level text representation, wherein the expression is as follows:
wherein,,representation of the text at the representation level->For node->Is characterized by->Representing a global pooling process.
Specifically, in step S6, the expression of the softmax classification is as follows:
wherein,,representing the final result of the classification,/->Is a learnable mapping matrix +.>In the event of a bias that is to be learnable,representation->A function.
According to an embodiment of the present invention, step S1 is actually a preprocessing stage of the label text of the training set, in which process the text classification problem is actually converted into the graph classification problem, the words are converted into nodes in the graph, the relationships between the words are converted into edges in the graph, and the three different linguistic features are converted. The steps S2, S3 and S4 are the process of learning the graph structure of different granularity for the input text graph, and integrate the representation of the graph structure under different viewing angles. The above steps have two advantages over conventional methods: (1) When the text graph is generated, various linguistic rules can be utilized to extract graph nodes and edges, so that the accuracy of text classification is effectively improved; (2) Through multi-granularity graph structure learning, the graph structures with different semantic features are integrated on the basis of removing repeated information, so that the occurrence of the phenomenon of graph structure semantic loss in the subsequent learning process is prevented, and the model performance is further improved.
Simulation experiment:
the present example performed a simulation experiment on the public dataset MR. MR is a movie rating dataset that contains user reviews of movies and corresponding categories. These categories are classified into positive and negative evaluations. In the embodiment, a simulation experiment is performed by randomly extracting one comment sample in the MR data set, so as to evaluate whether the method disclosed by the invention achieves the effect of relearning the graph structure of the sample. Three text graphs, namely a text co-occurrence graph shown in fig. 2, a text grammar graph shown in fig. 3 and a text semantic graph shown in fig. 4, are constructed according to the step S1 of the invention for the randomly extracted sentence "Take Care of My Cat offers a refreshingly different slice of Asian cinema". From these three text graphs, the final text graph, i.e., the graph-level text representation, as shown in fig. 5 is obtained and text-classified. The abscissa in fig. 2, 3, 4, 5 represents the unique word in this comment sample, the left ordinate also represents the unique word in this comment sample, and the right ordinate represents the strength of the relationship between the words. Color blocks in the matrix are larger than 0 to indicate that the relation between the corresponding words is positive, and the larger the numerical value is, the stronger the positive relation between the words is, namely the positive effect on the classification result is, the smaller the color blocks are, the smaller the numerical value is, the relation between the corresponding words is negative, and the stronger the negative relation between the words is, namely the stronger the negative effect on the classification result is. Color bars equal to 0 indicate that the relationship between the two words in the comment sample has no effect on the classification result. In summary, relearned FIG. 5 facilitates the method disclosed in this example in evaluating whether this comment sample belongs to a positive or negative rating.
The translation of the randomly extracted comment sample "Take Care of My Cat offers a refreshingly different slice of Asian cinema" is that "care for my cat" provides a completely new bridge for asian movies. The true classification result of this comment sample is a positive evaluation. As can be seen from the finally learned text chart 5, the method disclosed in this embodiment relearns the text chart and discards some erroneous relations between words in the original text chart, such as (wake, care) in the text co-occurrence chart 2 and the text grammar chart 3, and these relations exist in the movie title Take Care of My Cat, and have no positive effect on correctly judging this comment text as a classification result of the forward evaluation. In addition, the learned final text fig. 5 also adds some new relationships, such as (references), which helps the method disclosed in this example to determine this text as a positive evaluation category.
In addition, the embodiment also discloses a text classification device based on hierarchical text graph structure learning, which comprises:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory is used for storing a computer program;
the processor is configured to implement the text classification method as described above when executing the computer program.
In addition, the embodiment also discloses a computer readable storage medium, and the computer readable storage medium stores a computer program, and the computer program realizes the text classification method when being executed by a processor.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. The text classification method based on hierarchical text graph structure learning is characterized by comprising the following steps:
step S1: inputting and preprocessing the training set text to be classified according to three different linguistic characteristics to obtain node sets and edge sets of the three training set text, namely three graph structure matrixes; the three linguistic features are a text co-occurrence diagram, a text grammar diagram and a text semantic diagram respectively;
step S2: adopting a characteristic representation model based on edge level graph structure learning to process the three node sets and the graph structures of the edge sets to obtain three edge vectors; the process of the graph structure learning specifically comprises the following steps: giving confidence to the graph structure matrix, and optimizing the graph structure matrix based on the confidence; using Laplace regularization to restrict the characteristics of the nodes, and using the characteristics as likelihood functions of Bayesian estimation; setting a learning process of a priori function constraint adjacency matrix; combining the likelihood function and the prior function, and restraining the adjacency matrix of the learned graph through a Bayesian estimation framework;
the above optimization and constraint are performed on the three text graphs respectively, and the final loss function expression is as follows:
wherein,,loss function representing graph structure learning at edge level, +.>Representation->Constraint function for constraining the adjacency matrix of +.>Representing the structure of a learned text semantic graph, +.>Representing the structure of the learned text-dependent graph,representing a learned text co-occurrence graph structure;
step S3: removing redundancy of the three types of edge vectors according to the measurement standard of mutual information to obtain three types of text edge vectors; the redundancy elimination process specifically comprises the following steps: three text graph structure feature mappings generated in graph structure learning of opposite side levels of a graph convolution neural network are used to obtain mapped feature vectors, mutual information of different nodes in the same text graph is maximized, mutual information of nodes in different text graphs is minimized, estimation is carried out on the three text graphs based on the mutual information, and an optimization objective function is as follows:
wherein,,represents an optimization objective function, L (V 1 ,V 2 ) Represents V 1 And V 2 Mutual information estimation between V sem Edge vector representing text semantic graph, V dep Edge vectors representing text dependency graphs, V coo An edge vector representing a text co-line graph;
step S4: carrying out weighted summation on the three text edge vectors to obtain text graph structural representation;
step S5: processing the text graph structural representation obtained in the step S4 and the text semantic features corresponding to the text structural representation by adopting a graph convolution neural network, and generating a graph-level text representation through a graph pooling layer;
step S6: and (5) carrying out softmax classification on the graph-level text representation obtained in the step (S5), and taking the category with the highest probability as a final classification result.
2. The text classification method according to claim 1, wherein in step S1, the text co-occurrence graph construction mode specifically includes: representing any word t of d in text as a text co-occurrence graph G c Node V of (2) t The edge weight between any two word nodes in the graph is represented by the point mutual information PMI of the word nodes, and the edge weight expression of the text co-occurrence graph is as follows:
wherein,,edge weights representing text co-occurrence graphs, PMI (v i ,v j ) Representation word node v i Sum word node v j Is a piece of dot mutual information.
3. The text classification method according to claim 1, wherein in step S1, the text grammar map is constructed in a manner that: extracting any pair of words (p) in text d using parsing tool i ,p j ) Syntax dependency rel of (2) i,j Generates a relation triplet (p i ,rel i,j ,p j ) Using (p) i ,p j ) As nodes of the text grammar, the dependency relationship is used as the edge between the nodes, the edge weight is expressed by the frequency of the dependency relationship in the data set, and the edge weight expression of the text grammar is as followsThe following steps:
wherein,,edge weights, # N, representing text grammar graph dep (p i ,p j ) Representing the number of times two words have syntactic dependencies in all sentences of the corpus, #N tot (p i ,p j ) Representing the number of times two words are present in the same sentence in all sentences of the corpus.
4. The text classification method according to claim 1, wherein in step S1, the text semantic graph is constructed specifically by: encoding arbitrary word t in text d using BERT model to obtain feature vector w t The semantic similarity between the feature vectors is calculated by using cosine similarity, if the semantic similarity is larger than a set threshold value z, the semantic relationship exists between the pair of words, and the edge weight expression of the text semantic graph is as follows:
wherein,,edge weights, #N, representing text semantic graphs sem (q i ,q j ) Representing the number of times two words have semantic relationships in all sentences of the corpus, #N tot (q i ,q j ) Representing the number of times two words are present in the same sentence in all sentences of the corpus.
5. The text classification method according to claim 1, wherein in step S4, for threeThe edge vectors of the seed text graph are weighted and summed to obtain a final optimized graph structure A The expression is as follows:
A =aV sem +bV dep +cV coo
wherein a+b+c=1.
6. The text classification method according to claim 5, characterized in that in step S5, the process of generating the graph-level text representation is specifically: processing the final optimized text-graph structure a in step S4 using a graph-convolution neural network And its features, update text semantic features H 1 For H 1 And carrying out global pooling processing to obtain a graph-level text representation, wherein the expression is as follows:
wherein,,representation of text at representation level, H v For the feature representation of node V, maxpooling () represents the global pooling process, and V represents the node set of the text graph.
7. Text classification device based on hierarchical text graph structure study, characterized by comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory is used for storing a computer program;
the processor is configured to implement the text classification method according to any of claims 1 to 6 when executing the computer program.
8. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the text classification method according to any of claims 1 to 6.
CN202310551919.9A 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning Active CN116304061B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310551919.9A CN116304061B (en) 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310551919.9A CN116304061B (en) 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning

Publications (2)

Publication Number Publication Date
CN116304061A CN116304061A (en) 2023-06-23
CN116304061B true CN116304061B (en) 2023-07-21

Family

ID=86794469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310551919.9A Active CN116304061B (en) 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning

Country Status (1)

Country Link
CN (1) CN116304061B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116805059B (en) * 2023-06-26 2024-04-09 重庆邮电大学 Patent classification method based on big data
CN117435747B (en) * 2023-12-18 2024-03-29 中南大学 Few-sample link prediction drug recycling method based on multilevel refinement network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022001333A1 (en) * 2020-06-30 2022-01-06 首都师范大学 Hyperbolic space representation and label text interaction-based fine-grained entity recognition method
CN114186063A (en) * 2021-12-14 2022-03-15 合肥工业大学 Training method and classification method of cross-domain text emotion classification model
CN114528374A (en) * 2022-01-19 2022-05-24 浙江工业大学 Movie comment emotion classification method and device based on graph neural network
CN115858788A (en) * 2022-12-19 2023-03-28 福州大学 Visual angle level text emotion classification system based on double-graph convolutional neural network
CN115858725A (en) * 2022-11-22 2023-03-28 广西壮族自治区通信产业服务有限公司技术服务分公司 Method and system for screening text noise based on unsupervised graph neural network
CN115878800A (en) * 2022-12-12 2023-03-31 上海理工大学 Double-graph neural network fusing co-occurrence graph and dependency graph and construction method thereof

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11526676B2 (en) * 2019-05-17 2022-12-13 Naver Corporation Implicit discourse relation classification with contextualized word representation
US20210248425A1 (en) * 2020-02-12 2021-08-12 Nec Laboratories America, Inc. Reinforced text representation learning
CN114548099B (en) * 2022-02-25 2024-03-26 桂林电子科技大学 Method for extracting and detecting aspect words and aspect categories jointly based on multitasking framework

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022001333A1 (en) * 2020-06-30 2022-01-06 首都师范大学 Hyperbolic space representation and label text interaction-based fine-grained entity recognition method
CN114186063A (en) * 2021-12-14 2022-03-15 合肥工业大学 Training method and classification method of cross-domain text emotion classification model
CN114528374A (en) * 2022-01-19 2022-05-24 浙江工业大学 Movie comment emotion classification method and device based on graph neural network
CN115858725A (en) * 2022-11-22 2023-03-28 广西壮族自治区通信产业服务有限公司技术服务分公司 Method and system for screening text noise based on unsupervised graph neural network
CN115878800A (en) * 2022-12-12 2023-03-31 上海理工大学 Double-graph neural network fusing co-occurrence graph and dependency graph and construction method thereof
CN115858788A (en) * 2022-12-19 2023-03-28 福州大学 Visual angle level text emotion classification system based on double-graph convolutional neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Bingxin Xue等.The Study on the Text Classification Based on Graph Convolutional Network and BiLSTM.ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence.2022,全文. *
文本分类中基于熵的词权重计算方法研究;陈科文等;计算机科学与探索;10(9);全文 *
文本图表示模型及其在文本挖掘中的应用;李纲;毛进;;情报学报(12);全文 *

Also Published As

Publication number Publication date
CN116304061A (en) 2023-06-23

Similar Documents

Publication Publication Date Title
CN116304061B (en) Text classification method, device and medium based on hierarchical text graph structure learning
CN110807154B (en) Recommendation method and system based on hybrid deep learning model
WO2020062770A1 (en) Method and apparatus for constructing domain dictionary, and device and storage medium
CN109299341A (en) One kind confrontation cross-module state search method dictionary-based learning and system
CN113010693A (en) Intelligent knowledge graph question-answering method fusing pointer to generate network
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN107220220A (en) Electronic equipment and method for text-processing
CN113051399B (en) Small sample fine-grained entity classification method based on relational graph convolutional network
CN115099219A (en) Aspect level emotion analysis method based on enhancement graph convolutional neural network
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN115510226B (en) Emotion classification method based on graph neural network
CN112836051B (en) Online self-learning court electronic file text classification method
CN114722820A (en) Chinese entity relation extraction method based on gating mechanism and graph attention network
CN112597302A (en) False comment detection method based on multi-dimensional comment representation
CN116521882A (en) Domain length text classification method and system based on knowledge graph
CN110929532B (en) Data processing method, device, equipment and storage medium
CN114861636A (en) Training method and device of text error correction model and text error correction method and device
CN117251522A (en) Entity and relationship joint extraction model method based on latent layer relationship enhancement
CN117034916A (en) Method, device and equipment for constructing word vector representation model and word vector representation
Sekiyama et al. Automated proof synthesis for propositional logic with deep neural networks
Sekiyama et al. Automated proof synthesis for the minimal propositional logic with deep neural networks
CN113434698B (en) Relation extraction model establishing method based on full-hierarchy attention and application thereof
CN112380845B (en) Sentence noise design method, equipment and computer storage medium
Wei Recommended methods for teaching resources in public English MOOC based on data chunking
CN114528459A (en) Semantic-based webpage information extraction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant