CN113297838A - Relationship extraction method based on graph neural network - Google Patents

Relationship extraction method based on graph neural network Download PDF

Info

Publication number
CN113297838A
CN113297838A CN202110563551.9A CN202110563551A CN113297838A CN 113297838 A CN113297838 A CN 113297838A CN 202110563551 A CN202110563551 A CN 202110563551A CN 113297838 A CN113297838 A CN 113297838A
Authority
CN
China
Prior art keywords
expression
sentence
obtaining
pooling
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110563551.9A
Other languages
Chinese (zh)
Inventor
莫益军
姚盛楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Ezhou Institute of Industrial Technology Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Ezhou Institute of Industrial Technology Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology, Ezhou Institute of Industrial Technology Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202110563551.9A priority Critical patent/CN113297838A/en
Publication of CN113297838A publication Critical patent/CN113297838A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A graph neural network-based relationship extraction method, the method comprising the steps of: carrying out data processing on the document to be extracted; constructing a model data set of sentences in the document; obtaining semantic feature vectors of the sentences; generating an inter-entity neighborhood information expression of the sentence according to the data processing result and the semantic feature vector; enhancing sentence expression of the sentence according to the neighborhood information expression between the entities; obtaining sentence pooling expression and subject-object pooling expression of the sentence according to the data processing result and the sentence expression; carrying out cascade expression on the sentence pooling expression and the subject-object pooling expression; and acquiring the relation category representation of the sentence according to the cascade representation. According to the method and the device, incidence relations of the multi-order words are obtained by improving the weight matrix, and meanwhile, an attention mechanism is fused to model text contents so as to obtain complete dependency relations among semantics and achieve a better relation classification effect.

Description

Relationship extraction method based on graph neural network
Technical Field
The invention belongs to the technical field of relationship extraction, and particularly relates to a relationship extraction method based on a graph neural network.
Background
Relational extraction is intended to capture semantic relationships between pairs of tagged entities in unstructured sentences, plays an important role in natural language processing tasks, such as creating new structured knowledge bases and enhancing existing knowledge bases and building vertical domain knowledge graphs, and also plays an important role in supporting upper-level applications, such as: question-answering systems, relational reasoning, searching, and the like. Relationship extraction tasks typically occur between two or more particular entities, ultimately defining relationships into some relationship class that exists. A good relationship extraction model can help to deeply understand the text content.
Most of the existing relation extraction models are based on deep learning, such as RNN, CNN and improved models thereof. The relation extraction model takes a text sequence as input, sentence representation and word level representation are obtained through a feature extractor, and finally the relation category between entities is obtained through a classifier. The predicate in a sentence is often very important in extracting the relationship, which also means that if the distance between the entity and the predicate is too far, it may result in the loss of critical information. In order to solve the problem, a dependency tree is often adopted to obtain sentence remote information dependency, simplify complex sentences and complete core information extraction. In the early stage, LSTM is often used for word sequences of the shortest path, some researchers propose that DepNN applies RNN to extract subtree features and CNN to extract shortest path features, but the models directly run in dependency trees, and batch processing training is difficult to implement due to the fact that the dependency trees are often difficult to align, parallel training is difficult, and computing efficiency is low.
Disclosure of Invention
In view of the above, the present invention provides a graph neural network based relationship extraction method that overcomes or at least partially solves the above-mentioned problems.
In order to solve the technical problem, the invention provides a relation extraction method based on a graph neural network, which comprises the following steps:
carrying out data processing on the document to be extracted;
constructing a model data set of sentences in the document;
obtaining semantic feature vectors of the sentences;
generating an inter-entity neighborhood information expression of the sentence according to the data processing result and the semantic feature vector;
enhancing sentence expression of the sentence according to the neighborhood information expression between the entities;
obtaining sentence pooling expression and subject-object pooling expression of the sentence according to the data processing result and the sentence expression;
carrying out cascade expression on the sentence pooling expression and the subject-object pooling expression;
and acquiring the relation category representation of the sentence according to the cascade representation.
Preferably, the data processing of the document to be extracted includes the steps of:
acquiring the document to be extracted;
carrying out data cleaning operation on sentences in the document to be extracted;
performing word segmentation operation on the sentence;
extracting dependency syntax relation information of the sentences;
and acquiring the subject and object position information of the sentence.
Preferably, the data washing operation of the sentences in the document to be extracted includes the steps of:
unifying all the sentences into the same preset format;
deleting useless paragraphs in all the sentences;
deleting the alien characters in all the sentences;
deleting the repeated contents in all the sentences;
and deleting useless contents in all the sentences.
Preferably, the constructing of the model data set of sentences in the document comprises the steps of:
obtaining a conditional random field model and a dependency syntax analysis of the graph;
constructing a sentence model of the sentence based on the conditional random field model;
generating a directed graph corresponding to each sentence to be analyzed based on the dependency syntax analysis of the graph;
determining position information and relation category information of entities in the sentence according to the directed graph;
integrating the relevant data information of the sentences;
and storing the related data information into a dictionary.
Preferably, the obtaining the semantic feature vector of the sentence comprises the steps of:
obtaining the model data set;
obtaining word vectors of the sentences in the model data set;
the word vector is input into the RNN,
obtaining sentence vector expression output by the RNN;
adding a position feature dimension in the sentence vector expression;
and acquiring an input feature vector of the graph convolution neural network model.
Preferably, the generating of the inter-entity neighborhood information expression of the sentence according to the data processing result and the semantic feature vector includes:
obtaining dependency syntax relation information in the data processing result;
converting the dependency syntax relationship information into an adjacency matrix;
acquiring an input feature vector of a graph convolution neural network model;
inputting the adjacency matrix and the input feature vector into the graph convolution neural network model;
calculating first-order neighborhood dependence corresponding to the graph convolution neural network model;
acquiring a weighted graph convolution network model;
adding a virtual edge of a dependency tree in the weighted graph convolutional network model;
constructing a logical adjacency matrix of the dependency tree;
inputting the logical adjacency matrix into the weighted graph convolutional network model;
and calculating the k-order neighborhood dependence corresponding to the weighted graph convolutional network model.
Preferably, the enhancing the sentence expression of the sentence according to the inter-entity neighborhood information expression includes the steps of:
acquiring a relationship attention module and a position attention module;
obtaining the neighborhood information expression between the entities;
taking the inter-entity neighborhood information expression as an original feature expression of the position attention module;
calculating a position attention matrix of the sentence;
calculating a relational attention matrix of the sentence;
inputting the relationship attention matrix into a neural network model of a graph as an adjacency matrix;
and obtaining a graph convolution feature expression result output by the graph neural network model.
Preferably, the obtaining of the sentence pooling expression and the subject-object pooling expression of the sentence according to the data processing result and the sentence expression comprises the steps of:
obtaining a graph convolution characteristic expression result;
performing sentence pooling on the graph convolution feature expression result;
obtaining sentence pooling expression;
performing subject-object pooling on the graph convolution feature expression result;
and obtaining the subject-object pooling expression.
Preferably, the step of performing cascade representation on the sentence pooling expression and the subject-object pooling expression comprises the steps of:
obtaining the sentence pooling expression;
obtaining the subject-object pooling expression;
obtaining subject pooling expression in the subject-object pooling expression;
obtaining object pooling expression in the subject-object pooling expression;
splicing the sentence pooling expression, the subject pooling expression and the object pooling expression in sequence;
the final cascade expression is obtained.
Preferably, the obtaining of the relation category representation of the sentence according to the cascade representation comprises the steps of:
optimizing the relational representation of the sentences by using distribution reinforcement learning;
obtaining the cascade representation;
inputting the cascade representation into a feedforward neural network model;
obtaining a relation characteristic representation output by the feedforward neural network model;
obtaining the sentence pooling expression;
performing probability prediction on the relation characteristic representation according to the relation characteristic representation and the sentence pooling expression;
and estimating a distribution function of the relational expression by using distribution reinforcement learning.
One or more technical solutions in the embodiments of the present invention have at least the following technical effects or advantages: the relationship extraction method based on the graph neural network can be effectively applied to any dependency tree structure, meanwhile, the problem that the basic GCN is only limited to establishing the dependency relationship among first-order words is considered, the connection among the multi-order words needs multi-layer GCN, but the problem of over-smoothness is caused, therefore, the incidence relationship of the multi-order words is obtained by improving the weight matrix, meanwhile, the attention mechanism is fused to model the text content, the complete dependency relationship among semantics is obtained, and the better relationship classification effect is achieved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a relationship extraction method based on a graph neural network according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments and examples, and the advantages and various effects of the present invention will be more clearly apparent therefrom. It will be understood by those skilled in the art that these specific embodiments and examples are for the purpose of illustrating the invention and are not to be construed as limiting the invention.
Throughout the specification, unless otherwise specifically noted, terms used herein should be understood as having meanings as commonly used in the art. Accordingly, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. If there is a conflict, the present specification will control.
Unless otherwise specifically stated, various raw materials, reagents, instruments, equipment and the like used in the present invention are commercially available or can be prepared by existing methods.
Referring to fig. 1, in an embodiment of the present application, the present invention provides a graph neural network-based relationship extraction method, including the steps of:
s1: carrying out data processing on the document to be extracted;
in the embodiment of the present application, the data processing on the document to be extracted in step S1 includes the steps of:
acquiring the document to be extracted;
carrying out data cleaning operation on sentences in the document to be extracted;
performing word segmentation operation on the sentence;
extracting dependency syntax relation information of the sentences;
and acquiring the subject and object position information of the sentence.
In the embodiment of the application, when data processing is performed on a document to be extracted, specifically, data cleaning is performed on the document to be extracted according to a preset standard, then word segmentation is performed on a sentence, so that the sentence is decomposed into a plurality of phrases, dependency and syntax relationship information between the phrases in the sentence is extracted, and subject and object position information of the sentence can be acquired.
In this embodiment of the present application, the performing data cleansing operation on the sentences in the document to be extracted includes:
unifying all the sentences into the same preset format;
deleting useless paragraphs in all the sentences;
deleting the alien characters in all the sentences;
deleting the repeated contents in all the sentences;
and deleting useless contents in all the sentences.
In the embodiment of the present application, when performing data cleansing operation on the sentences in the document to be extracted, specifically, unifying all the sentences in the document into the same preset format, and then deleting useless paragraphs, alien characters, repeated contents, and useless contents in the sentences according to a preset standard.
S2: constructing a model data set of sentences in the document;
in the embodiment of the present application, the constructing of the model data set of the sentence in the document in step S2 includes the steps of:
obtaining a conditional random field model and a dependency syntax analysis of the graph;
constructing a sentence model of the sentence based on the conditional random field model;
generating a directed graph corresponding to each sentence to be analyzed based on the dependency syntax analysis of the graph;
determining position information and relation category information of entities in the sentence according to the directed graph;
integrating the relevant data information of the sentences;
and storing the related data information into a dictionary.
In the embodiment of the application, when the model data set of the sentence in the document is constructed, firstly, the sentence model of the sentence is constructed based on the conditional random field model, that is, the sentence model of the sentence is constructed by using the statistical sequence based on the conditional random field model, the sentence model is regarded as a binary decision task, that is, each character is marked as the beginning of a word or the continuation of a word, then, the gaussian prior is used for preventing overfitting, and the quasi-newton method is used for parameter optimization. Further, for a particular sequence of characters, the probability that a conditional random field model assigns a tag sequence is given by:
Figure BDA0003078692270000071
wherein, Y is the label sequence of the sentence, X is the un-segmented character sequence, Z (X) is the normalization term, fk is the characteristic function, c is the character in the labeled sequence.
And then generating a directed graph for each sentence to be analyzed by using graph-based dependency syntax analysis, wherein the nodes of the directed graph are words in the sentence, the edges of the directed graph are dependency relations between the words, entity classification and whether relations exist between the entities are obtained through models based on BilST and Attention, and finally, dependency syntax relation information between the entities in the sentence, including part of speech, syntax relation between the entities and the like, is obtained. And then determining the position and the relation category of an entity in the sentence based on the directed graph, integrating related data information of the sentence, including word segmentation, part of speech, syntactic relation among entities, entity position and relation category, and finally storing the related data information into a json dictionary to complete construction of a model data set.
S3: obtaining semantic feature vectors of the sentences;
in this embodiment of the present application, the obtaining of the semantic feature vector of the sentence in step S3 includes the steps of:
obtaining the model data set;
obtaining word vectors of the sentences in the model data set;
the word vector is input into the RNN,
obtaining sentence vector expression output by the RNN;
adding a position feature dimension in the sentence vector expression;
and acquiring an input feature vector of the graph convolution neural network model.
In the embodiment of the present application, when obtaining the semantic feature vector of the sentence, firstly, a GloVe-based model is used to obtain a word vector of the sentence to be detected in a model data set, and a cost function of the word vector is as follows:
Figure BDA0003078692270000081
wherein v isi,vjIs a word vector of word i and word j, bi,bjAre two scalars, which are bias terms, f is a weight function, N is the size of the vocabulary, and the co-occurrence matrix dimension is N × N. Meanwhile, in order to consider the context information, the word vector is fused into the RNN expression, so that the sentence vector expression with the context is obtained. And based on the obtained vector expression, considering the importance of the position features, increasing the dimension of the position features and obtaining an input feature vector X of the graph convolution neural network model. And acquiring an improved adjacency matrix based on the dependency syntax information of the data processing result in the step S1, and feeding the semantic feature vector in the step S2 into the GCN for training, thereby generating an inter-entity neighborhood information expression.
S4: generating an inter-entity neighborhood information expression of the sentence according to the data processing result and the semantic feature vector;
in this embodiment of the present application, the generating of the inter-entity neighborhood information expression of the sentence according to the data processing result and the semantic feature vector in step S4 includes the steps of:
obtaining dependency syntax relation information in the data processing result;
converting the dependency syntax relationship information into an adjacency matrix;
acquiring an input feature vector of a graph convolution neural network model;
inputting the adjacency matrix and the input feature vector into the graph convolution neural network model;
calculating first-order neighborhood dependence corresponding to the graph convolution neural network model;
acquiring a weighted graph convolution network model;
adding a virtual edge of a dependency tree in the weighted graph convolutional network model;
constructing a logical adjacency matrix of the dependency tree;
inputting the logical adjacency matrix into the weighted graph convolutional network model;
and calculating the k-order neighborhood dependence corresponding to the weighted graph convolutional network model.
In the embodiment of the present application, when generating the neighborhood information expression between entities of the sentence according to the data processing result and the semantic feature vector, the dependency syntax relationship information obtained in step 1 is first converted into an adjacency matrix, such as a sentence with a1,a2,a3,...,anIts adjacency matrix is A, its dimension is n × n, if entity aiWith entity ajIn (a) is associated withijIs 1, otherwise aijIs 0. The adjacency matrix a and the input feature vector X generated in step S3 are then fed into the GCN model. G ═ V, E), the input of the GCN is a feature matrix X, the shape of which is N × d, where N denotes the number of nodes in the graph, d is the input feature dimension of each node, and the expression of the feature matrix X is:
Figure BDA0003078692270000091
wherein H(0)Representing a feature matrix X, W(l)Is a linear transformation, b(l)Is a biased term, σ is a non-linear function, where the RELU activation function is used. Since the feature fusion on the GCN of one layer only represents the first-order neighborhood dependency, in order to realize the multi-hop feature fusion when the k-order neighborhood feature is further needed, a Weighted Graph Convolutional Network (WGCN) is used at this time. In this model, virtual edges are added to the dependency tree to construct a Logical Adjacency Matrix (LAM) that can be directly solved by a 1-level WGCNThe k-th order neighborhood dependence has the formula:
Figure BDA0003078692270000092
wherein weight (d) is used for calculating the weight coefficient of the feature fusion between the nodes. The shorter the distance between nodes, the larger the weight and vice versa. The fusion weight coefficient between adjacent nodes is 1, i.e. the maximum information fusion weight.
Thus updated, the formula for the calculation of the GCN expression is:
Figure BDA0003078692270000101
s5: enhancing sentence expression of the sentence according to the neighborhood information expression between the entities;
in this embodiment of the present application, the step S5 of enhancing the sentence expression of the sentence according to the inter-entity neighborhood information expression includes the steps of:
acquiring a relationship attention module and a position attention module;
obtaining the neighborhood information expression between the entities;
taking the inter-entity neighborhood information expression as an original feature expression of the position attention module;
calculating a position attention matrix of the sentence;
calculating a relational attention matrix of the sentence;
inputting the relationship attention matrix into a neural network model of a graph as an adjacency matrix;
and obtaining a graph convolution feature expression result output by the graph neural network model.
In the embodiment of the present application, when the sentence expression of the sentence is strengthened according to the neighborhood information expression between the entities, in order to better pay attention to the certainty of the dependency information and the node expression between the entities, so that the identification rate is improved while the node expresses the dependency relationship, a relationship attention module and a position attention module need to be added. The position attention module models the spatial relationship between any two positions, and first takes the neighborhood information expression between the entities obtained in step S4 as the original feature expression C, the original feature expression D, and the original feature expression E of the position attention module, respectively. A matrix multiplication operation is performed between the D matrix representation and the transpose of C, the position attention matrix being calculated using the softmax layer.
Figure BDA0003078692270000102
Where C, D, E is the feature transform of the first layer GCN output h.
Where qij represents the effect of the jth location on the ith location. The more similar the features of two locations are, the greater the correlation between them. At the same time, matrix multiplication is performed on E and q. And finally, multiplying the result by a learning factor alpha, gradually learning and allocating more weights, and taking the result as a final result of the first layer GCN, wherein the specific steps are as follows:
Figure BDA0003078692270000111
further, the relationship attention matrix is generated by the dependencies of the nodes, such as: i and j are initially two related to 1 and are otherwise 0, and the relationship characteristics are generated by a self-attention mechanism and are calculated through a softmax layer, which is as follows:
Figure BDA0003078692270000112
the tighter the relationship between two nodes, the greater the impact on the value. The attention matrix is then multiplied by the original node characteristic a. Finally, the result is multiplied by a learning factor β, and then element-summed with the original features to obtain the following formula:
Figure BDA0003078692270000113
expressing the obtained relational attention
Figure BDA0003078692270000114
And (3) replacing the original adjacent matrix with a new feature matrix, and sending the new feature matrix into the next layer of GCN to obtain a final graph convolution feature expression result hr, wherein the formula is as follows:
Figure BDA0003078692270000115
s6: obtaining sentence pooling expression and subject-object pooling expression of the sentence according to the data processing result and the sentence expression;
in the embodiment of the present application, the obtaining of the sentence pooling expression and the subject-object pooling expression of the sentence according to the data processing result and the sentence expression in step S6 includes:
obtaining a graph convolution characteristic expression result;
performing sentence pooling on the graph convolution feature expression result;
obtaining sentence pooling expression;
performing subject-object pooling on the graph convolution feature expression result;
and obtaining the subject-object pooling expression.
In the embodiment of the present application, in order to obtain important features of a sample and increase the operation speed, the graph convolution feature expression result hr obtained in step S5 is pooled, and the specific pooling operation is divided into two types, namely sentence pooling and subject-object pooling. In the sentence pooling process, firstly, mask operation on an adjacent matrix is obtained, related entities among all the entities are set to be 0, then, maximum pooling is used, and a pooling representation pool _ sensor is obtained, namely, feature expressions of all non-related entities in a sentence are obtained, and the specific formula is as follows:
Figure BDA0003078692270000121
in the mask mode of the host-object pooling use position, the positions corresponding to the subject and the object are respectively set to be 0 to obtain a mask _ object and a mask _ subject of the subject and the object, and then the final subject pooling expression pool _ subject and the final object pooling expression pool _ object are obtained by using the maximum pooling, wherein the specific formula is as follows:
Figure BDA0003078692270000122
Figure BDA0003078692270000123
s7: carrying out cascade expression on the sentence pooling expression and the subject-object pooling expression;
in this embodiment, the step of cascading the sentence pooling expression and the subject-object pooling expression in step S7 includes the steps of:
obtaining the sentence pooling expression;
obtaining the subject-object pooling expression;
obtaining subject pooling expression in the subject-object pooling expression;
obtaining object pooling expression in the subject-object pooling expression;
splicing the sentence pooling expression, the subject pooling expression and the object pooling expression in sequence;
the final cascade expression is obtained.
In this embodiment of the application, when the sentence pooling expression and the subject-object pooling expression are expressed in a cascade, specifically, based on the sentence pooling expression pool _ sensor, the subject pooling expression pool _ subject, and the object pooling expression pool _ object obtained in step S6, they are spliced to obtain a final cascade expression, which is specifically as follows:
hout=cat[pool_sentence;,pool_subject;pool_object]。
s8: and acquiring the relation category representation of the sentence according to the cascade representation.
In this embodiment of the present application, the obtaining of the relation category representation of the sentence according to the cascade representation in step S8 includes the steps of:
optimizing the relational representation of the sentences by using distribution reinforcement learning;
obtaining the cascade representation;
inputting the cascade representation into a feedforward neural network model;
obtaining a relation characteristic representation output by the feedforward neural network model;
obtaining the sentence pooling expression;
performing probability prediction on the relation characteristic representation according to the relation characteristic representation and the sentence pooling expression;
and estimating a distribution function of the relational expression by using distribution reinforcement learning.
In the embodiment of the application, when the relation type representation of the sentence is obtained according to the cascade representation, firstly, the relation representation of the sentence is optimized by distribution reinforcement learning, the entity to be classified is used as a state, the relation classification is used as a behavior, and the deviation between expectation and prediction is used as reward, so that the correct behavior is reinforced by the reward; then, the cascade representation obtained in step S7 is subjected to a layer of Feed Forward Neural Network (FFNN) to obtain a relational feature representation, which is specifically as follows:
rij=FFNN(hout),
where rij represents a relational expression of entity i and entity j.
Based on the obtained relationship feature representation rij and the sentence pooling expression pool _ sensor obtained in step S7, performing probability prediction on the output relationship feature by using a softmax function, wherein the specific formula is as follows:
P(rij|hobject,hsubject,hsentence)=softmax(MLP(rij)),
wherein MLP (.) is a multi-layer perceptron.
And then converting the probability predicted value into a state value matrix Q, and obtaining an optimal expected value through iteration, wherein the optimal expected value can be obtained by a Bellman optimization formula:
Figure BDA0003078692270000141
wherein h and r represent entities to be classified and corresponding relations thereof, Q (h, r) represents accumulated return obtained when the action r is executed in the h state, and gamma is a penalty factor.
Since reinforcement learning focuses on the expectation of future reward value, the nature of the assessment is that what is not happening to the prediction, which necessarily involves uncertainty, the magnitude of which has a very important impact on the decision. Therefore, using distribution reinforcement learning, not only the expected value but also the entire distribution function can be estimated by replacing the expected value of the learned return value with the probability distribution of the learned return value.
The distributed Bellman operator formula is:
Figure BDA0003078692270000142
wherein Z is a random variable representing a random variable generated in the reward after the action r is executed in the state h. The loss can be calculated by cross entropy, and the specific formula is as follows:
Figure BDA0003078692270000143
where S represents a set of sentences and S represents one sentence in the set.
The relationship extraction method based on the graph neural network can be effectively applied to any dependency tree structure, meanwhile, the problem that the basic GCN is only limited to establishing the dependency relationship among first-order words is considered, the connection among the multi-order words needs multi-layer GCN, but the problem of over-smoothness is caused, therefore, the incidence relationship of the multi-order words is obtained by improving the weight matrix, meanwhile, the attention mechanism is fused to model the text content, the complete dependency relationship among semantics is obtained, and the better relationship classification effect is achieved.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element. The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
In short, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A relationship extraction method based on a graph neural network is characterized by comprising the following steps:
carrying out data processing on the document to be extracted;
constructing a model data set of sentences in the document;
obtaining semantic feature vectors of the sentences;
generating an inter-entity neighborhood information expression of the sentence according to the data processing result and the semantic feature vector;
enhancing sentence expression of the sentence according to the neighborhood information expression between the entities;
obtaining sentence pooling expression and subject-object pooling expression of the sentence according to the data processing result and the sentence expression;
carrying out cascade expression on the sentence pooling expression and the subject-object pooling expression;
and acquiring the relation category representation of the sentence according to the cascade representation.
2. The relational extraction method based on the graph neural network according to claim 1, wherein the data processing of the document to be extracted comprises the steps of:
acquiring the document to be extracted;
carrying out data cleaning operation on sentences in the document to be extracted;
performing word segmentation operation on the sentence;
extracting dependency syntax relation information of the sentences;
and acquiring the subject and object position information of the sentence.
3. The relation extraction method based on the graph neural network as claimed in claim 2, wherein the data washing operation of the sentences in the document to be extracted comprises the steps of:
unifying all the sentences into the same preset format;
deleting useless paragraphs in all the sentences;
deleting the alien characters in all the sentences;
deleting the repeated contents in all the sentences;
and deleting useless contents in all the sentences.
4. The graph neural network-based relationship extraction method according to claim 1, wherein the constructing of the model data set of the sentences in the document comprises the steps of:
obtaining a conditional random field model and a dependency syntax analysis of the graph;
constructing a sentence model of the sentence based on the conditional random field model;
generating a directed graph corresponding to each sentence to be analyzed based on the dependency syntax analysis of the graph;
determining position information and relation category information of entities in the sentence according to the directed graph;
integrating the relevant data information of the sentences;
and storing the related data information into a dictionary.
5. The method for extracting relationship based on graph neural network according to claim 1, wherein said obtaining semantic feature vector of said sentence comprises the steps of:
obtaining the model data set;
obtaining word vectors of the sentences in the model data set;
the word vector is input into the RNN,
obtaining sentence vector expression output by the RNN;
adding a position feature dimension in the sentence vector expression;
and acquiring an input feature vector of the graph convolution neural network model.
6. The method for extracting relationship based on graph neural network according to claim 1, wherein the generating the inter-entity neighborhood information expression of the sentence according to the data processing result and the semantic feature vector comprises the steps of:
obtaining dependency syntax relation information in the data processing result;
converting the dependency syntax relationship information into an adjacency matrix;
acquiring an input feature vector of a graph convolution neural network model;
inputting the adjacency matrix and the input feature vector into the graph convolution neural network model;
calculating first-order neighborhood dependence corresponding to the graph convolution neural network model;
acquiring a weighted graph convolution network model;
adding a virtual edge of a dependency tree in the weighted graph convolutional network model;
constructing a logical adjacency matrix of the dependency tree;
inputting the logical adjacency matrix into the weighted graph convolutional network model;
and calculating the k-order neighborhood dependence corresponding to the weighted graph convolutional network model.
7. The method for extracting relationship based on graph neural network of claim 1, wherein said enhancing sentence expression of said sentence according to said inter-entity neighborhood information expression comprises the steps of:
acquiring a relationship attention module and a position attention module;
obtaining the neighborhood information expression between the entities;
taking the inter-entity neighborhood information expression as an original feature expression of the position attention module;
calculating a position attention matrix of the sentence;
calculating a relational attention matrix of the sentence;
inputting the relationship attention matrix into a neural network model of a graph as an adjacency matrix;
and obtaining a graph convolution feature expression result output by the graph neural network model.
8. The graph neural network-based relationship extraction method according to claim 1, wherein the obtaining of the sentence pooling expression and the subject-object pooling expression of the sentence according to the data processing result and the sentence expression comprises:
obtaining a graph convolution characteristic expression result;
performing sentence pooling on the graph convolution feature expression result;
obtaining sentence pooling expression;
performing subject-object pooling on the graph convolution feature expression result;
and obtaining the subject-object pooling expression.
9. The graph neural network-based relationship extraction method according to claim 1, wherein the cascade representation of the sentence pooling expression and the subject-object pooling expression comprises the steps of:
obtaining the sentence pooling expression;
obtaining the subject-object pooling expression;
obtaining subject pooling expression in the subject-object pooling expression;
obtaining object pooling expression in the subject-object pooling expression;
splicing the sentence pooling expression, the subject pooling expression and the object pooling expression in sequence;
the final cascade expression is obtained.
10. The method for extracting relationship based on graph neural network according to claim 1, wherein said obtaining the relationship class representation of the sentence according to the cascade representation comprises the steps of:
optimizing the relational representation of the sentences by using distribution reinforcement learning;
obtaining the cascade representation;
inputting the cascade representation into a feedforward neural network model;
obtaining a relation characteristic representation output by the feedforward neural network model;
obtaining the sentence pooling expression;
performing probability prediction on the relation characteristic representation according to the relation characteristic representation and the sentence pooling expression;
and estimating a distribution function of the relational expression by using distribution reinforcement learning.
CN202110563551.9A 2021-05-21 2021-05-21 Relationship extraction method based on graph neural network Pending CN113297838A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110563551.9A CN113297838A (en) 2021-05-21 2021-05-21 Relationship extraction method based on graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110563551.9A CN113297838A (en) 2021-05-21 2021-05-21 Relationship extraction method based on graph neural network

Publications (1)

Publication Number Publication Date
CN113297838A true CN113297838A (en) 2021-08-24

Family

ID=77324139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110563551.9A Pending CN113297838A (en) 2021-05-21 2021-05-21 Relationship extraction method based on graph neural network

Country Status (1)

Country Link
CN (1) CN113297838A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449084A (en) * 2021-09-01 2021-09-28 中国科学院自动化研究所 Relationship extraction method based on graph convolution
CN116521899A (en) * 2023-05-08 2023-08-01 中国传媒大学 Improved graph neural network-based document-level relation extraction algorithm and system

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107430600A (en) * 2014-12-12 2017-12-01 慧与发展有限责任合伙企业 Expansible web data extraction
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
CN109165737A (en) * 2018-08-29 2019-01-08 电子科技大学 Porosity prediction method based on condition random field and BP neural network
CN111241295A (en) * 2020-01-03 2020-06-05 浙江大学 Knowledge map relation data extraction method based on semantic syntax interactive network
CN111241294A (en) * 2019-12-31 2020-06-05 中国地质大学(武汉) Graph convolution network relation extraction method based on dependency analysis and key words
CN111651974A (en) * 2020-06-23 2020-09-11 北京理工大学 Implicit discourse relation analysis method and system
CN111831783A (en) * 2020-07-07 2020-10-27 北京北大软件工程股份有限公司 Chapter-level relation extraction method
CN111985245A (en) * 2020-08-21 2020-11-24 江南大学 Attention cycle gating graph convolution network-based relation extraction method and system
CN112001186A (en) * 2020-08-26 2020-11-27 重庆理工大学 Emotion classification method using graph convolution neural network and Chinese syntax
CN112001187A (en) * 2020-08-26 2020-11-27 重庆理工大学 Emotion classification system based on Chinese syntax and graph convolution neural network
CN112001185A (en) * 2020-08-26 2020-11-27 重庆理工大学 Emotion classification method combining Chinese syntax and graph convolution neural network
CN112487807A (en) * 2020-12-09 2021-03-12 重庆邮电大学 Text relation extraction method based on expansion gate convolution neural network

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107430600A (en) * 2014-12-12 2017-12-01 慧与发展有限责任合伙企业 Expansible web data extraction
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
CN109165737A (en) * 2018-08-29 2019-01-08 电子科技大学 Porosity prediction method based on condition random field and BP neural network
CN111241294A (en) * 2019-12-31 2020-06-05 中国地质大学(武汉) Graph convolution network relation extraction method based on dependency analysis and key words
CN111241295A (en) * 2020-01-03 2020-06-05 浙江大学 Knowledge map relation data extraction method based on semantic syntax interactive network
CN111651974A (en) * 2020-06-23 2020-09-11 北京理工大学 Implicit discourse relation analysis method and system
CN111831783A (en) * 2020-07-07 2020-10-27 北京北大软件工程股份有限公司 Chapter-level relation extraction method
CN111985245A (en) * 2020-08-21 2020-11-24 江南大学 Attention cycle gating graph convolution network-based relation extraction method and system
CN112001186A (en) * 2020-08-26 2020-11-27 重庆理工大学 Emotion classification method using graph convolution neural network and Chinese syntax
CN112001187A (en) * 2020-08-26 2020-11-27 重庆理工大学 Emotion classification system based on Chinese syntax and graph convolution neural network
CN112001185A (en) * 2020-08-26 2020-11-27 重庆理工大学 Emotion classification method combining Chinese syntax and graph convolution neural network
CN112487807A (en) * 2020-12-09 2021-03-12 重庆邮电大学 Text relation extraction method based on expansion gate convolution neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHIJIANG GUO 等: "Attention Guided Graph Convolutional Networks for Relation Extraction", 《HTTPS://ARXIV.ORG/ABS/1906.07510V8》, pages 1 - 13 *
买合木提·买买提 等: "基于条件随机场的维吾尔文机构名识别", 《计算机工程与设计》, vol. 40, no. 01, pages 273 - 278 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449084A (en) * 2021-09-01 2021-09-28 中国科学院自动化研究所 Relationship extraction method based on graph convolution
CN116521899A (en) * 2023-05-08 2023-08-01 中国传媒大学 Improved graph neural network-based document-level relation extraction algorithm and system
CN116521899B (en) * 2023-05-08 2024-03-26 中国传媒大学 Improved graph neural network-based document level relation extraction method and system

Similar Documents

Publication Publication Date Title
CN108388651B (en) Text classification method based on graph kernel and convolutional neural network
CN110321563B (en) Text emotion analysis method based on hybrid supervision model
CN111651974B (en) Implicit discourse relation analysis method and system
JP6291443B2 (en) Connection relationship estimation apparatus, method, and program
CN110298044B (en) Entity relationship identification method
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN113297838A (en) Relationship extraction method based on graph neural network
CN114896388A (en) Hierarchical multi-label text classification method based on mixed attention
Mattioli et al. An experiment on the use of genetic algorithms for topology selection in deep learning
CN114327483A (en) Graph tensor neural network model establishing method and source code semantic identification method
CN116521882A (en) Domain length text classification method and system based on knowledge graph
CN113705196A (en) Chinese open information extraction method and device based on graph neural network
CN112818121A (en) Text classification method and device, computer equipment and storage medium
CN115757773A (en) Method and device for classifying problem texts with multi-value chains
CN116861269A (en) Multi-source heterogeneous data fusion and analysis method in engineering field
Remya et al. Performance evaluation of optimized and adaptive neuro fuzzy inference system for predictive modeling in agriculture
CN113239694B (en) Argument role identification method based on argument phrase
CN114239828A (en) Supply chain affair map construction method based on causal relationship
CN112148879B (en) Computer readable storage medium for automatically labeling code with data structure
CN117271701A (en) Method and system for extracting system operation abnormal event relation based on TGGAT and CNN
JP2016197289A (en) Parameter learning device, similarity calculation device and method, and program
CN116341564A (en) Problem reasoning method and device based on semantic understanding
CN115599918A (en) Mutual learning text classification method and system based on graph enhancement
CN112131363B (en) Automatic question and answer method, device, equipment and storage medium
CN114386425A (en) Big data system establishing method for processing natural language text content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination