CN116702755A

CN116702755A - Document level relation extraction method based on dependency syntax graph and phrase structure tree

Info

Publication number: CN116702755A
Application number: CN202310749338.6A
Authority: CN
Inventors: 康昭; 田玲; 惠孛; 柯立; 鞠蓁轩; 吴旭程
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2023-06-21
Filing date: 2023-06-21
Publication date: 2023-09-05

Abstract

The invention discloses a document-level relation extraction method based on a dependency syntax diagram and a phrase structure tree, which comprises the following steps: encoding the document, and obtaining a character level embedded representation and an attention matrix of the document through a pre-training language model; constructing a phrase structure tree, and calculating a predicted value of the relation between entity pairs; constructing a dependency syntax graph comprising two types of nodes and three types of edges, and embedding predicted values representing the computing entity pairs based on the dependency syntax relationship according to the dependency syntax graph and character levels in the document; obtaining a final predicted value according to the predicted value based on the dependency syntax relationship between the entity pairs and the predicted value of the relationship between the entity pairs, obtaining a loss function according to the final predicted value, training a dependency syntax relationship model by using the loss function, processing a document to be processed by using the trained dependency syntax relationship model, and realizing document-level relationship extraction.

Description

Document level relation extraction method based on dependency syntax graph and phrase structure tree

Technical Field

The invention relates to the field of natural language processing, in particular to a document-level relation extraction method based on a dependency syntax diagram and a phrase structure tree.

Background

Relationship extraction is a key task in information extraction, aimed at modeling relationship patterns between entities in unstructured text. In the relationship extraction task, there are two specific scenarios: sentence-level relationship extraction and document-level relationship extraction. The entity of traditional sentence-level relation extraction is often in a sentence, and document-level relation extraction is not only limited to one sentence, but also meets the requirement of a real scene, and is receiving more and more attention.

One major challenge in document-level relationship extraction is to infer the relationship of multiple entity pairs in long sentences, which may contain irrelevant and even noisy information; in the existing document-level relation extraction method, under the condition of a large amount of irrelevant information, complex relation examples are sometimes encountered in document-level relation extraction, but the situation that the extraction effect is poor is often caused by learning the relation of the examples only through the context, and grammar information of the document needs to be considered.

Disclosure of Invention

Aiming at the defects in the prior art, the document-level relation extraction method based on the dependency syntax diagram and the phrase structure tree solves the problem that the extraction effect is poor when the existing document-level relation extraction method encounters a complex relation example.

In order to achieve the aim of the invention, the invention adopts the following technical scheme: there is provided a document-level relation extraction method based on a dependency syntax diagram and a phrase structure tree, comprising the steps of:

s1, encoding a document, and acquiring a character level embedded representation of the document through a pre-training language model;

s2, constructing a phrase structure Tree, and calculating a predicted value of the relation between entity pairs by adopting a Tree-LSTM model;

s3, constructing a dependency syntax graph comprising three types of nodes and three types of edges, constructing a dependency syntax relation model according to the dependency syntax graph and character level embedded representation in a document, and calculating predicted values between entity pairs based on the dependency syntax relation by using the dependency syntax relation model;

the dependency syntax graph comprises a plurality of nodes, wherein the nodes in the dependency syntax graph form a dependency syntax tree, and the dependency syntax tree and edges in the dependency syntax graph form the dependency syntax graph;

s4, obtaining a final predicted value according to the predicted value based on the dependency syntax relation and the predicted value based on the phrase structure relation among the entity pairs, obtaining a loss function according to the final predicted value, and training a dependency syntax relation model by using the loss function to obtain a trained dependency syntax relation model;

s5, processing the document to be subjected to relation extraction by using the trained dependency syntax relation model, and realizing relation extraction at the document level.

Further: the step S1 comprises the following sub-steps:

s11, inserting special symbols before and after each mentioned word in the document to finish coding;

s12, inputting all characters in the encoded document into a pre-training language model to obtain a character level embedded representation of the document.

Further: the step S2 comprises the following sub-steps:

s21, constructing a phrase structure Tree of each sentence of the document, and modeling by using a Tree-LSTM model to obtain a sentence vector embedded representation of each sentence;

s22, embedding sentence vectors of each sentence into a representation to obtain a vector representation of the document;

s23, calculating a predicted value of the relation between the entity pairs by utilizing the bilinear layer according to the embedded representation of the entity pairs and the vector representation of the document.

Further: the step S21 includes the following sub-steps:

s2101, calculating a state transition equation of an input gate in the Tree-LSTM model, wherein the equation is as follows:

wherein i is _j Output information x of input gate of node j _j Is the input vector of node j, h _jl Hidden state of the first child node of node j, W ⁽ⁱ⁾ To input the transformation matrix of the gate input features,parameter transformation matrix for hidden layer of input gate, b ⁽ⁱ⁾ N (j) is the adjacent node of the node j, which is the bias of the input gate;

s2102, calculating a state transition equation of a forgetting gate in a Tree-LSTM model, wherein the equation is as follows:

wherein f _jk Forget gate output information for the kth child node of node j, k=1, 2, |n (j) |, W ^(f) A transformation matrix for the forgetting gate input feature,off-diagonal parameter matrix for hidden layer of forgetting door, b ^(f) Bias for forgetting the door;

s2103, calculating a state transition equation of an output gate in the Tree-LSTM model, wherein the equation is as follows:

wherein o is _j To output information of the gate, W ^(o) To output the transformation matrix of the gate input features,a parameter transformation matrix for outputting a hidden layer of the gate b ^(o) Offset for the output gate;

s2104, calculating a state transition equation of the memory cells in the Tree-LSTM model, wherein the equation is as follows:

wherein c _j To represent the current cell state of node j, u _j Indicates the accepted state of the input gate, as indicated by dot product symbol, c _jl Memory cells of the first child node of node j, tanh (-) is the activation function, W ^(u) Andall are parameter matrixes b ^(u) Is biased.

S2105, calculating a state transition equation of an updated hidden state in the Tree-LSTM model, wherein the equation is as follows:

h _j ＝o _j ⊙tanh(c _j )

wherein h is _j Is the updated hidden state;

s2106, constructing a Tree-LSTM model according to a state transfer equation of an input gate in the Tree-LSTM model, a state transfer equation of a forgetting gate in the Tree-LSTM model, a state transfer equation of an output gate in the Tree-LSTM model, a state transfer equation of a memory cell in the Tree-LSTM model and a state transfer equation of an updated hidden state in the Tree-LSTM model;

s2107, constructing a phrase structure Tree for each sentence of the document, and modeling on each phrase structure Tree by using a Tree-LSTM model to obtain a sentence vector representation of each sentence.

Further: the formula for calculating the predicted value of the relationship between the entity pairs in the step S23 is as follows:

z _const ＝pair _s,o W ^const v _docu +b ^const

wherein z is _const Pair is a predicted value of relation between entity pairs _s,o For entity pair embedding representation, v _docu For vector representation of documents, W ^const And b ^const Are trainable parameters.

Further: the step S3 comprises the following sub-steps:

s31, taking each character in the document as a node, and constructing a node in the dependency syntax diagram;

s32, inputting each sentence of the document into a dependency syntax analyzer, and generating a dependency syntax tree corresponding to each sentence;

s33, constructing edges in the dependency syntax graph, and giving weight to each edge through character-level embedded representation in the document to finish the construction of the dependency syntax graph;

s34, carrying out feature fusion and coding on the dependency syntax graph by adopting a graph rolling network layer to obtain a final embedded representation;

s35, obtaining entity embedded representation by fusing final embedded representation of all mentioned words of the entity, and calculating the entity embedded representation by using a multi-layer perceptron;

s36, splicing the entity pairs with the embedded representation and the context information thereof to form a complete code of the entity pairs, completing construction of a dependency syntax relation model, and calculating predicted values based on the dependency syntax relation between the entity pairs through the dependency syntax relation model.

Further: in the step S31, the nodes include character nodes and word-mentioned nodes;

the node characteristics of the character nodes are the coding characteristics of characters;

the node characteristic of the mentioned word node is the average value of all character characteristics in the mentioned word.

Further: in the step S33, the edges in the dependency syntax graph include a bidirectional edge and a unidirectional edge, wherein the weight given to the bidirectional edge is 1, and the calculation formula given to the weight given to the unidirectional edge is:

wherein G is _ij H is the weight value of one-way edge between root nodes i and j of dependency syntax tree _i And h _j Representing the embedding of root node i and node j, respectively.

Further: the step S4 includes the following sub-steps:

s41, calculating a final predicted value according to the predicted value based on the dependency syntax relationship between the entity pairs and the predicted value of the relationship between the entity pairs, wherein the formula is as follows:

z _final ＝z _dep +ηz _const

wherein z is _final Z is the final predicted value _dep Z is a predicted value based on dependency syntax relationship between entity pairs _const The eta is a predicted value of the relation between the entity pairs, and the eta is a weight parameter for adjusting the proportion of the two predicted values;

s42, obtaining a loss function according to the final predicted value, and training a dependency syntax relation model by using the loss function, wherein the mathematical expression of the loss function is as follows:

wherein alpha is a margin hyper-parameter, C is the number of relationship categories, z _s Representing z _final Classified as an irrelevant score, z _i Representation ofz _final The score, max () of each category is a function of a large value; when the relation between two entities is in the correct category, t _i Is 1, t when the relationship of two entities is of incorrect category _i The value of (2) is 0.

The beneficial effects of the invention are as follows:

1. constructing a dependency graph to extract syntactic information in a single sentence, supplementing original text information, and enhancing text representation capability;

2. organizing the hierarchical grammar information of long sentences by using a phrase structure tree to realize fine granularity division of the long sentences;

3. the method has the advantages that the additional grammar information is fused and the long sentence dependency information is captured through the dependency graph and the phrase structure tree, so that the document is better represented, and the document relation extraction effect is improved.

Drawings

FIG. 1 is a flow chart of a document level relationship extraction method according to the present invention

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

As shown in FIG. 1, in one embodiment of the present invention, a document level relationship extraction method based on a dependency syntax graph and a phrase structure tree is provided, comprising the steps of:

s1, encoding a document, and acquiring character-level embedded representation and an attention matrix of the document through a pre-training language model;

in this embodiment, the step S1 includes the following sub-steps:

s12, inputting all characters in the coded document into a pre-training language model to obtain a character level embedded representation of the document;

the whole process is represented as follows:

wherein H is E R ^T*d For character-level embedded representation of the document, A ε R ^T*T For the attention matrix, T is the number of characters, d is the difficulty of character embedding, N is the total number of sentences included in the document, and P _N The number of characters included in the nth sentence, R being used to represent the size of the matrix;

the step S2 comprises the following sub-steps:

the step S21 includes the following sub-steps:

h _j ＝o _j ⊙tanh(c _j )

wherein h is _j Is the updated hidden state;

s23, calculating a predicted value of the relation between the entity pairs by utilizing a bilinear layer according to the embedded representation of the entity pairs and the vector representation of the document;

the formula for calculating the predicted value of the relationship between the entity pairs in the step S23 is as follows:

z _const ＝pair _s,o W ^const v _docu +b ^const

the step S3 comprises the following sub-steps:

in the step S31, the nodes include character nodes and word-mentioned nodes;

the node characteristics of the mentioned word nodes are the average value of all character characteristics in the mentioned words;

in the step S33, the edges in the dependency syntax graph include a bidirectional edge and a unidirectional edge, wherein the weight given to the bidirectional edge is 1, and the calculation formula given to the weight given to the unidirectional edge is:

wherein G is _ij H is the weight value of one-way edge between root nodes i and j of dependency syntax tree _i And h _j Respectively representing the embedding of a root node i and a node j;

s36, splicing the entity pair embedded representation and the context information thereof to form a complete code of the entity pair, and calculating a predicted value based on the dependency syntax relationship between the entity pairs through the complete code of the entity pair;

s4, obtaining a final predicted value according to the predicted value of the entity pair based on the dependency syntax relationship and the predicted value of the entity pair relationship, obtaining a loss function according to the final predicted value, and training the dependency syntax relationship model by using the loss function to obtain a trained dependency syntax relationship model.

The step S4 includes the following sub-steps:

z _final ＝z _dep +ηz _const

wherein alpha is a margin hyper-parameter, C is the number of relationship categories, z _s Representing z _final Classified as an irrelevant score, z _i Representing z _final The score, max () of each category is a function of a large value; when the relation between two entities is in the correct category, t _i Is 1, t when the relationship of two entities is of incorrect category _i The value of (2) is 0.

In the description of the present invention, it should be understood that the terms "center," "thickness," "upper," "lower," "horizontal," "top," "bottom," "inner," "outer," "radial," and the like indicate or are based on the orientation or positional relationship shown in the drawings, merely to facilitate description of the present invention and to simplify the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be configured and operated in a particular orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be interpreted as indicating or implying a relative importance or number of technical features indicated. Thus, a feature defined as "first," "second," "third," or the like, may explicitly or implicitly include one or more such feature.

Claims

1. A document level relation extraction method based on a dependency syntax diagram and a phrase structure tree, comprising the steps of:

2. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 1, wherein said step S1 comprises the sub-steps of:

3. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 2, wherein said step S2 comprises the sub-steps of:

4. The document level relation extracting method based on the dependency syntax diagram and phrase structure tree according to claim 3, wherein said step S21 comprises the sub-steps of:

wherein i is _j Output information x of input gate of node j _j As an input vector for node j,is the ∈th of node j>Hidden state of individual child node, W ⁽ⁱ⁾ A transformation matrix for inputting the characteristics of the gates, < >>Parameter transformation matrix for hidden layer of input gate, b ⁽ⁱ⁾ N (j) is the adjacent node of the node j, which is the bias of the input gate;

wherein c _j To represent the current cell state of node j, u _j Indicates the accepted state of the input gate, as indicated by dot product symbols,is the ∈th of node j>Memory cells of individual child nodes, tanh (, W) as activation function ^(u) And->All are parameter matrixes b ^(u) Is biased.

h _j ＝o _j ⊙tanh(c _j )

wherein h is _j Is the updated hidden state;

5. The method for extracting a document level relationship based on a dependency syntax diagram and phrase structure tree according to claim 4, wherein said step S23 calculates a formula of a predicted value of a relationship between pairs of entities as follows:

z _const ＝pair _s,o W ^const v _docu +b ^const

6. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 5, wherein said step S3 comprises the sub-steps of:

7. The method for extracting a document level relation based on a dependency syntax diagram and phrase structure tree according to claim 6, wherein in said step S31, the nodes include character nodes and mention word nodes;

8. The method for extracting a document level relation based on a dependency syntax diagram and phrase structure tree according to claim 7, wherein in the step S33, the edges in the dependency syntax diagram include a bi-directional edge and a uni-directional edge, wherein a weight value given to the bi-directional edge is 1, and a calculation formula given to the weight value of the uni-directional edge is:

9. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 8, wherein said step S4 comprises the sub-steps of:

z _final ＝z _dep +ηz _const