CN116702755A - Document level relation extraction method based on dependency syntax graph and phrase structure tree - Google Patents
Document level relation extraction method based on dependency syntax graph and phrase structure tree Download PDFInfo
- Publication number
- CN116702755A CN116702755A CN202310749338.6A CN202310749338A CN116702755A CN 116702755 A CN116702755 A CN 116702755A CN 202310749338 A CN202310749338 A CN 202310749338A CN 116702755 A CN116702755 A CN 116702755A
- Authority
- CN
- China
- Prior art keywords
- dependency syntax
- document
- tree
- relation
- predicted value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 30
- 230000006870 function Effects 0.000 claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims abstract description 22
- 238000010586 diagram Methods 0.000 claims abstract description 17
- 239000013598 vector Substances 0.000 claims description 21
- 230000009466 transformation Effects 0.000 claims description 15
- 230000007704 transition Effects 0.000 claims description 15
- 238000000034 method Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000005096 rolling process Methods 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a document-level relation extraction method based on a dependency syntax diagram and a phrase structure tree, which comprises the following steps: encoding the document, and obtaining a character level embedded representation and an attention matrix of the document through a pre-training language model; constructing a phrase structure tree, and calculating a predicted value of the relation between entity pairs; constructing a dependency syntax graph comprising two types of nodes and three types of edges, and embedding predicted values representing the computing entity pairs based on the dependency syntax relationship according to the dependency syntax graph and character levels in the document; obtaining a final predicted value according to the predicted value based on the dependency syntax relationship between the entity pairs and the predicted value of the relationship between the entity pairs, obtaining a loss function according to the final predicted value, training a dependency syntax relationship model by using the loss function, processing a document to be processed by using the trained dependency syntax relationship model, and realizing document-level relationship extraction.
Description
Technical Field
The invention relates to the field of natural language processing, in particular to a document-level relation extraction method based on a dependency syntax diagram and a phrase structure tree.
Background
Relationship extraction is a key task in information extraction, aimed at modeling relationship patterns between entities in unstructured text. In the relationship extraction task, there are two specific scenarios: sentence-level relationship extraction and document-level relationship extraction. The entity of traditional sentence-level relation extraction is often in a sentence, and document-level relation extraction is not only limited to one sentence, but also meets the requirement of a real scene, and is receiving more and more attention.
One major challenge in document-level relationship extraction is to infer the relationship of multiple entity pairs in long sentences, which may contain irrelevant and even noisy information; in the existing document-level relation extraction method, under the condition of a large amount of irrelevant information, complex relation examples are sometimes encountered in document-level relation extraction, but the situation that the extraction effect is poor is often caused by learning the relation of the examples only through the context, and grammar information of the document needs to be considered.
Disclosure of Invention
Aiming at the defects in the prior art, the document-level relation extraction method based on the dependency syntax diagram and the phrase structure tree solves the problem that the extraction effect is poor when the existing document-level relation extraction method encounters a complex relation example.
In order to achieve the aim of the invention, the invention adopts the following technical scheme: there is provided a document-level relation extraction method based on a dependency syntax diagram and a phrase structure tree, comprising the steps of:
s1, encoding a document, and acquiring a character level embedded representation of the document through a pre-training language model;
s2, constructing a phrase structure Tree, and calculating a predicted value of the relation between entity pairs by adopting a Tree-LSTM model;
s3, constructing a dependency syntax graph comprising three types of nodes and three types of edges, constructing a dependency syntax relation model according to the dependency syntax graph and character level embedded representation in a document, and calculating predicted values between entity pairs based on the dependency syntax relation by using the dependency syntax relation model;
the dependency syntax graph comprises a plurality of nodes, wherein the nodes in the dependency syntax graph form a dependency syntax tree, and the dependency syntax tree and edges in the dependency syntax graph form the dependency syntax graph;
s4, obtaining a final predicted value according to the predicted value based on the dependency syntax relation and the predicted value based on the phrase structure relation among the entity pairs, obtaining a loss function according to the final predicted value, and training a dependency syntax relation model by using the loss function to obtain a trained dependency syntax relation model;
s5, processing the document to be subjected to relation extraction by using the trained dependency syntax relation model, and realizing relation extraction at the document level.
Further: the step S1 comprises the following sub-steps:
s11, inserting special symbols before and after each mentioned word in the document to finish coding;
s12, inputting all characters in the encoded document into a pre-training language model to obtain a character level embedded representation of the document.
Further: the step S2 comprises the following sub-steps:
s21, constructing a phrase structure Tree of each sentence of the document, and modeling by using a Tree-LSTM model to obtain a sentence vector embedded representation of each sentence;
s22, embedding sentence vectors of each sentence into a representation to obtain a vector representation of the document;
s23, calculating a predicted value of the relation between the entity pairs by utilizing the bilinear layer according to the embedded representation of the entity pairs and the vector representation of the document.
Further: the step S21 includes the following sub-steps:
s2101, calculating a state transition equation of an input gate in the Tree-LSTM model, wherein the equation is as follows:
wherein i is j Output information x of input gate of node j j Is the input vector of node j, h jl Hidden state of the first child node of node j, W (i) To input the transformation matrix of the gate input features,parameter transformation matrix for hidden layer of input gate, b (i) N (j) is the adjacent node of the node j, which is the bias of the input gate;
s2102, calculating a state transition equation of a forgetting gate in a Tree-LSTM model, wherein the equation is as follows:
wherein f jk Forget gate output information for the kth child node of node j, k=1, 2, |n (j) |, W (f) A transformation matrix for the forgetting gate input feature,off-diagonal parameter matrix for hidden layer of forgetting door, b (f) Bias for forgetting the door;
s2103, calculating a state transition equation of an output gate in the Tree-LSTM model, wherein the equation is as follows:
wherein o is j To output information of the gate, W (o) To output the transformation matrix of the gate input features,a parameter transformation matrix for outputting a hidden layer of the gate b (o) Offset for the output gate;
s2104, calculating a state transition equation of the memory cells in the Tree-LSTM model, wherein the equation is as follows:
wherein c j To represent the current cell state of node j, u j Indicates the accepted state of the input gate, as indicated by dot product symbol, c jl Memory cells of the first child node of node j, tanh (-) is the activation function, W (u) Andall are parameter matrixes b (u) Is biased.
S2105, calculating a state transition equation of an updated hidden state in the Tree-LSTM model, wherein the equation is as follows:
h j =o j ⊙tanh(c j )
wherein h is j Is the updated hidden state;
s2106, constructing a Tree-LSTM model according to a state transfer equation of an input gate in the Tree-LSTM model, a state transfer equation of a forgetting gate in the Tree-LSTM model, a state transfer equation of an output gate in the Tree-LSTM model, a state transfer equation of a memory cell in the Tree-LSTM model and a state transfer equation of an updated hidden state in the Tree-LSTM model;
s2107, constructing a phrase structure Tree for each sentence of the document, and modeling on each phrase structure Tree by using a Tree-LSTM model to obtain a sentence vector representation of each sentence.
Further: the formula for calculating the predicted value of the relationship between the entity pairs in the step S23 is as follows:
z const =pair s,o W const v docu +b const
wherein z is const Pair is a predicted value of relation between entity pairs s,o For entity pair embedding representation, v docu For vector representation of documents, W const And b const Are trainable parameters.
Further: the step S3 comprises the following sub-steps:
s31, taking each character in the document as a node, and constructing a node in the dependency syntax diagram;
s32, inputting each sentence of the document into a dependency syntax analyzer, and generating a dependency syntax tree corresponding to each sentence;
s33, constructing edges in the dependency syntax graph, and giving weight to each edge through character-level embedded representation in the document to finish the construction of the dependency syntax graph;
s34, carrying out feature fusion and coding on the dependency syntax graph by adopting a graph rolling network layer to obtain a final embedded representation;
s35, obtaining entity embedded representation by fusing final embedded representation of all mentioned words of the entity, and calculating the entity embedded representation by using a multi-layer perceptron;
s36, splicing the entity pairs with the embedded representation and the context information thereof to form a complete code of the entity pairs, completing construction of a dependency syntax relation model, and calculating predicted values based on the dependency syntax relation between the entity pairs through the dependency syntax relation model.
Further: in the step S31, the nodes include character nodes and word-mentioned nodes;
the node characteristics of the character nodes are the coding characteristics of characters;
the node characteristic of the mentioned word node is the average value of all character characteristics in the mentioned word.
Further: in the step S33, the edges in the dependency syntax graph include a bidirectional edge and a unidirectional edge, wherein the weight given to the bidirectional edge is 1, and the calculation formula given to the weight given to the unidirectional edge is:
wherein G is ij H is the weight value of one-way edge between root nodes i and j of dependency syntax tree i And h j Representing the embedding of root node i and node j, respectively.
Further: the step S4 includes the following sub-steps:
s41, calculating a final predicted value according to the predicted value based on the dependency syntax relationship between the entity pairs and the predicted value of the relationship between the entity pairs, wherein the formula is as follows:
z final =z dep +ηz const
wherein z is final Z is the final predicted value dep Z is a predicted value based on dependency syntax relationship between entity pairs const The eta is a predicted value of the relation between the entity pairs, and the eta is a weight parameter for adjusting the proportion of the two predicted values;
s42, obtaining a loss function according to the final predicted value, and training a dependency syntax relation model by using the loss function, wherein the mathematical expression of the loss function is as follows:
wherein alpha is a margin hyper-parameter, C is the number of relationship categories, z s Representing z final Classified as an irrelevant score, z i Representation ofz final The score, max () of each category is a function of a large value; when the relation between two entities is in the correct category, t i Is 1, t when the relationship of two entities is of incorrect category i The value of (2) is 0.
The beneficial effects of the invention are as follows:
1. constructing a dependency graph to extract syntactic information in a single sentence, supplementing original text information, and enhancing text representation capability;
2. organizing the hierarchical grammar information of long sentences by using a phrase structure tree to realize fine granularity division of the long sentences;
3. the method has the advantages that the additional grammar information is fused and the long sentence dependency information is captured through the dependency graph and the phrase structure tree, so that the document is better represented, and the document relation extraction effect is improved.
Drawings
FIG. 1 is a flow chart of a document level relationship extraction method according to the present invention
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
As shown in FIG. 1, in one embodiment of the present invention, a document level relationship extraction method based on a dependency syntax graph and a phrase structure tree is provided, comprising the steps of:
s1, encoding a document, and acquiring character-level embedded representation and an attention matrix of the document through a pre-training language model;
in this embodiment, the step S1 includes the following sub-steps:
s11, inserting special symbols before and after each mentioned word in the document to finish coding;
s12, inputting all characters in the coded document into a pre-training language model to obtain a character level embedded representation of the document;
the whole process is represented as follows:
wherein H is E R T*d For character-level embedded representation of the document, A ε R T*T For the attention matrix, T is the number of characters, d is the difficulty of character embedding, N is the total number of sentences included in the document, and P N The number of characters included in the nth sentence, R being used to represent the size of the matrix;
s2, constructing a phrase structure Tree, and calculating a predicted value of the relation between entity pairs by adopting a Tree-LSTM model;
the step S2 comprises the following sub-steps:
s21, constructing a phrase structure Tree of each sentence of the document, and modeling by using a Tree-LSTM model to obtain a sentence vector embedded representation of each sentence;
the step S21 includes the following sub-steps:
s2101, calculating a state transition equation of an input gate in the Tree-LSTM model, wherein the equation is as follows:
wherein i is j Output information x of input gate of node j j Is the input vector of node j, h jl Hidden state of the first child node of node j, W (i) To input the transformation matrix of the gate input features,parameter transformation matrix for hidden layer of input gate, b (i) N (j) is the adjacent node of the node j, which is the bias of the input gate;
s2102, calculating a state transition equation of a forgetting gate in a Tree-LSTM model, wherein the equation is as follows:
wherein f jk Forget gate output information for the kth child node of node j, k=1, 2, |n (j) |, W (f) A transformation matrix for the forgetting gate input feature,off-diagonal parameter matrix for hidden layer of forgetting door, b (f) Bias for forgetting the door;
s2103, calculating a state transition equation of an output gate in the Tree-LSTM model, wherein the equation is as follows:
wherein o is j To output information of the gate, W (o) To output the transformation matrix of the gate input features,a parameter transformation matrix for outputting a hidden layer of the gate b (o) Offset for the output gate;
s2104, calculating a state transition equation of the memory cells in the Tree-LSTM model, wherein the equation is as follows:
wherein c j To represent the current cell state of node j, u j Indicates the accepted state of the input gate, as indicated by dot product symbol, c jl Memory cells of the first child node of node j, tanh (-) is the activation function, W (u) Andall are parameter matrixes b (u) Is biased.
S2105, calculating a state transition equation of an updated hidden state in the Tree-LSTM model, wherein the equation is as follows:
h j =o j ⊙tanh(c j )
wherein h is j Is the updated hidden state;
s2106, constructing a Tree-LSTM model according to a state transfer equation of an input gate in the Tree-LSTM model, a state transfer equation of a forgetting gate in the Tree-LSTM model, a state transfer equation of an output gate in the Tree-LSTM model, a state transfer equation of a memory cell in the Tree-LSTM model and a state transfer equation of an updated hidden state in the Tree-LSTM model;
s2107, constructing a phrase structure Tree for each sentence of the document, and modeling on each phrase structure Tree by using a Tree-LSTM model to obtain a sentence vector representation of each sentence.
S22, embedding sentence vectors of each sentence into a representation to obtain a vector representation of the document;
s23, calculating a predicted value of the relation between the entity pairs by utilizing a bilinear layer according to the embedded representation of the entity pairs and the vector representation of the document;
the formula for calculating the predicted value of the relationship between the entity pairs in the step S23 is as follows:
z const =pair s,o W const v docu +b const
wherein z is const Pair is a predicted value of relation between entity pairs s,o For entity pair embedding representation, v docu For vector representation of documents, W const And b const Are trainable parameters.
S3, constructing a dependency syntax graph comprising three types of nodes and three types of edges, constructing a dependency syntax relation model according to the dependency syntax graph and character level embedded representation in a document, and calculating predicted values between entity pairs based on the dependency syntax relation by using the dependency syntax relation model;
the dependency syntax graph comprises a plurality of nodes, wherein the nodes in the dependency syntax graph form a dependency syntax tree, and the dependency syntax tree and edges in the dependency syntax graph form the dependency syntax graph;
the step S3 comprises the following sub-steps:
s31, taking each character in the document as a node, and constructing a node in the dependency syntax diagram;
in the step S31, the nodes include character nodes and word-mentioned nodes;
the node characteristics of the character nodes are the coding characteristics of characters;
the node characteristics of the mentioned word nodes are the average value of all character characteristics in the mentioned words;
s32, inputting each sentence of the document into a dependency syntax analyzer, and generating a dependency syntax tree corresponding to each sentence;
s33, constructing edges in the dependency syntax graph, and giving weight to each edge through character-level embedded representation in the document to finish the construction of the dependency syntax graph;
in the step S33, the edges in the dependency syntax graph include a bidirectional edge and a unidirectional edge, wherein the weight given to the bidirectional edge is 1, and the calculation formula given to the weight given to the unidirectional edge is:
wherein G is ij H is the weight value of one-way edge between root nodes i and j of dependency syntax tree i And h j Respectively representing the embedding of a root node i and a node j;
s34, carrying out feature fusion and coding on the dependency syntax graph by adopting a graph rolling network layer to obtain a final embedded representation;
s35, obtaining entity embedded representation by fusing final embedded representation of all mentioned words of the entity, and calculating the entity embedded representation by using a multi-layer perceptron;
s36, splicing the entity pair embedded representation and the context information thereof to form a complete code of the entity pair, and calculating a predicted value based on the dependency syntax relationship between the entity pairs through the complete code of the entity pair;
s4, obtaining a final predicted value according to the predicted value of the entity pair based on the dependency syntax relationship and the predicted value of the entity pair relationship, obtaining a loss function according to the final predicted value, and training the dependency syntax relationship model by using the loss function to obtain a trained dependency syntax relationship model.
The step S4 includes the following sub-steps:
s41, calculating a final predicted value according to the predicted value based on the dependency syntax relationship between the entity pairs and the predicted value of the relationship between the entity pairs, wherein the formula is as follows:
z final =z dep +ηz const
wherein z is final Z is the final predicted value dep Z is a predicted value based on dependency syntax relationship between entity pairs const The eta is a predicted value of the relation between the entity pairs, and the eta is a weight parameter for adjusting the proportion of the two predicted values;
s42, obtaining a loss function according to the final predicted value, and training a dependency syntax relation model by using the loss function, wherein the mathematical expression of the loss function is as follows:
wherein alpha is a margin hyper-parameter, C is the number of relationship categories, z s Representing z final Classified as an irrelevant score, z i Representing z final The score, max () of each category is a function of a large value; when the relation between two entities is in the correct category, t i Is 1, t when the relationship of two entities is of incorrect category i The value of (2) is 0.
S5, processing the document to be subjected to relation extraction by using the trained dependency syntax relation model, and realizing relation extraction at the document level.
In the description of the present invention, it should be understood that the terms "center," "thickness," "upper," "lower," "horizontal," "top," "bottom," "inner," "outer," "radial," and the like indicate or are based on the orientation or positional relationship shown in the drawings, merely to facilitate description of the present invention and to simplify the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be configured and operated in a particular orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be interpreted as indicating or implying a relative importance or number of technical features indicated. Thus, a feature defined as "first," "second," "third," or the like, may explicitly or implicitly include one or more such feature.
Claims (9)
1. A document level relation extraction method based on a dependency syntax diagram and a phrase structure tree, comprising the steps of:
s1, encoding a document, and acquiring a character level embedded representation of the document through a pre-training language model;
s2, constructing a phrase structure Tree, and calculating a predicted value of the relation between entity pairs by adopting a Tree-LSTM model;
s3, constructing a dependency syntax graph comprising three types of nodes and three types of edges, constructing a dependency syntax relation model according to the dependency syntax graph and character level embedded representation in a document, and calculating predicted values between entity pairs based on the dependency syntax relation by using the dependency syntax relation model;
the dependency syntax graph comprises a plurality of nodes, wherein the nodes in the dependency syntax graph form a dependency syntax tree, and the dependency syntax tree and edges in the dependency syntax graph form the dependency syntax graph;
s4, obtaining a final predicted value according to the predicted value based on the dependency syntax relation and the predicted value based on the phrase structure relation among the entity pairs, obtaining a loss function according to the final predicted value, and training a dependency syntax relation model by using the loss function to obtain a trained dependency syntax relation model;
s5, processing the document to be subjected to relation extraction by using the trained dependency syntax relation model, and realizing relation extraction at the document level.
2. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 1, wherein said step S1 comprises the sub-steps of:
s11, inserting special symbols before and after each mentioned word in the document to finish coding;
s12, inputting all characters in the encoded document into a pre-training language model to obtain a character level embedded representation of the document.
3. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 2, wherein said step S2 comprises the sub-steps of:
s21, constructing a phrase structure Tree of each sentence of the document, and modeling by using a Tree-LSTM model to obtain a sentence vector embedded representation of each sentence;
s22, embedding sentence vectors of each sentence into a representation to obtain a vector representation of the document;
s23, calculating a predicted value of the relation between the entity pairs by utilizing the bilinear layer according to the embedded representation of the entity pairs and the vector representation of the document.
4. The document level relation extracting method based on the dependency syntax diagram and phrase structure tree according to claim 3, wherein said step S21 comprises the sub-steps of:
s2101, calculating a state transition equation of an input gate in the Tree-LSTM model, wherein the equation is as follows:
wherein i is j Output information x of input gate of node j j As an input vector for node j,is the ∈th of node j>Hidden state of individual child node, W (i) A transformation matrix for inputting the characteristics of the gates, < >>Parameter transformation matrix for hidden layer of input gate, b (i) N (j) is the adjacent node of the node j, which is the bias of the input gate;
s2102, calculating a state transition equation of a forgetting gate in a Tree-LSTM model, wherein the equation is as follows:
wherein f jk Forget gate output information for the kth child node of node j, k=1, 2, |n (j) |, W (f) A transformation matrix for the forgetting gate input feature,off-diagonal parameter matrix for hidden layer of forgetting door, b (f) Bias for forgetting the door;
s2103, calculating a state transition equation of an output gate in the Tree-LSTM model, wherein the equation is as follows:
wherein o is j To output information of the gate, W (o) To output the transformation matrix of the gate input features,a parameter transformation matrix for outputting a hidden layer of the gate b (o) Offset for the output gate;
s2104, calculating a state transition equation of the memory cells in the Tree-LSTM model, wherein the equation is as follows:
wherein c j To represent the current cell state of node j, u j Indicates the accepted state of the input gate, as indicated by dot product symbols,is the ∈th of node j>Memory cells of individual child nodes, tanh (, W) as activation function (u) And->All are parameter matrixes b (u) Is biased.
S2105, calculating a state transition equation of an updated hidden state in the Tree-LSTM model, wherein the equation is as follows:
h j =o j ⊙tanh(c j )
wherein h is j Is the updated hidden state;
s2106, constructing a Tree-LSTM model according to a state transfer equation of an input gate in the Tree-LSTM model, a state transfer equation of a forgetting gate in the Tree-LSTM model, a state transfer equation of an output gate in the Tree-LSTM model, a state transfer equation of a memory cell in the Tree-LSTM model and a state transfer equation of an updated hidden state in the Tree-LSTM model;
s2107, constructing a phrase structure Tree for each sentence of the document, and modeling on each phrase structure Tree by using a Tree-LSTM model to obtain a sentence vector representation of each sentence.
5. The method for extracting a document level relationship based on a dependency syntax diagram and phrase structure tree according to claim 4, wherein said step S23 calculates a formula of a predicted value of a relationship between pairs of entities as follows:
z const =pair s,o W const v docu +b const
wherein z is const Pair is a predicted value of relation between entity pairs s,o For entity pair embedding representation, v docu For vector representation of documents, W const And b const Are trainable parameters.
6. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 5, wherein said step S3 comprises the sub-steps of:
s31, taking each character in the document as a node, and constructing a node in the dependency syntax diagram;
s32, inputting each sentence of the document into a dependency syntax analyzer, and generating a dependency syntax tree corresponding to each sentence;
s33, constructing edges in the dependency syntax graph, and giving weight to each edge through character-level embedded representation in the document to finish the construction of the dependency syntax graph;
s34, carrying out feature fusion and coding on the dependency syntax graph by adopting a graph rolling network layer to obtain a final embedded representation;
s35, obtaining entity embedded representation by fusing final embedded representation of all mentioned words of the entity, and calculating the entity embedded representation by using a multi-layer perceptron;
s36, splicing the entity pairs with the embedded representation and the context information thereof to form a complete code of the entity pairs, completing construction of a dependency syntax relation model, and calculating predicted values based on the dependency syntax relation between the entity pairs through the dependency syntax relation model.
7. The method for extracting a document level relation based on a dependency syntax diagram and phrase structure tree according to claim 6, wherein in said step S31, the nodes include character nodes and mention word nodes;
the node characteristics of the character nodes are the coding characteristics of characters;
the node characteristic of the mentioned word node is the average value of all character characteristics in the mentioned word.
8. The method for extracting a document level relation based on a dependency syntax diagram and phrase structure tree according to claim 7, wherein in the step S33, the edges in the dependency syntax diagram include a bi-directional edge and a uni-directional edge, wherein a weight value given to the bi-directional edge is 1, and a calculation formula given to the weight value of the uni-directional edge is:
wherein G is ij H is the weight value of one-way edge between root nodes i and j of dependency syntax tree i And h j Representing the embedding of root node i and node j, respectively.
9. The method for extracting a document-level relationship based on a dependency syntax diagram and phrase structure tree according to claim 8, wherein said step S4 comprises the sub-steps of:
s41, calculating a final predicted value according to the predicted value based on the dependency syntax relationship between the entity pairs and the predicted value of the relationship between the entity pairs, wherein the formula is as follows:
z final =z dep +ηz const
wherein z is final Z is the final predicted value dep Z is a predicted value based on dependency syntax relationship between entity pairs const The eta is a predicted value of the relation between the entity pairs, and the eta is a weight parameter for adjusting the proportion of the two predicted values;
s42, obtaining a loss function according to the final predicted value, and training a dependency syntax relation model by using the loss function, wherein the mathematical expression of the loss function is as follows:
wherein alpha is a margin hyper-parameter, C is the number of relationship categories, z s Representing z final Classified as an irrelevant score, z i Representing z final The score, max () of each category is a function of a large value; when the relation between two entities is in the correct category, t i Is 1, t when the relationship of two entities is of incorrect category i The value of (2) is 0.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310749338.6A CN116702755A (en) | 2023-06-21 | 2023-06-21 | Document level relation extraction method based on dependency syntax graph and phrase structure tree |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310749338.6A CN116702755A (en) | 2023-06-21 | 2023-06-21 | Document level relation extraction method based on dependency syntax graph and phrase structure tree |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116702755A true CN116702755A (en) | 2023-09-05 |
Family
ID=87844898
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310749338.6A Pending CN116702755A (en) | 2023-06-21 | 2023-06-21 | Document level relation extraction method based on dependency syntax graph and phrase structure tree |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116702755A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117807956A (en) * | 2023-12-29 | 2024-04-02 | 兰州理工大学 | ICD automatic coding method based on clinical text tree structure |
CN117951313A (en) * | 2024-03-15 | 2024-04-30 | 华南理工大学 | Document relation extraction method based on entity relation statistics association |
-
2023
- 2023-06-21 CN CN202310749338.6A patent/CN116702755A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117807956A (en) * | 2023-12-29 | 2024-04-02 | 兰州理工大学 | ICD automatic coding method based on clinical text tree structure |
CN117951313A (en) * | 2024-03-15 | 2024-04-30 | 华南理工大学 | Document relation extraction method based on entity relation statistics association |
CN117951313B (en) * | 2024-03-15 | 2024-07-12 | 华南理工大学 | Document relation extraction method based on entity relation statistics association |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110134771B (en) | Implementation method of multi-attention-machine-based fusion network question-answering system | |
CN110717334B (en) | Text emotion analysis method based on BERT model and double-channel attention | |
CN109543180B (en) | Text emotion analysis method based on attention mechanism | |
CN108984745B (en) | Neural network text classification method fusing multiple knowledge maps | |
CN109284506B (en) | User comment emotion analysis system and method based on attention convolution neural network | |
CN110263323B (en) | Keyword extraction method and system based on barrier type long-time memory neural network | |
CN111241294B (en) | Relationship extraction method of graph convolution network based on dependency analysis and keywords | |
CN109002852B (en) | Image processing method, apparatus, computer readable storage medium and computer device | |
CN110083833B (en) | Method for analyzing emotion by jointly embedding Chinese word vector and aspect word vector | |
CN111858932B (en) | Multiple-feature Chinese and English emotion classification method and system based on Transformer | |
CN116702755A (en) | Document level relation extraction method based on dependency syntax graph and phrase structure tree | |
CN111753024B (en) | Multi-source heterogeneous data entity alignment method oriented to public safety field | |
CN108229582A (en) | Entity recognition dual training method is named in a kind of multitask towards medical domain | |
CN112487143A (en) | Public opinion big data analysis-based multi-label text classification method | |
US20140236578A1 (en) | Question-Answering by Recursive Parse Tree Descent | |
CN111160467A (en) | Image description method based on conditional random field and internal semantic attention | |
CN110765775A (en) | Self-adaptive method for named entity recognition field fusing semantics and label differences | |
CN111079409B (en) | Emotion classification method utilizing context and aspect memory information | |
CN111125333B (en) | Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism | |
CN113704416B (en) | Word sense disambiguation method and device, electronic equipment and computer-readable storage medium | |
CN111950296A (en) | Comment target emotion analysis based on BERT fine tuning model | |
Zhang et al. | A BERT fine-tuning model for targeted sentiment analysis of Chinese online course reviews | |
CN105975497A (en) | Automatic microblog topic recommendation method and device | |
CN112464816A (en) | Local sign language identification method and device based on secondary transfer learning | |
Sadr et al. | Improving the performance of text sentiment analysis using deep convolutional neural network integrated with hierarchical attention layer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |