CN114818682A - Document level entity relation extraction method based on self-adaptive entity path perception - Google Patents
Document level entity relation extraction method based on self-adaptive entity path perception Download PDFInfo
- Publication number
- CN114818682A CN114818682A CN202210749823.9A CN202210749823A CN114818682A CN 114818682 A CN114818682 A CN 114818682A CN 202210749823 A CN202210749823 A CN 202210749823A CN 114818682 A CN114818682 A CN 114818682A
- Authority
- CN
- China
- Prior art keywords
- entity
- document
- node
- nodes
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 64
- 230000008447 perception Effects 0.000 title claims abstract description 35
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 65
- 238000013528 artificial neural network Methods 0.000 claims abstract description 46
- 238000000034 method Methods 0.000 claims abstract description 33
- 238000003062 neural network model Methods 0.000 claims abstract description 24
- 238000012512 characterization method Methods 0.000 claims abstract description 18
- 230000003044 adaptive effect Effects 0.000 claims description 36
- 230000002776 aggregation Effects 0.000 claims description 29
- 238000004220 aggregation Methods 0.000 claims description 29
- 238000012549 training Methods 0.000 claims description 20
- 230000003993 interaction Effects 0.000 claims description 18
- 230000007246 mechanism Effects 0.000 claims description 13
- 235000008694 Humulus lupulus Nutrition 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 11
- 230000004931 aggregating effect Effects 0.000 claims description 9
- 230000014509 gene expression Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 3
- 230000002123 temporal effect Effects 0.000 claims 1
- 238000004590 computer program Methods 0.000 description 10
- 230000005540 biological transmission Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 5
- 230000011218 segmentation Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 238000009411 base construction Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The application relates to a document level entity relation extraction method based on self-adaptive entity path perception. The method comprises the following steps: constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics; predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities; calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model. By adopting the method, the accuracy of extracting the document-level entity relationship can be improved.
Description
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a document level entity relationship extraction method and apparatus based on adaptive entity path sensing, a computer device, and a storage medium.
Background
Entity relationship extraction is a classic task in the field of information extraction, which aims at identifying semantic relationships between entities (concepts) contained in a given unstructured text and storing the results in a structured form of relational triples. If a given text "10 months in 2017, the head of today announces that 10 hundred million dollars are valued to acquire music short video platform music.ly", and a relationship triple "head of today, acquisition, music.ly" is obtained by extracting entity relationships. The entity relation extraction is used as a key technology of information extraction, can play an important role in multiple fields of natural language processing, and particularly has great research significance and wide application prospect in the era background of internet mass information. From the theoretical value aspect, the entity relation extraction relates to theories and methods of multiple disciplines such as machine learning, data mining, natural language processing and the like. From the application aspect, the entity relation extraction can be used for automatically constructing a large-scale knowledge base, particularly a knowledge graph, provides data support for the construction of an information retrieval and automatic question-answering system, and is also the basis of natural language understanding. The existing entity relation extraction work mainly focuses on sentence-level extraction and is limited to entity semantic relations in a single sentence text. However, in a real application scenario, description of semantic relationships of entities is very complex, a large number of relationships between entities are expressed by a plurality of sentences, and complex associations between a plurality of entities are shown. Statistics based on manual annotation data sampled from wikipedia indicate that at least 40% of entity semantic relationship facts can only be jointly captured from multiple sentences. Therefore, there is a need to push entity relationship extraction to a document level that is more consistent with real scenarios. Compared with sentence-level entity relationship extraction, document-level entity relationship extraction is more challenging, and requires more complex reasoning skills, such as logical reasoning, co-reference reasoning, general knowledge reasoning and the like. A document may contain multiple entities, each with multiple references in different contexts. In order to identify relationships between entities across sentences, it is necessary to be able to model complex interactions between multiple entities in a document and to leverage the multiple-mentioned context information of the entities, which is clearly beyond the capability of sentence-level relationship extraction methods.
At present, with the intensive research of graph neural networks, researchers try to model various semantic information in documents by using document graphs, wherein words, mentions, entities or sentences are used as nodes, and heuristic rules are utilized to connect the nodes into the document graphs. These methods focus on how to build better document graphs to retain more semantic information and how to better propagate information on the graph. With the help of the strong representation capability of the graph neural network, the method achieves good effect, but has the following problems: a) when the existing work is used for aggregating entity representations, multiple reference representations are aggregated without distinction, and then the reference representations are combined into a single global representation to perform semantic relation prediction with all other entities. In fact, since multiple mentions of an entity are in different contexts in a document, the role each node plays should be different when connecting different types of nodes. b) The graph neural network implicitly performs reasoning through node information propagation, in order to capture interaction of high-order information in a graph, a multilayer graph network structure (such as multiple graph convolutions) is often used in an overlapping mode, the representation of nodes in the same connected component in the graph tends to converge to a subspace irrelevant to input, and therefore the representation of the learned nodes is too smooth and not accurate enough.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a document level entity relationship extraction method, apparatus, computer device and storage medium based on adaptive entity path awareness, which can improve the accuracy of document level entity relationship extraction.
A document level entity relationship extraction method based on adaptive entity path perception, the method comprises the following steps:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics;
predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
In one embodiment, constructing a document graph according to the context characterization sequence of sentences and the positions of the entities in the document to be extracted includes:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connecting nodes of the entity nodes and the sentence nodes in the documents to be extracted.
In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual characterizations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.
In one embodiment, updating the initial representation of the entity node from both the breadth and the depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprising:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, a message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph to obtain entity node representation of document level semantics.
In one embodiment, aggregating neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, modeling interaction between entity pairs, jointly controlling a message propagation algorithm from both an extent and a depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, and obtaining an entity node representation of document-level semantics, includes:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, and for the aspect of breadth, in the aggregation process of neighbor information of each hop, a temporary aggregation representation of the breadth of the node is obtained according to a breadth self-adaptive mode;
according to the long and short memory network of the LSTM, the extent temporary aggregation representation of the nodes is performed by utilizing a plurality of gating mechanisms, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that the entity node representation of the document level semantics is obtained.
In one embodiment, obtaining the extent temporary aggregation of the nodes according to the extent adaptive manner includes:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Wherein,,refers to the weight parameters of node u and neighbor v,are learnable parameters that linearly transform the neighbor features,representing nodesIs shown in the drawing (a) and (b),andreferring to the query and key matrices in the attention mechanism,refers to a feed-forward neural network, and the neural network,is a nodeIs determined by the node of the neighbor node set,representing nodesIs shown in the drawing (a) and (b),representing nodesThe neighbor nodes of (a) are,Trepresenting a transpose operation.
In one embodiment, in terms of depth, according to a long and short memory network of an LSTM, a plurality of gating mechanisms are used to temporarily aggregate and express the breadth of nodes to selectively store document-level high-order information related to the nodes, and only neighbors within a certain hop count are selected to be propagated to obtain an entity node expression of document-level semantics, including:
for the temporary aggregation representation of the breadth of the nodes, effective information in the temporary aggregation representation of the breadth of the nodes is added into a memory unit by using an update gate, invalid information in a previous layer of memory unit is filtered by a forget gate, an output gate controls the memory unit, and entity node representation of document level semantics is output.
In one embodiment, predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities includes:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Wherein, ,Andare learnable parameters of classifiers in a feed-forward neural network,refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,representing different physical node representationsAndand splicing to obtain the characteristics of the entity pairs.
In one embodiment, calculating the loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities comprises:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities as
Wherein TH represents a threshold relationship label TH,a set of relationship tags representing the actual existence of entities,representing negative exemplar relational tag sets, logits refer to pairs of entitiesThe scores of all of the relationship tags in (c),finger relation labelThe score value of (a) is calculated,a label representing the relationship between the user and the user,representing relationship labelsThe score value of (a) is calculated,label for representing threshold relationThe score value of (a).
A document level entity relationship extraction apparatus based on adaptive entity path awareness, the apparatus comprising:
the data preprocessing module is used for acquiring the document to be extracted and the position of the entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a workprovider algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
the document graph building module is used for building a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network; carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
the initial representation updating module is used for updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantics;
the prediction module is used for predicting the entity node representation of the document level semantics according to the feedforward neural network to obtain the score value of the relation label between the entities;
the document level entity relation extraction module is used for calculating a loss value according to the score value of the relation labels among the entities and the relation labels actually existing among the entities, and iteratively optimizing learnable parameters in the deep neural network model by utilizing the loss value and a back propagation algorithm to obtain an entity relation extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics;
predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics;
predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
The invention adopts a pre-training language model to model the complex interaction between different layers of information, learns the vocabulary representation of deep contextualization, models the semantic information in the document by constructing a fine document graph, then controls a message transmission algorithm from two aspects of breadth and depth, screens and aggregates the document-level information by learning the self-adaptive sensing path of node message transmission, selectively aggregates the effective document-level information of a target entity, solves the problem that the current entity relation extraction is limited to the entity relation in a sentence, also solves the problem that neighbor nodes and nodes are treated to represent smoothness without distinguishing when the message is transmitted in the document-level entity relation extraction method based on the document graph, and improves the performance of the document-level entity relation extraction, the method realizes the efficient extraction of the entity semantic relation, and provides data support and core algorithm technology for the large-scale knowledge base construction, information retrieval, automatic question answering system and natural language processing application of natural language understanding.
Drawings
FIG. 1 is a flowchart illustrating a document-level entity relationship extraction method based on adaptive entity path awareness according to an embodiment;
FIG. 2 is a diagram of adaptive entity path sensing in one embodiment;
FIG. 3 is a block diagram of an embodiment of an apparatus for document-level entity relationship extraction based on adaptive entity path awareness;
FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, there is provided a document level entity relationship extraction method based on adaptive entity path perception, including the following steps:
102, acquiring a document to be extracted and the position of an entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document contains a plurality of sentences.
Step 102 of the present invention uses the document to be extractedRepresenting that document D is composed of N sentences, whereinMeaning that the ith sentence contains M words. The document is marked with an entity set containing P entitiesWhereinRefer to the ith entity in the document for Q co-referenced entity mentions, each appearing in a different context. Respectively inputting sentences in the document into the wordpienteThe word segmentation device carries out word segmentation, for example, after the ith sentence is segmented, the word segmentation is carried outWherein k is<And = M, obtaining the preprocessed document. And preprocessing the document to be extracted to facilitate the context coding of a pre-training language model.
104, constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network, and the pre-training language model is used for carrying out context coding on the preprocessed document to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph includes an initial representation of the entity node.
In order to better model the semantics of an input document, inputting the preprocessed document after word segmentation into a pre-training language model BERT, mapping a sequence of the document after word segmentation into a low-dimensional real number vector containing context semantics through the pre-training language model BERT, wherein the input sequence corresponding to the ith sentenceMapping to a context characterization sequenceWhereinAnd d is the hidden dimension, typically 768. The pre-training language model BERT is adopted, and the BERT can be used for modeling complex interaction between different levels of information and learning deep contextualized vocabulary representation.
The method comprises the steps of calculating initial representations of a mention node, an entity node and a sentence node according to a context characterization sequence of a sentence and the position of an entity in a document to be extracted, and constructing a document graph by using the initial representations of the mention node, the entity node and the sentence node and the natural association connecting nodes of the entity node and the sentence node in the document to be extracted. Semantic information within a document is modeled by building a refined document map. The message transmission algorithm on the graph of the path perception of the self-adaptive entity is an improvement of the message transmission algorithm on the graph, the previous message transmission algorithm on the graph does not make a choice when carrying out node aggregation, all neighbor information of a target node is aggregated, and the message transmission algorithm is not controlled by the message from the breadth and the depth.
And 106, updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantics.
Neighbor information in a target node N hop is aggregated by using an adaptive entity path-aware on-graph message propagation algorithm, interaction between entity pairs is modeled, an adaptive path of an entity on a document graph is learned to promote representation of the entity node, and further entity node representation of document level semantics is obtained, the document level information is screened and aggregated by learning the adaptive sensing path of node message propagation, effective document level information of the target entity is selectively aggregated to capture more effective relation semantic information, and the problem that the current entity relation extraction is limited to intra-sentence entity relations is solved.
And 108, predicting the entity node representation of the document level semantics according to the feedforward neural network to obtain the score value of the relation label between the entities.
Entities generally refer to proper nouns or concepts such as names of people, places, organizations, and relationships refer to semantic relationships between entities, such as: "the head of this day announces that 10 hundred million dollars estimate values purchase music short video platform music.ly," and the relation triple "head of this day, purchase, music.ly" is obtained through entity relation extraction.
In order to predict the semantic relation included between the entity node pairs of the document-level semantics, head and tail entity representations included in the entity node representations of the document-level semantics are spliced to obtain entity pairs including the semantic relation, the entity pairs including the semantic relation are predicted according to a feedforward neural network, and the loss value is calculated by using the prediction result, namely the score of the relation labels between the entities and the actually existing relation labels between the entities, so that the deep neural network model can be trained, and an accurate entity relation extraction model can be obtained.
And calculating a loss value according to the score value of the relationship labels among the entities and the real relationship labels among the entities, wherein the real relationship labels among the entities are artificially pre-labeled real relationship labels among the entities, the loss value is minimized by random gradient descent, and the learnable parameters in the deep neural network model are updated layer by layer according to error back propagation. And when the loss function is converged in the optimization process, an entity relationship extraction model is obtained, and the entity relationship extraction model can be used for document-level entity relationship extraction after being stored.
In the document-level entity relation extraction method based on self-adaptive entity path perception, the invention adopts a pre-training language model to model complex interaction among different layers of information, learns deep contextualized vocabulary representation, models semantic information in a document by constructing a fine document graph, then controls a message transmission algorithm from two aspects of breadth and depth, screens and aggregates the document-level information by learning a self-adaptive perception path of node message transmission, selectively aggregates effective document-level information of a target entity, solves the problem that the current entity relation extraction is limited to intra-sentence entity relation, also solves the problem that neighbor nodes and nodes are treated to represent smoothness without distinguishing when the message is transmitted in the document-level entity relation extraction method based on the document graph, improves the performance of document-level entity relation extraction, realizes the efficient extraction of entity semantic relation, and a data support and core algorithm technology is provided for large-scale knowledge base construction, information retrieval and automatic question answering systems and natural language processing application of natural language understanding.
In one embodiment, constructing a document graph according to the context token sequence of the sentence and the position of the entity in the document to be extracted includes:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connecting nodes of the entity nodes and the sentence nodes in the documents to be extracted.
In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual representations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.
In particular embodiments, references to nodes are intended to mean different references of each entity in the document. The representation of the reference node is the average of the hidden representations corresponding to the words contained in the reference, assuming a documentIf the total number of the nodes contains N references, the representation form of the reference nodes isWhereinMention is made of type embedding of nodes. The form of the entity node is similar to that of the mentioned node, the representation form of the entity node is the average value of all mentioned representations corresponding to the entity, and the representation form of the entity node is that P entities are contained in one documentWhereinIs a type embedding of the entity node. The expression form of the sentence nodes is the average value of hidden expressions corresponding to all contained words in the sentence sequence, and if a document contains T sentences, the expression form of the sentence nodes isWhereinIs type embedding of sentence nodes.
Obtaining the expression set of the nodes through the three types of node structuresWhere d is the hidden dimension, for a total of N + R + T nodes.
After the node construction is completed, connecting nodes based on natural association between document node elements to form a document graph: a) mention node-mention node edge: and the mentions in the same sentence are connected with each other. b) Mention of node-sentence node edge: mentions are interconnected with the sentence in which they are located. c) Mention of node-entity node edges: the sentence entities mentioned as corresponding thereto are connected to each other. d) Entity node-sentence node edge: the entity is linked to the sentence that contains its mention. e) Sentence node-sentence node edge: all sentence nodes are connected to each other. It is noted that two entity nodes are not directly connected in the graph, and the purpose is to aggregate the relationship between the multi-hop intermediate node modeling entity pairs between the entity nodes by using the adaptive entity path-aware on-graph message propagation algorithm in the next step.
In conclusion, the constructed N + R + T mention, entity and sentence nodes are connected into the document graph by utilizing the natural association among different node elements of the documentWhere V is the set of nodes and E is the set of edges.
In one embodiment, updating the initial representation of the entity node from both the breadth and the depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprising:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, a message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph to obtain entity node representation of document level semantics.
In one embodiment, aggregating neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, modeling interaction between entity pairs, jointly controlling a message propagation algorithm from both an extent and a depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, and obtaining an entity node representation of document-level semantics, includes:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, and for the aspect of breadth, in the aggregation process of neighbor information of each hop, a temporary aggregation representation of the breadth of the node is obtained according to a breadth self-adaptive mode;
according to the LSTM long and short memory network in the depth aspect, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.
In one embodiment, obtaining the extent temporary aggregation of the nodes according to the extent adaptive manner includes:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Wherein,,refers to the weight parameters of node u and neighbor v,are learnable parameters that linearly transform the neighbor features,representing nodesIs shown in the drawing (a) and (b),andreferring to the query and key matrices in the attention mechanism,refers to a feed-forward neural network, and the neural network,is a nodeIs determined by the node of the neighbor node set,representing nodesIs shown in the drawing (a) and (b),representing nodesOf the node(s) of (a) is,Trepresenting a transpose operation.
In a specific embodiment, for the breadth aspect, the information aggregation process of each hop neighbor by utilizing the multi-layer graph attention network is used for representing each node at the l +1 th layerFirstly obtaining the temporary aggregation representation of the breadth of the node in a breadth self-adaptive mode shown by the following formula:
By giving different rights to different neighboursAnd the first-order neighbor nodes are treated differently.
In one embodiment, in terms of depth, according to a long and short memory network of an LSTM, a plurality of gating mechanisms are used to temporarily aggregate and express the breadth of nodes to selectively store document-level high-order information related to the nodes, and only neighbors within a certain hop count are selected to be propagated to obtain an entity node expression of document-level semantics, including:
and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forgetting gate, controlling the memory unit by using an output gate, and outputting the entity node representation of the document-level semantics.
In the specific embodiment, the long-short memory of the LSTM is introduced, a plurality of gating mechanisms are utilized to store and update neighbor information of each hop, document-level high-order information related to nodes is selectively stored, only neighbors within a certain hop number are selected to be transmitted, and the problems of transmission overload and over-smoothness are effectively prevented. Node-based breadth-temporal aggregation representationUpdating doorAdding new valid information to memory cellsMiddle and forget doorThen filter out the memory cells of the previous layerThe updating gate and the forgetting gate are matched with each other to play the roles of selective extraction and filtration when a farther neighbor is searched. Finally, an output gateControl memory unitOutput nodeiTo (1) at+Level 1 node representation. The calculation procedure is as follows.
Wherein,,,Respectively refer to learnable parameters of linear transformation corresponding to the forgetting gate, the updating gate, the output gate and the memory unit.
As shown in FIG. 2, the method determines a suitable subgraph by expanding the width (which hop neighbor is important) and the depth (the importance of the t-th hop neighbor) of each node, so as to learn the self-adaptive path of the information propagation of the entity on the document graph and selectively aggregate the effective documents of the target entityLevel information, which solves the problem that the extraction of entity relationship is limited to the entity relationship in the sentence. Obtaining a set of entity node representations containing document-level semantics from multiple iterations through the message propagation algorithm。
In one embodiment, predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities includes:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Wherein, ,Andare learnable parameters of classifiers in a feed-forward neural network,refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,representing different physical node representationsAndand splicing to obtain the characteristics of the entity pairs.
In particular embodiments, first, to predict entity pairsThe included semantic relation represents the head and tail entities included in the entity pairAndsplicing to obtain characteristics of entity pairs,
Then, the feedforward neural network is used to determine the characteristics of the entity pairsPair of computing entitiesScore of all relationship tags in:
Wherein, ,Andare learnable parameters of classifiers in a feed-forward neural network,refers to the activation function, d refers to the hidden dimension in the feedforward neural network, and k is the number of labels.
In the prediction stage, normalization is carried out by using a nonlinear activation function sigmoid, and an entity pair can be obtainedHas a relationship label thereinThe probability value of (a) is determined,
whereinFinger-shapedThe score value of the middle relationship label r. sigmoid is a value which is converted into a value between 0 and 1, and the probability that the relationship label given by the model exists in the target entity pair is taken as the value, so that the interpretability of the relationship label score is enhanced. Since the model is self-learned, the final output range cannot be known in advance, for example, it is unclear whether the score of the label is 100 high or low, the range can be compressed within (0-1) by using sigmoid, and thus it is known that scores generally greater than 0.5 are very high.
In one embodiment, calculating the loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities comprises:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities as
Wherein TH represents a threshold relationship label TH,a set of relationship tags representing the actual existence of entities,representing negative exemplar relational tag sets, logits refer to pairs of entitiesThe scores of all of the relationship tags in (c),finger relationship labelThe score value of (a) is calculated,a label representing the relationship between the user and the user,representing relationship labelsThe score value of (a) is calculated,label for representing threshold relationThe score value of (a).
In particular embodiments, to more efficiently handle the multi-tag problem, i.e., the same entity pair may contain multiple relationship tags, and the exemplar imbalance problem, i.e., most entity pairs are negative exemplars that do not contain any relationship tags. The invention adopts adaptive threshold loss as a loss function and optimizes the model parameters in an end-to-end mode. The adaptive threshold loss introduces an additional threshold relation label TH, and the optimization goal is the true existing positive sample relation label set between the entitiesIs higher than a threshold class label TH, and negative sample relation label sets do not exist among entitiesIs lower than a threshold class label TH, wherein a positive exemplar label refers to a relationship label that exists between entities and is actually present between entities, and a negative exemplar relationship label refers to a relationship that does not exist between entities. The loss function is calculated as follows:
wherein logis refers to an entity pairScores for all relationship tags in (c). In order to obtain the optimal model parameters, the invention calculates the loss value between the semantic relationship between the entities and the relationship labels actually existing between the entities through the loss function, minimizes the loss value L by using random gradient descent, and updates the learnable parameters in the model layer by layer according to error back propagation. And after the loss function is converged in the optimization process, the model is stored and then used for extracting the document-level entity relationship.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 3, there is provided a document level entity relationship extraction apparatus based on adaptive entity path perception, including: a data preprocessing module 302, a build document graph module 304, an initial representation updating module 306, a prediction module 308, and a document-level entity relationship extraction module 310, wherein:
the data preprocessing module 302 is configured to obtain a document to be extracted and a position of an entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
a build document map module 304 for building a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network, and the pre-training language model is used for carrying out context coding on the preprocessed document to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
the initial representation updating module 306 is used for updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantic;
the prediction module 308 is configured to predict an entity node representation of document-level semantics according to a feed-forward neural network, so as to obtain a score value of a relationship label between entities;
the document-level entity relationship extraction module 310 is configured to calculate a loss value according to the score values of the relationship labels between the entities and the actually existing relationship labels between the entities, and iteratively optimize learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.
In one embodiment, the construct document map module 304 is further configured to construct a document map according to the context token sequence of the sentence and the position of the entity in the document to be extracted, including:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connection nodes of the entity nodes and the sentence nodes in the documents to be extracted.
In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual characterizations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens for all words in the sentence.
In one embodiment, the initial representation updating module 306 is further configured to update the initial representation of the entity node in both width and depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm, and obtain an entity node representation of document-level semantics, including:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, a message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph to obtain entity node representation of document level semantics.
In one embodiment, the initial representation updating module 306 is further configured to aggregate neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, model interactions between entity pairs, jointly control a message propagation algorithm in both breadth and depth, filter and aggregate document-level information by automatically learning entity-related adaptive paths on the document graph, and obtain an entity node representation of document-level semantics, including:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, and for the aspect of breadth, in the aggregation process of neighbor information of each hop, a temporary aggregation representation of the breadth of the node is obtained according to a breadth self-adaptive mode;
according to the LSTM long and short memory network in the depth aspect, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.
In one embodiment, the initial representation updating module 306 is further configured to obtain a temporary aggregation of the extents of the nodes according to an extent adaptive manner, including:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Wherein,,refers to the weight parameters of node u and neighbor v,are learnable parameters that linearly transform the neighbor features,representing nodesIs shown in the drawing (a) and (b),andreferring to the query and key matrices in the attention mechanism,refers to a feed-forward neural network, and the neural network,is a nodeSet of neighbor nodesRepresenting nodesIs shown in the drawing (a) and (b),representing nodesThe neighbor nodes of (a) are,Trepresenting a transpose operation.
In one embodiment, the initial representation updating module 306 is further configured to, in terms of depth, according to a long-short memory network of an LSTM, utilize multiple gating mechanisms to temporarily aggregate representations of the breadth of the nodes to selectively store document-level high-order information related to the nodes, and select only neighbors within a certain hop count for propagation, so as to obtain an entity node representation of document-level semantics, where the method includes:
and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forgetting gate, controlling the memory unit by using an output gate, and outputting the entity node representation of the document-level semantics.
In one embodiment, the prediction module 308 is further configured to predict the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities, including:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Wherein, ,Andare learnable parameters of classifiers in a feed-forward neural network,refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,representing different physical node representationsAndand splicing to obtain the characteristics of the entity pairs.
In one embodiment, the document-level entity relationship extraction module 310 is further configured to calculate a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities, including:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities as
Wherein TH represents a threshold relationship label TH,a set of relationship tags representing the actual existence of entities,representing negative exemplar relational tag sets, logits refer to pairs of entitiesThe scores of all of the relationship tags in (c),finger relation labelThe score value of (a) is obtained,a label representing the relationship between the user and the user,representing relationship labelsThe score value of (a) is calculated,label for representing threshold relationThe score value of (a).
For the specific definition of the document-level entity relationship extracting apparatus based on adaptive entity path sensing, refer to the above definition of the document-level entity relationship extracting method based on adaptive entity path sensing, and no further description is given here. The modules in the document level entity relation extraction device based on the adaptive entity path perception can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a document level entity relationship extraction method based on adaptive entity path perception. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method in the above embodiments when the processor executes the computer program.
In an embodiment, a computer storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method in the above-mentioned embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (9)
1. A document level entity relation extraction method based on adaptive entity path perception is characterized by comprising the following steps:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on the document to be extracted according to a workprovider algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain an entity node representation of document level semantics;
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels among the entities and the relationship labels actually existing among the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
2. The method according to claim 1, wherein constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted comprises:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations of the mention nodes, the entity nodes and the sentence nodes and the natural associated connection nodes of the mention nodes, the entity nodes and the sentence nodes in the documents to be extracted.
3. The method according to claim 2, wherein the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises interconnection between the mention node and the mention node, interconnection between the mention node and the sentence node, interconnection between the mention node and the entity node and interconnection between the entity node and the sentence node; the mentioned nodes, the entity nodes and the sentence nodes form a node set of the document graph; the mentioned nodes, the entity nodes and the sentence nodes are naturally associated in the document to be extracted to form an edge set of the document graph; the reference node is an average value of context tokens referring to corresponding words in the document; the entity node is an average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.
4. The method of claim 1, wherein updating the initial representation of the entity nodes from both breadth and depth to the document graph using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprises:
neighbor information in target node N hops in the document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, the message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph, so that entity node representation of document level semantics is obtained.
5. The method of claim 4, wherein aggregating neighbor information within N hops of a target node in the document graph using an adaptive entity path-aware on-graph message propagation algorithm, modeling interactions between pairs of entities, co-controlling the message propagation algorithm in both breadth and depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, resulting in an entity node representation of document-level semantics, comprises:
aggregating neighbor information in N hops of a target node in the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception, modeling interaction between entity pairs, and obtaining temporary aggregation representation of the node in the aggregation process of neighbor information of each hop according to an extent self-adaptive mode in terms of extent;
and according to the long and short memory network of the LSTM in depth, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.
6. The method of claim 5, wherein obtaining the extent temporal aggregation of nodes according to an extent adaptive approach comprises:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Wherein,,refers to the weight parameters of node u and neighbor v,are learnable parameters that linearly transform the neighbor features,representing nodesIs shown in the drawing (a) and (b),andreferring to the query and key matrices in the attention mechanism,refers to a feed-forward neural network, and the neural network,is a nodeIs determined by the node of the neighbor node set,representing nodesIs shown in the drawing (a) and (b),representing nodesThe neighbor nodes of (a) are,Trepresenting a transpose operation.
7. The method of claim 5, wherein the temporally aggregating the representation of the breadth of the nodes using a plurality of gating mechanisms to selectively save the document level high-order information related to the nodes according to the long and short memory network of the LSTM in terms of depth, and selecting only the neighbors within a certain hop count for propagation to obtain the entity node representation of the document level semantics, comprises:
and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forget gate, controlling the memory unit by an output gate, and outputting the entity node representation of the document-level semantics.
8. The method of claim 1, wherein predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities comprises:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Wherein, ,Andare learnable parameters of classifiers in a feed-forward neural network,refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,representing different physical node representationsAndand splicing to obtain the characteristics of the entity pairs.
9. The method of claim 8, wherein calculating a loss value based on the scoring values of the relationship labels between the entities and the actually existing relationship labels between the entities comprises:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities
Wherein TH represents a threshold relationship label TH,a set of relationship tags representing the actual existence of entities,representing negative exemplar relational tag sets, logits refer to pairs of entitiesThe scores of all of the relationship tags in (c),finger relationship labelThe score value of (a) is calculated,a label representing the relationship between the user and the user,representing relationship labelsThe score value of (a) is calculated,a score value representing a threshold relationship label TH.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210749823.9A CN114818682B (en) | 2022-06-29 | 2022-06-29 | Document level entity relation extraction method based on self-adaptive entity path perception |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210749823.9A CN114818682B (en) | 2022-06-29 | 2022-06-29 | Document level entity relation extraction method based on self-adaptive entity path perception |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114818682A true CN114818682A (en) | 2022-07-29 |
CN114818682B CN114818682B (en) | 2022-09-02 |
Family
ID=82523327
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210749823.9A Active CN114818682B (en) | 2022-06-29 | 2022-06-29 | Document level entity relation extraction method based on self-adaptive entity path perception |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114818682B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116522935A (en) * | 2023-03-29 | 2023-08-01 | 北京德风新征程科技股份有限公司 | Text data processing method, processing device and electronic equipment |
CN117648415A (en) * | 2023-10-17 | 2024-03-05 | 北京邮电大学 | Training method of text detection model, text detection method and related equipment |
CN118569267A (en) * | 2024-08-01 | 2024-08-30 | 贵州大学 | Relation extraction method for entity and mention enhancement based on semantic and association modeling |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351749A1 (en) * | 2016-06-03 | 2017-12-07 | Microsoft Technology Licensing, Llc | Relation extraction across sentence boundaries |
US20210019370A1 (en) * | 2019-07-19 | 2021-01-21 | Siemens Aktiengesellschaft | Neural relation extraction within and across sentence boundaries |
CN114090792A (en) * | 2021-11-25 | 2022-02-25 | 润联软件系统(深圳)有限公司 | Document relation extraction method based on comparison learning and related equipment thereof |
CN114298052A (en) * | 2022-01-04 | 2022-04-08 | 中国人民解放军国防科技大学 | Entity joint labeling relation extraction method and system based on probability graph |
CN114398491A (en) * | 2021-12-21 | 2022-04-26 | 成都量子矩阵科技有限公司 | Semantic segmentation image entity relation reasoning method based on knowledge graph |
-
2022
- 2022-06-29 CN CN202210749823.9A patent/CN114818682B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351749A1 (en) * | 2016-06-03 | 2017-12-07 | Microsoft Technology Licensing, Llc | Relation extraction across sentence boundaries |
US20210019370A1 (en) * | 2019-07-19 | 2021-01-21 | Siemens Aktiengesellschaft | Neural relation extraction within and across sentence boundaries |
CN114090792A (en) * | 2021-11-25 | 2022-02-25 | 润联软件系统(深圳)有限公司 | Document relation extraction method based on comparison learning and related equipment thereof |
CN114398491A (en) * | 2021-12-21 | 2022-04-26 | 成都量子矩阵科技有限公司 | Semantic segmentation image entity relation reasoning method based on knowledge graph |
CN114298052A (en) * | 2022-01-04 | 2022-04-08 | 中国人民解放军国防科技大学 | Entity joint labeling relation extraction method and system based on probability graph |
Non-Patent Citations (1)
Title |
---|
张翠等: "融合句法依存树注意力的关系抽取研究", 《广东通信技术》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116522935A (en) * | 2023-03-29 | 2023-08-01 | 北京德风新征程科技股份有限公司 | Text data processing method, processing device and electronic equipment |
CN116522935B (en) * | 2023-03-29 | 2024-03-29 | 北京德风新征程科技股份有限公司 | Text data processing method, processing device and electronic equipment |
CN117648415A (en) * | 2023-10-17 | 2024-03-05 | 北京邮电大学 | Training method of text detection model, text detection method and related equipment |
CN118569267A (en) * | 2024-08-01 | 2024-08-30 | 贵州大学 | Relation extraction method for entity and mention enhancement based on semantic and association modeling |
CN118569267B (en) * | 2024-08-01 | 2024-10-01 | 贵州大学 | Relation extraction method for entity and mention enhancement based on semantic and association modeling |
Also Published As
Publication number | Publication date |
---|---|
CN114818682B (en) | 2022-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ribeiro et al. | Anchors: High-precision model-agnostic explanations | |
CN114818682B (en) | Document level entity relation extraction method based on self-adaptive entity path perception | |
Wang et al. | ADRL: An attention-based deep reinforcement learning framework for knowledge graph reasoning | |
CN114330510B (en) | Model training method, device, electronic equipment and storage medium | |
CN112396185A (en) | Fact verification method, system, computer equipment and storage medium | |
CN112905801A (en) | Event map-based travel prediction method, system, device and storage medium | |
US20240046127A1 (en) | Dynamic causal discovery in imitation learning | |
US20240232519A1 (en) | Automated notebook completion using sequence-to-sequence transformer | |
Xiong et al. | DGI: recognition of textual entailment via dynamic gate matching | |
Liu et al. | Interpretability of computational models for sentiment analysis | |
CN114444515B (en) | Relation extraction method based on entity semantic fusion | |
US12106045B2 (en) | Self-learning annotations to generate rules to be utilized by rule-based system | |
Yang et al. | Generation-based parallel particle swarm optimization for adversarial text attacks | |
CN112015890B (en) | Method and device for generating movie script abstract | |
WO2021139255A1 (en) | Model based method and apparatus for predicting data change frequency, and computer device | |
Guo | [Retracted] Financial Market Sentiment Prediction Technology and Application Based on Deep Learning Model | |
Li | [Retracted] Forecast and Simulation of the Public Opinion on the Public Policy Based on the Markov Model | |
CN116541507A (en) | Visual question-answering method and system based on dynamic semantic graph neural network | |
WO2024098282A1 (en) | Geometric problem-solving method and apparatus, and device and storage medium | |
Yigit et al. | Assessing the impact of minor modifications on the interior structure of GRU: GRU1 and GRU2 | |
CN114579761A (en) | Information security knowledge entity relation connection prediction method, system and medium | |
Eisenstadt et al. | Autocompletion of Architectural Spatial Configurations Using Case-Based Reasoning, Graph Clustering, and Deep Learning | |
Liu et al. | An improved Harris Hawks optimization for Bayesian network structure learning via genetic operators | |
CN118709788B (en) | Social network public opinion situation decision method, device, equipment and medium | |
CN117909492B (en) | Method, system, equipment and medium for extracting unstructured information of power grid |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |