CN114818682A - Document level entity relation extraction method based on self-adaptive entity path perception - Google Patents

Document level entity relation extraction method based on self-adaptive entity path perception Download PDF

Info

Publication number
CN114818682A
CN114818682A CN202210749823.9A CN202210749823A CN114818682A CN 114818682 A CN114818682 A CN 114818682A CN 202210749823 A CN202210749823 A CN 202210749823A CN 114818682 A CN114818682 A CN 114818682A
Authority
CN
China
Prior art keywords
entity
document
node
nodes
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210749823.9A
Other languages
Chinese (zh)
Other versions
CN114818682B (en
Inventor
蒋林承
张俊丰
张维琦
赵超
邓劲生
曾道建
谭真
李硕豪
乔凤才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202210749823.9A priority Critical patent/CN114818682B/en
Publication of CN114818682A publication Critical patent/CN114818682A/en
Application granted granted Critical
Publication of CN114818682B publication Critical patent/CN114818682B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The application relates to a document level entity relation extraction method based on self-adaptive entity path perception. The method comprises the following steps: constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics; predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities; calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model. By adopting the method, the accuracy of extracting the document-level entity relationship can be improved.

Description

Document level entity relation extraction method based on self-adaptive entity path perception
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a document level entity relationship extraction method and apparatus based on adaptive entity path sensing, a computer device, and a storage medium.
Background
Entity relationship extraction is a classic task in the field of information extraction, which aims at identifying semantic relationships between entities (concepts) contained in a given unstructured text and storing the results in a structured form of relational triples. If a given text "10 months in 2017, the head of today announces that 10 hundred million dollars are valued to acquire music short video platform music.ly", and a relationship triple "head of today, acquisition, music.ly" is obtained by extracting entity relationships. The entity relation extraction is used as a key technology of information extraction, can play an important role in multiple fields of natural language processing, and particularly has great research significance and wide application prospect in the era background of internet mass information. From the theoretical value aspect, the entity relation extraction relates to theories and methods of multiple disciplines such as machine learning, data mining, natural language processing and the like. From the application aspect, the entity relation extraction can be used for automatically constructing a large-scale knowledge base, particularly a knowledge graph, provides data support for the construction of an information retrieval and automatic question-answering system, and is also the basis of natural language understanding. The existing entity relation extraction work mainly focuses on sentence-level extraction and is limited to entity semantic relations in a single sentence text. However, in a real application scenario, description of semantic relationships of entities is very complex, a large number of relationships between entities are expressed by a plurality of sentences, and complex associations between a plurality of entities are shown. Statistics based on manual annotation data sampled from wikipedia indicate that at least 40% of entity semantic relationship facts can only be jointly captured from multiple sentences. Therefore, there is a need to push entity relationship extraction to a document level that is more consistent with real scenarios. Compared with sentence-level entity relationship extraction, document-level entity relationship extraction is more challenging, and requires more complex reasoning skills, such as logical reasoning, co-reference reasoning, general knowledge reasoning and the like. A document may contain multiple entities, each with multiple references in different contexts. In order to identify relationships between entities across sentences, it is necessary to be able to model complex interactions between multiple entities in a document and to leverage the multiple-mentioned context information of the entities, which is clearly beyond the capability of sentence-level relationship extraction methods.
At present, with the intensive research of graph neural networks, researchers try to model various semantic information in documents by using document graphs, wherein words, mentions, entities or sentences are used as nodes, and heuristic rules are utilized to connect the nodes into the document graphs. These methods focus on how to build better document graphs to retain more semantic information and how to better propagate information on the graph. With the help of the strong representation capability of the graph neural network, the method achieves good effect, but has the following problems: a) when the existing work is used for aggregating entity representations, multiple reference representations are aggregated without distinction, and then the reference representations are combined into a single global representation to perform semantic relation prediction with all other entities. In fact, since multiple mentions of an entity are in different contexts in a document, the role each node plays should be different when connecting different types of nodes. b) The graph neural network implicitly performs reasoning through node information propagation, in order to capture interaction of high-order information in a graph, a multilayer graph network structure (such as multiple graph convolutions) is often used in an overlapping mode, the representation of nodes in the same connected component in the graph tends to converge to a subspace irrelevant to input, and therefore the representation of the learned nodes is too smooth and not accurate enough.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a document level entity relationship extraction method, apparatus, computer device and storage medium based on adaptive entity path awareness, which can improve the accuracy of document level entity relationship extraction.
A document level entity relationship extraction method based on adaptive entity path perception, the method comprises the following steps:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics;
predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
In one embodiment, constructing a document graph according to the context characterization sequence of sentences and the positions of the entities in the document to be extracted includes:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connecting nodes of the entity nodes and the sentence nodes in the documents to be extracted.
In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual characterizations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.
In one embodiment, updating the initial representation of the entity node from both the breadth and the depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprising:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, a message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph to obtain entity node representation of document level semantics.
In one embodiment, aggregating neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, modeling interaction between entity pairs, jointly controlling a message propagation algorithm from both an extent and a depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, and obtaining an entity node representation of document-level semantics, includes:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, and for the aspect of breadth, in the aggregation process of neighbor information of each hop, a temporary aggregation representation of the breadth of the node is obtained according to a breadth self-adaptive mode;
according to the long and short memory network of the LSTM, the extent temporary aggregation representation of the nodes is performed by utilizing a plurality of gating mechanisms, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that the entity node representation of the document level semantics is obtained.
In one embodiment, obtaining the extent temporary aggregation of the nodes according to the extent adaptive manner includes:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Figure 974029DEST_PATH_IMAGE001
Wherein the content of the first and second substances,
Figure 267607DEST_PATH_IMAGE002
Figure 714769DEST_PATH_IMAGE003
refers to the weight parameters of node u and neighbor v,
Figure 435600DEST_PATH_IMAGE004
are learnable parameters that linearly transform the neighbor features,
Figure 335423DEST_PATH_IMAGE005
representing nodes
Figure 167113DEST_PATH_IMAGE006
Is shown in the drawing (a) and (b),
Figure 734360DEST_PATH_IMAGE008
and
Figure 360514DEST_PATH_IMAGE009
referring to the query and key matrices in the attention mechanism,
Figure 278791DEST_PATH_IMAGE010
refers to a feed-forward neural network, and the neural network,
Figure 914172DEST_PATH_IMAGE011
is a node
Figure 70347DEST_PATH_IMAGE012
Is determined by the node of the neighbor node set,
Figure 867401DEST_PATH_IMAGE013
representing nodes
Figure 7396DEST_PATH_IMAGE014
Is shown in the drawing (a) and (b),
Figure 180888DEST_PATH_IMAGE015
representing nodes
Figure 457149DEST_PATH_IMAGE016
The neighbor nodes of (a) are,Trepresenting a transpose operation.
In one embodiment, in terms of depth, according to a long and short memory network of an LSTM, a plurality of gating mechanisms are used to temporarily aggregate and express the breadth of nodes to selectively store document-level high-order information related to the nodes, and only neighbors within a certain hop count are selected to be propagated to obtain an entity node expression of document-level semantics, including:
for the temporary aggregation representation of the breadth of the nodes, effective information in the temporary aggregation representation of the breadth of the nodes is added into a memory unit by using an update gate, invalid information in a previous layer of memory unit is filtered by a forget gate, an output gate controls the memory unit, and entity node representation of document level semantics is output.
In one embodiment, predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities includes:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Figure 690684DEST_PATH_IMAGE017
Wherein
Figure 52395DEST_PATH_IMAGE018
,
Figure 763999DEST_PATH_IMAGE019
,
Figure 894766DEST_PATH_IMAGE020
And
Figure 299203DEST_PATH_IMAGE021
are learnable parameters of classifiers in a feed-forward neural network,
Figure 148210DEST_PATH_IMAGE022
refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,
Figure 663505DEST_PATH_IMAGE024
representing different physical node representations
Figure 914358DEST_PATH_IMAGE025
And
Figure 489696DEST_PATH_IMAGE026
and splicing to obtain the characteristics of the entity pairs.
In one embodiment, calculating the loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities comprises:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities as
Figure 825999DEST_PATH_IMAGE027
Wherein TH represents a threshold relationship label TH,
Figure 144985DEST_PATH_IMAGE028
a set of relationship tags representing the actual existence of entities,
Figure 250344DEST_PATH_IMAGE029
representing negative exemplar relational tag sets, logits refer to pairs of entities
Figure 996583DEST_PATH_IMAGE030
The scores of all of the relationship tags in (c),
Figure 554604DEST_PATH_IMAGE031
finger relation label
Figure 677280DEST_PATH_IMAGE032
The score value of (a) is calculated,
Figure 637146DEST_PATH_IMAGE033
a label representing the relationship between the user and the user,
Figure 554287DEST_PATH_IMAGE034
representing relationship labels
Figure 865182DEST_PATH_IMAGE035
The score value of (a) is calculated,
Figure 525971DEST_PATH_IMAGE036
label for representing threshold relation
Figure 340343DEST_PATH_IMAGE037
The score value of (a).
A document level entity relationship extraction apparatus based on adaptive entity path awareness, the apparatus comprising:
the data preprocessing module is used for acquiring the document to be extracted and the position of the entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a workprovider algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
the document graph building module is used for building a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network; carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
the initial representation updating module is used for updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantics;
the prediction module is used for predicting the entity node representation of the document level semantics according to the feedforward neural network to obtain the score value of the relation label between the entities;
the document level entity relation extraction module is used for calculating a loss value according to the score value of the relation labels among the entities and the relation labels actually existing among the entities, and iteratively optimizing learnable parameters in the deep neural network model by utilizing the loss value and a back propagation algorithm to obtain an entity relation extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics;
predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics;
predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
The invention adopts a pre-training language model to model the complex interaction between different layers of information, learns the vocabulary representation of deep contextualization, models the semantic information in the document by constructing a fine document graph, then controls a message transmission algorithm from two aspects of breadth and depth, screens and aggregates the document-level information by learning the self-adaptive sensing path of node message transmission, selectively aggregates the effective document-level information of a target entity, solves the problem that the current entity relation extraction is limited to the entity relation in a sentence, also solves the problem that neighbor nodes and nodes are treated to represent smoothness without distinguishing when the message is transmitted in the document-level entity relation extraction method based on the document graph, and improves the performance of the document-level entity relation extraction, the method realizes the efficient extraction of the entity semantic relation, and provides data support and core algorithm technology for the large-scale knowledge base construction, information retrieval, automatic question answering system and natural language processing application of natural language understanding.
Drawings
FIG. 1 is a flowchart illustrating a document-level entity relationship extraction method based on adaptive entity path awareness according to an embodiment;
FIG. 2 is a diagram of adaptive entity path sensing in one embodiment;
FIG. 3 is a block diagram of an embodiment of an apparatus for document-level entity relationship extraction based on adaptive entity path awareness;
FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, there is provided a document level entity relationship extraction method based on adaptive entity path perception, including the following steps:
102, acquiring a document to be extracted and the position of an entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document contains a plurality of sentences.
Step 102 of the present invention uses the document to be extracted
Figure 693964DEST_PATH_IMAGE038
Representing that document D is composed of N sentences, wherein
Figure 226576DEST_PATH_IMAGE039
Meaning that the ith sentence contains M words. The document is marked with an entity set containing P entities
Figure 691056DEST_PATH_IMAGE040
Wherein
Figure 359935DEST_PATH_IMAGE041
Refer to the ith entity in the document for Q co-referenced entity mentions, each appearing in a different context. Respectively inputting sentences in the document into the wordpienteThe word segmentation device carries out word segmentation, for example, after the ith sentence is segmented, the word segmentation is carried out
Figure 884457DEST_PATH_IMAGE042
Wherein k is<And = M, obtaining the preprocessed document. And preprocessing the document to be extracted to facilitate the context coding of a pre-training language model.
104, constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network, and the pre-training language model is used for carrying out context coding on the preprocessed document to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph includes an initial representation of the entity node.
In order to better model the semantics of an input document, inputting the preprocessed document after word segmentation into a pre-training language model BERT, mapping a sequence of the document after word segmentation into a low-dimensional real number vector containing context semantics through the pre-training language model BERT, wherein the input sequence corresponding to the ith sentence
Figure 638786DEST_PATH_IMAGE043
Mapping to a context characterization sequence
Figure 641377DEST_PATH_IMAGE044
Wherein
Figure 430342DEST_PATH_IMAGE045
And d is the hidden dimension, typically 768. The pre-training language model BERT is adopted, and the BERT can be used for modeling complex interaction between different levels of information and learning deep contextualized vocabulary representation.
The method comprises the steps of calculating initial representations of a mention node, an entity node and a sentence node according to a context characterization sequence of a sentence and the position of an entity in a document to be extracted, and constructing a document graph by using the initial representations of the mention node, the entity node and the sentence node and the natural association connecting nodes of the entity node and the sentence node in the document to be extracted. Semantic information within a document is modeled by building a refined document map. The message transmission algorithm on the graph of the path perception of the self-adaptive entity is an improvement of the message transmission algorithm on the graph, the previous message transmission algorithm on the graph does not make a choice when carrying out node aggregation, all neighbor information of a target node is aggregated, and the message transmission algorithm is not controlled by the message from the breadth and the depth.
And 106, updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantics.
Neighbor information in a target node N hop is aggregated by using an adaptive entity path-aware on-graph message propagation algorithm, interaction between entity pairs is modeled, an adaptive path of an entity on a document graph is learned to promote representation of the entity node, and further entity node representation of document level semantics is obtained, the document level information is screened and aggregated by learning the adaptive sensing path of node message propagation, effective document level information of the target entity is selectively aggregated to capture more effective relation semantic information, and the problem that the current entity relation extraction is limited to intra-sentence entity relations is solved.
And 108, predicting the entity node representation of the document level semantics according to the feedforward neural network to obtain the score value of the relation label between the entities.
Entities generally refer to proper nouns or concepts such as names of people, places, organizations, and relationships refer to semantic relationships between entities, such as: "the head of this day announces that 10 hundred million dollars estimate values purchase music short video platform music.ly," and the relation triple "head of this day, purchase, music.ly" is obtained through entity relation extraction.
In order to predict the semantic relation included between the entity node pairs of the document-level semantics, head and tail entity representations included in the entity node representations of the document-level semantics are spliced to obtain entity pairs including the semantic relation, the entity pairs including the semantic relation are predicted according to a feedforward neural network, and the loss value is calculated by using the prediction result, namely the score of the relation labels between the entities and the actually existing relation labels between the entities, so that the deep neural network model can be trained, and an accurate entity relation extraction model can be obtained.
Step 110, calculating a loss value according to the score value of the relationship labels between the entities and the actually existing relationship labels between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.
And calculating a loss value according to the score value of the relationship labels among the entities and the real relationship labels among the entities, wherein the real relationship labels among the entities are artificially pre-labeled real relationship labels among the entities, the loss value is minimized by random gradient descent, and the learnable parameters in the deep neural network model are updated layer by layer according to error back propagation. And when the loss function is converged in the optimization process, an entity relationship extraction model is obtained, and the entity relationship extraction model can be used for document-level entity relationship extraction after being stored.
In the document-level entity relation extraction method based on self-adaptive entity path perception, the invention adopts a pre-training language model to model complex interaction among different layers of information, learns deep contextualized vocabulary representation, models semantic information in a document by constructing a fine document graph, then controls a message transmission algorithm from two aspects of breadth and depth, screens and aggregates the document-level information by learning a self-adaptive perception path of node message transmission, selectively aggregates effective document-level information of a target entity, solves the problem that the current entity relation extraction is limited to intra-sentence entity relation, also solves the problem that neighbor nodes and nodes are treated to represent smoothness without distinguishing when the message is transmitted in the document-level entity relation extraction method based on the document graph, improves the performance of document-level entity relation extraction, realizes the efficient extraction of entity semantic relation, and a data support and core algorithm technology is provided for large-scale knowledge base construction, information retrieval and automatic question answering systems and natural language processing application of natural language understanding.
In one embodiment, constructing a document graph according to the context token sequence of the sentence and the position of the entity in the document to be extracted includes:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connecting nodes of the entity nodes and the sentence nodes in the documents to be extracted.
In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual representations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.
In particular embodiments, references to nodes are intended to mean different references of each entity in the document. The representation of the reference node is the average of the hidden representations corresponding to the words contained in the reference, assuming a documentIf the total number of the nodes contains N references, the representation form of the reference nodes is
Figure 125765DEST_PATH_IMAGE046
Wherein
Figure 367391DEST_PATH_IMAGE047
Mention is made of type embedding of nodes. The form of the entity node is similar to that of the mentioned node, the representation form of the entity node is the average value of all mentioned representations corresponding to the entity, and the representation form of the entity node is that P entities are contained in one document
Figure 908093DEST_PATH_IMAGE048
Wherein
Figure 817144DEST_PATH_IMAGE049
Is a type embedding of the entity node. The expression form of the sentence nodes is the average value of hidden expressions corresponding to all contained words in the sentence sequence, and if a document contains T sentences, the expression form of the sentence nodes is
Figure 417889DEST_PATH_IMAGE050
Wherein
Figure 412390DEST_PATH_IMAGE051
Is type embedding of sentence nodes.
Obtaining the expression set of the nodes through the three types of node structures
Figure 491205DEST_PATH_IMAGE053
Where d is the hidden dimension, for a total of N + R + T nodes.
After the node construction is completed, connecting nodes based on natural association between document node elements to form a document graph: a) mention node-mention node edge: and the mentions in the same sentence are connected with each other. b) Mention of node-sentence node edge: mentions are interconnected with the sentence in which they are located. c) Mention of node-entity node edges: the sentence entities mentioned as corresponding thereto are connected to each other. d) Entity node-sentence node edge: the entity is linked to the sentence that contains its mention. e) Sentence node-sentence node edge: all sentence nodes are connected to each other. It is noted that two entity nodes are not directly connected in the graph, and the purpose is to aggregate the relationship between the multi-hop intermediate node modeling entity pairs between the entity nodes by using the adaptive entity path-aware on-graph message propagation algorithm in the next step.
In conclusion, the constructed N + R + T mention, entity and sentence nodes are connected into the document graph by utilizing the natural association among different node elements of the document
Figure 254761DEST_PATH_IMAGE054
Where V is the set of nodes and E is the set of edges.
In one embodiment, updating the initial representation of the entity node from both the breadth and the depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprising:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, a message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph to obtain entity node representation of document level semantics.
In one embodiment, aggregating neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, modeling interaction between entity pairs, jointly controlling a message propagation algorithm from both an extent and a depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, and obtaining an entity node representation of document-level semantics, includes:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, and for the aspect of breadth, in the aggregation process of neighbor information of each hop, a temporary aggregation representation of the breadth of the node is obtained according to a breadth self-adaptive mode;
according to the LSTM long and short memory network in the depth aspect, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.
In one embodiment, obtaining the extent temporary aggregation of the nodes according to the extent adaptive manner includes:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Figure 26408DEST_PATH_IMAGE055
Wherein the content of the first and second substances,
Figure 508205DEST_PATH_IMAGE056
Figure 656290DEST_PATH_IMAGE058
refers to the weight parameters of node u and neighbor v,
Figure 8774DEST_PATH_IMAGE059
are learnable parameters that linearly transform the neighbor features,
Figure 216901DEST_PATH_IMAGE060
representing nodes
Figure 920415DEST_PATH_IMAGE061
Is shown in the drawing (a) and (b),
Figure 606611DEST_PATH_IMAGE062
and
Figure 344760DEST_PATH_IMAGE063
referring to the query and key matrices in the attention mechanism,
Figure 458209DEST_PATH_IMAGE010
refers to a feed-forward neural network, and the neural network,
Figure 383440DEST_PATH_IMAGE064
is a node
Figure 873327DEST_PATH_IMAGE065
Is determined by the node of the neighbor node set,
Figure 465983DEST_PATH_IMAGE066
representing nodes
Figure 15913DEST_PATH_IMAGE067
Is shown in the drawing (a) and (b),
Figure 694019DEST_PATH_IMAGE068
representing nodes
Figure 456438DEST_PATH_IMAGE016
Of the node(s) of (a) is,Trepresenting a transpose operation.
In a specific embodiment, for the breadth aspect, the information aggregation process of each hop neighbor by utilizing the multi-layer graph attention network is used for representing each node at the l +1 th layer
Figure 903600DEST_PATH_IMAGE069
Firstly obtaining the temporary aggregation representation of the breadth of the node in a breadth self-adaptive mode shown by the following formula
Figure 624431DEST_PATH_IMAGE070
Figure 789834DEST_PATH_IMAGE071
Figure 621523DEST_PATH_IMAGE072
Figure 657613DEST_PATH_IMAGE073
By giving different rights to different neighboursAnd the first-order neighbor nodes are treated differently.
In one embodiment, in terms of depth, according to a long and short memory network of an LSTM, a plurality of gating mechanisms are used to temporarily aggregate and express the breadth of nodes to selectively store document-level high-order information related to the nodes, and only neighbors within a certain hop count are selected to be propagated to obtain an entity node expression of document-level semantics, including:
and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forgetting gate, controlling the memory unit by using an output gate, and outputting the entity node representation of the document-level semantics.
In the specific embodiment, the long-short memory of the LSTM is introduced, a plurality of gating mechanisms are utilized to store and update neighbor information of each hop, document-level high-order information related to nodes is selectively stored, only neighbors within a certain hop number are selected to be transmitted, and the problems of transmission overload and over-smoothness are effectively prevented. Node-based breadth-temporal aggregation representation
Figure 549345DEST_PATH_IMAGE074
Updating door
Figure 202043DEST_PATH_IMAGE075
Adding new valid information to memory cells
Figure 571845DEST_PATH_IMAGE076
Middle and forget door
Figure 993599DEST_PATH_IMAGE077
Then filter out the memory cells of the previous layer
Figure 56233DEST_PATH_IMAGE079
The updating gate and the forgetting gate are matched with each other to play the roles of selective extraction and filtration when a farther neighbor is searched. Finally, an output gate
Figure 930648DEST_PATH_IMAGE080
Control memory unit
Figure 369720DEST_PATH_IMAGE081
Output nodeiTo (1) at+Level 1 node representation
Figure 380401DEST_PATH_IMAGE082
. The calculation procedure is as follows.
Figure 613936DEST_PATH_IMAGE083
Figure 987366DEST_PATH_IMAGE084
Figure 698970DEST_PATH_IMAGE085
Figure 95316DEST_PATH_IMAGE086
Wherein
Figure 499753DEST_PATH_IMAGE087
Figure 348760DEST_PATH_IMAGE088
Figure 598476DEST_PATH_IMAGE089
Figure 849329DEST_PATH_IMAGE090
Respectively refer to learnable parameters of linear transformation corresponding to the forgetting gate, the updating gate, the output gate and the memory unit.
As shown in FIG. 2, the method determines a suitable subgraph by expanding the width (which hop neighbor is important) and the depth (the importance of the t-th hop neighbor) of each node, so as to learn the self-adaptive path of the information propagation of the entity on the document graph and selectively aggregate the effective documents of the target entityLevel information, which solves the problem that the extraction of entity relationship is limited to the entity relationship in the sentence. Obtaining a set of entity node representations containing document-level semantics from multiple iterations through the message propagation algorithm
Figure 424667DEST_PATH_IMAGE091
In one embodiment, predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities includes:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Figure 760970DEST_PATH_IMAGE092
Wherein
Figure 79956DEST_PATH_IMAGE093
,
Figure 919736DEST_PATH_IMAGE094
,
Figure 665975DEST_PATH_IMAGE095
And
Figure 223995DEST_PATH_IMAGE096
are learnable parameters of classifiers in a feed-forward neural network,
Figure 346672DEST_PATH_IMAGE097
refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,
Figure 40959DEST_PATH_IMAGE099
representing different physical node representations
Figure 958099DEST_PATH_IMAGE100
And
Figure 268995DEST_PATH_IMAGE101
and splicing to obtain the characteristics of the entity pairs.
In particular embodiments, first, to predict entity pairs
Figure 929783DEST_PATH_IMAGE102
The included semantic relation represents the head and tail entities included in the entity pair
Figure 744155DEST_PATH_IMAGE103
And
Figure 832197DEST_PATH_IMAGE104
splicing to obtain characteristics of entity pairs
Figure 364810DEST_PATH_IMAGE106
Figure 563710DEST_PATH_IMAGE107
Then, the feedforward neural network is used to determine the characteristics of the entity pairs
Figure 498168DEST_PATH_IMAGE106
Pair of computing entities
Figure 22690DEST_PATH_IMAGE108
Score of all relationship tags in
Figure 511440DEST_PATH_IMAGE109
Figure 779610DEST_PATH_IMAGE110
Wherein
Figure 568575DEST_PATH_IMAGE111
,
Figure 998419DEST_PATH_IMAGE112
,
Figure 505624DEST_PATH_IMAGE114
And
Figure 46327DEST_PATH_IMAGE115
are learnable parameters of classifiers in a feed-forward neural network,
Figure 689798DEST_PATH_IMAGE116
refers to the activation function, d refers to the hidden dimension in the feedforward neural network, and k is the number of labels.
In the prediction stage, normalization is carried out by using a nonlinear activation function sigmoid, and an entity pair can be obtained
Figure 556122DEST_PATH_IMAGE117
Has a relationship label therein
Figure 285044DEST_PATH_IMAGE118
The probability value of (a) is determined,
Figure 629438DEST_PATH_IMAGE119
wherein
Figure 392994DEST_PATH_IMAGE120
Finger-shaped
Figure 164641DEST_PATH_IMAGE121
The score value of the middle relationship label r. sigmoid is a value which is converted into a value between 0 and 1, and the probability that the relationship label given by the model exists in the target entity pair is taken as the value, so that the interpretability of the relationship label score is enhanced. Since the model is self-learned, the final output range cannot be known in advance, for example, it is unclear whether the score of the label is 100 high or low, the range can be compressed within (0-1) by using sigmoid, and thus it is known that scores generally greater than 0.5 are very high.
In one embodiment, calculating the loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities comprises:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities as
Figure 646438DEST_PATH_IMAGE122
Wherein TH represents a threshold relationship label TH,
Figure 794523DEST_PATH_IMAGE124
a set of relationship tags representing the actual existence of entities,
Figure 412586DEST_PATH_IMAGE125
representing negative exemplar relational tag sets, logits refer to pairs of entities
Figure 89555DEST_PATH_IMAGE126
The scores of all of the relationship tags in (c),
Figure 793069DEST_PATH_IMAGE127
finger relationship label
Figure 10423DEST_PATH_IMAGE128
The score value of (a) is calculated,
Figure 482993DEST_PATH_IMAGE033
a label representing the relationship between the user and the user,
Figure 862022DEST_PATH_IMAGE034
representing relationship labels
Figure 52832DEST_PATH_IMAGE035
The score value of (a) is calculated,
Figure 542719DEST_PATH_IMAGE036
label for representing threshold relation
Figure 869795DEST_PATH_IMAGE037
The score value of (a).
In particular embodiments, to more efficiently handle the multi-tag problem, i.e., the same entity pair may contain multiple relationship tags, and the exemplar imbalance problem, i.e., most entity pairs are negative exemplars that do not contain any relationship tags. The invention adopts adaptive threshold loss as a loss function and optimizes the model parameters in an end-to-end mode. The adaptive threshold loss introduces an additional threshold relation label TH, and the optimization goal is the true existing positive sample relation label set between the entities
Figure 419725DEST_PATH_IMAGE129
Is higher than a threshold class label TH, and negative sample relation label sets do not exist among entities
Figure 832252DEST_PATH_IMAGE130
Is lower than a threshold class label TH, wherein a positive exemplar label refers to a relationship label that exists between entities and is actually present between entities, and a negative exemplar relationship label refers to a relationship that does not exist between entities. The loss function is calculated as follows:
Figure 125830DEST_PATH_IMAGE131
wherein logis refers to an entity pair
Figure 572992DEST_PATH_IMAGE132
Scores for all relationship tags in (c). In order to obtain the optimal model parameters, the invention calculates the loss value between the semantic relationship between the entities and the relationship labels actually existing between the entities through the loss function, minimizes the loss value L by using random gradient descent, and updates the learnable parameters in the model layer by layer according to error back propagation. And after the loss function is converged in the optimization process, the model is stored and then used for extracting the document-level entity relationship.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 3, there is provided a document level entity relationship extraction apparatus based on adaptive entity path perception, including: a data preprocessing module 302, a build document graph module 304, an initial representation updating module 306, a prediction module 308, and a document-level entity relationship extraction module 310, wherein:
the data preprocessing module 302 is configured to obtain a document to be extracted and a position of an entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
a build document map module 304 for building a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network, and the pre-training language model is used for carrying out context coding on the preprocessed document to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
the initial representation updating module 306 is used for updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantic;
the prediction module 308 is configured to predict an entity node representation of document-level semantics according to a feed-forward neural network, so as to obtain a score value of a relationship label between entities;
the document-level entity relationship extraction module 310 is configured to calculate a loss value according to the score values of the relationship labels between the entities and the actually existing relationship labels between the entities, and iteratively optimize learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.
In one embodiment, the construct document map module 304 is further configured to construct a document map according to the context token sequence of the sentence and the position of the entity in the document to be extracted, including:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connection nodes of the entity nodes and the sentence nodes in the documents to be extracted.
In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual characterizations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens for all words in the sentence.
In one embodiment, the initial representation updating module 306 is further configured to update the initial representation of the entity node in both width and depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm, and obtain an entity node representation of document-level semantics, including:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, a message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph to obtain entity node representation of document level semantics.
In one embodiment, the initial representation updating module 306 is further configured to aggregate neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, model interactions between entity pairs, jointly control a message propagation algorithm in both breadth and depth, filter and aggregate document-level information by automatically learning entity-related adaptive paths on the document graph, and obtain an entity node representation of document-level semantics, including:
neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, and for the aspect of breadth, in the aggregation process of neighbor information of each hop, a temporary aggregation representation of the breadth of the node is obtained according to a breadth self-adaptive mode;
according to the LSTM long and short memory network in the depth aspect, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.
In one embodiment, the initial representation updating module 306 is further configured to obtain a temporary aggregation of the extents of the nodes according to an extent adaptive manner, including:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Figure 28244DEST_PATH_IMAGE133
Wherein the content of the first and second substances,
Figure 193646DEST_PATH_IMAGE134
Figure 25336DEST_PATH_IMAGE135
refers to the weight parameters of node u and neighbor v,
Figure 327004DEST_PATH_IMAGE136
are learnable parameters that linearly transform the neighbor features,
Figure 218737DEST_PATH_IMAGE137
representing nodes
Figure 340277DEST_PATH_IMAGE138
Is shown in the drawing (a) and (b),
Figure 975657DEST_PATH_IMAGE139
and
Figure 397411DEST_PATH_IMAGE140
referring to the query and key matrices in the attention mechanism,
Figure 460045DEST_PATH_IMAGE141
refers to a feed-forward neural network, and the neural network,
Figure 334460DEST_PATH_IMAGE142
is a node
Figure 242373DEST_PATH_IMAGE143
Set of neighbor nodes
Figure 518634DEST_PATH_IMAGE066
Representing nodes
Figure 752169DEST_PATH_IMAGE067
Is shown in the drawing (a) and (b),
Figure 113880DEST_PATH_IMAGE068
representing nodes
Figure 91064DEST_PATH_IMAGE016
The neighbor nodes of (a) are,Trepresenting a transpose operation.
In one embodiment, the initial representation updating module 306 is further configured to, in terms of depth, according to a long-short memory network of an LSTM, utilize multiple gating mechanisms to temporarily aggregate representations of the breadth of the nodes to selectively store document-level high-order information related to the nodes, and select only neighbors within a certain hop count for propagation, so as to obtain an entity node representation of document-level semantics, where the method includes:
and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forgetting gate, controlling the memory unit by using an output gate, and outputting the entity node representation of the document-level semantics.
In one embodiment, the prediction module 308 is further configured to predict the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities, including:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Figure 221831DEST_PATH_IMAGE144
Wherein
Figure 626267DEST_PATH_IMAGE145
,
Figure 209695DEST_PATH_IMAGE146
,
Figure 990570DEST_PATH_IMAGE147
And
Figure 241422DEST_PATH_IMAGE148
are learnable parameters of classifiers in a feed-forward neural network,
Figure 551181DEST_PATH_IMAGE149
refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,
Figure 621905DEST_PATH_IMAGE150
representing different physical node representations
Figure 940891DEST_PATH_IMAGE151
And
Figure 46250DEST_PATH_IMAGE152
and splicing to obtain the characteristics of the entity pairs.
In one embodiment, the document-level entity relationship extraction module 310 is further configured to calculate a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities, including:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities as
Figure 792489DEST_PATH_IMAGE153
Wherein TH represents a threshold relationship label TH,
Figure 350510DEST_PATH_IMAGE154
a set of relationship tags representing the actual existence of entities,
Figure 207607DEST_PATH_IMAGE155
representing negative exemplar relational tag sets, logits refer to pairs of entities
Figure 433052DEST_PATH_IMAGE156
The scores of all of the relationship tags in (c),
Figure 350193DEST_PATH_IMAGE157
finger relation label
Figure 395509DEST_PATH_IMAGE158
The score value of (a) is obtained,
Figure 790718DEST_PATH_IMAGE033
a label representing the relationship between the user and the user,
Figure 870670DEST_PATH_IMAGE034
representing relationship labels
Figure 958712DEST_PATH_IMAGE035
The score value of (a) is calculated,
Figure 491324DEST_PATH_IMAGE036
label for representing threshold relation
Figure 955803DEST_PATH_IMAGE037
The score value of (a).
For the specific definition of the document-level entity relationship extracting apparatus based on adaptive entity path sensing, refer to the above definition of the document-level entity relationship extracting method based on adaptive entity path sensing, and no further description is given here. The modules in the document level entity relation extraction device based on the adaptive entity path perception can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a document level entity relationship extraction method based on adaptive entity path perception. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method in the above embodiments when the processor executes the computer program.
In an embodiment, a computer storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method in the above-mentioned embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (9)

1. A document level entity relation extraction method based on adaptive entity path perception is characterized by comprising the following steps:
acquiring a document to be extracted and the position of an entity in the document to be extracted;
performing data preprocessing on the document to be extracted according to a workprovider algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;
constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;
carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;
constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;
updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain an entity node representation of document level semantics;
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain a relation label score value between entities;
calculating a loss value according to the score value of the relationship labels among the entities and the relationship labels actually existing among the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;
and extracting the document-level entity relationship according to the entity relationship extraction model.
2. The method according to claim 1, wherein constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted comprises:
and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations of the mention nodes, the entity nodes and the sentence nodes and the natural associated connection nodes of the mention nodes, the entity nodes and the sentence nodes in the documents to be extracted.
3. The method according to claim 2, wherein the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises interconnection between the mention node and the mention node, interconnection between the mention node and the sentence node, interconnection between the mention node and the entity node and interconnection between the entity node and the sentence node; the mentioned nodes, the entity nodes and the sentence nodes form a node set of the document graph; the mentioned nodes, the entity nodes and the sentence nodes are naturally associated in the document to be extracted to form an edge set of the document graph; the reference node is an average value of context tokens referring to corresponding words in the document; the entity node is an average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.
4. The method of claim 1, wherein updating the initial representation of the entity nodes from both breadth and depth to the document graph using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprises:
neighbor information in target node N hops in the document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, the message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph, so that entity node representation of document level semantics is obtained.
5. The method of claim 4, wherein aggregating neighbor information within N hops of a target node in the document graph using an adaptive entity path-aware on-graph message propagation algorithm, modeling interactions between pairs of entities, co-controlling the message propagation algorithm in both breadth and depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, resulting in an entity node representation of document-level semantics, comprises:
aggregating neighbor information in N hops of a target node in the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception, modeling interaction between entity pairs, and obtaining temporary aggregation representation of the node in the aggregation process of neighbor information of each hop according to an extent self-adaptive mode in terms of extent;
and according to the long and short memory network of the LSTM in depth, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.
6. The method of claim 5, wherein obtaining the extent temporal aggregation of nodes according to an extent adaptive approach comprises:
obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode
Figure 138226DEST_PATH_IMAGE001
Wherein the content of the first and second substances,
Figure 55366DEST_PATH_IMAGE002
Figure 366262DEST_PATH_IMAGE003
refers to the weight parameters of node u and neighbor v,
Figure 761471DEST_PATH_IMAGE004
are learnable parameters that linearly transform the neighbor features,
Figure 841422DEST_PATH_IMAGE005
representing nodes
Figure 929464DEST_PATH_IMAGE006
Is shown in the drawing (a) and (b),
Figure 462077DEST_PATH_IMAGE007
and
Figure 926556DEST_PATH_IMAGE008
referring to the query and key matrices in the attention mechanism,
Figure 861014DEST_PATH_IMAGE009
refers to a feed-forward neural network, and the neural network,
Figure 119957DEST_PATH_IMAGE010
is a node
Figure 874286DEST_PATH_IMAGE011
Is determined by the node of the neighbor node set,
Figure 876878DEST_PATH_IMAGE012
representing nodes
Figure 665842DEST_PATH_IMAGE013
Is shown in the drawing (a) and (b),
Figure 361266DEST_PATH_IMAGE013
representing nodes
Figure 868470DEST_PATH_IMAGE014
The neighbor nodes of (a) are,Trepresenting a transpose operation.
7. The method of claim 5, wherein the temporally aggregating the representation of the breadth of the nodes using a plurality of gating mechanisms to selectively save the document level high-order information related to the nodes according to the long and short memory network of the LSTM in terms of depth, and selecting only the neighbors within a certain hop count for propagation to obtain the entity node representation of the document level semantics, comprises:
and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forget gate, controlling the memory unit by an output gate, and outputting the entity node representation of the document-level semantics.
8. The method of claim 1, wherein predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities comprises:
predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities
Figure 409173DEST_PATH_IMAGE015
Wherein
Figure 52644DEST_PATH_IMAGE016
,
Figure 653390DEST_PATH_IMAGE017
,
Figure 647890DEST_PATH_IMAGE018
And
Figure 992284DEST_PATH_IMAGE019
are learnable parameters of classifiers in a feed-forward neural network,
Figure 755841DEST_PATH_IMAGE020
refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,
Figure 793067DEST_PATH_IMAGE021
representing different physical node representations
Figure 9285DEST_PATH_IMAGE022
And
Figure 157369DEST_PATH_IMAGE023
and splicing to obtain the characteristics of the entity pairs.
9. The method of claim 8, wherein calculating a loss value based on the scoring values of the relationship labels between the entities and the actually existing relationship labels between the entities comprises:
calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities
Figure 775432DEST_PATH_IMAGE024
Wherein TH represents a threshold relationship label TH,
Figure 717981DEST_PATH_IMAGE025
a set of relationship tags representing the actual existence of entities,
Figure 421494DEST_PATH_IMAGE026
representing negative exemplar relational tag sets, logits refer to pairs of entities
Figure 107691DEST_PATH_IMAGE027
The scores of all of the relationship tags in (c),
Figure 845839DEST_PATH_IMAGE028
finger relationship label
Figure 224868DEST_PATH_IMAGE029
The score value of (a) is calculated,
Figure 415678DEST_PATH_IMAGE030
a label representing the relationship between the user and the user,
Figure 905565DEST_PATH_IMAGE031
representing relationship labels
Figure 967062DEST_PATH_IMAGE032
The score value of (a) is calculated,
Figure 516992DEST_PATH_IMAGE033
a score value representing a threshold relationship label TH.
CN202210749823.9A 2022-06-29 2022-06-29 Document level entity relation extraction method based on self-adaptive entity path perception Active CN114818682B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210749823.9A CN114818682B (en) 2022-06-29 2022-06-29 Document level entity relation extraction method based on self-adaptive entity path perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210749823.9A CN114818682B (en) 2022-06-29 2022-06-29 Document level entity relation extraction method based on self-adaptive entity path perception

Publications (2)

Publication Number Publication Date
CN114818682A true CN114818682A (en) 2022-07-29
CN114818682B CN114818682B (en) 2022-09-02

Family

ID=82523327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210749823.9A Active CN114818682B (en) 2022-06-29 2022-06-29 Document level entity relation extraction method based on self-adaptive entity path perception

Country Status (1)

Country Link
CN (1) CN114818682B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116522935A (en) * 2023-03-29 2023-08-01 北京德风新征程科技股份有限公司 Text data processing method, processing device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351749A1 (en) * 2016-06-03 2017-12-07 Microsoft Technology Licensing, Llc Relation extraction across sentence boundaries
US20210019370A1 (en) * 2019-07-19 2021-01-21 Siemens Aktiengesellschaft Neural relation extraction within and across sentence boundaries
CN114090792A (en) * 2021-11-25 2022-02-25 润联软件系统(深圳)有限公司 Document relation extraction method based on comparison learning and related equipment thereof
CN114298052A (en) * 2022-01-04 2022-04-08 中国人民解放军国防科技大学 Entity joint labeling relation extraction method and system based on probability graph
CN114398491A (en) * 2021-12-21 2022-04-26 成都量子矩阵科技有限公司 Semantic segmentation image entity relation reasoning method based on knowledge graph

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351749A1 (en) * 2016-06-03 2017-12-07 Microsoft Technology Licensing, Llc Relation extraction across sentence boundaries
US20210019370A1 (en) * 2019-07-19 2021-01-21 Siemens Aktiengesellschaft Neural relation extraction within and across sentence boundaries
CN114090792A (en) * 2021-11-25 2022-02-25 润联软件系统(深圳)有限公司 Document relation extraction method based on comparison learning and related equipment thereof
CN114398491A (en) * 2021-12-21 2022-04-26 成都量子矩阵科技有限公司 Semantic segmentation image entity relation reasoning method based on knowledge graph
CN114298052A (en) * 2022-01-04 2022-04-08 中国人民解放军国防科技大学 Entity joint labeling relation extraction method and system based on probability graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张翠等: "融合句法依存树注意力的关系抽取研究", 《广东通信技术》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116522935A (en) * 2023-03-29 2023-08-01 北京德风新征程科技股份有限公司 Text data processing method, processing device and electronic equipment
CN116522935B (en) * 2023-03-29 2024-03-29 北京德风新征程科技股份有限公司 Text data processing method, processing device and electronic equipment

Also Published As

Publication number Publication date
CN114818682B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
Ribeiro et al. Anchors: High-precision model-agnostic explanations
US11227121B2 (en) Utilizing machine learning models to identify insights in a document
US20220050967A1 (en) Extracting definitions from documents utilizing definition-labeling-dependent machine learning background
Wang et al. ADRL: An attention-based deep reinforcement learning framework for knowledge graph reasoning
US20120323558A1 (en) Method and apparatus for creating a predicting model
CN112396185B (en) Fact verification method, system, computer equipment and storage medium
CN112905801A (en) Event map-based travel prediction method, system, device and storage medium
CN113449204B (en) Social event classification method and device based on local aggregation graph attention network
US20190228297A1 (en) Artificial Intelligence Modelling Engine
Bogaerts et al. A framework for step-wise explaining how to solve constraint satisfaction problems
CN115455130B (en) Fusion method of social media data and movement track data
Liu et al. Interpretability of computational models for sentiment analysis
Xiong et al. DGI: recognition of textual entailment via dynamic gate matching
Okawa et al. Predicting opinion dynamics via sociologically-informed neural networks
CN114818682B (en) Document level entity relation extraction method based on self-adaptive entity path perception
CN110489730A (en) Text handling method, device, terminal and storage medium
CN112015890B (en) Method and device for generating movie script abstract
Xu et al. Collective vertex classification using recursive neural network
US20230080424A1 (en) Dynamic causal discovery in imitation learning
CN115599918B (en) Graph enhancement-based mutual learning text classification method and system
US20230111052A1 (en) Self-learning annotations to generate rules to be utilized by rule-based system
Yang et al. Generation-based parallel particle swarm optimization for adversarial text attacks
WO2023107207A1 (en) Automated notebook completion using sequence-to-sequence transformer
US20220051083A1 (en) Learning word representations via commonsense reasoning
Yigit et al. Assessing the impact of minor modifications on the interior structure of GRU: GRU1 and GRU2

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant