CN114818682A

CN114818682A - Document level entity relation extraction method based on self-adaptive entity path perception

Info

Publication number: CN114818682A
Application number: CN202210749823.9A
Authority: CN
Inventors: 蒋林承; 张俊丰; 张维琦; 赵超; 邓劲生; 曾道建; 谭真; 李硕豪; 乔凤才
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-07-29
Anticipated expiration: 2042-06-29
Also published as: CN114818682B

Abstract

The application relates to a document level entity relation extraction method based on self-adaptive entity path perception. The method comprises the following steps: constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics; predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities; calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model. By adopting the method, the accuracy of extracting the document-level entity relationship can be improved.

Description

Document level entity relation extraction method based on self-adaptive entity path perception

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a document level entity relationship extraction method and apparatus based on adaptive entity path sensing, a computer device, and a storage medium.

Background

Entity relationship extraction is a classic task in the field of information extraction, which aims at identifying semantic relationships between entities (concepts) contained in a given unstructured text and storing the results in a structured form of relational triples. If a given text "10 months in 2017, the head of today announces that 10 hundred million dollars are valued to acquire music short video platform music.ly", and a relationship triple "head of today, acquisition, music.ly" is obtained by extracting entity relationships. The entity relation extraction is used as a key technology of information extraction, can play an important role in multiple fields of natural language processing, and particularly has great research significance and wide application prospect in the era background of internet mass information. From the theoretical value aspect, the entity relation extraction relates to theories and methods of multiple disciplines such as machine learning, data mining, natural language processing and the like. From the application aspect, the entity relation extraction can be used for automatically constructing a large-scale knowledge base, particularly a knowledge graph, provides data support for the construction of an information retrieval and automatic question-answering system, and is also the basis of natural language understanding. The existing entity relation extraction work mainly focuses on sentence-level extraction and is limited to entity semantic relations in a single sentence text. However, in a real application scenario, description of semantic relationships of entities is very complex, a large number of relationships between entities are expressed by a plurality of sentences, and complex associations between a plurality of entities are shown. Statistics based on manual annotation data sampled from wikipedia indicate that at least 40% of entity semantic relationship facts can only be jointly captured from multiple sentences. Therefore, there is a need to push entity relationship extraction to a document level that is more consistent with real scenarios. Compared with sentence-level entity relationship extraction, document-level entity relationship extraction is more challenging, and requires more complex reasoning skills, such as logical reasoning, co-reference reasoning, general knowledge reasoning and the like. A document may contain multiple entities, each with multiple references in different contexts. In order to identify relationships between entities across sentences, it is necessary to be able to model complex interactions between multiple entities in a document and to leverage the multiple-mentioned context information of the entities, which is clearly beyond the capability of sentence-level relationship extraction methods.

At present, with the intensive research of graph neural networks, researchers try to model various semantic information in documents by using document graphs, wherein words, mentions, entities or sentences are used as nodes, and heuristic rules are utilized to connect the nodes into the document graphs. These methods focus on how to build better document graphs to retain more semantic information and how to better propagate information on the graph. With the help of the strong representation capability of the graph neural network, the method achieves good effect, but has the following problems: a) when the existing work is used for aggregating entity representations, multiple reference representations are aggregated without distinction, and then the reference representations are combined into a single global representation to perform semantic relation prediction with all other entities. In fact, since multiple mentions of an entity are in different contexts in a document, the role each node plays should be different when connecting different types of nodes. b) The graph neural network implicitly performs reasoning through node information propagation, in order to capture interaction of high-order information in a graph, a multilayer graph network structure (such as multiple graph convolutions) is often used in an overlapping mode, the representation of nodes in the same connected component in the graph tends to converge to a subspace irrelevant to input, and therefore the representation of the learned nodes is too smooth and not accurate enough.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a document level entity relationship extraction method, apparatus, computer device and storage medium based on adaptive entity path awareness, which can improve the accuracy of document level entity relationship extraction.

A document level entity relationship extraction method based on adaptive entity path perception, the method comprises the following steps:

acquiring a document to be extracted and the position of an entity in the document to be extracted;

performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;

constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network;

carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence;

constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;

updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of document level semantics;

predicting entity node representation of document level semantics according to a feedforward neural network to obtain a relation label score value between entities;

calculating a loss value according to the score value of the relationship labels between the entities and the relationship labels actually existing between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;

and extracting the document-level entity relationship according to the entity relationship extraction model.

In one embodiment, constructing a document graph according to the context characterization sequence of sentences and the positions of the entities in the document to be extracted includes:

and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connecting nodes of the entity nodes and the sentence nodes in the documents to be extracted.

In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual characterizations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.

In one embodiment, updating the initial representation of the entity node from both the breadth and the depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprising:

neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, a message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph to obtain entity node representation of document level semantics.

In one embodiment, aggregating neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, modeling interaction between entity pairs, jointly controlling a message propagation algorithm from both an extent and a depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, and obtaining an entity node representation of document-level semantics, includes:

neighbor information in target node N hops in a document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, and for the aspect of breadth, in the aggregation process of neighbor information of each hop, a temporary aggregation representation of the breadth of the node is obtained according to a breadth self-adaptive mode;

according to the long and short memory network of the LSTM, the extent temporary aggregation representation of the nodes is performed by utilizing a plurality of gating mechanisms, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that the entity node representation of the document level semantics is obtained.

In one embodiment, obtaining the extent temporary aggregation of the nodes according to the extent adaptive manner includes:

obtaining the temporary aggregation of the breadth of the node according to the breadth self-adaptive mode

Wherein the content of the first and second substances,

，

refers to the weight parameters of node u and neighbor v,

are learnable parameters that linearly transform the neighbor features,

representing nodes

Is shown in the drawing (a) and (b),

and

referring to the query and key matrices in the attention mechanism,

refers to a feed-forward neural network, and the neural network,

is a node

Is determined by the node of the neighbor node set,

representing nodes

Is shown in the drawing (a) and (b),

representing nodes

The neighbor nodes of (a) are,Trepresenting a transpose operation.

In one embodiment, in terms of depth, according to a long and short memory network of an LSTM, a plurality of gating mechanisms are used to temporarily aggregate and express the breadth of nodes to selectively store document-level high-order information related to the nodes, and only neighbors within a certain hop count are selected to be propagated to obtain an entity node expression of document-level semantics, including:

for the temporary aggregation representation of the breadth of the nodes, effective information in the temporary aggregation representation of the breadth of the nodes is added into a memory unit by using an update gate, invalid information in a previous layer of memory unit is filtered by a forget gate, an output gate controls the memory unit, and entity node representation of document level semantics is output.

In one embodiment, predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities includes:

predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain scores of all relation labels among entities

Wherein

,

,

And

are learnable parameters of classifiers in a feed-forward neural network,

refers to the activation function, d refers to the hidden dimension in the feedforward neural network, k is the number of labels,

representing different physical node representations

And

and splicing to obtain the characteristics of the entity pairs.

In one embodiment, calculating the loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities comprises:

calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities as

Wherein TH represents a threshold relationship label TH,

a set of relationship tags representing the actual existence of entities,

representing negative exemplar relational tag sets, logits refer to pairs of entities

The scores of all of the relationship tags in (c),

finger relation label

The score value of (a) is calculated,

a label representing the relationship between the user and the user,

representing relationship labels

The score value of (a) is calculated,

label for representing threshold relation

The score value of (a).

A document level entity relationship extraction apparatus based on adaptive entity path awareness, the apparatus comprising:

the data preprocessing module is used for acquiring the document to be extracted and the position of the entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a workprovider algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;

the document graph building module is used for building a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network; carrying out context coding on the preprocessed document by utilizing a pre-training language model to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;

the initial representation updating module is used for updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantics;

the prediction module is used for predicting the entity node representation of the document level semantics according to the feedforward neural network to obtain the score value of the relation label between the entities;

the document level entity relation extraction module is used for calculating a loss value according to the score value of the relation labels among the entities and the relation labels actually existing among the entities, and iteratively optimizing learnable parameters in the deep neural network model by utilizing the loss value and a back propagation algorithm to obtain an entity relation extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

The invention adopts a pre-training language model to model the complex interaction between different layers of information, learns the vocabulary representation of deep contextualization, models the semantic information in the document by constructing a fine document graph, then controls a message transmission algorithm from two aspects of breadth and depth, screens and aggregates the document-level information by learning the self-adaptive sensing path of node message transmission, selectively aggregates the effective document-level information of a target entity, solves the problem that the current entity relation extraction is limited to the entity relation in a sentence, also solves the problem that neighbor nodes and nodes are treated to represent smoothness without distinguishing when the message is transmitted in the document-level entity relation extraction method based on the document graph, and improves the performance of the document-level entity relation extraction, the method realizes the efficient extraction of the entity semantic relation, and provides data support and core algorithm technology for the large-scale knowledge base construction, information retrieval, automatic question answering system and natural language processing application of natural language understanding.

Drawings

FIG. 1 is a flowchart illustrating a document-level entity relationship extraction method based on adaptive entity path awareness according to an embodiment;

FIG. 2 is a diagram of adaptive entity path sensing in one embodiment;

FIG. 3 is a block diagram of an embodiment of an apparatus for document-level entity relationship extraction based on adaptive entity path awareness;

FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, there is provided a document level entity relationship extraction method based on adaptive entity path perception, including the following steps:

102, acquiring a document to be extracted and the position of an entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document contains a plurality of sentences.

Step 102 of the present invention uses the document to be extracted

Representing that document D is composed of N sentences, wherein

Meaning that the ith sentence contains M words. The document is marked with an entity set containing P entities

Wherein

Refer to the ith entity in the document for Q co-referenced entity mentions, each appearing in a different context. Respectively inputting sentences in the document into the wordpienteThe word segmentation device carries out word segmentation, for example, after the ith sentence is segmented, the word segmentation is carried out

Wherein k is<And = M, obtaining the preprocessed document. And preprocessing the document to be extracted to facilitate the context coding of a pre-training language model.

104, constructing a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network, and the pre-training language model is used for carrying out context coding on the preprocessed document to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph includes an initial representation of the entity node.

In order to better model the semantics of an input document, inputting the preprocessed document after word segmentation into a pre-training language model BERT, mapping a sequence of the document after word segmentation into a low-dimensional real number vector containing context semantics through the pre-training language model BERT, wherein the input sequence corresponding to the ith sentence

Mapping to a context characterization sequence

Wherein

And d is the hidden dimension, typically 768. The pre-training language model BERT is adopted, and the BERT can be used for modeling complex interaction between different levels of information and learning deep contextualized vocabulary representation.

The method comprises the steps of calculating initial representations of a mention node, an entity node and a sentence node according to a context characterization sequence of a sentence and the position of an entity in a document to be extracted, and constructing a document graph by using the initial representations of the mention node, the entity node and the sentence node and the natural association connecting nodes of the entity node and the sentence node in the document to be extracted. Semantic information within a document is modeled by building a refined document map. The message transmission algorithm on the graph of the path perception of the self-adaptive entity is an improvement of the message transmission algorithm on the graph, the previous message transmission algorithm on the graph does not make a choice when carrying out node aggregation, all neighbor information of a target node is aggregated, and the message transmission algorithm is not controlled by the message from the breadth and the depth.

And 106, updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantics.

Neighbor information in a target node N hop is aggregated by using an adaptive entity path-aware on-graph message propagation algorithm, interaction between entity pairs is modeled, an adaptive path of an entity on a document graph is learned to promote representation of the entity node, and further entity node representation of document level semantics is obtained, the document level information is screened and aggregated by learning the adaptive sensing path of node message propagation, effective document level information of the target entity is selectively aggregated to capture more effective relation semantic information, and the problem that the current entity relation extraction is limited to intra-sentence entity relations is solved.

And 108, predicting the entity node representation of the document level semantics according to the feedforward neural network to obtain the score value of the relation label between the entities.

Entities generally refer to proper nouns or concepts such as names of people, places, organizations, and relationships refer to semantic relationships between entities, such as: "the head of this day announces that 10 hundred million dollars estimate values purchase music short video platform music.ly," and the relation triple "head of this day, purchase, music.ly" is obtained through entity relation extraction.

In order to predict the semantic relation included between the entity node pairs of the document-level semantics, head and tail entity representations included in the entity node representations of the document-level semantics are spliced to obtain entity pairs including the semantic relation, the entity pairs including the semantic relation are predicted according to a feedforward neural network, and the loss value is calculated by using the prediction result, namely the score of the relation labels between the entities and the actually existing relation labels between the entities, so that the deep neural network model can be trained, and an accurate entity relation extraction model can be obtained.

Step 110, calculating a loss value according to the score value of the relationship labels between the entities and the actually existing relationship labels between the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.

And calculating a loss value according to the score value of the relationship labels among the entities and the real relationship labels among the entities, wherein the real relationship labels among the entities are artificially pre-labeled real relationship labels among the entities, the loss value is minimized by random gradient descent, and the learnable parameters in the deep neural network model are updated layer by layer according to error back propagation. And when the loss function is converged in the optimization process, an entity relationship extraction model is obtained, and the entity relationship extraction model can be used for document-level entity relationship extraction after being stored.

In the document-level entity relation extraction method based on self-adaptive entity path perception, the invention adopts a pre-training language model to model complex interaction among different layers of information, learns deep contextualized vocabulary representation, models semantic information in a document by constructing a fine document graph, then controls a message transmission algorithm from two aspects of breadth and depth, screens and aggregates the document-level information by learning a self-adaptive perception path of node message transmission, selectively aggregates effective document-level information of a target entity, solves the problem that the current entity relation extraction is limited to intra-sentence entity relation, also solves the problem that neighbor nodes and nodes are treated to represent smoothness without distinguishing when the message is transmitted in the document-level entity relation extraction method based on the document graph, improves the performance of document-level entity relation extraction, realizes the efficient extraction of entity semantic relation, and a data support and core algorithm technology is provided for large-scale knowledge base construction, information retrieval and automatic question answering systems and natural language processing application of natural language understanding.

In one embodiment, constructing a document graph according to the context token sequence of the sentence and the position of the entity in the document to be extracted includes:

In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual representations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.

In particular embodiments, references to nodes are intended to mean different references of each entity in the document. The representation of the reference node is the average of the hidden representations corresponding to the words contained in the reference, assuming a documentIf the total number of the nodes contains N references, the representation form of the reference nodes is

Wherein

Mention is made of type embedding of nodes. The form of the entity node is similar to that of the mentioned node, the representation form of the entity node is the average value of all mentioned representations corresponding to the entity, and the representation form of the entity node is that P entities are contained in one document

Wherein

Is a type embedding of the entity node. The expression form of the sentence nodes is the average value of hidden expressions corresponding to all contained words in the sentence sequence, and if a document contains T sentences, the expression form of the sentence nodes is

Wherein

Is type embedding of sentence nodes.

Obtaining the expression set of the nodes through the three types of node structures

Where d is the hidden dimension, for a total of N + R + T nodes.

After the node construction is completed, connecting nodes based on natural association between document node elements to form a document graph: a) mention node-mention node edge: and the mentions in the same sentence are connected with each other. b) Mention of node-sentence node edge: mentions are interconnected with the sentence in which they are located. c) Mention of node-entity node edges: the sentence entities mentioned as corresponding thereto are connected to each other. d) Entity node-sentence node edge: the entity is linked to the sentence that contains its mention. e) Sentence node-sentence node edge: all sentence nodes are connected to each other. It is noted that two entity nodes are not directly connected in the graph, and the purpose is to aggregate the relationship between the multi-hop intermediate node modeling entity pairs between the entity nodes by using the adaptive entity path-aware on-graph message propagation algorithm in the next step.

In conclusion, the constructed N + R + T mention, entity and sentence nodes are connected into the document graph by utilizing the natural association among different node elements of the document

Where V is the set of nodes and E is the set of edges.

according to the LSTM long and short memory network in the depth aspect, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.

Wherein the content of the first and second substances,

，

refers to the weight parameters of node u and neighbor v,

are learnable parameters that linearly transform the neighbor features,

representing nodes

Is shown in the drawing (a) and (b),

and

referring to the query and key matrices in the attention mechanism,

refers to a feed-forward neural network, and the neural network,

is a node

Is determined by the node of the neighbor node set,

representing nodes

Is shown in the drawing (a) and (b),

representing nodes

Of the node(s) of (a) is,Trepresenting a transpose operation.

In a specific embodiment, for the breadth aspect, the information aggregation process of each hop neighbor by utilizing the multi-layer graph attention network is used for representing each node at the l +1 th layer

Firstly obtaining the temporary aggregation representation of the breadth of the node in a breadth self-adaptive mode shown by the following formula

：

By giving different rights to different neighboursAnd the first-order neighbor nodes are treated differently.

and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forgetting gate, controlling the memory unit by using an output gate, and outputting the entity node representation of the document-level semantics.

In the specific embodiment, the long-short memory of the LSTM is introduced, a plurality of gating mechanisms are utilized to store and update neighbor information of each hop, document-level high-order information related to nodes is selectively stored, only neighbors within a certain hop number are selected to be transmitted, and the problems of transmission overload and over-smoothness are effectively prevented. Node-based breadth-temporal aggregation representation

Updating door

Adding new valid information to memory cells

Middle and forget door

Then filter out the memory cells of the previous layer

The updating gate and the forgetting gate are matched with each other to play the roles of selective extraction and filtration when a farther neighbor is searched. Finally, an output gate

Control memory unit

Output nodeiTo (1) at+Level 1 node representation

. The calculation procedure is as follows.

Wherein

，

，

，

Respectively refer to learnable parameters of linear transformation corresponding to the forgetting gate, the updating gate, the output gate and the memory unit.

As shown in FIG. 2, the method determines a suitable subgraph by expanding the width (which hop neighbor is important) and the depth (the importance of the t-th hop neighbor) of each node, so as to learn the self-adaptive path of the information propagation of the entity on the document graph and selectively aggregate the effective documents of the target entityLevel information, which solves the problem that the extraction of entity relationship is limited to the entity relationship in the sentence. Obtaining a set of entity node representations containing document-level semantics from multiple iterations through the message propagation algorithm

。

Wherein

,

,

And

are learnable parameters of classifiers in a feed-forward neural network,

representing different physical node representations

And

and splicing to obtain the characteristics of the entity pairs.

In particular embodiments, first, to predict entity pairs

The included semantic relation represents the head and tail entities included in the entity pair

And

splicing to obtain characteristics of entity pairs

，

。

Then, the feedforward neural network is used to determine the characteristics of the entity pairs

Pair of computing entities

Score of all relationship tags in

：

Wherein

,

,

And

are learnable parameters of classifiers in a feed-forward neural network,

refers to the activation function, d refers to the hidden dimension in the feedforward neural network, and k is the number of labels.

In the prediction stage, normalization is carried out by using a nonlinear activation function sigmoid, and an entity pair can be obtained

Has a relationship label therein

The probability value of (a) is determined,

。

wherein

Finger-shaped

The score value of the middle relationship label r. sigmoid is a value which is converted into a value between 0 and 1, and the probability that the relationship label given by the model exists in the target entity pair is taken as the value, so that the interpretability of the relationship label score is enhanced. Since the model is self-learned, the final output range cannot be known in advance, for example, it is unclear whether the score of the label is 100 high or low, the range can be compressed within (0-1) by using sigmoid, and thus it is known that scores generally greater than 0.5 are very high.

Wherein TH represents a threshold relationship label TH,

a set of relationship tags representing the actual existence of entities,

The scores of all of the relationship tags in (c),

finger relationship label

The score value of (a) is calculated,

a label representing the relationship between the user and the user,

representing relationship labels

The score value of (a) is calculated,

label for representing threshold relation

The score value of (a).

In particular embodiments, to more efficiently handle the multi-tag problem, i.e., the same entity pair may contain multiple relationship tags, and the exemplar imbalance problem, i.e., most entity pairs are negative exemplars that do not contain any relationship tags. The invention adopts adaptive threshold loss as a loss function and optimizes the model parameters in an end-to-end mode. The adaptive threshold loss introduces an additional threshold relation label TH, and the optimization goal is the true existing positive sample relation label set between the entities

Is higher than a threshold class label TH, and negative sample relation label sets do not exist among entities

Is lower than a threshold class label TH, wherein a positive exemplar label refers to a relationship label that exists between entities and is actually present between entities, and a negative exemplar relationship label refers to a relationship that does not exist between entities. The loss function is calculated as follows:

wherein logis refers to an entity pair

Scores for all relationship tags in (c). In order to obtain the optimal model parameters, the invention calculates the loss value between the semantic relationship between the entities and the relationship labels actually existing between the entities through the loss function, minimizes the loss value L by using random gradient descent, and updates the learnable parameters in the model layer by layer according to error back propagation. And after the loss function is converged in the optimization process, the model is stored and then used for extracting the document-level entity relationship.

It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 3, there is provided a document level entity relationship extraction apparatus based on adaptive entity path perception, including: a data preprocessing module 302, a build document graph module 304, an initial representation updating module 306, a prediction module 308, and a document-level entity relationship extraction module 310, wherein:

the data preprocessing module 302 is configured to obtain a document to be extracted and a position of an entity in the document to be extracted; performing data preprocessing on a document to be extracted according to a wordpieee algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;

a build document map module 304 for building a deep neural network model; the deep neural network model comprises a pre-training language model, an on-graph information propagation algorithm for self-adaptive entity path perception and a feedforward neural network, and the pre-training language model is used for carrying out context coding on the preprocessed document to obtain a context representation sequence of a sentence; constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted; the document graph comprises an initial representation of entity nodes;

the initial representation updating module 306 is used for updating the initial representation of the entity node of the document graph from two aspects of the breadth and the depth by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain the entity node representation of the document level semantic;

the prediction module 308 is configured to predict an entity node representation of document-level semantics according to a feed-forward neural network, so as to obtain a score value of a relationship label between entities;

the document-level entity relationship extraction module 310 is configured to calculate a loss value according to the score values of the relationship labels between the entities and the actually existing relationship labels between the entities, and iteratively optimize learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model; and extracting the document-level entity relationship according to the entity relationship extraction model.

In one embodiment, the construct document map module 304 is further configured to construct a document map according to the context token sequence of the sentence and the position of the entity in the document to be extracted, including:

and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations and the mention nodes of the mention nodes, the entity nodes and the sentence nodes and the natural association connection nodes of the entity nodes and the sentence nodes in the documents to be extracted.

In one embodiment, the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises the interconnection between the mention node and the mention node, the interconnection between the mention node and the sentence node, the interconnection between the mention node and the entity node and the interconnection between the entity node and the sentence node; the mention node, the entity node and the sentence node form a node set of the document graph; the natural association of the nodes, the entity nodes and the sentence nodes in the document to be extracted forms an edge set of the document graph; a reference node is an average of the contextual characterizations that reference the corresponding word in the document; the entity node is the average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens for all words in the sentence.

In one embodiment, the initial representation updating module 306 is further configured to update the initial representation of the entity node in both width and depth of the document graph by using an adaptive entity path-aware on-graph message propagation algorithm, and obtain an entity node representation of document-level semantics, including:

In one embodiment, the initial representation updating module 306 is further configured to aggregate neighbor information in N hops of a target node in a document graph by using an adaptive entity path-aware on-graph message propagation algorithm, model interactions between entity pairs, jointly control a message propagation algorithm in both breadth and depth, filter and aggregate document-level information by automatically learning entity-related adaptive paths on the document graph, and obtain an entity node representation of document-level semantics, including:

In one embodiment, the initial representation updating module 306 is further configured to obtain a temporary aggregation of the extents of the nodes according to an extent adaptive manner, including:

Wherein the content of the first and second substances,

，

refers to the weight parameters of node u and neighbor v,

are learnable parameters that linearly transform the neighbor features,

representing nodes

Is shown in the drawing (a) and (b),

and

referring to the query and key matrices in the attention mechanism,

refers to a feed-forward neural network, and the neural network,

is a node

Set of neighbor nodes

Representing nodes

Is shown in the drawing (a) and (b),

representing nodes

The neighbor nodes of (a) are,Trepresenting a transpose operation.

In one embodiment, the initial representation updating module 306 is further configured to, in terms of depth, according to a long-short memory network of an LSTM, utilize multiple gating mechanisms to temporarily aggregate representations of the breadth of the nodes to selectively store document-level high-order information related to the nodes, and select only neighbors within a certain hop count for propagation, so as to obtain an entity node representation of document-level semantics, where the method includes:

In one embodiment, the prediction module 308 is further configured to predict the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities, including:

Wherein

,

,

And

are learnable parameters of classifiers in a feed-forward neural network,

representing different physical node representations

And

and splicing to obtain the characteristics of the entity pairs.

In one embodiment, the document-level entity relationship extraction module 310 is further configured to calculate a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities, including:

Wherein TH represents a threshold relationship label TH,

a set of relationship tags representing the actual existence of entities,

The scores of all of the relationship tags in (c),

finger relation label

The score value of (a) is obtained,

a label representing the relationship between the user and the user,

representing relationship labels

The score value of (a) is calculated,

label for representing threshold relation

The score value of (a).

For the specific definition of the document-level entity relationship extracting apparatus based on adaptive entity path sensing, refer to the above definition of the document-level entity relationship extracting method based on adaptive entity path sensing, and no further description is given here. The modules in the document level entity relation extraction device based on the adaptive entity path perception can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a document level entity relationship extraction method based on adaptive entity path perception. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method in the above embodiments when the processor executes the computer program.

In an embodiment, a computer storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method in the above-mentioned embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A document level entity relation extraction method based on adaptive entity path perception is characterized by comprising the following steps:

performing data preprocessing on the document to be extracted according to a workprovider algorithm to obtain a preprocessed document; the preprocessed document comprises a plurality of sentences;

updating the initial representation of the entity node from two aspects of the breadth and the depth of the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception to obtain an entity node representation of document level semantics;

predicting the entity node representation of the document level semantics according to a feedforward neural network to obtain a relation label score value between entities;

calculating a loss value according to the score value of the relationship labels among the entities and the relationship labels actually existing among the entities, and iteratively optimizing learnable parameters in the deep neural network model by using the loss value and a back propagation algorithm to obtain an entity relationship extraction model;

2. The method according to claim 1, wherein constructing a document graph according to the context characterization sequence of the sentence and the position of the entity in the document to be extracted comprises:

and calculating initial representations of the mention nodes, the entity nodes and the sentence nodes according to the context characterization sequences of the sentences and the positions of the entities in the documents to be extracted, and constructing a document graph by using the initial representations of the mention nodes, the entity nodes and the sentence nodes and the natural associated connection nodes of the mention nodes, the entity nodes and the sentence nodes in the documents to be extracted.

3. The method according to claim 2, wherein the natural association of the mention node, the entity node and the sentence node in the document to be extracted comprises interconnection between the mention node and the mention node, interconnection between the mention node and the sentence node, interconnection between the mention node and the entity node and interconnection between the entity node and the sentence node; the mentioned nodes, the entity nodes and the sentence nodes form a node set of the document graph; the mentioned nodes, the entity nodes and the sentence nodes are naturally associated in the document to be extracted to form an edge set of the document graph; the reference node is an average value of context tokens referring to corresponding words in the document; the entity node is an average value represented by all the mentioned nodes corresponding to the entity; the sentence node is the average of the context tokens of all words in the sentence.

4. The method of claim 1, wherein updating the initial representation of the entity nodes from both breadth and depth to the document graph using an adaptive entity path-aware on-graph message propagation algorithm to obtain an entity node representation of document-level semantics, comprises:

neighbor information in target node N hops in the document graph is aggregated by using an on-graph message propagation algorithm of self-adaptive entity path perception, interaction between entity pairs is modeled, the message propagation algorithm is controlled from both the aspect of breadth and depth, and document level information is screened and aggregated by automatically learning self-adaptive paths related to entities on the document graph, so that entity node representation of document level semantics is obtained.

5. The method of claim 4, wherein aggregating neighbor information within N hops of a target node in the document graph using an adaptive entity path-aware on-graph message propagation algorithm, modeling interactions between pairs of entities, co-controlling the message propagation algorithm in both breadth and depth, screening and aggregating document-level information by automatically learning entity-related adaptive paths on the document graph, resulting in an entity node representation of document-level semantics, comprises:

aggregating neighbor information in N hops of a target node in the document graph by using an on-graph message propagation algorithm of self-adaptive entity path perception, modeling interaction between entity pairs, and obtaining temporary aggregation representation of the node in the aggregation process of neighbor information of each hop according to an extent self-adaptive mode in terms of extent;

and according to the long and short memory network of the LSTM in depth, a plurality of gate control mechanisms are utilized to temporarily aggregate and express the breadth of the nodes, document level high-order information related to the nodes is selectively stored, and only neighbors within a certain hop count are selected to be transmitted, so that entity node expression of document level semantics is obtained.

6. The method of claim 5, wherein obtaining the extent temporal aggregation of nodes according to an extent adaptive approach comprises:

Wherein the content of the first and second substances,

，

refers to the weight parameters of node u and neighbor v,

are learnable parameters that linearly transform the neighbor features,

representing nodes

Is shown in the drawing (a) and (b),

and

referring to the query and key matrices in the attention mechanism,

refers to a feed-forward neural network, and the neural network,

is a node

Is determined by the node of the neighbor node set,

representing nodes

Is shown in the drawing (a) and (b),

representing nodes

The neighbor nodes of (a) are,Trepresenting a transpose operation.

7. The method of claim 5, wherein the temporally aggregating the representation of the breadth of the nodes using a plurality of gating mechanisms to selectively save the document level high-order information related to the nodes according to the long and short memory network of the LSTM in terms of depth, and selecting only the neighbors within a certain hop count for propagation to obtain the entity node representation of the document level semantics, comprises:

and for the temporary aggregation representation of the breadth of the node, adding effective information in the temporary aggregation representation of the breadth of the node into a memory unit by using an update gate, filtering invalid information in a previous layer of memory unit by using a forget gate, controlling the memory unit by an output gate, and outputting the entity node representation of the document-level semantics.

8. The method of claim 1, wherein predicting the entity node representation of the document-level semantics according to a feed-forward neural network to obtain a relationship label score value between entities comprises:

Wherein

,

,

And

are learnable parameters of classifiers in a feed-forward neural network,

representing different physical node representations

And

and splicing to obtain the characteristics of the entity pairs.

9. The method of claim 8, wherein calculating a loss value based on the scoring values of the relationship labels between the entities and the actually existing relationship labels between the entities comprises:

calculating a loss value according to the score value of the relationship label between the entities and the actually existing relationship label between the entities