CN115982392A

CN115982392A - Relationship graph method, device, equipment and medium for multiple entity and relationship extraction

Info

Publication number: CN115982392A
Application number: CN202310272383.7A
Authority: CN
Inventors: 刘昊; 夏祎敏; 马坚; 魏志强; 孔令磊; 曾谁飞; 李桂玺; 张景瑞
Original assignee: Ocean University of China; Qingdao Haier Refrigerator Co Ltd
Current assignee: Ocean University of China; Qingdao Haier Refrigerator Co Ltd
Priority date: 2023-03-21
Filing date: 2023-03-21
Publication date: 2023-04-18

Abstract

The invention relates to a method, a device, equipment and a medium for extracting a relation graph of multiple entities and relations, belonging to the technical field of recognition of natural language processing named entities.A bidirectional RNN is firstly applied to extract sequence characteristics, then a bidirectional GCN is used for further extracting region dependence characteristics, and the relation between each word pair and a word entity is predicted based on the extracted word characteristics; building a complete relationship-weighted graph for each relationship r, wherein the edges in the (w 1, w 2) relationship-weighted graph areP _r (w 1, w 2) using bi-directional GCN on each relationship-weighted graph, taking into account the differences in different relationshipsDegree of influence, aggregated into comprehensive word features; the invention considers the interaction between the entity and the relation and better predicts the interaction between the entity and the relation, particularly the overlapping entity, thereby realizing more comprehensive and accurate triple data extraction from the unstructured text to complete the construction of the knowledge graph.

Description

Relationship graph method, apparatus, device and medium for multiple entities and relationship extraction

Technical Field

The invention belongs to the technical field of recognition of named entities in natural language processing, and particularly relates to a method, a device, equipment and a medium for extracting a relationship graph of multiple entities and relationships.

Background

The relation extraction task is to judge the semantic relation between any two entities in the text according to the information in the text, wherein the entities refer to some specific proper nouns, such as names of people, names of places, time and the like. The result obtained by the relationship extraction task is usually returned in the form of an entity relationship triple, and formally represented as (E1, R, E2), where E1 and E2 respectively refer to a head entity and a tail entity, and R represents the relationship therebetween. For example, in the text "a originated at B, there is a person entity" a "and a place entity" B "and there is a" place of origin "relationship between them, then entity relationship triplets (" a "," place of origin "," B ") may be returned. In the task of extracting relationships, the types of entities and relationships usually need to be predefined, and as in the above example, "place of birth" is a predefined relationship type.

There is a large amount of unstructured electronic text on the Web, including newsletters, blogs, email communications, documents, chat logs, and the like. These data are generally understood by converting unstructured text to structured text by labeling semantic information. However, due to the large amount and heterogeneity of data, often one is interested in entity-to-entity relationships. Current advanced Named Entity Recognizers (NER) can automatically label data with high accuracy. In order to make the correct annotations, the computer needs to know how to identify a piece of text with semantic attributes of interest. Therefore, extracting semantic relationships between entities in natural language text is a key step in implementing natural language understanding applications.

Traditionally, a pipelined approach first extracts entity mentions using a named entity identifier and then predicts the relationship between each pair of extracted entity mentions. By taking advantage of the close interaction between these two tasks, a federated entity identification and relationship extraction model is built. While showing the benefits of joint modeling, these complex methods are feature-based structural learning systems and therefore rely heavily on feature engineering. With the success of deep neural networks, automatic feature learning methods based on convolutional neural networks (CNN for short) have been applied to relationship extraction. These methods use CNN, long and short term memory: (LSTM) Or tree-structured long-short term memory (TreeLSTM) encodes information about the word sequences mentioned by each pair of entities, the shortest going path mentioned by the two entities, or the smallest constituent subtree mentioned by the two entities. However, these methods are not end-to-end joint modeling of entities and relationships. They assume that the content of an entity is given and that performance is expected to drop significantly when a named entity recognizer is needed for real-world use. Another challenge of relationship abstraction is how to consider interactions between relationships, which is particularly important for overlapping relationships, i.e., relationships that share common entity references. For example, (refrigerator, storage, vegetables) can be inferred from (vegetables, eligible for storage, refrigerator); these two triplets are said to exhibit two entity overlap. Alternatively, the previous triplet may be stored from (cold room, temperature range, 0-10℃) and (0-10℃, suitable storage,vegetables), the latter two are said to exhibit single entity overlap. While common in knowledge base implementations, it is particularly difficult for entity recognition and relationship extraction federated models, whether through direct derivation or indirect evidence.

Extracting pairs of entities with semantic relationships, i.e. pairs of relationships like (cold room, storage, vegetables), is a core task that forms the extraction and allows automatic construction of knowledge edges from unstructured text, but three key aspects have not been adequately processed in a unified framework. The method specifically comprises the following steps: (1) End-to-end joint modeling of entity identification and relationship extraction; (2) Predicting overlapping relationships, i.e., sharing a common mentioned relationship; (3) Taking into account the interaction between relationships, in particular overlapping relationships.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a deep learning-based relationship graph method for extracting multiple entities and relationships, which considers the interaction between named entities and relationships through a graph convolution neural network (GCN for short) with relationship weighting so as to better extract the relationships. Both linear and dependent structures are used to extract the order and region features of the text, and further, a complete word graph is used to extract the implicit features of all word pairs in the text. The prediction of the overlap relationship is much improved over previous sequential methods by graph-based methods.

The invention is realized by the following technical scheme:

a relationship graph method for multiple entity and relationship extraction includes the following steps:

step 1, firstly, a bidirectional cyclic neural network (RNN for short) is applied to extract sequence features, then, bidirectional GCN is used to further extract region dependence features, and the relation between a word entity and each word pair is predicted based on the extracted word features; for a word entity, predicting all words according to the word features on the layer 1 LSTM, applying a category loss, denoted as eloss1p, and training the words; for relation extraction, removing a dependence edge, and predicting all word pairs; for each relationship, model learning3 weight matrixes are provided

And calculating a relationship score S;

；

；

wherein S (w 1, r, w 2) represents the relationship score of the word pair (w 1, w 2) under the relationship r, and (w 1, w 2) refers to the word pair; for the word pair (w 1, w 2), calculating the relationship score of all the word pairs including the non-relationship, recording the relationship score as S (w 1, NULL, w 2), applying the Softmax function to S (w 1, r, w 2) to obtain the relationship scoreP _r (w 1, w 2) indicating the score probability of the relation r of (w 1, w 2),h _w1 ，h _w2 hidden features representing word w1 and word w2, respectively; reLU represents the (modified linear unit) activation function,

XOR sign for logical operation;

step 2, establishing a complete relation weighted graph for each relation r, wherein the edge in the (w 1, w 2) relation weighted graph is P _r （w1， w2）；

Adopting bidirectional GCN on each relation weighted graph, considering different influence degrees of different relations, and aggregating the influence degrees into comprehensive word characteristics; the method comprises the following specific steps:

；

wherein, the first and the second end of the pipe are connected with each other,

indicates that the word is greater or less>

Is at>

Hidden feature of layer +1 +>

Indicates that the word is greater or less>

In or on>

The hidden features of the layer(s) are,

weights representing edges, i.e., the scored probability of a word versus relationship r, <' > based on>

And &>

Is indicated at the fifth->

GCN weight under tier relationships, V includes all words, R includes all relationships, and ` H `>

Indicates that the word is greater or less>

Is at>

A hidden feature of the layer; the full bidirectional GCN also takes both incoming and outgoing cases into account;

the bi-directional GCN further considers the propagation of the relation weighting, extracts more sufficient characteristics for each word, and carries out named entity and relation classification again by utilizing the acquired newer word characteristics so as to acquire more stable relation prediction.

The invention also provides a device for extracting the relation graph of the multiple entities and the relations, which comprises a data processing module, wherein the data processing module runs the method for extracting the relation graph of the multiple entities and the relations.

The invention also provides a computer apparatus comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the multiple entity and relationship extracted relationship graph method.

The present invention also provides a computer readable storage medium having stored thereon a computer program adapted to be loaded by a processor and to execute the relationship graph method of the multiple entities and relationship extraction.

Compared with the prior art, the invention has the beneficial effects that:

(1) The invention provides an end-to-end relationship extraction model which jointly learns named entities and GCN-based relationships. Combining RNN and GCN, not only order features are extracted, but also region-dependent features of each word are extracted. The invention utilizes the linear structure and the dependency structure to extract the sequence characteristics and the regional characteristics of the text, and further extracts the implicit characteristics among all word pairs of the text by utilizing the complete word graph, predicts the relationship for each word pair and solves the problem of entity overlapping; in addition, the present invention introduces a new relationship-weighted GCN that incorporates the interaction between named entities and re-relationships.

(2) The invention provides an end-to-end neural network combined model for entity identification and relationship extraction, which processes three key aspects in the relationship extraction for the first time; in step 1 prediction, the automatic extraction of hidden features for each word is learned by stacking the BilSTM sentence encoder and the GCN dependency tree encoder. The entity mentions are then tagged and the relationship triples that connect the mentions are predicted. In order to better predict the relationship triples while considering the interaction between the relationships, the invention adds a new relationship weighted GCN in step 2. Step 1, under the dual guidance of entity loss and relationship loss, extracting node hiding characteristics on a dependent link, and establishing a brand-new full connected graph with relationship weighted edges; then, through the operation on the full connectivity graph, step 2GCN effectively considers the interaction of entities and relationships before final classification of each edge.

(3) In the prior art, a pipeline method is used for extraction: firstly, entity recognition is carried out on the sentences, then, every two recognized entities are combined, then, relationship classification is carried out, and finally, the triples with entity relationships are used as input. In the method, the error of the entity identification module affects the classification performance of the following relationships, unnecessary redundant information is generated, and entities without relationships bring redundant information, so that the error rate is improved; the entity identification and relation extraction combined model used in the invention can realize the simultaneous entity identification and relation extraction of the sentence to obtain a related entity triple, and takes the interaction between the entity and the relation into consideration.

(4) The purpose of Information Extraction (IE) is to extract pairs of entities and their relationships for a given sentence or document. The IE is an important task because it facilitates the automatic construction of a knowledge graph from unstructured text. With the success of deep neural networks, neural network-based methods have been applied to information extraction. However, these methods tend to ignore non-local and non-serialized contextual information of the input text. In addition, the prediction of overlapping relationships, i.e., the prediction of the relationship of a pair of entities sharing the same entity, cannot be properly solved; according to the method and the device, the interaction between the entities and the relation, particularly the overlapped entities, can be well predicted, so that more comprehensive and accurate triple data can be extracted from the unstructured text to complete the construction of the knowledge graph and the like.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a relational weighting graph.

Detailed Description

The present invention will be further described with reference to specific embodiments thereof, it being understood that the embodiments described are only a few, and not all, of the embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

Embodiment 1, a method for relationship graph extraction of multiple entities and relationships, as shown in fig. 1, the method includes the following steps:

step 1, as the most advanced text feature extractor, in order to take order and region dependencies into account, the invention first applies bi-directional RNNs to extract order features and then uses dual GCNs to further extract region dependency features. And then predicting the relation between each word pair and the word entity according to the extracted word characteristics.

This embodiment uses the well-known LSTM as a dual RNN unit. For each word, taking the embedding and semantic Part (POS) of the word as initial features;

；

wherein the content of the first and second substances,

means for word->

The characteristics of the header of (a) are, device for selecting or keeping>

And &>

Respectively, words and part-of-speech embedding of words. Pre-training word embedding in GloVe was used and POS embedding was randomly initialized to train the entire GraphRel.

Since the original input sentence is a sequence, there is no inherent graph structure to speak. The dependency tree is used as an adjacency matrix for the input sentence, and the GCN is used to extract the region dependency characteristics. The initial GCN was designed for undirected graphs. To consider both incoming and outgoing word features, and implement a bi-directional GCN as:

；

wherein the content of the first and second substances,

represents a word->

At the fifth place>

+1 layer hidden feature. />

Represents a word->

In the fifth or fifth place>

Hidden features of the layer.

And &>

Respectively represent the word->

In the fifth or fifth place>

The output hidden feature and the input hidden feature of +1 layer. />

Containing slave words

All words initially output, and>

comprising from start to word->

All words entered, both including the word @>

Itself.WAndbare all learnable convolutionsAnd (4) weighting. />

，/>

And &>

，/>

Are also indicated in each case at a fifth>

Layer output weights and input weights. The output and input word features are concatenated as the final word feature.

Word entities are predicted using word features extracted from the bi-directional RNN and bi-directional GCN, and relationships between each word pair are extracted. For word entities, all words are predicted based on word features on the LSTM and trained with a domain penalty, denoted eloss 1p.

For the relation extraction, removing the dependence edge and predicting all the word pairs; for each relationship, the model learns 3 weight matrices

、/>

、/>

And calculating a relationship score S;

；

；

wherein S (w 1, r, w 2) represents a relationship score of (w 1, w 2) under the relationship, and (w 1, w 2) refers to a word pair. h is _w1 ，h _w2 Representing hidden features of word w1 and word w2, respectively. Note that (w 1, r, w 2) should be different from (w 2, r, w 1). For word pair (w 1, w 2), a relationship score is calculated for all word pairs including non-relationships and is denoted as S (w 1, NULL, w 2). Applying the Softmax function to S (w 1, r, w 2) to obtain P _r (w 1, w 2) indicating the score probability of the relation r of (w 1, w 2).

Since the present embodiment extracts the relationship of each word pair, the design does not include triple count restrictions. By studying the relationships of each word pair, the present embodiment method identifies as many relationships as possible. For P _r (w 1, w 2), the relationship domain loss here can also be calculated and is denoted as rloss1p. Note that while neither can be used as final predictions, they are also a good aid to training step 1.

The entities and relations extracted in step 2 and step 1 are not considered to each other. In order to take into account the actions between named entities and relationships to each other, and to take into account the implicit features between all word pairs in the text, the present embodiment proposes a novel relationship-weighted GCN for further extraction.

After step 1, a complete relationship-weighted graph is built for each relationship r, where the edge of the relationship-weighted graph of (w 1, w 2) is P _r (w 1, w 2) as shown in FIG. 2.

And adopting bidirectional GCN on each relation graph, considering different influence degrees of different relations, and aggregating the two into comprehensive word characteristics. This process may be:

；

wherein

Indicates that the word is greater or less>

At the fifth place>

+1 layer hidden feature./>

、/>

Respectively represent the word->

And the word->

At the fifth place>

Hidden features of the layer. />

Weight representing a side (word pair @)>

R score probability), based on the relation r->

And &>

Denotes the th or lower relation r>

The weights of the GCN of the layers,Vincluding all words and R contains all relationships. The full bi-directional GCN also takes into account both input and output cases. The bi-directional GCN further considers the relationship weighted propagation, extracting more sufficient features for each word.

The bi-directional GCN further takes into account the propagation of the relationship weights and extracts more sufficient features for each word. Named entities and relationship classification are performed again using newer word features to obtain more robust relationship predictions.

Example 2 the sentences in table 1 were studied using the method of example 1.

TABLE 1 case study of the method of the invention

。

The first sentence is a simple example, and both step 1 and step 2 can be extracted accurately. For the second case, although "it" does not belong to a named entity, it should contain the hidden semantics of apple. Thus, step 2 can be further better predicted (apple, functional, promoting digestion). The third case is a single entity overlapping class, wherein step 2 finds that the longan, longan and noose are the same, so the longan, the storage position of the longan is a refrigerating chamber.

Claims

1. A method for relational graph extraction of multiple entities and relationships, the method comprising the steps of:

step 1, firstly, extracting sequence features by using a bidirectional cyclic neural network, then further extracting region dependence features by using a bidirectional graph convolutional neural network, and predicting the relation between a word entity and each word pair based on the extracted word features; for a word entity, predicting all words according to the word features on the layer 1 LSTM, applying a category loss, denoted as eloss1p, and training the words;

、/>

、/>

And calculating a relationship score S;

，

；

wherein S (w 1, r, w 2) represents the relationship score of the word pair (w 1, w 2) under the relationship r, and (w 1, w 2) refers to the word pair; for the word pair (w 1, w 2), calculating the relationship score of all the word pairs including the non-relationship, recording the relationship score as S (w 1, NULL, w 2), applying the Softmax function to S (w 1, r, w 2) to obtain the relationship scoreP _r (w 1, w 2) indicating the score probability of the relation r of (w 1, w 2),h _w1 ，h _w2 hidden features representing word w1 and word w2, respectively; reLU represents a modified linear unit activation function,

is a logical operation exclusive or sign;

step 2, establishing a complete relationship weighted graph for each relationship r, wherein the edges in the (w 1, w 2) relationship weighted graph areP _r (w 1, w 2); adopting a bidirectional graph convolutional neural network on each relationship weighted graph, considering different influence degrees of different relationships, and aggregating the different influence degrees into comprehensive word characteristics; the method comprises the following specific steps:

，

indicates that the word is greater or less>

Is at>

Hidden feature of layer +1 +>

Means for word->

Is at>

Hidden feature of the layer->

Weight representing an edge, i.e., the score probability of the word under the relationship r, <' >>

And &>

Is indicated at the fifth->

The graph under the layer relationship is convolved with the neural network weights,Vincluding all of the words that are to be included,Rincluding all relationships>

Means for word->

Is at>

A hidden feature of the layer; the complete bipartite convolutional neural network also takes both afferent and efferent cases into account; the bi-directional graph convolutional neural network further considers the propagation of the relation weighting, extracts more sufficient characteristics for each word, and carries out named entity and relation classification again by utilizing the acquired newer word characteristics so as to acquire more stable relation prediction.

2. An apparatus for multiple entity and relationship extraction relationship graph, the apparatus comprising a data processing module, the data processing module operating the multiple entity and relationship extraction relationship graph method of claim 1.

3. A computer arrangement, characterized in that the arrangement comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of the multiple entity and relationship extracted relationship graph method of claim 1.

4. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program adapted to be loaded by a processor and to execute the relationship graph method of multiple entities and relationship extraction as claimed in claim 1.