CN113032571A - Entity and relationship extraction method - Google Patents

Entity and relationship extraction method Download PDF

Info

Publication number
CN113032571A
CN113032571A CN202110420639.5A CN202110420639A CN113032571A CN 113032571 A CN113032571 A CN 113032571A CN 202110420639 A CN202110420639 A CN 202110420639A CN 113032571 A CN113032571 A CN 113032571A
Authority
CN
China
Prior art keywords
word
representation information
node
information
extracting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110420639.5A
Other languages
Chinese (zh)
Inventor
程良伦
牛伟才
张伟文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202110420639.5A priority Critical patent/CN113032571A/en
Publication of CN113032571A publication Critical patent/CN113032571A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an entity and relation extraction method, which is used for solving the technical problem that the entity recognition and relation extraction effects of words in the prior art are poor. The method comprises the following steps: extracting multi-granularity characteristic representation information of each word in a preset text; extracting first node feature representation information of the word based on the multi-granularity feature representation information; constructing adaptive adjacency matrixes of various preset relation types; extracting second node characteristic representation information of the word according to the self-adaptive adjacency matrix and the first node characteristic representation information; determining an entity type of each word based on the second node feature representation information of each word; a relation category between any two words is calculated based on the second node characteristic representation information of each word.

Description

Entity and relationship extraction method
Technical Field
The invention relates to the technical field of text processing, in particular to an entity and relationship extraction method.
Background
The large-scale knowledge graph is constructed, so that the fields of text generation, question answering systems, recommendation systems and the like can be better served. However, the triple information required by the knowledge graph is often hidden in massive unstructured internet texts, and the labeling only by manpower wastes a great deal of money and manpower resources. Therefore, it is important to extract the correct entity and relationship triples from the large amount of unstructured text.
Entity recognition and relationship extraction have been receiving wide attention from researchers as the lowest level of tasks in natural language processing. The earliest work looked the extraction of relational triples as two pipeline subtasks, i.e. first all entities in a sentence were identified; and then, carrying out relation classification on the extracted entity pairs according to the semantic information of the sentences.
However, the above method often causes a problem of error accumulation propagation of information, because if an entity is not correctly identified, the relationship classification is necessarily guided by the error information. Moreover, the method ignores the interaction between the two subtasks, and therefore loses much important information. In order to solve the problem, the prior art proposes an entity-relationship joint extraction method based on feature engineering, and these models aim to establish interaction between entities and relationships, and although information of the entities and the relationships is utilized at the same time, the effect of the models is too much more dependent on the set feature engineering. With the development of neural networks, neural network construction models are increasingly used in practical applications to automatically learn the feature representation of sentences, so as to extract the relation triples implicit in texts. Despite some advances made in these models, it is still not possible to cope with the relational structure in a complex context. For example, two triples share the same entity, the model is required to be able to accurately identify the triples related to each other according to the semantic information of the context, and there is a certain relationship in the triples sharing the same entity, for example, there is a strong semantic relationship between the triples (Michael Jacks, born, America) and the triples (State of Indiana, LocateIn, America), however, the existing joint extraction model ignores the semantic interaction between words under different relationships, which results in loss of a lot of useful information, and results in poor effect of entity identification and relationship extraction of the words.
Disclosure of Invention
The invention provides an entity and relationship extraction method, which is used for solving the technical problem that the entity recognition and relationship extraction effects of words in the prior art are poor.
The invention provides an entity and relationship extraction method, which comprises the following steps:
extracting multi-granularity characteristic representation information of each word in a preset text;
extracting first node feature representation information of the word based on the multi-granularity feature representation information;
constructing adaptive adjacency matrixes of various preset relation types;
extracting second node characteristic representation information of the word according to the self-adaptive adjacency matrix and the first node characteristic representation information;
determining an entity type of each of the words based on the second node feature representation information of each of the words;
calculating a relationship category between any two words based on the second node characteristic representation information of each of the words.
Optionally, the step of extracting multi-granularity feature representation information of each word in the preset text includes:
calculating hidden state representation information of each word in a preset text;
extracting character-level word features and word-level part-of-speech features of each word;
and generating multi-granularity characteristic representation information of each word by adopting the hidden state representation information, the character-level word characteristics and the word-level part-of-speech characteristics.
Optionally, the step of extracting the first node feature representation information of the word based on the multi-granularity feature representation information includes:
creating an adjacency matrix for the word;
extracting incoming node representation information and outgoing node representation information of the word by adopting the adjacency matrix and the multi-granularity feature representation information;
generating first node characteristic representation information of the word using the incoming node representation information and the outgoing node representation information.
Optionally, the step of constructing an adaptive adjacency matrix of multiple preset relationship types includes:
obtaining sentence characteristics of sentences in which each word is located;
acquiring a hidden state dimension and a preset input dimension of the sentence characteristics;
calculating dependency weight initial hidden representation information of the words by adopting the sentence characteristics, the hidden state dimension and the input dimension;
adopting the initial hidden representation information to respectively calculate the respective corresponding dependency information of the words under various preset relationship types;
and constructing self-adaptive adjacency matrixes respectively corresponding to a plurality of preset relation types based on the dependency information.
Optionally, the step of extracting second node feature representation information of the word according to the adaptive adjacency matrix and the first node feature representation information includes:
calculating forward characteristic representation information and backward characteristic representation information of the word by adopting the self-adaptive adjacency matrix and the first node characteristic representation information;
and generating second node feature representation information of the word by using the forward feature representation information and the backward feature representation information.
Optionally, the dependency information includes a query vector and a key-value vector.
Optionally, the step of extracting the character-level word features and the word-level part-of-speech features of each word includes:
and extracting character-level word characteristics and word-level part-of-speech characteristics of each word by adopting a preset bidirectional long-and-short time memory network.
According to the technical scheme, the invention has the following advantages: the invention provides a method for extracting entities and relations in a combined manner, and particularly discloses the following steps: extracting multi-granularity characteristic representation information of each word in a preset text; extracting first node feature representation information of the word based on the multi-granularity feature representation information; constructing adaptive adjacency matrixes of various preset relation types; extracting second node characteristic representation information of the word according to the self-adaptive adjacency matrix and the first node characteristic representation information; determining an entity type of each word based on the second node feature representation information of each word; a relation category between any two words is calculated based on the second node characteristic representation information of each word. According to the method, the self-adaptive adjacency matrixes of various preset relationship types are constructed, the second node characteristic representation information of the words is calculated based on the self-adaptive adjacency matrixes under different relationship types, so that semantic interaction between the words under different relationship types is captured, and the recognition effect of the entity types of the words and the relationship types among different words is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
FIG. 1 is an example of different categories of sentences;
FIG. 2 is a flowchart illustrating steps of a method for extracting entities and relationships according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating steps of an entity and relationship extraction method according to another embodiment of the present invention.
Detailed Description
In practical applications, the early joint extraction model cannot extract overlapping triplets in plain text. As shown in fig. 1, sentences are classified into three types, Normal (Normal), single entity overlap (SingleEntityOverlap), and double entity overlap (entitypair overlap), according to the degree of relationship overlap. The first statement is of the Normal type, i.e., there is no intersection between an entity and a relationship. The second sentence is a single entity overlap, i.e., two triples share one entity. The third sentence is a two-entity recognition, i.e., two entities have multiple relationships. The problem of triplet overlap directly plagues the sequential label-based joint extraction scheme, which assumes only one label per token. To solve the problem of overlapping triplets. Researchers have proposed a replication mechanism to replicate entities from sentences repeatedly, but performance is poor because all entities cannot be replicated at once. To enhance the interaction between entities and relationships, researchers have proposed using graphical neural networks to model text as a relationship weighted graph, predicting both entities and relationships. But previous studies have ignored semantic interactions between words under different relationships, which can lose much useful information.
In view of this, embodiments of the present invention provide an entity and relationship extraction method, which is used to solve the technical problem that the entity recognition and relationship extraction effects of the word in the prior art are poor.
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 2, fig. 2 is a flowchart illustrating steps of an entity and relationship extraction method according to an embodiment of the present invention.
The entity and relationship extraction method provided by the invention specifically comprises the following steps:
step 201, extracting multi-granularity characteristic representation information of each word in a preset text;
in the embodiment of the invention, multi-granularity feature representation information of each word in a sentence can be extracted through a feature mixing layer, and the multi-granularity feature representation information comprises a global context feature encoder, a character feature encoder and a part-of-speech feature encoder.
Step 202, extracting first node feature representation information of a word based on multi-granularity feature representation information;
after obtaining the multi-granularity feature representation information of each word, a feature mixture layer and a BiGCN (BiGCN) layer of a first stage may be stacked to automatically obtain the first node feature representation information of each word.
Wherein, BiGCN: when calculating the characteristics of the current node, not only all the characteristics of the path pointing to the node, i.e. the pointing characteristics, but also the characteristics of the path pointed to by the node, i.e. the pointed characteristics, are calculated.
Step 203, constructing adaptive adjacency matrixes of various preset relationship types;
in order to enhance information interaction among all parts of the triples, the embodiment of the invention provides a node-aware attention mechanism to acquire hidden association information among words. Thus, a complete word association matrix can be established and is more suitable for real data distribution through supervised learning.
Specifically, the dependency weights for words in different relationship spaces are different. To more flexibly predict overlapping triples, the node-aware attention mechanism of embodiments of the present invention dynamically learns the strength of correlation between different words in each relationship space in an end-to-end manner. The original dependency tree is converted into a plurality of fully connected graphs. Each graph contains semantic information under different relation spaces, and a dependency relation adaptive adjacency matrix under each relation type is further constructed.
Step 204, extracting second node characteristic representation information of the word according to the self-adaptive adjacency matrix and the first node characteristic representation information;
after the adaptive adjacency matrix of each predefined relation is acquired, the first node feature representation information is used as initial sentence input information of the BiGCN feature extractor in the second stage. The feature information of the sentences is mapped into different relation spaces, and the dependency correlation strength between the nodes is dynamically learned by using the adaptive adjacency matrix. And then fusing the node dependency relationship information and the first node feature representation information extracted by the BiGCN under all relationship spaces to obtain second node feature representation information of each word.
Step 205, determining the entity type of each word based on the second node characteristic representation information of each word;
and step 206, calculating the relation category between any two words based on the second node characteristic representation information of each word.
In the embodiment of the invention, the entity types and the relation categories in the text can be extracted simultaneously by utilizing the feature mixed layer and the data extracted by the BiGCN in the two stages.
According to the method, the self-adaptive adjacency matrixes of various preset relationship types are constructed, the second node characteristic representation information of the words is calculated based on the self-adaptive adjacency matrixes under different relationship types, so that semantic interaction between the words under different relationship types is captured, and the recognition effect of the entity types of the words and the relationship types among different words is improved.
Referring to fig. 3, fig. 3 is a flowchart illustrating steps of an entity and relationship extraction method according to another embodiment of the present invention. The method specifically comprises the following steps:
step 301, calculating hidden state representation information of each word in a preset text;
in an embodiment of the present invention, context information may first be encoded using a pre-trained BERT model. BERT is a series combination of N identical transmamers modules, each of which can be represented using trans (x), where x represents the input vector of a sentence. The BERT model was used as follows:
Figure BDA0003027719410000061
Figure BDA0003027719410000062
where I is the word index in the vocabulary, WsIs sub-word is an embedded matrix, WpIs a position embedding matrix of the input sentence。
Figure BDA0003027719410000063
Is the hidden state representation information of the l-th layer.
Step 302, extracting character-level word characteristics and word-level part-of-speech characteristics of each word;
in addition to the global context representation of the sentence, embodiments of the present invention also introduce character embedding and part-of-speech embedding. Character-level word features and word-level part-of-speech features of each word are extracted by using a Long short-Term Memory network (LSTM) as a feature extractor for character-level and part-of-speech information. In an LSTM network, the output of the current time step is not only related to previous states, but also possibly future states. Firstly, randomly initializing an embedding matrix and a part-of-speech embedding matrix of characters, and then conveying the character embedding matrix and the part-of-speech embedding matrix to a bidirectional LSTM to extract word features at a character level and part-of-speech features at a word level. The specific mode is as follows:
Figure BDA0003027719410000064
Figure BDA0003027719410000065
wherein the content of the first and second substances,
Figure BDA0003027719410000071
dp、dcthe hidden layer state dimension representing BiLSTM then represents the character and part-of-speech characteristics of the input sentence as a sum.
Step 303, generating multi-granularity characteristic representation information of each word by adopting the hidden state representation information, the character-level word characteristics and the word-level part-of-speech characteristics;
then, the three kinds of characteristic information including the hidden state representation information of each word, the character level word characteristics and the word level part-of-speech characteristics are spliced to obtain the characteristic combination representation h of the sentences=[hw;hc;hp]And then, taking the combined feature representation as an input of the BilSTM to continuously extract implicit association information among the three features so as to obtain multi-granularity feature representation information of the words. The generated multi-granularity feature representation information can greatly alleviate the OOV (out-of-vocabulary) problem. And the characteristic potential association semantic information can be mined, and the specific mode is as follows:
Figure BDA0003027719410000072
wherein the content of the first and second substances,
Figure BDA0003027719410000073
for multi-granular characterization of information, dw、dc、dpHidden layer dimensions of context embedding, character embedding and part of speech embedding are respectively.
Step 304, extracting first node feature representation information of the word based on the multi-granularity feature representation information;
step 304 may include the following sub-steps:
s41, creating an adjacency matrix of words;
s42, adopting the adjacency matrix and the multi-granularity characteristic representation information, and extracting the incoming node representation information and the outgoing node representation information of the words;
s43, using the incoming node representing information and the outgoing node representing information, generating first node characteristic representing information of the word.
In practical application, original data passes through an encoding layer and a feature mixing layer to obtain representation information of an input statement, but because a fixed graph structure is lacked, the invention uses a syntax dependency tree parser to create a dependency tree, finally uses the dependency tree as a adjacency matrix of input statement feature information, and uses GCN to extract regional dependency features. The original GCN uses an undirected graph to extract features of an input statement, and the undirected graph loses much dependency structure information because dependency direction information exists in a dependency tree. The invention considers the information transmitted by the word node and transmitted by the word node at the same time, namely bidirectional BiGCN is used.
The method comprises the following specific steps:
Figure BDA0003027719410000074
Figure BDA0003027719410000081
Figure BDA0003027719410000082
wherein the content of the first and second substances,
Figure BDA0003027719410000083
first node characterizing information representing the ith word, including outgoing node characterizing information propagated from the node to neighboring nodes
Figure BDA0003027719410000084
And incoming node representation information received from neighboring nodes
Figure BDA0003027719410000085
Figure BDA0003027719410000086
Are parameters that can be learned in the model and represent the weights of the outward output and inward input of the node words, respectively. The initial input of node u is the output of the feature mixture layer
Figure BDA0003027719410000087
I.e. multi-granular characterizing information, AijIs a contiguous matrix. Finally, the incoming node representation information and the outgoing node representation information are concatenated as first node characteristic representation information of the nodes.
Step 305, constructing adaptive adjacency matrixes of various preset relationship types;
in an embodiment of the present invention, step 305 may include the following sub-steps:
s51, obtaining sentence characteristics of the sentence in which each word is located;
s52, acquiring hidden state dimension and preset input dimension of sentence characteristics;
s53, calculating the dependency weight initial hidden representation information of the words by using sentence characteristics, hidden state dimensions and input dimensions;
s54, adopting the initial hidden representation information to respectively calculate the respective corresponding dependency information of the words under various preset relationship types; the dependency information comprises a query vector and a key value vector;
and S55, constructing adaptive adjacency matrixes corresponding to multiple preset relationship types respectively based on the dependency information.
In practical applications, the dependency weights of words in different relationship spaces are different. To more flexibly predict overlapping triples, embodiments of the present invention provide a node-aware attention mechanism to dynamically learn the strength of correlation between different words in each relationship space in an end-to-end manner. The original dependency tree is converted into a plurality of fully connected graphs, wherein each graph contains semantic information under a different relationship space. Specifically, first, a fully-connected layer is run on the output of the Bi-GCN in the first stage to obtain weight-dependent initial hidden representation information:
S=UWa+ba
wherein S is weight-dependent initial hidden representation information,
Figure BDA0003027719410000088
and
Figure BDA0003027719410000089
is the parameter that the model needs to learn, duAnd daThe hidden state dimension of sentence feature U and the input dimension in S, respectively.
Since the feature semantic information required to learn each relationship space is different, separate computation of dependency information for word nodes is required for different relationship types. Projecting the feature representation of the sentence to feature subspaces of different relation types, wherein the specific formula is as follows:
Figure BDA0003027719410000091
Figure BDA0003027719410000092
wherein the content of the first and second substances,
Figure BDA0003027719410000093
and
Figure BDA0003027719410000094
a query vector and a key-value vector representing the mth relationship category. N represents the length of the input sentence;
Figure BDA0003027719410000095
and
Figure BDA0003027719410000096
are parameters of the model. drIs the dimension of each relationship space.
Then, a dependency adaptive adjacency matrix under each relation type can be constructed
Figure BDA0003027719410000097
Wherein the content of the first and second substances,
Figure BDA0003027719410000098
and representing the strength of the dependency relationship between the ith node and the jth node under the relationship type m.
The construction method of the adjacency matrix under the specific relation is as follows:
Figure BDA0003027719410000099
wherein d isrIs the dimension of each relationship space; t is a transposed matrix.
Notably, the number of constructed adaptive adjacency matrices is consistent with the number of relationship types in the dataset.
Step 306, extracting second node characteristic representation information of the word according to the self-adaptive adjacent matrix and the first node characteristic representation information;
in an embodiment of the present invention, step 306 may include the following sub-steps:
s61, calculating forward characteristic representation information and backward characteristic representation information of the word by adopting the self-adaptive adjacent matrix and the first node characteristic representation information;
s62, second node feature representing information of the word is generated using the forward feature representing information and the backward feature representing information.
In a specific implementation, after obtaining the adaptive adjacency matrix for each predefined relationship, the node feature information extracted by the first BiGCN may be used as the initial sentence input information of the second-stage BiGCN. Unlike the first BiGCN feature extractor, in the BiGCN of the second stage, the feature information of sentences is mapped to different relationship feature spaces, and the dependent association strength between nodes is dynamically learned by using an adaptive adjacency matrix, which is helpful for extracting the information of overlapping triples.
In addition, the BiGCN feature extractor at the two stages can also establish information interaction between named entities and relationships, so that the model can extract all triples in a sentence to the maximum extent. The second stage BiGCN feature extractor operates as follows:
Figure BDA0003027719410000101
Figure BDA0003027719410000102
Figure BDA0003027719410000103
Figure BDA0003027719410000104
wherein the content of the first and second substances,
Figure BDA0003027719410000105
representing the dependency relationship strength of the node i and the node j under the mth relationship,
Figure BDA0003027719410000106
is a characteristic representation of the l-1 layer of the node j under m relations, the initial input is the output of the first BiGCN
Figure BDA0003027719410000107
Figure BDA0003027719410000108
Forward characteristic representation information of the BiGCN node u corresponding to the mth relation;
Figure BDA0003027719410000109
is the backward characteristic representation information of the BiGCN node u corresponding to the mth relationship; as with the BiGCN of the first stage, both incoming and outgoing information of the node is taken into account.
Figure BDA00030277194100001010
The second node feature representation information is a word as a node.
It should be noted that, in the BiGCN feature extractor in the second stage, the node dependency relationship information extracted by the BiGCN in all relationship spaces and the first node feature representation information extracted by the BiGCN in the first stage are fused, and then named entity identification and classification of node relationships are performed again. Because the node dependency information under different relation spaces is merged, all appearing entities and relations in the text can be extracted to the greatest extent.
Step 307, determining an entity type of each word based on the second node characteristic representation information of each word;
and step 308, calculating the relation category between any two words based on the second node characteristic representation information of each word.
In the embodiment of the invention, the entity types and the relation categories in the text can be extracted simultaneously by utilizing the feature mixed layer and the data extracted by the BiGCN in the two stages.
Wherein, for entity recognition, the first feature representation information and the second feature representation information of each word as a node are delivered to a linear connection layer, and then sequence labels of the text are obtained using a softmax function:
Figure BDA00030277194100001011
Figure BDA00030277194100001012
wherein, WeAnd beIs a parameter of the model, y is a true tag, eiIs the ith node, LeIs a loss function; s represents the input sentence sequence; n represents the length of a word in a sentence; e.g. of the typeiRepresents the ith word;
Figure BDA0003027719410000111
representing the category of the i-th word prediction.
From the sequence tags, the entity type of the word can be identified.
For the relationship extraction, the node feature representation information of two arbitrary words can be output to different linear connection layers respectively, and then the relationship category of the two words can be obtained by using softmax.
Figure BDA0003027719410000112
Pr(r|ei,ej,s)=σ(S(ei,r,ej))
Figure BDA0003027719410000113
Wherein the content of the first and second substances,
Figure BDA0003027719410000114
and
Figure BDA0003027719410000115
is a parameter of the model, S (e)i,r,ej) Representing the scores of two words under the relationship r, note that S (e)i,r,ej) Is different from S (e)j,r,ei)。
According to the method, the self-adaptive adjacency matrixes of various preset relationship types are constructed, the second node characteristic representation information of the words is calculated based on the self-adaptive adjacency matrixes under different relationship types, so that semantic interaction between the words under different relationship types is captured, and the recognition effect of the entity types of the words and the relationship types among different words is improved.
For easy understanding, the embodiment of the present invention has been performed on two public data sets NYT and WebNLG, and the sentences in the data sets and the distribution of the triples are shown in table 1. The results of the experiment are shown in table 2. Compared with the most advanced model extracted in a combined way, the experimental result shows that the method provided by the invention is improved by 6.5% and 11.4% on the NYT data set and the WebNLG data set.
Figure BDA0003027719410000116
TABLE 1
Figure BDA0003027719410000121
TABLE 2
Table 3 shows the effect of embodiments of the present invention and recognition of NYR, WebNLG entities (F1 value). By comparison, the inventive example was improved by 1.8% and 4.9% on both data sets. The method can be used for accurately identifying the entities in the text, so that the effect of extracting the triples can be greatly improved.
Method NYT WebNLG
GraphRel 0.892 0.919
AntNRE 0.925 0.916
Examples of the invention 0.943 0.965
TABLE 3
Table 4 shows three examples of the extraction results on the NYT dataset, with three representative sentences selected as a display from the Normal, SEO and EPO test sets, respectively. The first sentence contains only one triplet and can therefore be easily identified. The second statement has two triplets that share one entity. By extracting the characteristic information of all words in the sentence, the potential connection between the two triples can be deduced, so that all the triples in the sentence are extracted. The third sentence is a double-entity overlap type, and since the embodiment of the present invention cannot completely extract all double-entity triples, one triplet is omitted.
Figure BDA0003027719410000122
Figure BDA0003027719410000131
TABLE 4
According to the experimental result, the embodiment of the invention improves the recognition effect of the entity type of the word and the relation category between different words.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. An entity and relationship extraction method, comprising:
extracting multi-granularity characteristic representation information of each word in a preset text;
extracting first node feature representation information of the word based on the multi-granularity feature representation information;
constructing adaptive adjacency matrixes of various preset relation types;
extracting second node characteristic representation information of the word according to the self-adaptive adjacency matrix and the first node characteristic representation information;
determining an entity type of each of the words based on the second node feature representation information of each of the words;
calculating a relationship category between any two words based on the second node characteristic representation information of each of the words.
2. The method according to claim 1, wherein the step of extracting multi-granularity feature representation information of each word in the preset text comprises:
calculating hidden state representation information of each word in a preset text;
extracting character-level word features and word-level part-of-speech features of each word;
and generating multi-granularity characteristic representation information of each word by adopting the hidden state representation information, the character-level word characteristics and the word-level part-of-speech characteristics.
3. The method of claim 1, wherein the step of extracting the first node feature representation information of the word based on the multi-granularity feature representation information comprises:
creating an adjacency matrix for the word;
extracting incoming node representation information and outgoing node representation information of the word by adopting the adjacency matrix and the multi-granularity feature representation information;
generating first node characteristic representation information of the word using the incoming node representation information and the outgoing node representation information.
4. The method of claim 1, wherein the step of constructing the adaptive adjacency matrices for the plurality of preset-relationship types includes:
obtaining sentence characteristics of sentences in which each word is located;
acquiring a hidden state dimension and a preset input dimension of the sentence characteristics;
calculating dependency weight initial hidden representation information of the words by adopting the sentence characteristics, the hidden state dimension and the input dimension;
adopting the initial hidden representation information to respectively calculate the respective corresponding dependency information of the words under various preset relationship types;
and constructing self-adaptive adjacency matrixes respectively corresponding to a plurality of preset relation types based on the dependency information.
5. The method of claim 4, wherein the step of extracting second node feature representation information of the word from the adaptive adjacency matrix and the first node feature representation information comprises:
calculating forward characteristic representation information and backward characteristic representation information of the word by adopting the self-adaptive adjacency matrix and the first node characteristic representation information;
and generating second node feature representation information of the word by using the forward feature representation information and the backward feature representation information.
6. The method of claim 4, wherein the dependency information comprises a query vector and a key-value vector.
7. The method of claim 2, wherein said step of extracting character-level word features and word-level part-of-speech features of each of said words comprises:
and extracting character-level word characteristics and word-level part-of-speech characteristics of each word by adopting a preset bidirectional long-and-short time memory network.
CN202110420639.5A 2021-04-19 2021-04-19 Entity and relationship extraction method Pending CN113032571A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110420639.5A CN113032571A (en) 2021-04-19 2021-04-19 Entity and relationship extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110420639.5A CN113032571A (en) 2021-04-19 2021-04-19 Entity and relationship extraction method

Publications (1)

Publication Number Publication Date
CN113032571A true CN113032571A (en) 2021-06-25

Family

ID=76456886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110420639.5A Pending CN113032571A (en) 2021-04-19 2021-04-19 Entity and relationship extraction method

Country Status (1)

Country Link
CN (1) CN113032571A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609846A (en) * 2021-08-06 2021-11-05 首都师范大学 Method and device for extracting entity relationship in statement

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160008A (en) * 2019-12-18 2020-05-15 华南理工大学 Entity relationship joint extraction method and system
CN112163425A (en) * 2020-09-25 2021-01-01 大连民族大学 Text entity relation extraction method based on multi-feature information enhancement

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160008A (en) * 2019-12-18 2020-05-15 华南理工大学 Entity relationship joint extraction method and system
CN112163425A (en) * 2020-09-25 2021-01-01 大连民族大学 Text entity relation extraction method based on multi-feature information enhancement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
牛伟才等: "GCN2-NAA Two-stage Graph Convolutional Networks with Node-Aware Attention for Joint Entity and Relation Extraction", 《ICMLC\'21》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609846A (en) * 2021-08-06 2021-11-05 首都师范大学 Method and device for extracting entity relationship in statement
CN113609846B (en) * 2021-08-06 2022-10-04 首都师范大学 Method and device for extracting entity relationship in statement

Similar Documents

Publication Publication Date Title
CN111753024B (en) Multi-source heterogeneous data entity alignment method oriented to public safety field
CN112069408B (en) Recommendation system and method for fusion relation extraction
CN109902301B (en) Deep neural network-based relationship reasoning method, device and equipment
CN111814487B (en) Semantic understanding method, device, equipment and storage medium
CN112084335A (en) Social media user account classification method based on information fusion
CN113626589B (en) Multi-label text classification method based on mixed attention mechanism
CN110704576A (en) Text-based entity relationship extraction method and device
CN110674317A (en) Entity linking method and device based on graph neural network
CN113569001A (en) Text processing method and device, computer equipment and computer readable storage medium
CN113268609A (en) Dialog content recommendation method, device, equipment and medium based on knowledge graph
CN110188454A (en) Architectural Equipment and Building Information Model matching process and device
CN111914553B (en) Financial information negative main body judging method based on machine learning
CN113516198A (en) Cultural resource text classification method based on memory network and graph neural network
CN112131345A (en) Text quality identification method, device, equipment and storage medium
CN111581379A (en) Automatic composition scoring calculation method based on composition question-deducting degree
CN113239143B (en) Power transmission and transformation equipment fault processing method and system fusing power grid fault case base
CN114357167A (en) Bi-LSTM-GCN-based multi-label text classification method and system
CN113032571A (en) Entity and relationship extraction method
US20230168989A1 (en) BUSINESS LANGUAGE PROCESSING USING LoQoS AND rb-LSTM
CN112287239B (en) Course recommendation method and device, electronic equipment and storage medium
CN114880991A (en) Knowledge map question-answer entity linking method, device, equipment and medium
CN111538898B (en) Web service package recommendation method and system based on combined feature extraction
CN114491029A (en) Short text similarity calculation method based on graph neural network
CN112948561A (en) Method and device for automatically expanding question-answer knowledge base
CN116484004B (en) Dialogue emotion recognition and classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination