CN115510855A - Entity relation joint extraction method of multi-relation word pair label space - Google Patents

Entity relation joint extraction method of multi-relation word pair label space Download PDF

Info

Publication number
CN115510855A
CN115510855A CN202211171497.4A CN202211171497A CN115510855A CN 115510855 A CN115510855 A CN 115510855A CN 202211171497 A CN202211171497 A CN 202211171497A CN 115510855 A CN115510855 A CN 115510855A
Authority
CN
China
Prior art keywords
entity
word
head
tail
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211171497.4A
Other languages
Chinese (zh)
Inventor
王立松
孙明杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202211171497.4A priority Critical patent/CN115510855A/en
Publication of CN115510855A publication Critical patent/CN115510855A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for extracting entity relations of a multi-relation word pair label space in a combined manner.A input layer receives English training samples or samples in a prediction stage; the Tokenize layer performs Token on the sample sentences received by the input layer according to the word list, and after Bert coding, a Token semantic expression vector and a dictionary recording the initial position of the word in the Token sequence are obtained; the Maxpooling layer is used for performing maximum pooling on Token semantic expression vectors based on the dictionary to obtain semantic vector expression of each word in the sentence; and the joint extraction layer enumerates all word pairs in the sentence, marks labels for the word pairs in all predefined relation spaces, and finally performs joint extraction according to the label characteristics. The invention further improves the effect and efficiency of entity relation combined extraction under the complex relation and provides better guarantee for the bottom layer of natural language processing.

Description

Entity relation joint extraction method of multi-relation word pair label space
Technical Field
The invention belongs to the field of natural language processing, and particularly relates to a method for jointly extracting entity relations of a multi-relation word pair label space.
Background
Entity relationship joint extraction is a basic task of natural language processing, and the existing entity relationship joint extraction method has certain limitations. Entity relationship joint extraction aims at extracting all correct triples containing head entities, relationships and tail entities from a sentence of unstructured text. However, in a real scenario, the context information of a sentence is very complex, and the problem of overlapping of different types of triples is involved.
As shown in fig. 1, four triples overlap are given: the triplet is EPO, entity pair overlap, SEO, SOO, head entity and tail entity overlap, and Normal. Most current models have some drawbacks to this extraction of entity relationships in complex contexts. While those more sophisticated models do not deal well with such triple overlap. In general, the existing entity relationship extraction model has the following defects:
at present, most models which best work on the entity relationship joint extraction task can only deal with the situation of simple non-overlapping triples, and the model framework cannot be applied to processing the overlapping triples in a complex context.
For the case of overlapping triples, a plurality of entity relationship extraction researches emerge at present. The method can be mainly divided into two categories, namely a pipeline mode and a combined extraction mode. The streamline mode is respectively finished according to the sequence, firstly named entity recognition is carried out on the sentences, and then relations of the candidate entities are classified pairwise. The pipeline mode has low coupling and low relevance between tasks. This results in poor information interactivity between tasks, problems with error propagation and exposure bias.
And for the joint extraction mode, the method can be subdivided into multi-task learning and single-step single-module extraction. The multi-task learning refers to that an entity relationship joint extraction task is divided into a plurality of sub-modules with related relations to be completed in a cooperative work mode. At present, related methods exist, such as predicting the relationship first, then performing head entity and tail entity identification on the specific relationship, recognizing the head entity first, then predicting the relationship, and finally recognizing the tail entity. Compared with a pipeline mode, the multi-task learning is improved, and certain information interaction can be carried out among modules, for example, the entity identification module has relationship information, and the relationship prediction module has entity information. However, this kind of information interaction is not completely shared between modules, and secondly, as in a pipeline manner, there are still problems of error propagation and exposure bias between modules due to the problem of low coupling between modules.
The best model framework is currently considered by the public to be in the form of a single-step single module. The method is to carry out semantic coding through an independent module and extract all triples in a sentence in a one-step mode. The method completely shares entity and relationship information in the same module, and the influence of error transfer and exposure deviation between the modules can be avoided. The existing single-step single-module model has few researches, and has the problems of low model training and reasoning efficiency and relatively incomplete shared entity information.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method for extracting entity relationship of multiple relationship word pair label space in a combined manner, aiming at the defects of the prior art.
In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:
the entity relationship joint extraction method of the label space of the multi-relationship word pair is realized based on an entity relationship joint extraction model, the entity relationship joint extraction model comprises an input layer, a Tokenize layer, a Max posing layer and a joint extraction layer, and the method comprises the following steps:
step 1, an input layer receives English training samples or samples in a prediction stage;
step 2, the Tokenize layer carries out Token transformation on the sample sentences received by the input layer according to a word list, and after Bert coding, token semantic expression vectors and a dictionary for recording the initial positions of the words in a Token sequence are obtained;
and 3, a Max posing layer, based on the dictionary, performing maximum pooling on the Token semantic expression vectors to obtain semantic vector expression of each word in the sentence.
And step 4, enumerating all word pairs in the sentence by the combined extraction layer, scoring the labels of the word pairs in all predefined relationship spaces, and finally performing combined extraction according to the label characteristics.
In order to optimize the technical scheme, the specific measures adopted further comprise:
in the step 2, the Tokenize layer uses the Tokenizer in the PyTorch Keras Bert package to Token the sample sentences received by the input layer according to the vocabulary.
Step 2 above for sentence W = { W = { (W) 1 ,w 2 ,...,w X },w i Representing the ith word in the sentence, and obtaining a Token semantic representation vector after Token transformation and Bert coding
Figure BDA0003862938470000021
Where N represents the number of tokens, t i Represents the ith token, W enc Representing the semantic vector after all tokens in the sentence are coded, d is the dimension of the semantic vector, and obtaining a dictionary Index for recording the starting position of the word in the token sequence.
Step 3 above fuses token semantic expression vectors into word vector expression by using Max posing operation, and the formula is:
Index=[(1,n 1 ) 1 ,(n 1 +1,n 2 ) 2 ,...,(n i ,n) X ],
Figure BDA0003862938470000022
index refers to the dictionary of recorded words obtained at the token layer at the beginning of the token sequence;
the slicing operation of the sequence;
Emb i a vector representation representing the resulting i-th word.
The label strategy adopted in the step 4 is specifically as follows:
for one input sentence sample W = { W = { [ W ] 1 ,w 2 ,...,w X And a set of predefined relationships R = { R = } 1 ,r 2 ,...,r Q }, generating a Q-dimensional label matrix TM Q×X×X Where X represents the length of the sentence, r i Representing the ith relation in the relation set, wherein Q is the total number of the relations;
each dimension of the matrix TM corresponds to one of the relations in R, and each square is provided with a label which is generated by a model and has a specific meaning;
the rows and columns in the matrix represent the head entity and the tail entity respectively;
the decoding is to extract all predicted triplets in the matrix at once according to the specific label meaning.
The label strategy is characterized in that eight kinds of labels are arranged according to the characteristics of the entity length and the alignment mode of the head entity and the tail entity: SS, SMH, SMT, MSH, MST, MMH, MMT, A;
wherein, SS represents that the head entity and the tail entity are both composed of single words;
SMH indicates that the head entity is composed of a single word, the tail entity is composed of a plurality of words, and the current alignment is the head word of the head entity and the tail entity;
SMT indicates that the head entity is composed of a single word, the tail entity is composed of a plurality of words, and the current alignment is the head word of the head entity and the tail word of the tail entity;
MSH indicates that the head entity is composed of a plurality of words, the tail entity is composed of a single word, and the current alignment is the head word of the head entity and the tail entity;
MST indicates that the head entity is composed of a plurality of words, the tail entity is composed of a single word, and the current alignment is the tail word of the head entity and the head word of the tail entity;
MMH represents that the head entity and the tail entity are both composed of a plurality of words, and the current alignment is the head word of the head entity and the tail entity;
MMT represents that a head entity is composed of a plurality of words, a tail entity is composed of a single word, and the current alignment is the head word of the head entity and the tail word of the tail entity or the tail word of the head entity and the tail word of the tail entity;
a indicates an empty label.
The joint extraction layer in the step 4 enumerates all word pairs (Emb) under all predefined relations i ,Emb j ) And assigning a high-confidence label to the label to realize decoding.
The joint extraction layer in step 4 above maps the high-dimensional word semantic vector to the low-dimensional entity representation vector by using two low-dimensional multi-layer perceptron MLPs:
h i =MLP head (Emb i ),
t j =MLP tail (Emb j )
wherein, MLP represents a multi-layer perceptron;
Figure BDA0003862938470000041
is a multi-layer perceptron dimension;
d e a dimension that is a representation of an entity;
head, tail represent head and tail entities, respectively.
The combined extraction layer in the step 4 scores each word pair under all predefined relations through one-time calculation, and the formula is as follows:
Figure BDA0003862938470000042
wherein, y (h) i ,r q ,t j ) Is a label marked in the training set;
Figure BDA0003862938470000043
ReLU denotes activation function;
drop represents a dropout strategy;
Figure BDA0003862938470000044
is a trainable relational projection parameter matrix;
and 8 represents the number of classified labels.
The invention has the following beneficial effects:
the invention combines the characteristics of the entity and the mapping from the entity to the multi-relation space, establishes the label strategy of the interaction between the entity and the relation and the calculation method of the next modeling of the multi-relation, finally further improves the effect and efficiency of the entity relation joint extraction of the model under the complex relation and provides better guarantee for the bottom layer of the natural language processing.
Drawings
FIG. 1 illustrates an overlapping situation of triples in a conventional entity relationship joint extraction method;
FIG. 2 is a schematic diagram of a method for extracting entity relationships of a multi-relationship word pair label space jointly according to the present invention;
fig. 3 is a label generation result case analysis.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
As shown in fig. 2, the entity relationship joint extraction method of the multi-relationship word pair label space of the present invention is implemented based on an entity relationship joint extraction model, where the entity relationship joint extraction model includes an input layer, a Tokenize layer, a Max posing layer, and a joint extraction layer, and the method includes:
step 1, an input layer receives English training sample sentences or sample language sentences in a prediction stage;
step 2, the Tokenize layer carries out Token transformation on the sample sentences received by the input layer according to a word list, and after Bert coding, token semantic expression vectors and a dictionary for recording the initial positions of the words in a Token sequence are obtained;
the Tokenize layer uses tokenizers in a PyTorch Keras Bert packet, and the function of the Tokenizer is to perform Token transformation on sample sentences received by the input layer according to a vocabulary. There is also a dictionary that records the starting position of each word in the Token sequence after the sentence has been Tokenize. So that the subsequent Bert pre-training model can carry out the coding of semantic context and the formation of word semantic representation. For experimental contrast fairness, the Bert pre-training model used herein is also at the semantic coding layer.
And 3, a Max posing layer, based on the dictionary, performing maximum pooling on the Token semantic expression vectors to obtain semantic vector expression of each word in the sentence.
The layer plays an important role in improving the training and reasoning speed of the whole model and enhancing the information interaction in the joint extraction module.
And step 4, enumerating all word pairs in the sentence by the combined extraction layer, scoring the labels of the word pairs in all predefined relationship spaces, and finally performing combined extraction according to the label characteristics.
The joint extraction layer is the most critical step of the task, and the related technology is as follows
Tag policy and decoding process
For one input sentence sample W = { W 1 ,w 2 ,...,w X And a set of predefined relationships R = { R = } 1 ,r 2 ,...,r Q The entity relationship joint extraction model generates a Q-dimensional label matrix TM Q×X×X Where X represents the length of the sentence, r i Representing the ith relationship in the relationship set, and Q is the total number of relationships.
Each dimension of the matrix TM corresponds to a relationship in R, and each square has a label with a particular meaning generated by the model.
The rows and columns in the matrix represent the head and tail entities, respectively.
Decoding is to extract all the predicted triplets in the matrix at once according to the specific label meaning.
The invention sets eight kinds of labels according to the characteristics of the length of the entity and the alignment mode of the head entity and the tail entity: SS, SMH, SMT, MSH, MST, MMH, MMT, A.
S and M in the label indicate that the current entity is composed of a single word and multiple words, respectively.
SS represents that the head entity and the tail entity are both composed of single words;
SMH indicates that the head entity is composed of a single word, the tail entity is composed of a plurality of words, and the current alignment is the head word of the head entity and the tail entity;
SMT indicates that the head entity is composed of a single word, the tail entity is composed of a plurality of words, and the current alignment is the head word of the head entity and the tail word of the tail entity;
MSH indicates that the head entity is composed of a plurality of words, the tail entity is composed of a single word, and the current alignment is the head word of the head entity and the tail entity;
MST indicates that the head entity is composed of a plurality of words, the tail entity is composed of a single word, and the current alignment is the tail word of the head entity and the head word of the tail entity;
MMH represents that the head entity and the tail entity are both composed of a plurality of words, and the current alignment is the head word of the head entity and the tail entity;
MMT represents that a head entity is composed of a plurality of words, a tail entity is composed of a single word, and the current alignment is the head word of the head entity and the tail word of the tail entity or the tail word of the head entity and the tail word of the tail entity;
a indicates an empty label.
With this label system, the characteristics of the entity itself can be fully exploited and decoding facilitated.
During the decoding process, only the alignment positions of the legal word pairs need to be found out in the Q-dimensional matrix TM respectively. A detailed decoding example is shown in fig. 2.
In the dimension of the "Capital" relationship, the word pair (Shijiazhuang, hebei) is found to belong to the tag SMH, and the head entity is Shijiazhuang.
The head word of the tail entity is "Hebei", which needs to be searched backwards along the current line, and when finding the tag SMT is terminated, the tail entity is "Hebei Province".
Triplets (Shijiazhuang, capital, hebei Provision) were extracted.
In the dimension of the relation of "continins", the MSH label is found, and the tail entity is "Shijiazhuang".
The head entity is the first word "Hebei", looking down the current column to find the tag MST, and the head entity is "Hebei Province".
Triplet (Hebei Province, contains, shijiazhuang).
The multi-relation space modeling method comprises the following steps:
in step 2, for sentence W = { W 1 ,w 2 ,...,w X },w i Representing the ith word in the sentence, and obtaining a Token semantic representation vector after Token transformation and Bert coding
Figure BDA0003862938470000061
Where N represents the number of tokens, t i Denotes the ith token, W enc Representing the semantic vector after all tokens in the sentence are coded, d is the dimension of the semantic vector, and obtaining a dictionary Index for recording the starting position of the word in the token sequence.
In step 3, the token semantic expression vectors are fused into word vector expression by using Max posing operation, as shown in formula (1):
Index=[(1,n 1 ) 1 ,(n 1 +1,n 2 ) 2 ,...,(n i ,n) X ], (1)
Figure BDA0003862938470000071
index refers to the dictionary of recorded words obtained at the token layer at the beginning of the token sequence;
the slicing operation of the sequence;
Emb i a vector representation representing the resulting i-th word.
For an input sentence, the word vector representation is obtained through the above processing.
Enumerating all word pairs (Emb) under all predefined relationships in a joint extraction module of a joint extraction layer i ,Emb j ) It is assigned a high confidence label.
Inspired by dependency parsing and knowledge graph representation, the present invention combines the concepts of affine-double attention and HOLE to achieve the intended goal, as shown in equations (2) and (3):
h i =MLP head (Emb i ),
t j =MLP tail (Emb j ) (2)
where MLP denotes a multi-layer perceptron.
Has the dimension of
Figure BDA0003862938470000072
d e Dimension for entity representation:
head, tail represent head and tail entities, respectively.
The application of two low-dimensional multi-layered perceptron MLPs to map high-dimensional word semantic vectors to low-dimensional entity representation vectors has two advantages:
first, low-dimensional MLPs mapping can remove interfering information from high-dimensional word semantic vectors.
Secondly, such low-dimensional entity representation vectors can speed up subsequent computations.
Figure BDA0003862938470000073
Wherein ReLU represents an activation function;
drop represents a dropout strategy;
Figure BDA0003862938470000074
is a trainable relational projection parameter matrix;
and 8 represents the number of classified labels.
The model will score each word pair under all predefined relationships through one calculation.
Optimizing the objective function using equation (4):
Figure BDA0003862938470000081
wherein, y (h) i ,r q ,t j ) Are labels labeled in the training set.
Formula (4) is a scoring formula, and the gradient of the loss function in the model training process is reduced to optimize the model parameters, so that the model can tag the word pairs more accurately.
The implementation verification is as follows:
the performance of the entity relationship joint extraction model is evaluated on two reference data sets NYT and WebNLG.
According to the marking strategy of the reference data set, the two versions can be divided into two versions, and the versions are distinguished by NYT, webNLG and WebNLG.
Wherein, the two versions, NYT and WebNLG, mark all the components of the entity, and NYT and WebNLG mark only the last word of the entity.
In addition, the invention divides the test set into several different subsets according to different overlapping modes of the triples. Detailed statistics of the data set are shown in table 1.
Table 1 data set statistics
Figure BDA0003862938470000082
For fair comparison, model performance was evaluated using Precision (Prec.), recall (Rec.), and F1-score indices as in the conventional method.
The model is implemented by PyTorch, and is trained and deployed on a server of a video memory 32G Tesla V100-PCIe GPU;
the pretrained model uses a BERT base-shielded English model, adam is used as an optimizer training parameter, and the learning rate is set to be 0.00001;
the method is the same as the previous work, and the maximum token length of the sentence is controlled to be 100;
the output dimension d of Bert is 768;
the mapping dimension for setting the MLPs is 50.
To prevent overfitting, the dropout ratio is set to 0.1.
The experiment compares several comparative baseline models, and the overall experiment results are shown in table 2. -represents null data.
Table 2 experimental results (%)
Figure BDA0003862938470000091
Experimental results show that the entity relationship joint extraction model obtains good F1 index performance on the reference data set, and most of the F1 index performance is obvious on the two indexes of prec and Rec.
Because NYT (NYT) has more training samples and fewer relation numbers compared with WebNLG (WebNLG), the performance improvement of the model is not very obvious, which also shows that the size of the training set samples is very important for the performance improvement of the model.
At the same time, the performance of the model was also verified under different triple overlap conditions, as shown in table 3.
TABLE 3 results of experiment (%)
Figure BDA0003862938470000092
The experimental result shows that the performance of the entity relationship joint extraction model is improved on almost all triple overlapping types. In NYT and WebNLG, the number of overlapping triples exceeds one fourth of the total number, and in such a complex context, the superior key to model performance lies in the multiple relationship space modeling method and label strategy of the present invention.
The context information of an entity is first enriched using word pairs rather than token pairs, which is important to the correctness of the model to generate labels. Here a case analysis is performed as shown in fig. 3.
The words "Ampara", "Hospital", "Sri" and "Lanka" in sentence 1 form token sequences 'Am', '# # par', '# # a', 'Hospital', 'Sri' and 'Lanka' after passing through the Tokenize layer.
Assuming that the token pair is used for modeling, the model would label the token pairs ('Am', 'Sri'), ('Am', 'Lanka') and ('Hospital', 'Lanka') with MMH, MMT, and MMT, respectively.
Upon decoding, triplets (Ampara Hospital, county, sri Lanka) are extracted.
Sentence 2 is a deliberate modification of sentence 1, and after the Tokenize layer, the words "Am", "Hospital", "Sri", and "Lanka" are divided into the token sequences 'Am', 'Hospital', 'Sri', and 'Lanka'.
The model would still tag token pairs ('Am', 'Sri', ('Am', 'Lanka') and ('Hospital', 'Lanka') with MMH, MMT, which results in erroneous triples being extracted. It can be seen that token is not very complete for the semantic information inclusion of context.
Table 4 analyzes the comparison of the efficiency of the entity-relationship joint extraction model (Ours) of the present invention with other baseline models.
TABLE 4
Figure BDA0003862938470000101
The above are only preferred embodiments of the present invention, and the scope of the present invention is not limited to the above examples, and all technical solutions that fall under the spirit of the present invention belong to the scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (9)

1. The entity relationship joint extraction method of the label space of the multi-relationship word pair is realized based on an entity relationship joint extraction model, wherein the entity relationship joint extraction model comprises an input layer, a Tokenize layer, a Max posing layer and a joint extraction layer, and is characterized by comprising the following steps of:
step 1, an input layer receives English training samples or samples in a prediction stage;
step 2, the Token layer performs Token transformation on the sample sentences received by the input layer according to a word list, and after Bert coding, token semantic expression vectors and dictionaries recording the initial positions of the words in a Token sequence are obtained;
step 3, a Max posing layer, based on the dictionary, performing maximal pooling on Token semantic expression vectors to obtain semantic vector expression of each word in the sentence;
and 4, based on the processing in the step 3, enumerating all word pairs in the sentence by the combined extraction layer, scoring labels for the word pairs in all predefined relationship spaces, and finally performing combined extraction according to the label characteristics.
2. The method for extracting entity relationships of multi-relation word pair label space jointly as claimed in claim 1, wherein the Tokenize layer in step 2 uses Tokenizer in PyTorch Keras Bert package to Token the sample sentences received by the input layer according to the vocabulary.
3. The method for jointly extracting entity relationships in multi-relationship word-pair tag space according to claim 1, wherein the step 2 is implemented for a sentence W = { W = { (W) } 1 ,w 2 ,...,w X },w i Representing the ith word in the sentence, and obtaining a Token semantic representation vector after Token transformation and Bert coding
Figure FDA0003862938460000011
Wherein N represents the number of tokens, t i Denotes the ith token, W enc Representing the semantic vector after all tokens in the sentence are coded, d is the dimension of the semantic vector, and obtaining a dictionary Index for recording the starting position of the word in the token sequence.
4. The method for extracting entity relations of multi-relation word pair label space jointly as claimed in claim 1, wherein step 3 fuses token semantic expression vectors into word vector expression by means of Max posing operation, and the formula is:
Index=[(1,n 1 ) 1 ,(n 1 +1,n 2 ) 2 ,...,(n i ,n) X ],
Figure FDA0003862938460000012
wherein, index refers to the dictionary of the recorded words obtained at the Tokenize layer at the beginning of the token sequence;
the slicing operation of the sequence;
Emb i a vector representation representing the resulting i-th word.
5. The method for extracting entity relations of multi-relation word pair label space jointly according to claim 1, wherein the label strategy adopted in step 4 is specifically as follows:
for one input sentence sample W = { W = { [ W ] 1 ,w 2 ,...,w X And a set of predefined relationships R = { R = } 1 ,r 2 ,...,r Q }, generating a Q-dimensional label matrix TM Q×X×X Where X represents the length of the sentence, r i Representing the ith relation in the relation set, wherein Q is the total number of the relations;
each dimension of the matrix TM corresponds to one of the relations in R, and each square is provided with a label which is generated by a model and has a specific meaning;
the rows and columns in the matrix represent the head entity and the tail entity respectively;
the decoding is to extract all the predicted triples in the matrix at once according to the specific label meaning.
6. The method for extracting entity relationships in a label space from multiple related words according to claim 5, wherein the label policy sets eight kinds of labels according to the alignment of the characteristics of entity lengths and entities at the head and tail:
SS,SMH,SMT,MSH,MST,MMH,MMT,A;
wherein, SS represents that the head entity and the tail entity are both composed of single words;
SMH indicates that the head entity is composed of a single word, the tail entity is composed of a plurality of words, and the current alignment is the head word of the head entity and the tail entity;
SMT indicates that the head entity is composed of a single word, the tail entity is composed of a plurality of words, and the current alignment is the head word of the head entity and the tail word of the tail entity;
MSH indicates that the head entity is composed of a plurality of words, the tail entity is composed of a single word, and the current alignment is the head word of the head entity and the tail entity;
MST indicates that the head entity is composed of a plurality of words, the tail entity is composed of a single word, and the current alignment is the tail word of the head entity and the head word of the tail entity;
MMH represents that the head entity and the tail entity are both composed of a plurality of words, and the current alignment is the head word of the head entity and the tail entity;
the MMT represents that a head entity is composed of a plurality of words, a tail entity is composed of a single word, and the current alignment is the head word of the head entity and the tail word of the tail entity or the tail word of the head entity and the tail word of the tail entity;
a indicates an empty label.
7. The method of claim 1, wherein the joint extraction layer of step 4 enumerates all word pairs (Emb) under all predefined relations i ,Emb j ) And allocating a high-confidence label to the decoding unit to realize decoding.
8. The method as claimed in claim 7, wherein the step 4 of jointly extracting the entity relationship of the multi-relation word pair tag space applies two low-dimensional multi-layer perceptron MLPs to map the high-dimensional word semantic vector to the low-dimensional entity representation vector:
h i =MLP head (Emb i ),
t j =MLP tail (Emb j )
wherein MLP represents a multi-layer perceptron;
Figure FDA0003862938460000031
is a multi-layer perceptron dimension;
d e a dimension that is a representation of an entity;
head, tail represent head and tail entities, respectively.
9. The method for extracting entity relations of a multi-relation word pair label space in a combined manner according to claim 8, wherein the combined extraction layer in step 4 scores labels of each word pair in all predefined relations through one calculation based on low-dimensional entity expression vectors, and the scoring formula is as follows:
Figure FDA0003862938460000032
wherein, y (h) i ,r q ,t j ) Is a label marked in the training set;
Figure FDA0003862938460000033
ReLU denotes activation function;
drop represents a dropout strategy;
Figure FDA0003862938460000034
is a trainable relational projection parameter matrix;
and 8 represents the number of classified labels.
CN202211171497.4A 2022-09-26 2022-09-26 Entity relation joint extraction method of multi-relation word pair label space Pending CN115510855A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211171497.4A CN115510855A (en) 2022-09-26 2022-09-26 Entity relation joint extraction method of multi-relation word pair label space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211171497.4A CN115510855A (en) 2022-09-26 2022-09-26 Entity relation joint extraction method of multi-relation word pair label space

Publications (1)

Publication Number Publication Date
CN115510855A true CN115510855A (en) 2022-12-23

Family

ID=84505977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211171497.4A Pending CN115510855A (en) 2022-09-26 2022-09-26 Entity relation joint extraction method of multi-relation word pair label space

Country Status (1)

Country Link
CN (1) CN115510855A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12026466B1 (en) * 2023-03-13 2024-07-02 Ailife Diagnostics, Inc. Distant supervision for data entity relation extraction

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12026466B1 (en) * 2023-03-13 2024-07-02 Ailife Diagnostics, Inc. Distant supervision for data entity relation extraction

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN112801010B (en) Visual rich document information extraction method for actual OCR scene
CN111914091B (en) Entity and relation combined extraction method based on reinforcement learning
CN113468888A (en) Entity relation joint extraction method and device based on neural network
CN113934824B (en) Similar medical record matching system and method based on multi-round intelligent question answering
CN112733533A (en) Multi-mode named entity recognition method based on BERT model and text-image relation propagation
CN114417839A (en) Entity relation joint extraction method based on global pointer network
CN115034218A (en) Chinese grammar error diagnosis method based on multi-stage training and editing level voting
Tensmeyer et al. Training full-page handwritten text recognition models without annotated line breaks
CN112507727A (en) Text visual question-answering system and method based on text
CN114970536B (en) Combined lexical analysis method for word segmentation, part-of-speech tagging and named entity recognition
Wigington et al. Multi-label connectionist temporal classification
CN115310560A (en) Multimode emotion classification method based on modal space assimilation and contrast learning
CN115238693A (en) Chinese named entity recognition method based on multi-word segmentation and multi-layer bidirectional long-short term memory
CN115510855A (en) Entity relation joint extraction method of multi-relation word pair label space
CN112015760B (en) Automatic question-answering method and device based on candidate answer set reordering and storage medium
CN113673241A (en) Text abstract generation framework and method based on example learning
CN117112743A (en) Method, system and storage medium for evaluating answers of text automatic generation questions
CN113516209B (en) Comparison task adaptive learning method for few-sample intention recognition
CN115455144A (en) Data enhancement method of completion type space filling type for small sample intention recognition
CN114881038A (en) Chinese entity and relation extraction method and device based on span and attention mechanism
CN114626463A (en) Language model training method, text matching method and related device
CN114461779A (en) Case writing element extraction method
CN114780725A (en) Text classification algorithm based on deep clustering
CN114757193A (en) Threat information named entity identification method based on machine reading understanding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination