CN115081392A - Document level relation extraction method based on adjacency matrix and storage device - Google Patents

Document level relation extraction method based on adjacency matrix and storage device Download PDF

Info

Publication number
CN115081392A
CN115081392A CN202210602851.8A CN202210602851A CN115081392A CN 115081392 A CN115081392 A CN 115081392A CN 202210602851 A CN202210602851 A CN 202210602851A CN 115081392 A CN115081392 A CN 115081392A
Authority
CN
China
Prior art keywords
matrix
relation
entity
adjacency matrix
order
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210602851.8A
Other languages
Chinese (zh)
Inventor
闾海荣
王天亨
李艳
石顺中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou Institute Of Data Technology Co ltd
Original Assignee
Fuzhou Institute Of Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou Institute Of Data Technology Co ltd filed Critical Fuzhou Institute Of Data Technology Co ltd
Priority to CN202210602851.8A priority Critical patent/CN115081392A/en
Publication of CN115081392A publication Critical patent/CN115081392A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The present application relates to the technical field of document level relationship extraction, and in particular, to a document level relationship extraction method and a storage device based on an adjacency matrix. The document level relation extraction method based on the adjacency matrix comprises the following steps: modeling the long text of the document level through a Transformer-XL model; modeling the entity pairs with the relations into a dependency tree respectively; generating an adjacency matrix with a certain relation entity pair according to the dependency tree; fusing the relation characteristics related to the target relation characteristics through weighting attention; and generating the probability of the corresponding relation of the entity pair according to the fused feature matrix. In the method, a transform-XL model is adopted to model a long text sequence in a document, so that the text among all segments has semantic relation, and no upper limit is provided for the length of the modeled text.

Description

Document level relation extraction method based on adjacency matrix and storage device
Technical Field
The present application relates to the technical field of document level relationship extraction, and in particular, to a document level relationship extraction method and a storage device based on an adjacency matrix.
Background
The document level relation extraction aims at extracting the relation between entity pairs in a section of document, and the document level relation extraction is used as an information extraction method and plays an important role in constructing a large-scale knowledge graph. Then the current relationship extraction is mainly sentence-level oriented relationship extraction, which aims to extract a certain relationship existing between entity pairs from a sentence. However, in real-world applications, most pairs of entities with certain relationships are in different sentences, which makes document-level relationship extraction a relatively more difficult task than sentence-level relationship extraction.
The existing document level relation extraction often has the following defects:
1. all the relation features in the attention domain or the convolution domain are fused into the target relation features, some relation features are closely connected with the target relation features, the relation features provide rich semantic relations for the target relation features, and the target relation features can be helped to carry out intra-sentence or inter-sentence reasoning; but also introduces much noise, such as entity-pair relationship features where there is no relationship in the domain, which as a kind of noise exists widely in the convolution domain or attention domain.
2. Because the length of the document generally exceeds the coding range of the BERT model, the prior method divides a long text sequence and then codes the long text sequence respectively, so that although the problem that the BERT cannot process the long text is solved, a semantic fault is formed, namely weak semantic relation exists between the segments after coding.
3. The two methods for fusing the characteristics of different entity pairs in the characteristic matrix belong to implicit fusion, have weak purpose, and complete the fusion of the related relation characteristics mainly by fusing elements in a large-range matrix. The method has large calculation cost and poor fusion effect. This method is an important factor that hinders the performance improvement of the model.
Disclosure of Invention
In view of the above problems, the present application provides a document level relationship extraction method based on adjacency matrix, so as to solve the technical problems mentioned in the background art. The specific technical scheme is as follows:
a document level relation extraction method based on an adjacency matrix comprises the following steps:
modeling the long text of the document level through a Transformer-XL model;
constructing an entity pair relation characteristic matrix;
modeling the entity pairs with the relations into a path dependency tree respectively;
generating an adjacency matrix between a certain relation entity pair according to the dependency tree;
calculating a visibility matrix from the adjacency matrix;
fusing the relation characteristics related to the relation characteristics of the target entity pair through a self-attention mechanism;
and calculating the probability of the corresponding relation of the entity pair according to the fused feature matrix.
Further, the calculating a visible matrix according to the adjacency matrix further includes:
repeating the step of calculating the n-order adjacency matrix by using the n-1-order matrix until two relation characteristics with 1 element in the n-order adjacency matrix meet a preset condition;
calculating a visible matrix V according to the previous n-order adjacency matrix:
V=A+A 2 +...+A n
wherein A represents a first order matrix, A 2 Represents a second order matrix, A n Representing an n-order matrix, wherein the value of n is a natural number which is more than or equal to 2.
Further, the fusing the relationship features associated with the target relationship features through weighted attention further includes:
and determining different weights according to the difference of the step numbers among different relation characteristics, wherein the longer the step number is, the smaller the weight is.
Further, the root node of the dependency tree is a corresponding entity pair, and the first-layer node represents an entity pair relationship characteristic representation in the adjacency matrix, which is in direct connection with the entity pair in the horizontal direction and the vertical direction.
Further, the constructing a relationship feature matrix further includes:
and calculating all entity embedded representations in the document, and constructing a relation characteristic matrix according to the embedded representations.
In order to solve the technical problem, the storage device is further provided, and the specific technical scheme is as follows:
a storage device having stored therein a set of instructions for performing:
modeling the long text of the document level through a Transformer-XL model;
constructing an entity pair relation characteristic matrix;
modeling the entity pairs with the relations into a path dependency tree respectively;
generating an adjacency matrix between a certain relation entity pair according to the dependency tree;
calculating a visibility matrix from the adjacency matrix;
fusing the relation characteristics related to the relation characteristics of the target entity pair through a self-attention mechanism;
and calculating the probability of the corresponding relation of the entity pair according to the fused feature matrix.
Further, the set of instructions is further operable to perform: said calculating a visibility matrix from said adjacency matrix further comprising:
repeating the step of calculating the n-order adjacency matrix by using the n-1-order matrix until two relation characteristics with 1 element in the n-order adjacency matrix meet a preset condition;
calculating a visible matrix V according to the previous n-order adjacency matrix:
V=A+A 2 +...+A n
wherein A represents a first order matrix, A 2 Represents a second order matrix, A n Representing an n-order matrix, wherein the value of n is a natural number which is more than or equal to 2.
Further, the set of instructions is further for performing:
the fusing the relationship features associated with the target relationship features through weighted attention further comprises:
and determining different weights according to different step numbers among different relation characteristics, wherein the longer the step number is, the smaller the weight is.
Further, the root node of the dependency tree is a corresponding entity pair, and the first-layer node represents an entity pair relationship characteristic representation in the adjacency matrix, which is in direct connection with the entity pair in the horizontal direction and the vertical direction.
Further, the set of instructions is further for performing:
the constructing of the relationship feature matrix further comprises:
and calculating all entity embedded representations in the document, and constructing a relation characteristic matrix according to the embedded representations.
The invention has the beneficial effects that: a document level relation extraction method based on an adjacency matrix comprises the following steps: modeling the long text of the document level through a Transformer-XL model; constructing an entity pair relation characteristic matrix; modeling the entity pairs with the relations into a path dependency tree respectively; generating an adjacency matrix between a certain relation entity pair according to the dependency tree; calculating a visibility matrix from the adjacency matrix; fusing the relation characteristics related to the relation characteristics of the target entity pair through a self-attention mechanism; and calculating the probability of the corresponding relation of the entity pair according to the fused feature matrix. In the method, a transform-XL model is adopted to model a long text sequence in a document, so that the text among all segments has semantic relation, and no upper limit is provided for the length of the modeled text. Semantic faults caused by BERT modeling can be effectively avoided. And the characteristics of different steps are captured by adopting an adjacency matrix method, the objective modeling is clear, and only the entity pairs with certain relation are modeled and the entity pairs without relation are discarded, so that the introduction of noise can be avoided, and the performance of the model is influenced. Only the entity pairs with certain relations are modeled, so that the calculation complexity can be effectively reduced, and the training and reasoning speed of the model is improved.
The above description of the present invention is only an outline of the present invention, and in order to make the technical solution of the present invention more clearly understood by those skilled in the art, the present invention may be implemented based on the content described in the text and drawings of the present specification, and in order to make the above object, other objects, features, and advantages of the present invention more easily understood, the following description will be made in conjunction with the embodiments of the present application and the drawings.
Drawings
The drawings are only for purposes of illustrating the principles, implementations, applications, features, and effects of particular embodiments of the present application, as well as others related thereto, and are not to be construed as limiting the application.
In the drawings of the specification:
FIG. 1 is a schematic diagram of the different entities of the embodiment that Bartolomeo Altomonte and Altomonte belong to the unified entity;
FIG. 2 is a diagram illustrating a semantic segmentation task regarding relationship feature fusion according to an embodiment;
FIG. 3 is a diagram illustrating the use of stacked criss-cross attention modules to fuse information in an entity-pair relationship feature matrix according to an embodiment;
FIG. 4 is a flowchart illustrating a document level relationship extraction method based on adjacency matrices according to an embodiment;
FIG. 5 is a schematic diagram of a model applied to a document level relationship extraction method based on an adjacency matrix according to an embodiment;
fig. 6 is a block diagram of a storage device according to an embodiment.
The reference numerals referred to in the above figures are explained below:
600. a storage device.
Detailed Description
In order to explain in detail possible application scenarios, technical principles, practical embodiments, and the like of the present application, the following detailed description is given with reference to the accompanying drawings in conjunction with the listed embodiments. The embodiments described herein are merely for more clearly illustrating the technical solutions of the present application, and therefore, the embodiments are only used as examples, and the scope of the present application is not limited thereby.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase "an embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or related to other embodiments specifically defined. In principle, in the present application, the technical features mentioned in the embodiments can be combined in any manner to form a corresponding implementable technical solution as long as there is no technical contradiction or conflict.
Unless otherwise defined, technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the use of relational terms herein is intended only to describe particular embodiments and is not intended to limit the present application.
In the description of the present application, the term "and/or" is a expression for describing a logical relationship between objects, meaning that three relationships may exist, for example a and/or B, meaning: there are three cases of A, B, and both A and B. In addition, the character "/" herein generally indicates that the former and latter associated objects are in a logical relationship of "or".
In this application, terms such as "first" and "second" are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
Without further limitation, in this application, the use of "including," "comprising," "having," or other similar expressions in phrases and expressions of "including," "comprising," or "having," is intended to cover a non-exclusive inclusion, and such expressions do not exclude the presence of additional elements in a process, method, or article that includes the recited elements, such that a process, method, or article that includes a list of elements may include not only those elements but also other elements not expressly listed or inherent to such process, method, or article.
As is understood in the examination of the guidelines, the terms "greater than", "less than", "more than" and the like in this application are to be understood as excluding the number; the expressions "above", "below", "within" and the like are understood to include the present numbers. In addition, in the description of the embodiments of the present application, "a plurality" means two or more (including two), and expressions related to "a plurality" similar thereto are also understood, for example, "a plurality of groups", "a plurality of times", and the like, unless specifically defined otherwise.
Document level relationship extraction, as mentioned in the background, is a relatively more difficult task than sentence level relationship extraction.
The current common method is as follows:
first, a document is given
Figure BDA0003669961390000061
Where N represents the length of the document, a special symbol is first inserted before and after each entity mention in the document, thereby marking the location of the entity mention in the document. The document after marking is encoded using BERT as encoder, resulting in an embedded representation with context information:
H=[h 1 ,h 2 ,…,h H ]=BERT([x 1 ,x 2 ,…,x n ])
where n represents the length of the document after insertion of the special character.
The imbedding referring to each entity to the previous special identifier denotes the imbedding referred to as an entity. Since there may be multiple corresponding entity mentions for each entity in the document, different entity mentions may be used to generate the representation of the corresponding entity. For example, Bartolomeo altomone and altomone in fig. 1 are references to different entities belonging to a unified entity.
After obtaining embedding of each entity mention, generating an entity embedded representation corresponding to the entity mention by using logsumexp function:
Figure BDA0003669961390000071
wherein
Figure BDA0003669961390000072
Denotes the representation of the jth entity mention, h e Representing entities refers to embedded representations of corresponding entities.
After all the entity embedded representations in the document are obtained, an entity-to-feature representation matrix M is generated based on these embedded representations. Firstly, combining document representation, local feature context information and entity embedded representation to obtain entity representation after semantic enrichment:
Figure BDA0003669961390000073
Figure BDA0003669961390000074
wherein W s And W o The parameters of the model are represented by,
Figure BDA0003669961390000075
and
Figure BDA0003669961390000076
an entity embedded representation representing a host and an object, respectively, h doc Representing document embedded representations, c s,o Representing local feature context information resulting from multiplying the topmost attention score matrix of the encoded subject and object BERT models by the word embedding representation in the document:
Figure BDA0003669961390000077
after obtaining the enhanced representation of the subject and the object, generating the entity pair relation feature vector through a Feed Forward Neural Network (FFNN):
M s,o =FFNN([u s ,u o ])
the above operations on the entity pairs in the document can obtain an entity pair feature matrix M, in which the elements represent the representation of the corresponding relationship between the entity pairs, but there is no semantic relation between the entity pairs. In the document level relation prediction, information between entity pairs plays an important role in cross-sentence reasoning, so that the combination of characteristics in the entity pair relation characteristic demonstration can be helpful for improving the expression of relation extraction.
The currently common method for fusing entity-to-relationship features mainly comprises the following two methods:
1. the relation feature fusion is regarded as a semantic segmentation task (as shown in figure 2)
The semantic segmentation task is essentially to model each pixel point in an image to obtain pixel representation after the semantic is rich, then different types are distributed to each pixel point, the task type is similar to the document level relation extraction task, all the tasks are to model the pixel point in the matrix, and a label is distributed. And (4) regarding the relation characteristic matrix as an image matrix consisting of pixel points, and then coding the relation characteristic matrix by using U-Net. Specifically, the relation feature matrix is downsampled by adopting a convolutional layer and a pooling layer, and then upsampled by adopting the convolutional layer and a deconvolution layer, so that the relation feature matrix with rich semantics is obtained.
And (4) combining each element in the relation matrix after the semanteme is rich by adopting a linear layer and an activation function to obtain the probability of each relation of the entity pair.
2. A stacked cross attention module is employed to fuse information in the entity-pair relationship feature matrix (as shown in fig. 3).
The cross attention network (Criss-CrossAttentionNetwork) can capture the transverse and longitudinal information in the relation feature matrix by attention, thereby completing the operation of information fusion. However, the single-layer cross attention network can only merge the horizontal and vertical information into the target entity pair, which is not enough to support the relationship classification. Therefore, information in the whole matrix can be indirectly fused into the target entity pair by stacking a plurality of crossed networks, and the effect of semantic enhancement is achieved. The calculation process is as follows:
Figure BDA0003669961390000091
wherein N is e Size of the relational feature matrix, A (s,o)→(s,i) Represents from M s,o To M s,i Attention score of, A (s,o)→(i,o) Represents from M s,o To M i,o The attention score of (1). The above operations are then repeated on each feature representation in the matrix to achieve feature fusion.
When obtaining a fused feature matrix M' s,o Then, the entity representation information is fused with the entity representation information, and finally, the probability of the corresponding relation of the entity pair is generated:
Figure BDA0003669961390000092
Figure BDA0003669961390000093
Figure BDA0003669961390000094
where σ represents the activation function.
As mentioned in the background, the above method has three disadvantages in the background.
FIG. 4 is a flowchart illustrating a document level relationship extraction method based on an adjacency matrix according to the present application, which can be applied to a storage device including, but not limited to: personal computers, servers, general purpose computers, special purpose computers, network appliances, embedded appliances, programmable appliances, intelligent mobile terminals, etc. As shown in fig. 4, a document level relationship extraction method based on an adjacency matrix includes steps S401 to S407.
In step S401, long text at the document level is modeled by a transform-XL model. Therefore, the defect that the BERT cannot carry out long sequence modeling can be overcome, and the semantic continuity among long sequences is ensured. The encoding process of the Transformer-XL is as follows:
H=[h 1 ,h 2 ,...,h n ]=transformer-xl([x 1 ,x 2 ,...,x n ])
in step S402, an entity-pair relationship feature matrix is constructed. Specifically, all entity embedded representations in the document are calculated, and a relationship feature matrix is constructed according to the embedded representations. The specific operation process is the same as the above-mentioned existing method.
After the relational feature matrix M is generated, step S403 is performed to model the entity pairs with relations into a path dependency tree. The method specifically comprises the following steps: and for the entity pairs with certain relationships, modeling the entity pairs into a dependency tree respectively, wherein the root node of the tree is the entity pair, and the child nodes in the first layer represent the relationship characteristic representation of the entity pairs which are in direct connection with the longitudinal direction and the entity pairs in the adjacency matrix. By having a certain relationship is meant: for a given entity pair<e s ,eo>And a predefined relationship type set R, if the relationship type described in the relationship type set R exists between the entity pairs, the entity pairs are called to have a certain relationship, otherwise, the entity pairs are called to have no relationship.
In step S404, an adjacency matrix between pairs of entities having a certain relationship is generated from the dependency tree.
After the adjacency matrix a is established, step S405 is executed: a visibility matrix is calculated from the adjacency matrix. It may specifically be: repeating the step of calculating the n-order adjacency matrix by using the n-1-order matrix until two relation characteristics with 1 element in the n-order adjacency matrix meet a preset condition;
calculating a visible matrix V according to the previous n-order adjacency matrix:
V=A+A 2 +...+A n
wherein A represents a first order matrix, A 2 Represents a second order matrix, A n Representing an n-order matrix, wherein the value of n is a natural number which is more than or equal to 2.
Since an element of 1 in a represents the first order adjacency between two relational features, it is represented in the relational matrix as being visible in the landscape and portrait directions. Then, a second-order adjacency matrix A is calculated by using the first-order adjacency matrix 2 The element of 1 in the second order matrix represents the second order adjacency between two relational features, so that the two-hop reasoning can be represented in the document. By analogy, the third-order adjacency matrix A 3 A three-hop inference can be represented. When an n-order adjacency matrix is calculated, it is considered that there is no obvious association between two relation features of which the elements in the subsequent adjacency matrix are not 1, so that the calculation is not performed, where no obvious association means that the association degree between the two relation features is smaller than a preset threshold, and the specific number of the preset threshold is defined by an actual application scenario, where the value of n needs to be predefined, and the specific number of the predefined value is determined according to the actual application scenario, and n is generally selected to be smaller than or equal to 5. From the first n-th order adjacency matrix, the visibility matrix V can be calculated:
V=A+A 2 +...+A n
in step S406, the relationship features associated with the target entity pair relationship features are fused through a self-attention mechanism. Since the element 1 in the visual matrix V represents all the relationship features related to the target relationship features within the n-hop reasoning, and these relationship features are considered to have a certain influence on the classification of the target relationship features, these features can be said to be fused by a method of weighted attention. Different weights are added in the feature fusion process with different step numbers because the step numbers of different relational features are different, and the longer the step number is, the smaller the weight is.
Figure BDA0003669961390000111
Wherein M' s,o Representing the relational features after fusion, M s,o Representing the relation characteristic before fusion, and respectively representing the weights corresponding to different step numbers, wherein alpha is more than beta and more than gamma. A. the 1 Attention coefficient, M, representing the visibility of a first order matrix 1 Representing a first order visible representation of the relational feature.
Generating a post-fusion relationship feature M' s,o After that, step S407 is executed: and calculating the probability of the corresponding relation of the entity pair according to the fused feature matrix. The method used is the same as mentioned above:
Figure BDA0003669961390000112
Figure BDA0003669961390000113
Figure BDA0003669961390000114
as shown in FIG. 5, in the model diagram adopted by the method, a transform-XL model is adopted to model a long text sequence in a document, so that the text between segments has semantic relation, and no upper limit is provided for the length of the modeled text. Semantic faults caused by BERT modeling can be effectively avoided. And the characteristics of different steps are captured by adopting an adjacency matrix method, the objective modeling is clear, and only the entity pairs with certain relation are modeled and the entity pairs without relation are discarded, so that the introduction of noise can be avoided, and the performance of the model is influenced. Only the entity pairs with certain relations are modeled, so that the calculation complexity can be effectively reduced, and the training and reasoning speed of the model is improved.
Referring now to FIG. 6, a specific embodiment of a storage device 600 is illustrated:
a storage device 600 having stored therein a set of instructions for performing:
the long text at the document level is modeled by a Transformer-XL model. Therefore, the defect that the BERT cannot carry out long sequence modeling can be overcome, and the semantic continuity among long sequences is ensured. The encoding process of the Transformer-XL is as follows:
H=[h 1 ,h 2 ,...,h n ]=transformer-xl([x 1 ,x 2 ,...,x n ])
and constructing an entity pair relation characteristic matrix. Specifically, all entity embedded representations in the document are calculated, and a relationship feature matrix is constructed according to the embedded representations. The specific operation process is the same as the above-mentioned existing method.
After the relational feature matrix M is generated, the entity pairs with the relations are respectively modeled into a path dependency tree. The method specifically comprises the following steps: and for entity pairs with certain relationships, modeling the entity pairs as a dependency tree respectively, wherein the root node of the tree is the entity pair, and the child nodes in the first layer represent the relationship characteristic representation of the entity pair in the adjacency matrix, which is in direct connection with the entity pair in the horizontal direction and the vertical direction. By having a certain relationship is meant: for a given entity pair<e s ,eo>And a predefined relationship type set R, if the relationship type described in the relationship type set R exists between the entity pairs, the entity pairs are called to have a certain relationship, otherwise, the entity pairs are called to have no relationship.
And after an adjacency matrix A is established by generating an adjacency matrix between entity pairs with certain relations according to the dependency tree, calculating a visible matrix according to the adjacency matrix. It can be specifically: repeating the step of calculating the n-order adjacency matrix by using the n-1-order matrix until two relation characteristics with 1 element in the n-order adjacency matrix meet a preset condition;
calculating a visible matrix V according to the previous n-order adjacency matrix:
V=A+A 2 +...+A n
wherein A represents a first order matrix, A 2 Represents a second order matrix, A n Representing an n-order matrix, wherein the value of n is a natural number which is more than or equal to 2.
Since an element of 1 in a represents the first order adjacency between two relational features, it is represented in the relational matrix as being visible in the landscape and portrait directions. Then, a second-order adjacency matrix A is calculated by using the first-order adjacency matrix 2 The element of 1 in the second order matrix represents the second order adjacency between two relational features, so that the two-hop reasoning can be represented in the document. By analogy, the third-order adjacency matrix A 3 A three-hop inference can be represented. When an n-order adjacency matrix is calculated, if it is considered that two relation features of which the elements are not 1 in the subsequent adjacency matrix do not have obvious relation, the calculation is not performed, and the lack of obvious relation means that the degree of relation between the two relation features is smaller than a preset threshold, and the specific number of the preset threshold is defined by an actual application scene, wherein the value of n needs to be predefined, and the specific number of the value is determined according to the actual application scene, and n is generally selected to be smaller than or equal to 5. From the first n-th order adjacency matrix, the visibility matrix V can be calculated:
V=A+A 2 +...+A n
and fusing the relation characteristics related to the relation characteristics of the target entity pair through a self-attention mechanism. Since the element 1 in the visible matrix V represents all the relationship features related to the target relationship features within the n-hop reasoning, and these relationship features are considered to have a certain influence on the classification of the target relationship features, these features can be described to be fused by a method of weighted attention. Different weights are added in the feature fusion process with different step numbers because the step numbers of different relational features are different, and the longer the step number is, the smaller the weight is.
Figure BDA0003669961390000131
WhereinM′ s,o Representing the relational features after fusion, M s,o Representing the relation characteristic before fusion, and respectively representing the weights corresponding to different step numbers, wherein alpha is more than beta and more than gamma. A. the 1 Attention coefficient, M, representing the visibility of a first order matrix 1 Representing a first order visible representation of the relational feature.
Generating a post-fusion relationship feature M' s,o And then, calculating the probability of the corresponding relation of the entity pair according to the fused feature matrix. The method used is the same as mentioned above:
Figure BDA0003669961390000132
Figure BDA0003669961390000133
Figure BDA0003669961390000134
the commands executed by the instruction set of the storage device 600 model long text sequences in the documents by using a Transformer-XL model, so that the texts between the segments have semantic relation, and no upper limit is provided for the length of the modeled texts. Semantic faults caused by BERT modeling can be effectively avoided. And the characteristics of different steps are captured by adopting an adjacency matrix method, the objective modeling is clear, and only the entity pairs with certain relation are modeled and the entity pairs without relation are discarded, so that the introduction of noise can be avoided, and the performance of the model is influenced. Only the entity pairs with certain relations are modeled, so that the calculation complexity can be effectively reduced, and the training and reasoning speed of the model is improved.
Finally, it should be noted that, although the above embodiments have been described in the text and drawings of the present application, the scope of the patent protection of the present application is not limited thereby. All technical solutions which are generated by replacing or modifying the equivalent structure or the equivalent flow according to the contents described in the text and the drawings of the present application, and which are directly or indirectly implemented in other related technical fields, are included in the scope of protection of the present application.

Claims (10)

1. A document level relation extraction method based on an adjacency matrix is characterized by comprising the following steps:
modeling the long text of the document level through a Transformer-XL model;
constructing an entity pair relation characteristic matrix;
modeling the entity pairs with the relations into a path dependency tree respectively;
generating an adjacency matrix between a certain relation entity pair according to the dependency tree;
calculating a visibility matrix from the adjacency matrix;
fusing the relation characteristics related to the relation characteristics of the target entity pair through a self-attention mechanism;
and calculating the probability of the corresponding relation of the entity pair according to the fused feature matrix.
2. The method for extracting document level relationship based on adjacency matrix according to claim 1, wherein the calculating a visible matrix according to the adjacency matrix further comprises:
repeating the step of calculating the n-order adjacency matrix by using the n-1-order matrix until two relation characteristics with 1 element in the n-order adjacency matrix meet a preset condition;
calculating a visible matrix V according to the previous n-order adjacency matrix:
V=A+A 2 +...+A n
wherein A represents a first order matrix, A 2 Represents a second order matrix, A n Representing an n-order matrix, wherein the value of n is a natural number which is more than or equal to 2.
3. The method according to claim 1, wherein the fusing the relationship features associated with the target relationship features through weighted attention further comprises:
and determining different weights according to the difference of the step numbers among different relation characteristics, wherein the longer the step number is, the smaller the weight is.
4. The method according to claim 1, wherein the root node of the dependency tree is a corresponding entity pair, and the first-level node represents an entity-pair relationship feature representation in the adjacency matrix, which is directly related to the entity pair in the horizontal and vertical directions.
5. The method for extracting document level relationship based on adjacency matrix according to claim 1, wherein the constructing of the relationship feature matrix further comprises:
and calculating all entity embedded representations in the document, and constructing a relation characteristic matrix according to the embedded representations.
6. A storage device having a set of instructions stored therein, the set of instructions being operable to perform:
modeling the long text of the document level through a Transformer-XL model;
constructing an entity pair relation characteristic matrix;
modeling the entity pairs with the relations into a path dependency tree respectively;
generating an adjacency matrix between a certain relation entity pair according to the dependency tree;
calculating a visibility matrix from the adjacency matrix;
fusing the relation characteristics related to the relation characteristics of the target entity pair through a self-attention mechanism;
and calculating the probability of the corresponding relation of the entity pair according to the fused feature matrix.
7. The storage device of claim 6, wherein the set of instructions is further configured to perform: said calculating a visibility matrix from said adjacency matrix further comprising:
repeating the step of calculating the n-order adjacency matrix by using the n-1-order matrix until two relation characteristics with 1 element in the n-order adjacency matrix meet a preset condition;
calculating a visible matrix V according to the previous n-order adjacency matrix:
V=A+A 2 +...+A n
wherein A represents a first order matrix, A 2 Represents a second order matrix, A n Representing an n-order matrix, wherein the value of n is a natural number which is more than or equal to 2.
8. The storage device of claim 6, wherein the set of instructions is further configured to perform:
the fusing the relationship features associated with the target relationship features through weighted attention further comprises:
and determining different weights according to the difference of the step numbers among different relation characteristics, wherein the longer the step number is, the smaller the weight is.
9. The storage device according to claim 6, wherein the root node of the dependency tree is a corresponding entity pair, and the first-level node represents an entity pair relationship characteristic representation in the adjacency matrix, which is directly related to the longitudinal direction and the entity pair.
10. The storage device of claim 6, wherein the set of instructions is further configured to perform:
the constructing of the relationship feature matrix further comprises:
and calculating all entity embedded representations in the document, and constructing a relation characteristic matrix according to the embedded representations.
CN202210602851.8A 2022-05-30 2022-05-30 Document level relation extraction method based on adjacency matrix and storage device Pending CN115081392A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210602851.8A CN115081392A (en) 2022-05-30 2022-05-30 Document level relation extraction method based on adjacency matrix and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210602851.8A CN115081392A (en) 2022-05-30 2022-05-30 Document level relation extraction method based on adjacency matrix and storage device

Publications (1)

Publication Number Publication Date
CN115081392A true CN115081392A (en) 2022-09-20

Family

ID=83250131

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210602851.8A Pending CN115081392A (en) 2022-05-30 2022-05-30 Document level relation extraction method based on adjacency matrix and storage device

Country Status (1)

Country Link
CN (1) CN115081392A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521888A (en) * 2023-03-20 2023-08-01 麦博(上海)健康科技有限公司 Method for extracting medical long document cross-sentence relation based on DocRE model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782768A (en) * 2020-06-30 2020-10-16 首都师范大学 Fine-grained entity identification method based on hyperbolic space representation and label text interaction
CN111967258A (en) * 2020-07-13 2020-11-20 中国科学院计算技术研究所 Method for constructing coreference resolution model, coreference resolution method and medium
CN113282750A (en) * 2021-05-27 2021-08-20 成都数之联科技有限公司 Model training method, system, device and medium
CN114153942A (en) * 2021-11-17 2022-03-08 中国人民解放军国防科技大学 Event time sequence relation extraction method based on dynamic attention mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782768A (en) * 2020-06-30 2020-10-16 首都师范大学 Fine-grained entity identification method based on hyperbolic space representation and label text interaction
CN111967258A (en) * 2020-07-13 2020-11-20 中国科学院计算技术研究所 Method for constructing coreference resolution model, coreference resolution method and medium
CN113282750A (en) * 2021-05-27 2021-08-20 成都数之联科技有限公司 Model training method, system, device and medium
CN114153942A (en) * 2021-11-17 2022-03-08 中国人民解放军国防科技大学 Event time sequence relation extraction method based on dynamic attention mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context", HTTPS://DOI.ORG/10.48550/ARXIV.1901.02860, 2 June 2019 (2019-06-02) *
汤凌燕: "基于深度学习的短文本情感倾向分析综述", 计算机科学与探索, vol. 15, no. 5, 31 May 2021 (2021-05-31) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521888A (en) * 2023-03-20 2023-08-01 麦博(上海)健康科技有限公司 Method for extracting medical long document cross-sentence relation based on DocRE model

Similar Documents

Publication Publication Date Title
CN111159412B (en) Classification method, classification device, electronic equipment and readable storage medium
CN109359297A (en) A kind of Relation extraction method and system
CN115099219B (en) Aspect-level emotion analysis method based on enhancement graph convolutional neural network
CN113868425B (en) Aspect-level emotion classification method
CN112100486B (en) Deep learning recommendation system and method based on graph model
CN116152833B (en) Training method of form restoration model based on image and form restoration method
CN115081392A (en) Document level relation extraction method based on adjacency matrix and storage device
CN115470232A (en) Model training and data query method and device, electronic equipment and storage medium
CN112800225A (en) Microblog comment emotion classification method and system
CN114973286A (en) Document element extraction method, device, equipment and storage medium
CN116595406A (en) Event argument character classification method and system based on character consistency
CN115269834A (en) High-precision text classification method and device based on BERT
CN114742016A (en) Chapter-level event extraction method and device based on multi-granularity entity differential composition
CN114417874A (en) Chinese named entity recognition method and system based on graph attention network
CN116521899B (en) Improved graph neural network-based document level relation extraction method and system
CN111581386A (en) Construction method, device, equipment and medium of multi-output text classification model
CN114998809B (en) ALBERT and multi-mode cyclic fusion-based false news detection method and system
CN111159424A (en) Method, device, storage medium and electronic equipment for labeling knowledge graph entities
CN107330513B (en) Method for extracting hidden node semantics in deep belief network
CN112131879A (en) Relationship extraction system, method and device
CN115423098A (en) Method, system and device for injecting entity knowledge of pre-training language model
CN115438164A (en) Question answering method, system, equipment and storage medium
CN115048926A (en) Entity relationship extraction method and device, electronic equipment and storage medium
CN114201957A (en) Text emotion analysis method and device and computer readable storage medium
CN116468030A (en) End-to-end face-level emotion analysis method based on multitasking neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination