CN117408247B

CN117408247B - Intelligent manufacturing triplet extraction method based on relational pointer network

Info

Publication number: CN117408247B
Application number: CN202311726555.XA
Authority: CN
Inventors: 亓晋; 刘晨雅; 孙雁飞; 郭宇锋; 胡筱旋; 董振江
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-12-15
Filing date: 2023-12-15
Publication date: 2024-03-29
Anticipated expiration: 2043-12-15
Also published as: CN117408247A

Abstract

The invention belongs to the field of intelligent manufacturing triplet extraction, and discloses an intelligent manufacturing triplet extraction method based on a relation pointer network.

Description

Intelligent manufacturing triplet extraction method based on relational pointer network

Technical Field

The invention belongs to the field of intelligent manufacturing triplet extraction, and particularly relates to an intelligent manufacturing triplet extraction method based on a relation pointer network.

Background

With the rapid development of new generation information technology, the amount and complexity of information has increased dramatically. There is a need for a more efficient way to organize, retrieve and understand large amounts of information, and conventional relational databases and search engines are limited in this regard. The knowledge graph is a semantic knowledge base of a directed graph structure, and can extract useful information from redundant data and knowledge texts to effectively express internal relations between the data. Knowledge maps allow knowledge from different domains to be integrated into a unified structure. This is very effective for complex problem solving and knowledge discovery involving multiple fields. In the process of constructing a large-scale domain knowledge graph, triad extraction is one of key technologies used for extracting entities and relations from massive texts.

In the knowledge graph, the triplet is a data structure shaped as (head entity, relation, tail entity), and at present, the triplet extraction is divided into pipeline extraction and joint extraction, and the pipeline extraction divides the triplet extraction into two independent tasks: namely, entity identification is carried out firstly, then relationship classification is carried out, and the two tasks are not interacted, so that errors generated by the entity identification are transmitted to the relationship classification; the joint extraction is used for uniformly modeling the entity identification and relation extraction sub-models, and compared with pipeline extraction, the inherent relation and the dependent relation in two tasks are further utilized, so that the defect of error accumulation is relieved to a certain extent.

The Chinese patent application number CN2021111821736 discloses a Chinese triplet extraction method based on the bidirectional coding representation of convertors, which can fully describe the characteristic relationship among character level, word level and sentence level. However, the technology firstly identifies the entity and then identifies the relation through pipeline type extraction, the inherent relation and the dependent relation between the two tasks cannot be fully utilized, and the entity pair without the relation is identified to promote the error rate, waste the calculated amount, bring additional redundant information and cannot correspondingly process complex scenes such as the entity with overlapping relation and shared head.

Chinese patent application No. CN202111658767X provides a method for extracting relation triples based on cascade binary annotation framework, which models the relation as mapping head entity to tail entity in sentence, namely f _{Relationship of} The (head entity) =tail entity solves the problem of relation overlapping, and the discrete labels of the entity pairs are allocated to the relation by the single label labeling model, the Start and End positions of the entities are labeled by the multi-label binary labeling framework, and the problem of sample imbalance is solved. However, the technology has the problems of relation redundancy, waste of calculation amount, extra redundant information, and incapability of correspondingly processing complex scenes such as overlapping relation, entities sharing the head and the like. For example, there are hundreds of relations in the relation set, but the density of the relations is very low, and one relation appears in each sentence of text on average, so that for each head entity, the method needs to predict tail entity information corresponding to hundreds of relations, and a plurality of redundant relation judgment is generated.

In summary, there are a problem of overlapping of one pair of entities corresponding to a plurality of relationships, namely, "one-to-many" entity pair, a problem of overlapping of one entity corresponding to a plurality of relationships, namely, "many-to-one" single entity pair, and a nesting situation of the entities themselves in the text, so the task of extracting triples faces a great challenge.

Disclosure of Invention

Aiming at the technical problems and the demands, the invention provides an intelligent manufacturing triplet extraction method based on a relation pointer network, which predicts potential relations in data by utilizing an average pooling network and a fully-connected neural network, solves the problem of intelligent manufacturing relation calculation redundancy in the subsequent entity identification process, calculates entity positions by utilizing two span-based double-layer pointer networks, solves the problem of entity nesting and relation overlapping, improves the accuracy of intelligent manufacturing entity labeling, and finally matches entity pairs based on the potential relations by utilizing the relation pointer network, thereby correctly extracting intelligent manufacturing triples and improving the efficiency and quality of triplet extraction tasks.

In order to achieve the above purpose, the invention is realized by the following technical scheme:

the invention relates to an intelligent manufacturing triplet extraction method based on a relation pointer network, which comprises a context pre-training model coding module, a potential relation prediction module, a head entity decoding module and a head and tail entity alignment module, wherein the context pre-training model coding module is used for acquiring semantic feature representation of intelligent text data, and learning context information of each word in a text by using a context pre-training model to obtain semantic vectors of the text; the potential relation prediction module is used for constructing an intelligent manufacturing relation candidate set and predicting potential relation existing in the intelligent manufacturing text; the header entity decoding module is used for realizing entity span extraction and determining the starting and ending positions of the entity; the head-tail entity alignment module is used for realizing the matching of intelligent manufacturing triples, traversing potential relations in the relation candidate set for each head entity, checking whether tail entities related to the head entity exist, and outputting { head entity, relation, tail entity } triples if the tail entities exist.

The specific triplet extraction method comprises the following steps:

step 1, preparing text data and defining a relation set existing in the text dataWherein the input is a text sentence;

step 2, adopting a context pre-training model coding module to code the text data in the step 1;

step 3, predicting a candidate relation set in the text data by utilizing a potential relation prediction module，/>For the relation set defined in step 1 +.>Is to get->；

Step 4, setting a relation threshold valueWhen +.>Is greater than->When the relationship is considered to be contained in the triplet, the corresponding relationship is marked as 1, the rest is marked as 0, and the relationship is recorded as a candidate relationship set +.>Candidate relation set->Less than relation set->；

Step 5, predicting the starting position and the ending position of the head entity by using a head entity decoding module, and determining the head existing in the text according to the natural continuity of the span of the head entityEntityInput sequence +.>The token is a header entity +.>The start and end positions of (1) are denoted +.>And->；

Step 6, if obtained in step 5、/>When the value of (2) is larger than a preset threshold value, marking the corresponding token as 1, and regarding the corresponding token as the starting or ending position of the head entity;

step 7, traversing the relation candidate set by using the head-tail entity alignment moduleIn (2) calculating whether there is +.>Related tail entity->First->Probability of a person token as a start position of a header entity +.>Probability of ending position->Respectively->，；

Step 8, if obtained in step 7When the value is larger than the preset value, the corresponding token is marked as 1 and is regarded as the starting position of the tail entity, if +.>When the value of (2) is larger than the preset value, the corresponding token is marked as 1 and is regarded as the end position of the tail entity, and the { head entity, relation and tail entity } triples are successfully output by matching.

The invention further improves that: the step 2 specifically comprises the following steps:

step 2.1, firstly, inputting text sentencesEach sub-word of (a) is converted into a word vector and a position vectorThe word vector and the position vector are input into 12 double-layer converter blocks to extract the features,

step 2.2, the transformation blocks learn information through a multi-head self-attention mechanism, each transformation block transmits the learned information through a layer of full-connection network, and finally semantic vectors are output：，/>Representing the dimension of the last hidden layer of the BERT model.

The invention further improves that: the step 3 is specifically as follows:

step 3.1, the potential relation prediction module encodes the semantic vector of the text output by the context pre-training model encoding modulePushing into a global average pooling layer;

step 3.2, inputting a fully-connected neural network;

step 3.3, finally calculating the probability of each relation by activating the function to obtain。

The invention further improves that: the step 5 specifically comprises the following steps:

step 5.1, the head entity decoding module outputs the semantic vector from the context pre-training model encoding moduleInputting GRU to obtain->；

Step 5.2, inputting two identical double-layer pointer networks for marking the starting position and the ending position of the entity, wherein the nonlinearity of the network is enhanced by using a ReLU activation function in the middle of the two-layer pointer networks;

step 5.3, calculating to obtain probability:，，/>and->Respectively represent the +.>The start and end positions of the respective token as header entity,/->Is a weight that can be learned, +.>Indicating bias(s)>Representing an activation function.

The invention further improves that: the model is trained jointly by adopting a mode of sharing parameters during model training, and the objective function of the combination is optimized during training, and the total loss is divided into three parts:，，the total loss is the sum of the three parts: />。

The beneficial effects of the invention are as follows:

(1) The invention utilizes the relation pointer-based network to realize the triplet extraction model, improves the triplet extraction quality, provides new kinetic energy for the comprehensive landing of the knowledge graph, and accelerates the enterprise to realize resource aggregation and optimization.

(2) According to the method, the potential relation in the text data is predicted by utilizing the neural network, and the entity pair matching is performed by the pointer network based on the relation, so that the problem of relation redundancy is effectively solved, and the calculated amount in the head-to-tail entity alignment process is greatly reduced by constructing the candidate relation set.

(3) According to the invention, the GRU is matched with the double-layer pointer network to calculate the position of the intelligent manufacturing entity, the accuracy of marking the intelligent manufacturing entity is improved, the characteristics are combined through the operation of dimension increasing and dimension decreasing of the two layers of fully connected neural network layers, the resolution capability of a model is improved, the combined characteristics with low distinguishing degree are removed, and the characterization of the entity is effectively learned.

Drawings

FIG. 1 is a flow chart of the intelligent manufacturing triplet extraction method of the present invention based on a potential relationship and pointer network.

FIG. 2 is a block diagram of an intelligent manufacturing triplet extraction method based on a potential relationship and pointer network of the present invention.

Detailed Description

Embodiments of the invention are disclosed in the drawings, and for purposes of explanation, numerous practical details are set forth in the following description. However, it should be understood that these practical details are not to be taken as limiting the invention. That is, in some embodiments of the invention, these practical details are unnecessary.

The invention relates to an intelligent manufacturing triplet extraction method based on a relation pointer network, which uses a BERT pre-training model to replace the existing word2vec (Word Representations in Vector Space) word vector generation model, thereby overcoming the defects of the traditional triplet extraction method, and the word vector obtained by the BERT model has stronger generalization capability, can fully describe the characteristic relation among character level, word level and sentence level, uses Bi-directional Gated recurrent units and Bi-GRU as network structures, combines an attention mechanism to perform relation extraction, not only improves the accuracy of relation extraction, but also expands the application range of the BERT pre-training model. The invention also processes the extracted sentences through the BERT pre-training model to obtain semantic feature representation coding vectors in the sentences; decoding the output coded vector, and identifying Start and End position labels of the head entities, so as to obtain feature vector matrixes of all possible head entities and corresponding Token in the sentence; averaging vectors corresponding to Token of the feature vector matrix to obtain a head entity feature vector, and fusing the output decoding vectors to obtain a fused vector. According to the fused vector->And combining a group of specific relation sets, identifying Start and End position labels of tail entities of the corresponding relation, identifying all relations and tail entities related to the head entity, and finally extracting relation triples.

Aiming at the problems of entity overlapping and relation redundancy in text data, a relation pointer network-based triplet extraction method is introduced, and the method comprises a context pre-training model coding module, a potential relation prediction module, a head entity decoding module and a head and tail entity alignment module, wherein the whole framework is shown in figure 1.

The context pre-training model coding module is used for obtaining semantic feature representation of intelligent text data, and learning context information of each word in the text by using the context pre-training model to obtain semantic vectors of the text. And mapping the words in the text into corresponding word vectors, and inputting the corresponding word vectors into the BERT pre-training model. The embedded vector obtained through the BERT pre-training model has stronger generalization capability, can fully describe the characteristic relation in the text, has better global expression effect, and the text semantic vector output by the pre-training model coding module is used as the input of the next module.

The potential relation prediction module is used for constructing an intelligent manufacturing relation candidate set and predicting potential relations existing in the intelligent manufacturing text. The number of potential relations is far smaller than the number of relations in the predefined relation set, so that only the relations in the relation candidate set are subjected to entity alignment in the head-to-tail entity alignment module, all the relations are not calculated, redundant relation judgment can be reduced, and the redundancy of relation calculation is reduced. Text semantic vector output by context pre-training model coding module，/>Input global average pooling layer (Global Average Pooling, GAP) to remove redundant messagesAnd the characteristics are compressed at the same time, so that the number of parameters and the calculated amount are reduced, and the overfitting is effectively restrained. Then the probability +.sub.f was obtained using the softmax two classifier>When->When the relation is larger than a preset threshold value, the relation is considered to be contained, the corresponding relation is marked as 1, the relation is put into the intelligent manufacturing relation candidate set, and the rest is marked as 0.

The header entity decoding module is used for realizing entity span extraction and determining the starting and ending positions of the entity. Semantic vector of text output by pre-training model coding module，/>Input gating of the recurrent neural unit (Gate Recurrent Unit, GRU), by resetting the short-term dependence in the gate capture sequence, by updating the long-term dependence in the gate capture sequence, obtaining +.>The start and end positions of the entity are then calculated using a span-based two-layer pointer network: />，，/>And->Respectively represent the +.>The start of a token as a header entityAnd end position if->When the value of (2) is larger than a preset value, marking the corresponding token as 1, and regarding the corresponding token as the starting position of the head entity; if->When the value of (2) is larger than the preset value, marking the corresponding token as 1, and regarding the corresponding token as the end position of the head entity.

The head-tail entity alignment module is used for realizing the matching of intelligent manufacturing triples, traversing potential relations in the relation candidate set for each head entity, checking whether tail entities related to the head entity exist, outputting { head entity, relation, tail entity } triples if the tail entities exist, namely searching for the tail entities according to the head entities and the relation, and outputting the head entity, the relation and the tail entity triples if the tail entities exist. Alignment of head and tail entities based on potential relationships using a relationship pointer network, not only considering GRU processing when decoding and identifying tail entities for further use of global information of context in textThe characteristics of the head entity obtained in the head entity decoding module also need to be considered, so that the average +.f of all word vectors of the head entity is added when the tail entity corresponding to the head entity is calculated>The position of each tail entity is calculated using the fused feature vector. Calculating to obtain probability，，/>Andrespectively represent input sequencesMiddle->The start and end positions of the token as header entity, if->When the value of (2) is larger than a preset value, marking the corresponding token as 1, and regarding the token as the starting position of the tail entity; if->When the value of (2) is larger than the preset value, marking the corresponding token as 1, and regarding the corresponding token as the end position of the tail entity.

Specifically, the intelligent manufacturing triplet extraction method based on the relation pointer network comprises the following steps:

step 1, preparing enough text data and defining the relation set existing in the text dataWherein the input is a text sentence, wherein there may be multiple triples, and there may be overlapping situations of entities and relationships.

And 2, encoding the text data in the step 1 by adopting a context pre-training model encoding module. First, input sentencesIs converted into a word vector and a position vector +.>The method comprises the steps of inputting the characteristics into 12 double-layer convertors, learning information by the convertors through a multi-head self-attention mechanism, transmitting the learned information by each convertors through a layer of full-connection network, and finally outputting semantic vectors->：，/>Representing the dimension of the last hidden layer of the BERT model.

Step 3, predicting a candidate relation set in the text by utilizing a potential relation prediction module，/>Is the relation set defined in step one +.>Is a subset of the set of (c). The module encodes the semantic vector +.>Pushing into global average pooling layer, inputting fully connected neural network, and calculating probability of each relation by activating function to obtain ∈>。

Step 4, setting a relation threshold valueWhen +.>Is greater than->When the relation is considered to be contained, the corresponding relation is marked as 1, the rest is marked as 0, and the relation is recorded as a candidate relation set +.>Candidate relation set->Less than relation set->；

Step 5, predicting the starting position and the ending position of the head entity by using a head entity decoding module, and determining the head entity existing in the text according to the natural continuity of the span of the head entityThe module encodes the semantic vector outputted by the module of the context pre-training model +.>Inputting GRU to obtain->Then, two identical double-layer pointer networks are input for marking the starting position and the ending position of the entity, wherein the nonlinearity of the network is enhanced by using a ReLU activation function in the middle of the two-layer pointer networks, and finally, the probability is calculated: />，，/>And->Respectively represent the +.>The start and end positions of the respective token as header entity,/->Is a weight that can be learned, +.>Indicating bias(s)>Representing an activation function.

step 7, traversing the relation candidate set by using the head-tail entity alignment moduleIn (2) calculating whether there is +.>Related tail entity->. Matching entity pairs by using a relationship-based pointer network, and simultaneously considering the output +.f through GRU when decoding and identifying tail entities for further using global information in text>And the average +.about.of all word vectors of the head entity>. First->Probability of a person token as a start position of a header entity +.>Probability of ending position->Respectively->，

；

Examples

The invention relates to an intelligent manufacturing triplet extraction method based on a relation pointer network, which comprises the following specific flow: character vector characterization of intelligent manufacturing text using pre-trained language model BERT, intelligent manufacturing text as character sequenceInputting BERT, using self-attention mechanism, residual and layer normalization in a transducer encoder, feedforward neural network for each character +.>Character vector representation is performed:wherein { Q, K, V } is the input matrix, { I }>For the dimension of the input vector, the output vector is finally obtained>。

Will beInputting an average pooling layer, and calculating each relation by using the fully connected neural network and the activation function sigmoidProbability of->. And predicting potential relations existing in the sentence, counting a candidate relation set, and setting three potential relations contained in the sentence, wherein when entity matching is performed, only entity pairs corresponding to the three potential relations are calculated instead of calculating tail entities corresponding to all relations in the relation set.

Using a header entity decoding module toDecoding, namely outputting semantic vectors of intelligent manufacturing text by the pre-training model coding module>，/>Input GRU,/->And predicting the starting position and the ending position of the head entity by using a double-layer labeling mode to form a candidate head entity set. Let the starting position decoding vector of the sentence output be [1,0,0,0,0,0,1,0,1,0 ]]The end position decoding vector is [0,1,1,0,0,0,0,1,0,1 ]]The head entity is +.>。

And traversing the relation in the relation candidate set for each entity in the candidate head entity set, similarly, calculating tail entities related to the head entities by combining GRU with a double-layer pointer network labeling mode, determining the tail entities under the corresponding relation, and outputting the { head entity, relation and tail entity } triples if the matching is successful.

The model is trained jointly by adopting a mode of sharing parameters during model training, and the objective function of the combination is optimized during training, so that the total loss can be divided into three parts:，，the total loss is the sum of the three parts:here, equal weights are assigned.

In order to verify the effect of the present invention, experiments were performed to evaluate models using Precision (P), recall (R), and F1 values. The calculation methods of the accuracy, recall and F1 values are shown below.

。

The experiment used a WebNLG dataset containing 246 relationships, with the training set containing 5019 triples and the test set containing 703 triples. The data set has rich relationship types and complex entity and relationship matching scenes. The WebNLG dataset triplet distribution statistics are shown in table 1 and the experimental results are shown in table 2.

TABLE 1

TABLE 2

Experimental results show that the triad extraction accuracy rate of the invention on the WebNLG data set is 90.5%, the recall rate is 92.2%, and the F1 value is 91.4%, and compared with other three comparison models, the triad extraction accuracy rate is improved to a certain extent.

The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present invention, should be included in the scope of the claims of the present invention.

Claims

1. An intelligent manufacturing triplet extraction method based on a relation pointer network is characterized by comprising the following steps of: the intelligent manufacturing triplet extraction method specifically comprises the following steps:

step 1, preparing text data, and defining a relation set R existing in the text data, wherein the text data is input into a text sentence;

step 3, predicting candidate relation sets R 'and R' in text data to obtain p, wherein the candidate relation sets R 'and R' are subsets of the relation sets R defined in the step 1 by utilizing a potential relation prediction module _rel ；

Step 4, setting a relation threshold p' _rel When p in step 3 _rel Greater than p' _rel When the relationship is considered to be contained in the triples, the corresponding relationship is marked as 1, the rest is marked as 0, the relationship is recorded into a candidate relationship set R ', and the candidate relationship set R' is smaller than the relationship set R;

step 5, predicting the start position and the end position of the head entity by using the head entity decoding module, determining the head entity s existing in the text according to the natural continuity of the span of the head entity, and representing the start position and the end position of the ith token in the input sequence as the head entity s as followsAnd->

Step 6, if obtained in step 5When the value of (2) is larger than a preset threshold value, marking the corresponding token as 1, and regarding the corresponding token as the starting or ending position of the head entity;

step 7, using the head-tail entity alignment module to traverse the relationship in the relationship candidate set R', and calculating whether there is a tail entity o related to the head entity s, wherein the ith token is used as the probability of the beginning position of the tail entityProbability of ending position->Respectively is

h _avg Averaging all word vectors for the head entity;

step 8, if obtained in step 7When the value is larger than the preset value, the corresponding token is marked as 1 and is regarded as the starting position of the tail entity, if +.>When the value of the token is larger than a preset value, marking the corresponding token as 1, and regarding the token as the end position of the tail entity, and at the moment, successfully outputting the triples by matching;

the step 5 specifically includes the following steps:

step 5.1, the head entity decoding module inputs the semantic vector h output by the context pre-training model encoding module into the GRU to obtain h' =GRU (h);

step 5.3, calculating to obtain probability: and->Representing the start and end positions of the ith token in the input sequence as the head entity, W _(·) Is a weight capable of learning, b _(·) Representing bias, σ represents activation function;

the step 2 specifically comprises the following steps:

step 2.1, firstly, converting each sub word of the input text sentence S into a word vector and a position vector E= { E ₁ ,E ₂ ，...，E _n Inputting word vectors and position vectors into 12 double-layer converterler blocks to extract features;

step 2.2, the transformers learn information through a multi-head self-attention mechanism, each Transformer transmits the learned information through a layer of fully-connected network, and finally semantic vectors h are output: h= { h ₀ ,h ₁ ,h ₂ ,...,h _m ,h _m+1 |h _i ∈R ^d×1 D represents the dimension of the last hidden layer of the BERT model;

the step 3 specifically comprises the following steps:

step 3.1, the potential relation prediction module pushes a semantic vector h of the text output by the context pre-training model coding module into a global average pooling layer;

step 3.2, inputting a fully-connected neural network;

step 3.3, finally calculating the probability of each relation by activating the function to obtain p _rel ＝σ(w(Pooling(h))+b)。

2. The intelligent manufacturing triplet extraction method based on the relation pointer network according to claim 1, wherein: the model is trained jointly by adopting a mode of sharing parameters during model training, and the objective function of the combination is optimized during training, and the total loss is divided into three parts: ，the total loss is the sum of the three parts: l (L) _total ＝αL _rel +βL _s +γL _o 。

3. The intelligent manufacturing triplet extraction method based on the relation pointer network according to claim 1, wherein: the extraction method comprises a context pre-training model coding module, a potential relation prediction module, a head entity decoding module and a head and tail entity alignment module,

the context pre-training model coding module is used for acquiring semantic feature representation of intelligent manufacturing text data, and learning context information of each word in the text by using the context pre-training model to obtain semantic vectors of the text;

the potential relation prediction module is used for constructing an intelligent manufacturing relation candidate set and predicting potential relation existing in the intelligent manufacturing text;

the header entity decoding module is used for realizing entity span extraction and determining the starting and ending positions of the entity;

the head-tail entity alignment module is used for realizing the matching of intelligent manufacturing triples, traversing potential relations in the relation candidate set for each head entity, checking whether tail entities related to the head entity exist or not, and outputting the triples if the tail entities exist.