CN115934944A - Entity relation extraction method based on Graph-MLP and adjacent contrast loss - Google Patents

Entity relation extraction method based on Graph-MLP and adjacent contrast loss Download PDF

Info

Publication number
CN115934944A
CN115934944A CN202211594439.2A CN202211594439A CN115934944A CN 115934944 A CN115934944 A CN 115934944A CN 202211594439 A CN202211594439 A CN 202211594439A CN 115934944 A CN115934944 A CN 115934944A
Authority
CN
China
Prior art keywords
training text
mlp
loss
graph
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211594439.2A
Other languages
Chinese (zh)
Inventor
吴涛
游小琳
先兴平
宋秀丽
姜丰
徐敖远
张浩然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202211594439.2A priority Critical patent/CN115934944A/en
Publication of CN115934944A publication Critical patent/CN115934944A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an entity relation extraction method based on Graph-MLP and adjacent contrast loss, which comprises the following steps: acquiring training text data with label information; performing word segmentation processing on the training text according to the word list; embedding words in the training text into vectors to represent to obtain word sequence vectors of the training text; inputting the word sequence vector of the training text into a Bi-LSTM and extracting to obtain the context semantic feature representation of the training text; creating a Graph-MLP relation classification model; taking the context semantic feature expression of the training text as a training sample to train the Graph-MLP relation classification model; and acquiring a word sequence vector of the target text, inputting the word sequence vector of the target text into a trained Graph-MLP relation classification model, and outputting the relation between two entities in the target text.

Description

Entity relation extraction method based on Graph-MLP and adjacent contrast loss
Technical Field
The invention belongs to the field of natural language processing, and particularly relates to an entity relation extraction method based on Graph-MLP and adjacent contrast loss.
Background
With the rapid development of internet technology, network data in the internet grows exponentially. Massive network data implies rich and important information. The relation extraction technology in the field of natural language processing aims at automatically extracting semantic relations in texts by modeling text information, extracting effective semantic knowledge and facilitating effective storage and screening of data by subsequent tasks. The task of relationship extraction is to extract the relationships between pairs of entities in natural language text. The relationship of an entity pair can be formally described as a relationship triple < e 1 ,r,e 2 Is where e 1 And e 2 Is an entity, R belongs to a target relationship set R { R } 1 ,r 2 ,...,r M One of them. Successful relation extraction is a foundation for large-scale relation understanding of unstructured texts, and research results of the successful relation extraction are mainly applied to the fields of information retrieval, automatic question answering, knowledge map construction and the like. At present, graphical Neural Networks (GNNs) have been increasingly used to solve the problem of relationship extraction, with advanced results. The method is mainly characterized in that topological structure information and attribute feature information in graph data are integrated by utilizing a graph neural network model, and further more refined feature representation of nodes or substructures is provided.
However, the conventional GNN-based relational extraction model often requires an additional language tool to convert the sequence text into graph-structured data, so as to be used as an input form of GNN. Resulting in computationally expensive and not end-to-end tasks during relational extraction data processing. Meanwhile, the traditional GNN-based relation extraction model mainly utilizes neighborhood information to realize message transmission among nodes, so that the structural information of the graph is clearly learned in the feedforward neural network. Such complex messaging often results in complex and heavy computations, which are also a major source of delays generated by GNNs in large-scale graph structures, reducing the speed of entity relationship extraction, and making GNNs difficult to deploy in large-scale industrial applications requiring rapid inference and complex structures.
Disclosure of Invention
In order to solve the problems in the background art, the invention provides an entity relation extraction method based on Graph-MLP and adjacent contrast loss, which comprises the following steps:
s1: acquiring training text data with label information; the tag information includes: training the relationship category between two entities in the text;
s2: performing word segmentation processing on the training text according to the word list; embedding words in the training text into a vector by adopting a GloVe model to represent to obtain a word sequence vector of the training text;
s3, inputting the word sequence vector of the training text into the Bi-LSTM and extracting to obtain the context semantic feature representation of the training text;
s4, establishing a Graph-MLP relation classification model; taking the context semantic feature expression of the training text as a training sample to train the Graph-MLP relation classification model; wherein the Graph-MLP relationship classification model comprises: relu activation function, MLP and softmax activation function;
and S5, obtaining a word sequence vector of the target text, inputting the word sequence vector of the target text into the trained Graph-MLP relation classification model, and outputting the relation between two entities in the target text.
The invention has at least the following beneficial effects:
the invention adopts simple and light MLP to replace the aggregation operation in GCN, does not need to explicitly transmit neighborhood node information through a Graph structure, achieves message transmission and simultaneously improves the calculation efficiency, and the Graph-MLP classification model can realize performance equivalent to that of the Graph model and is more efficient.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and embodiments may be combined with each other without conflict.
Referring to fig. 1, the present invention provides a method for extracting an entity relationship based on Graph-MLP and adjacent contrast loss, comprising:
s1: acquiring training text data with label information; the tag information includes: training the relationship category between two entities in the text;
in the invention, a data set TACRED containing 106264 examples is obtained, wherein the data set TACRED comprises the following steps: the training set 68124, the verification set 22631, and the 15509 test sets. The data set contains 41 types of relationship categories and a special "no-relation".
S2: performing word segmentation processing on the training text according to the word list; embedding words in the training text into a vector by adopting a GloVe model to represent to obtain a word sequence vector of the training text;
firstly, preprocessing data, wherein the preprocessing comprises the following steps: loading original data, cleaning data, mapping a word token into a digital index according to a word list, and mapping various types of marking information into a corresponding numerical value list according to different rules. Sentence information is converted into a numerical form through a preprocessing step and serves as input of the embedded model.
Word embedding is intended to convert words in text into a numerical representation, discrete words into continuous word vectors using a pre-trained GloVe model, where each word is represented by a 300-dimensional real-valued vector.
Performing word segmentation according to the vocabulary to obtain n words X = (X) 1 ,x 2 ,…,x n ) The n words are embedded into the vector to obtain a word sequence vector E = (E) of the training text 1 ,e 2 ,…,e n )。
Each training text comprises two entities; the two entities form an entity pair; the two entities are at different locations in the text; both entities are nouns; for example, the training text is "the registered address of company a is located in region B", where sta is entity 1, region B is entity 2, and the relationship between entity 1 and entity 2 is a location relationship.
S2: inputting the word sequence vector of the training text into Bi-LSTM and extracting to obtain the context semantic feature representation of the training text;
further capture of the contextual representation of the text using a Bi-LSTM model with the input of Bi-LSTM E = (E) 1 ,e 2 ,…,e n ) I.e. for each item e 1 ,e 2 ,…,e n Sequentially inputting the data into a Bi-LSTM layer to calculate the forward and reverse outputs of the Bi-LSTM at the time n, respectively
Figure BDA0003996460550000041
And &>
Figure BDA0003996460550000042
Adding the forward output and the reverse output point by point to obtain a result h n The context semantic feature of the training text after Bi-LSTM layer extraction and combination is represented as H = (H) 1 ,h 2 ,…,h n )。
S4: creating a Graph-MLP relation classification model; taking the context semantic feature expression of the training text as a training sample to train the Graph-MLP relation classification model; wherein the Graph-MLP relationship classification model comprises: relu activation function, MLP and softmax activation function;
s41: inputting the context semantic feature representation of the training text into MLP to calculate to obtain a feature representation vector of the training text;
preferably, the obtaining of the feature representation vector of the training text by inputting the context semantic feature representation of the training text into MLP includes:
H (l) =Dropout(LN(σ(HW (l) )))
where σ denotes the Relu activation function, W (l) A weight parameter matrix representing the L < th > layer of the MLP, H represents the context semantic feature representation of the training text, H (l) A feature representation vector representing the l-layer of the training text.
In a multi-layer perceptron of l layers, a context semantic feature representation of a sentence is received as an input, and a linear change is applied to update the feature representation in the sentence. The nonlinear activation function then mathematically transforms the feature vectors to learn different nonlinear relationships, then combines LayerNorm for stability training, and finally feeds the output into Dropout to avoid overfitting.
S42: inputting the feature expression vector of the training text into a softmax activation function to predict the relationship between entity pairs in the training text;
preferably, the step of inputting the feature expression vector of the training text into the softmax activation function to predict the relationship between the entity pairs in the training text comprises the following steps:
O=W o H (l) +b
Figure BDA0003996460550000051
wherein, W o Weight matrix for softmax activation function, b bias term for softmax activation function, H (l) Feature representation vector, P (c | H), representing training text (l) ) Representing the probability, o, that a relationship between a pair of entities belongs to a class c r And o k Represents the R-th element and the k-th element in O, M is the number of relation categories, H (l) A feature representation vector representing the training text.
S43: constructing a loss function of the Graph-MLP relational classification model by using a multi-head attention mechanism, a Relu activation function, a cross entropy loss algorithm and an adjacent contrast loss algorithm according to the label information of the training text and the relation prediction result between entity pairs in the training text, and updating parameters of the MLP and softmax activation functions through a back propagation mechanism to complete the training of the Graph-MLP relational classification model;
preferably, the process of constructing the loss function of the Graph-MLP relationship classification model includes:
s441: according to a multi-head attention mechanism, the context semantic feature expression of the training text is respectively expressed with a trainable parameter matrix W Q And W K Multiplying to obtain the query Q and the key K, where W Q Or W K ∈R d×d D represents the dimension of the trainable parameter matrix;
s442: calculating the vector dot product of the training text by using a Relu activation function according to the query Q and the key K;
preferably, the vector dot product of the training text comprises:
Figure BDA0003996460550000052
wherein M is t The vector dot product of the training text is represented, the dimension of the feature vector is represented by d, the ReLU represents an activation function, the T represents transposition, the T represents the number of multi-head attention heads, and the value of the T is usually set to be 3.
S443: calculating to obtain a weighting matrix of the training text according to the vector dot product of the training text;
preferably, the weighting matrix of the training text comprises:
Figure BDA0003996460550000061
wherein A is t Representing a weighting matrix of the training text, d is the dimension of the feature vector, u represents word nodes in the training text, num represents a set of word nodes in the training text, and t represents the number of the multi-head attention heads.
S444: constructing a first loss function by using an adjacent contrast loss algorithm according to the weighting matrix of the training text;
preferably, the first loss function comprises:
Figure BDA0003996460550000062
Figure BDA0003996460550000063
Figure BDA0003996460550000064
among them, loss NC Representing a first loss function, sim (-) representing a similarity function, N representing the number of word nodes in the training text, v i Representing the ith word node, v j Represents the jth word node, a ij Is represented by A t The value of the ith row and the jth column simultaneously represents the word node v i And word node v j And theta represents a preset threshold value.
S445: constructing a second loss function by using a cross entropy loss algorithm according to the label information of the training text and the relationship prediction result between the entity pairs in the training text;
preferably, the second loss function comprises:
Figure BDA0003996460550000065
therein, loss cooss-entro Representing a second loss function, N representing the number of word nodes in the training text, M being the number of relationship classes, y ic And p (c | H) represents a relation prediction result between the entity pairs in the training text.
S446: obtaining a loss function of the Graph-MLP relation classification model according to the first loss function and the second loss function;
preferably, the loss function of the Graph-MLP relationship classification model comprises:
loss final =loss NC +loss cooss-entropy
therein, loss NC Representing a first loss function, loss cooss-entropy Representing the second loss function, loss final Representing the loss function of the Graph-MLP relational classification model.
S4: and acquiring a word sequence vector of the target text, inputting the word sequence vector of the target text into a trained Graph-MLP relation classification model, and outputting the relation between two entities in the target text.
Finally, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (9)

1. An entity relation extraction method based on Graph-MLP and adjacent contrast loss is characterized by comprising the following steps:
s1: acquiring training text data with label information; the tag information includes: training the relationship category between two entities in the text;
s2: performing word segmentation processing on the training text according to the word list; embedding words in the training text into a vector by adopting a GloVe model to represent to obtain a word sequence vector of the training text;
s3: inputting the word sequence vector of the training text into Bi-LSTM and extracting to obtain the context semantic feature representation of the training text;
s4: creating a Graph-MLP relation classification model; taking the context semantic feature expression of the training text as a training sample to train the Graph-MLP relation classification model; wherein the Graph-MLP relationship classification model comprises: relu activation function, MLP and softmax activation function;
s5: and acquiring a word sequence vector of the target text, inputting the word sequence vector of the target text into a trained Graph-MLP relation classification model, and outputting the relation between two entities in the target text.
2. The method of claim 1, wherein the training of the Graph-MLP relational classification model using the context semantic feature representation of the training text as the training samples comprises:
s41: inputting the context semantic feature representation of the training text into MLP to calculate to obtain a feature representation vector of the training text;
s42: inputting the feature expression vector of the training text into a softmax activation function to predict the relationship between entity pairs in the training text;
s43: and constructing a loss function of the Graph-MLP relational classification model by using a multi-head attention mechanism, a Relu activation function, a cross entropy loss algorithm and an adjacent contrast loss algorithm according to the label information of the training text and the relation prediction result between entity pairs in the training text, and updating parameters of the MLP and softmax activation functions through a back propagation mechanism to finish the training of the Graph-MLP relational classification model.
3. The method for extracting the entity relationship between the Graph-MLP and the adjacent contrast loss according to claim 2, wherein the step of inputting the feature representation vector of the training text into the softmax activation function to predict the relationship between the entity pairs in the training text comprises the steps of:
O=W o H (l) +b
Figure FDA0003996460540000021
wherein, W o Weight matrix for softmax activation function, b bias term for softmax activation function, P (c | H) (l) ) Representing the probability, o, that a relationship between pairs of entities in the training text belongs to class c r And o k Represents the R-th element and the k-th element in O, M is the number of relation categories, H (l) A feature representation vector representing the training text.
4. The method according to claim 2, wherein the construction process of the loss function of the Graph-MLP relational classification model comprises:
s441: according to a multi-head attention mechanism, the context semantic feature expression of the training text is respectively expressed with a trainable parameter matrix W Q And W K Multiplying to obtain query Q and key K, where W Q Or W K ∈R d×d D represents the dimension of the trainable parameter matrix;
s442: calculating the vector dot product of the training text by using a Relu activation function according to the query Q and the key K;
s443: calculating to obtain a weighting matrix of the training text according to the vector dot product of the training text;
s444: constructing a first loss function by using an adjacent contrast loss algorithm according to a weighting matrix of the training text;
s445: constructing a second loss function by using a cross entropy loss algorithm according to the label information of the training text and the relationship prediction result between the entity pairs in the training text;
s446: and obtaining a loss function of the Graph-MLP relation classification model according to the first loss function and the second loss function.
5. The method according to claim 4, wherein the extracting the entity relationship between Graph-MLP and adjacent contrast loss comprises:
Figure FDA0003996460540000022
wherein M is t The vector dot product of the training text is represented, the dimension of the feature vector is represented by d, the ReLU represents an activation function, the T represents transposition, and the T represents the number of multi-head attention heads.
6. The method as claimed in claim 4, wherein the weighting matrix of the training text comprises:
Figure FDA0003996460540000031
wherein A is t Representing a weighting matrix of the training text, d is the dimension of the feature vector, u represents word nodes in the training text, num represents a set of word nodes in the training text, and t represents the number of the multi-head attention heads.
7. The method according to claim 4, wherein the first loss function comprises:
Figure FDA0003996460540000032
Figure FDA0003996460540000033
Figure FDA0003996460540000034
therein, loss NC Representing a first loss function, sim (-) representing a similarity function, N representing the number of word nodes in the training text, v i Representing the ith word node, v j Represents the jth word node, a ij Is represented by A t The value of the ith row and the jth column in the Chinese character string simultaneously represents a word node v i And word node v j And theta represents a preset threshold value.
8. The method of claim 4, wherein the second loss function comprises:
Figure FDA0003996460540000035
therein, loss cooooss-entrop Representing a second loss function, N representing the number of word nodes in the training text, M being the number of relation classes, y ic And p (c | H) represents a relation prediction result between the entity pairs in the training text.
9. The method according to claim 4, wherein the loss function of the Graph-MLP relational classification model comprises:
loss final =loss NC +loss cooss-entropy
therein, loss NC Representing the first loss function, loss cooss-entro Representing the second loss function, loss final Representing the loss function of the Graph-MLP relational classification model.
CN202211594439.2A 2022-12-13 2022-12-13 Entity relation extraction method based on Graph-MLP and adjacent contrast loss Pending CN115934944A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211594439.2A CN115934944A (en) 2022-12-13 2022-12-13 Entity relation extraction method based on Graph-MLP and adjacent contrast loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211594439.2A CN115934944A (en) 2022-12-13 2022-12-13 Entity relation extraction method based on Graph-MLP and adjacent contrast loss

Publications (1)

Publication Number Publication Date
CN115934944A true CN115934944A (en) 2023-04-07

Family

ID=86553563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211594439.2A Pending CN115934944A (en) 2022-12-13 2022-12-13 Entity relation extraction method based on Graph-MLP and adjacent contrast loss

Country Status (1)

Country Link
CN (1) CN115934944A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118093846A (en) * 2024-04-26 2024-05-28 华南理工大学 Knowledge retrieval question-answering method based on association modeling

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118093846A (en) * 2024-04-26 2024-05-28 华南理工大学 Knowledge retrieval question-answering method based on association modeling

Similar Documents

Publication Publication Date Title
CN111291185B (en) Information extraction method, device, electronic equipment and storage medium
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN108897857B (en) Chinese text subject sentence generating method facing field
CN110609891A (en) Visual dialog generation method based on context awareness graph neural network
CN112347268A (en) Text-enhanced knowledge graph joint representation learning method and device
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN113312501A (en) Construction method and device of safety knowledge self-service query system based on knowledge graph
CN108038492A (en) A kind of perceptual term vector and sensibility classification method based on deep learning
CN117009490A (en) Training method and device for generating large language model based on knowledge base feedback
CN113626589B (en) Multi-label text classification method based on mixed attention mechanism
CN110263325A (en) Chinese automatic word-cut
CN116450796B (en) Intelligent question-answering model construction method and device
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN113254675B (en) Knowledge graph construction method based on self-adaptive few-sample relation extraction
CN113051922A (en) Triple extraction method and system based on deep learning
CN112559723A (en) FAQ search type question-answer construction method and system based on deep learning
CN113988079A (en) Low-data-oriented dynamic enhanced multi-hop text reading recognition processing method
CN115203507A (en) Event extraction method based on pre-training model and oriented to document field
CN114648016A (en) Event argument extraction method based on event element interaction and tag semantic enhancement
CN114238653A (en) Method for establishing, complementing and intelligently asking and answering knowledge graph of programming education
CN114417872A (en) Contract text named entity recognition method and system
CN113641809A (en) XLNET-BiGRU-CRF-based intelligent question answering method
CN115687609A (en) Zero sample relation extraction method based on Prompt multi-template fusion
CN115292490A (en) Analysis algorithm for policy interpretation semantics
CN115659947A (en) Multi-item selection answering method and system based on machine reading understanding and text summarization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination