CN115600595A - Entity relationship extraction method, system, equipment and readable storage medium - Google Patents
Entity relationship extraction method, system, equipment and readable storage medium Download PDFInfo
- Publication number
- CN115600595A CN115600595A CN202211027598.4A CN202211027598A CN115600595A CN 115600595 A CN115600595 A CN 115600595A CN 202211027598 A CN202211027598 A CN 202211027598A CN 115600595 A CN115600595 A CN 115600595A
- Authority
- CN
- China
- Prior art keywords
- sentence
- entity relationship
- relationship extraction
- entity
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 130
- 238000003860 storage Methods 0.000 title claims abstract description 11
- 238000000034 method Methods 0.000 claims abstract description 26
- 230000008569 process Effects 0.000 claims abstract description 10
- 239000013598 vector Substances 0.000 claims description 117
- 230000006870 function Effects 0.000 claims description 22
- 230000007246 mechanism Effects 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 13
- 230000002457 bidirectional effect Effects 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 11
- 101000802640 Homo sapiens Lactosylceramide 4-alpha-galactosyltransferase Proteins 0.000 claims description 3
- 102100035838 Lactosylceramide 4-alpha-galactosyltransferase Human genes 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 2
- 238000012546 transfer Methods 0.000 abstract description 4
- 230000015556 catabolic process Effects 0.000 abstract description 3
- 238000006731 degradation reaction Methods 0.000 abstract description 3
- 230000007774 longterm Effects 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 10
- 238000012795 verification Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000003813 thumb Anatomy 0.000 description 2
- 206010067484 Adverse reaction Diseases 0.000 description 1
- 230000006838 adverse reaction Effects 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention relates to a method, a system, equipment and a readable storage medium for extracting entity relations, wherein the method comprises the following steps: the entity relationship extraction method provided by the invention can better process entity feature expression, enable sentence feature information to retain dependency information before and after the sentence in the transfer process, reduce the degradation problem of the sentence feature value in the transfer process, better solve the long-term dependency and reverse feature dependency transfer problems, and better identify the entity and the relationship.
Description
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a method, a system, a device, and a readable storage medium for extracting an entity relationship.
Background
With the high-speed development of the modern internet and the application of artificial intelligence and big data in the industry, exponential data and massive information are generated, the data contains a lot of valuable information, due to the reasons of much information redundancy, large information amount, wide information sources and information structure diversity, how to mine the high-value information in the disordered information becomes more and more urgent, information extraction becomes more and more important in mining information, the information extraction is structured information extracted from unstructured text, and the relation extraction task in the information extraction is the most critical because the relation between two entities can be extracted.
The relation extraction has wide application in the aspects of establishing a knowledge graph and the like in natural language processing, and the original purpose of the knowledge graph is to enable a machine to have recognition capability and reasoning capability, store the relation of the mutual connection of the entities and establish the dependence path of the entities and the relation. Knowledge maps are widely applied in the industry, academia and daily life, such as information recommendation, question and answer systems, intelligent search and the like. In the aspect of intelligent search, answers of related questions can be obtained by inputting texts, such as the current voice assistant and chat robot; secondly, it has been widely used in the medical related field, for example, to intelligently search for similar cases according to symptoms, and to make more precise treatment. The relationship extraction task has great research significance in life, can greatly facilitate the life of people, reduces complicated matters in life, and provides intelligence and boundaries for life, so that the relationship extraction is researched by more and more researchers in the aspects of natural language processing and the like, and the application range of the relationship extraction in the fields of informatization and industry, such as medical field and the like, is improved by the researchers through the improvement of the relationship extraction performance.
The existing relation extraction method automatically extracts features through a convolutional neural network, so that a large amount of feature marking work is reduced, a large amount of human resources are saved, and semantic information is lost due to the fact that the convolutional neural network cannot refine pooling features; besides, the application of the recurrent neural network model in entity relationship extraction is proposed, but the problems of gradient disappearance and gradient explosion exist, the long-distance relationship is easy to lose, and the problem of long-distance dependence is difficult to process.
In summary, the existing relationship extraction technology has the problems of semantic information loss and difficulty in long-distance relationship extraction.
Disclosure of Invention
Therefore, the technical problem to be solved by the invention is to overcome the problems that semantic information is lost when the relationship is extracted and the long-distance relationship is difficult to extract in the prior art.
In order to solve the above technical problem, the present invention provides an entity relationship extraction method, including:
inputting sentences to be subjected to entity relationship extraction into a trained entity relationship extraction model, and extracting feature input vectors of the sentences by using a word2vec network in the entity relationship extraction model;
inputting the characteristic input vector of the sentence into a PCNN characteristic extraction network in the entity relationship extraction model to extract a local characteristic vector of the sentence;
inputting the local feature vector of the sentence into a BiGRU neural network in the entity relationship extraction model for bidirectional learning to obtain a target feature vector of the sentence;
inputting the target feature vector of the sentence into a multi-branch attention mechanism in the entity relation extraction model to calculate a target feature vector weight value of the sentence;
and classifying the entity relation of the sentence according to the target characteristic vector weight of the sentence by using a softmax function in the entity relation extraction model to obtain the probability that each word in the sentence is selected as the entity relation.
In an embodiment of the present invention, the extracting a feature input vector of a sentence by using a word2vec network in the entity relationship extraction model includes:
obtaining an input text sequence S = { w) of sentences to be subjected to entity relation extraction 1 ,w 2 ,...,w l -means for, among other things, w i representing the code of the ith word in the sentence, and l represents the length of the sentence;
inputting the input text sequence into a word2vec network, wherein the output dimension is d p And obtaining a text sequence M = { p) from the first entity in the sentence to each word distance in the sentence according to the word vector 1 ,p 2 ,......,p m And the text sequence of the second entity in the sentence to each word distance in the sentence N = { p = { (p) } 1 ,p 2 ,......,p n };
Respectively inputting the text sequence M and the text sequence N into a word2vec network, and outputting two dimensions d t The word vector of (a);
will have dimension d p Of a word vector and two dimensions d t The word vector is spliced to obtain the dimension d p +2d t The feature input vector of (2).
In an embodiment of the present invention, the calculation formula of the target feature vector of the sentence obtained by inputting the local feature vector of the sentence into the BiGRU neural network in the entity relationship extraction model and performing bidirectional learning is as follows:
z t =σ(W z *[h t-1 ,x t ]),
r t =σ(W r *[h t-1 ,x t ]),
wherein r is t Is a reset gate, z t Is the update gate, x t Is input data, W z 、W r W is the weight matrix of the reset gate, the update gate, and the candidate hidden state, respectively, σ is a sigmoid function,for output of candidate hidden states, h t Is the output of the current time, h t-1 Is the output of the last moment in time,in order to obtain the feature vector by backward learning,in order to obtain the feature vector for the forward learning,is the target feature vector.
In one embodiment of the present invention, the multi-branch attention mechanism for inputting the target feature vector of the sentence into the entity relationship extraction model calculates the target feature vector weight value of the sentence by:
calculating the weight of the target feature vector of the sentence at the time t, wherein the calculation formula is as follows:
wherein, beta i In order to initialize the characteristic parameters of the device,is a target feature vector, Z t The weight of a target feature vector of a sentence at the time t;
and summing the weights of the target feature vectors of the sentences from 1 to t to obtain the weight values of the target feature vectors of the sentences, wherein the calculation formula is as follows:
where j = {1,2,3, ·.
In one embodiment of the present invention, the calculation formula of the probability of each word in the sentence being selected as the entity relationship is:
wherein, P (k) is the probability that the kth word in the sentence is selected as the entity relation.
In an embodiment of the present invention, the training process of the entity relationship extraction model is as follows:
the method comprises the steps of collecting sentences containing different entities and relations as training samples, and inputting the training samples into a word2vec network to obtain characteristic input vectors of the samples;
inputting the characteristic input vector of the sample into a PCNN characteristic extraction network to extract a local characteristic vector of the sample;
inputting the local feature vector of the sample into a BiGRU neural network for bidirectional learning to obtain a target feature vector of the sample;
inputting the target feature vector of the sample into a multi-branch attention system to calculate a target feature vector weight value of the sample;
classifying the entity relationship of the sample according to the target characteristic vector weight of the sample by using a softmax function to obtain the probability of each word in the sample being selected as the entity relationship;
and continuously adjusting model parameters until the loss function is converged to obtain the trained entity relationship extraction model. In one embodiment of the present invention, the loss function of the entity-relationship extraction model is:
L=BCELoss(P(k),P(k')),
wherein, P (k) is the true probability that the kth word in the sample is used as the entity relationship, and P (k') is the predicted probability that the kth word in the sample is used as the entity relationship.
The invention also provides an entity relationship extraction system, which comprises:
the input module is used for inputting sentences to be subjected to entity relationship extraction into a word2vec network in a trained entity relationship extraction model to obtain characteristic input vectors of the sentences;
the characteristic extraction module is used for extracting a local characteristic vector of the characteristic input vector by utilizing a PCNN characteristic extraction network in the entity relationship extraction model;
the bidirectional learning module is used for utilizing a BiGRU network in the entity relationship extraction model to enable the local feature vectors to be subjected to forward and backward learning to obtain target feature vectors;
the attention distribution module is used for calculating a weight value of the target feature vector by utilizing a multi-branch attention mechanism in the entity relationship extraction model;
and the classification module is used for classifying the entity relationship of the sentence by utilizing the softmax function in the entity relationship extraction model and calculating the probability of selecting each word in the sentence as the entity relationship.
The invention also provides an entity relationship extraction device, which comprises:
a memory: for storing a computer program;
a processor: for implementing the steps of the entity relationship extraction method when executing the computer program.
The invention also provides a computer readable storage medium, on which a computer program is stored, and when the computer program is processed and executed, the steps of the entity relationship extraction method are realized.
The entity relationship extraction method provided by the invention extracts the word vector of the sentence by using the word2vec network, then introduces the PCNN feature extraction network to effectively collect the feature information of the extracted sentence, so that the dependency information before and after the sentence is kept in the transmission process of the sentence feature information, the degradation problem of the sentence feature value in the transmission process is reduced, the sentence is better and fully identified and trained by using the sentence feature information through the BiGRU neural network, the long-term dependency and inverse feature dependency transmission problems can be better solved, finally, the entity and the relationship can be better identified by using a multi-branch attention mechanism, the correlation between each part and the relationship in the sentence is obtained, the corresponding weight is distributed to the part and the relationship, the correct relationship and the entity are endowed with higher weight, the forward direction entity and the relationship are enhanced, and the relationship extraction performance of the model is improved.
Drawings
In order that the present disclosure may be more readily and clearly understood, reference is now made to the following detailed description of the embodiments of the present disclosure taken in conjunction with the accompanying drawings, in which
FIG. 1 is a flow chart of a method for entity relationship extraction;
FIG. 2 is a diagram of a model PCNN-BiGRU-MulATT;
FIG. 3 is a diagram of an entity relationship extraction system.
Detailed Description
The present invention is further described below in conjunction with the drawings and the embodiments so that those skilled in the art can better understand the present invention and can carry out the present invention, but the embodiments are not to be construed as limiting the present invention.
The relationship extraction can extract the relationship between the entities in the unstructured and programmed text, so as to provide more accurate and comprehensive information for the user, the Entity relationship extraction is to extract the binary relationship between the entities in the text and form a relationship triple, for example, entity1 and Entity2 represent two entities, and relationship represents the relationship between the two entities. Given a sentence "liang sing is a facial Chinese architecture", it can be seen that the relationship between two entities is.
Example 1:
referring to fig. 1, an entity relationship extraction method provided in the embodiment of the present invention includes:
s10: inputting a sentence to be subjected to entity relationship extraction into a trained entity relationship extraction model, and extracting a feature input vector of the sentence by using a word2vec network in the entity relationship extraction model, wherein the method specifically comprises the following steps:
s100: acquiring an input text sequence S = { w) of sentences to be subjected to entity relationship extraction 1 ,w 2 ,...,w l In which w i Representing the code of the ith word in the sentence, and l represents the length of the sentence;
s101: inputting the input text sequence into a word2vec model, wherein the output dimension is d p And obtaining a text sequence M = { p) from the first entity in the sentence to each word distance in the sentence according to the word vector 1 ,p 2 ,......,p m And the text sequence of the second entity in the sentence to each word distance in the sentence N = { p = { (p) } 1 ,p 2 ,......,p n };
S102: respectively inputting the text sequence M and the text sequence N into a word2vec model, and outputting two dimensions d t The word vector of (2);
s103: will have dimension d p Of a word vector and two dimensions d t The word vector is spliced to obtain the dimension d p +2d t The feature input vector of (2).
S20: inputting the characteristic input vector of the sentence into a PCNN characteristic extraction network in the entity relationship extraction model to extract a local characteristic vector of the sentence;
s30: inputting the local feature vector of the sentence into the BiGRU neural network in the entity relationship extraction model for bidirectional learning to obtain a target feature vector of the sentence, wherein the specific formula comprises the following steps:
z t =σ(W z *[h t-1 ,x t ]),
r t =σ(W r *[h t-1 ,x t ]),
wherein r is t Is a reset gate, z t Is the update gate, x t Is input data, W z 、W r W is the weight matrix of the reset gate, the update gate, and the candidate hidden state, respectively, σ is a sigmoid function,for output of candidate hidden states, h t Is the output of the current time, h t-1 Is the output of the previous moment;
wherein the content of the first and second substances,in order to obtain the feature vector for the backward learning,in order to obtain the feature vector for the forward learning,in order to be the target feature vector,the target feature vector is taken;
since the backward features cannot be learned in the GRU network to affect the forward features, and the forward features and the backward features affect each other in the general sentence, the BiGRU neural network can be used for fully learning the features between two directions in the embodiment.
S40: inputting the target feature vector of the sentence into a multi-branch attention mechanism in the entity relationship extraction model to calculate a target feature vector weight value of the sentence, wherein the method specifically comprises the following steps:
calculating the weight of a target feature input vector of a sentence at the time t, wherein the calculation formula is as follows:
wherein, beta i In order to initialize the characteristic parameters of the device,is a target feature vector, Z t The weight of a target feature vector of a sentence at the time t is taken as the weight;
and summing the weights of the target feature vectors of the sentences from 1 to t to obtain the weight value of the target feature vector of the sentence, wherein the calculation formula is as follows:
where j = {1,2,3, ·.
The method has the advantages that through the fusion of the BiGRU neural network and the multi-branch attention mechanism, entities and relations are fully learned, the relations can better identify entity characteristics, the degradation problem of the characteristics can be reduced in the transfer process, the weight of positive phase relations can be increased by the fusion of the multi-branch attention mechanism and the BiGRU, the multi-branch attention mechanism is that each attention mechanism and each output are subjected to attention mechanism fusion, the obtained weight is updated, each attention mechanism can be influenced by the output of the BiGRU, the attention weight can be distributed to each statement, so that words and phrases in each statement affect the entity relations, correct relations, entities and statements are enhanced, and superior relation identification performance is achieved.
For example: in the sentence "thumb preceding semantic information semantic at the White House", when the entity relationship of the sentence is thumb, the entity relationship expression in the sentence can be more reflected by the ceremony and the White House in the sentence, so that the sentence is assigned with larger weight to the word in the multi-branch attention mechanism; in the sentence "Trump is a facial American business house", the sentence has no words that can directly express the president, so that each word can obtain less weight when calculating the weight of the president, and when the sentence expresses the business, the house can obtain higher weight, thereby influencing the entity. Therefore, the BiGRU network fused with the multi-branch attention mechanism can better reflect the relation between the entity and each word in the sentence, and better obtain characteristic expression.
S50: classifying the entity relationship of the sentence according to the target feature vector weight value of the sentence by using a softmax function in the entity relationship extraction model to obtain the probability that each word in the sentence is selected as the entity relationship, wherein the probability calculation formula is as follows:
wherein, P (k) is the probability that the kth word in the sentence is selected as the entity relation.
The training process of the entity relationship extraction model comprises the following steps:
the method comprises the steps of collecting sentences containing different entities and relations as training samples, and inputting the training samples into a word2vec network to obtain characteristic input vectors of the samples;
inputting the characteristic input vector of the sample into a PCNN characteristic extraction network to extract a local characteristic vector of the sample;
inputting the local feature vector of the sample into a BiGRU neural network for bidirectional learning to obtain a target feature vector of the sample;
inputting the target feature vector of the sample into a multi-branch attention system to calculate a target feature vector weight value of the sample;
classifying the entity relationship of the sample according to the target characteristic vector weight of the sample by using a softmax function to obtain the probability of each word in the sample being selected as the entity relationship;
continuously adjusting model parameters until a loss function is converged to obtain a trained entity relationship extraction model, wherein the formula of the loss function is as follows:
L=BCELoss(P(k),P(k')),
wherein, P (k) is the true probability that the kth word in the sample is used as the entity relationship, and P (k') is the predicted probability that the kth word in the sample is used as the entity relationship.
The entity relationship extraction method provided by this embodiment is based on a BiGRU network fused with a PCNN feature extraction network and a multi-branch attention mechanism, so as to better process the entity feature expression, reduce the problem of feature resolution, and enable better interaction between the entity and the relationship.
Example 2:
based on the entity relationship extraction method provided in the above embodiment, to verify the effect of the method, the entity relationship extraction model is evaluated on four data sets, where decibels of the four data sets are NYT, webNLG, ADE, and SciERC, and the sizes of the training set, the verification set, the test set, and each entity and relationship category corresponding to each data set are shown in table 1:
TABLE 1
The Scierc data set is a summary part of the artificial intelligence related paper from 500, and has 2687 pieces of data in total, wherein the training data set has 1861 pieces of related data, the test data set has 551 pieces of data, and the verification data set has 275 pieces of data; there are six types of entities and 7 types of relationships in the dataset.
An ADE data set which is described by being obtained by arranging in a medical report of adverse reactions caused by drug use and comprises 4272 pieces of data, wherein the original model is verified by using a 10-fold cross validation method, and in the experiment, 403 pieces of data are randomly sampled from a training set to serve as a verification set, and the verification method is consistent with the previous verification method; there is only one relationship type and two entity types in total in the dataset, containing 6821 relationships.
In a WebNLG data set, 5019 data exist in a training set, 703 data exist in a test set, 500 data exist in a verification set, and 170 relation types exist.
There is no standard entity type in the NYT and WebNLG datasets, so the label of the entity type is set to "NONE", and thus the entity type will not be predicted in the NYT and WebNLG datasets.
In this embodiment, the model is evaluated by using the values of the precision P, the recall rate R, and the F, and in the relationship extraction, the relationship extraction is correct only when the relationship types between the entities are correct, wherein the calculation formulas of the values of the precision P, the recall rate R, and the F are respectively:
where TP represents the number of samples for which positive samples are predicted to be positive, FP represents the number of samples for which negative samples are predicted to be positive, and FN represents the number of samples for which positive samples are predicted to be negative.
In this embodiment, the model parameters extracted by entity relationship are shown in table 2:
TABLE 2
In order to verify the superiority of the method, this embodiment further provides a comparative analysis result of the CNN + ONE model and the CNN + ATT model and the PCNN _ BiGRU _ MulATT model adopted by the present invention on different verification sets:
CNN + ONE: the model uses a convolutional neural network to extract the characteristics of sentences, and then alleviates the error problem of labels by using a plurality of instances, wherein ONE is equivalent to that only ONE sentence in the sentence instances is selected to perform characteristic representation;
CNN + ATT: the model is obtained by combining a CNN module and an Attention mechanism module, a relation extraction model combining a segmented convolutional neural network and an Attention mechanism is used, the CNN is used for extracting features, and then the Attention mechanism is used for endowing corresponding weight information for the relation between each part in a sentence and a target.
The results of the three models on the NYT data set are shown in table 3:
TABLE 3
Kind of model | P | R | F |
CNN+ONE | 78.5 | 68.9 | 76.4 |
CNN+ATT | 80.2 | 75.4 | 78.9 |
PCNN_BiGRU_MulATT | 84.6 | 81.4 | 82.3 |
The experimental results of the three models on the WebNLG dataset are shown in table 4:
TABLE 4
Kind of model | P | R | F |
CNN+ONE | 80.5 | 75.7 | 79.3 |
CNN+ATT | 85.1 | 82.0 | 84.2 |
PCNN_BiGRU_MulATT | 87.5 | 84.3 | 85.3 |
The results of the three models on the ADE dataset are shown in table 5:
TABLE 5
Kind of model | P | R | F |
CNN+ONE | 83.1 | 77.6 | 82.9 |
CNN+ATT | 87.0 | 84.6 | 85.8 |
PCNN_BiGRU_MulATT | 91.5 | 86.3 | 88.7 |
The results of the three models on the SciERC data set are shown in table 6:
TABLE 6
Kind of model | P | R | F |
CNN+ONE | 55.7 | 52.0 | 54.9 |
CNN+ATT | 67.2 | 61.3 | 66.1 |
PCNN_BiGRU_MulATT | 69.2 | 67.9 | 68.2 |
As can be seen from tables 3, 4, 5 and 6, the models provided by the invention are superior to the other two models in the accuracy rate, recall rate and harmonic mean of the accurate association and recall rate when the entity relationship extraction is performed on each data set; the model provided by the invention can well solve the problem of unbalanced characteristic information interaction between the entities and the relations, and the problems of no direct connection between the subsequently extracted relation characteristics and the previously extracted entity characteristics, error information characteristic propagation and the like.
The specific embodiment of the present invention further provides an entity relationship extraction system, as shown in fig. 3, which includes:
the input module 100 is configured to input a sentence to be subjected to entity relationship extraction into a word2vec network in a trained entity relationship extraction model to obtain a feature input vector of the sentence;
a feature extraction module 200, configured to extract a local feature vector of the feature input vector by using a PCNN feature extraction network in an entity relationship extraction model;
the bidirectional learning module 300 is configured to utilize a BiGRU network in the entity relationship extraction model to perform forward and backward learning on the local feature vectors to obtain target feature vectors;
an attention allocation module 400, configured to calculate a target feature vector weight value by using a multi-branch attention mechanism in an entity relationship extraction model;
the classification module 500 is configured to classify the entity relationship of the sentence by using the softmax function in the entity relationship extraction model, and calculate a probability that each word in the sentence is selected as the entity relationship.
The entity relationship extraction system is configured to implement the entity relationship extraction method, and thus a specific implementation of the entity relationship extraction system can be seen in the foregoing embodiment of the entity relationship extraction method, for example, the input module 100 is configured to implement the step S10 in the entity relationship extraction method, the feature extraction module 200 is configured to implement the step S20 in the entity relationship extraction method, the bidirectional learning module 300 is configured to implement the step S30 in the entity relationship extraction method, the attention allocation module 400 is configured to implement the step S40 in the entity relationship extraction method, and the classification module 500 is configured to implement the step S50 in the entity relationship extraction method.
The specific embodiment of the present invention further provides an entity relationship extraction device, including:
a memory: for storing a computer program;
a processor: for implementing the steps of the entity relationship extraction method described above when executing the computer program.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is processed and executed, the computer program implements the steps of the entity relationship extraction method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. This need not be, nor should it be exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.
Claims (10)
1. An entity relationship extraction method, comprising:
inputting sentences to be subjected to entity relationship extraction into a trained entity relationship extraction model, and extracting feature input vectors of the sentences by using a word2vec network in the entity relationship extraction model;
inputting the characteristic input vector of the sentence into a PCNN characteristic extraction network in the entity relationship extraction model to extract a local characteristic vector of the sentence;
inputting the local feature vector of the sentence into a BiGRU neural network in the entity relationship extraction model for bidirectional learning to obtain a target feature vector of the sentence;
inputting the target feature vector of the sentence into a multi-branch attention mechanism in the entity relation extraction model to calculate a target feature vector weight value of the sentence;
and classifying the entity relation of the sentence according to the target characteristic vector weight value of the sentence by using the softmax function in the entity relation extraction model to obtain the probability of selecting each word in the sentence as the entity relation.
2. The entity relationship extraction method according to claim 1, wherein the extracting the feature input vector of the sentence by using the word2vec network in the entity relationship extraction model comprises:
acquiring an input text sequence S = { w) of sentences to be subjected to entity relationship extraction 1 ,w 2 ,...,w l In which w i Representing the code of the ith word in the sentence, and l represents the length of the sentence;
inputting the input text sequence into a word2vec network, wherein the output dimension is d p And obtaining a text sequence M = { p) from the first entity in the sentence to each word distance in the sentence according to the word vector 1 ,p 2 ,......,p m H, and the text sequence N = { p } of the second entity in the sentence to each word distance in the sentence 1 ,p 2 ,......,p n };
Respectively inputting the text sequence M and the text sequence N into a word2vec network, and outputting two dimensions d t The word vector of (2);
will have dimension d p Of a word vector and two dimensions d t The word vector is spliced to obtain the dimension d p +2d t The feature input vector of (2).
3. The entity relationship extraction method according to claim 1, wherein the calculation formula for inputting the local feature vector of the sentence into the BiGRU neural network in the entity relationship extraction model to perform bidirectional learning to obtain the target feature vector of the sentence is as follows:
z t =σ(W z *[h t-1 ,x t ]),
r t =σ(W r *[h t-1 ,x t ]),
wherein r is t Is a reset gate, z t Is to update the door, x t Is input data, W z 、W r W is the weight matrix of the reset gate, the update gate, and the candidate hidden state, respectively, σ is a sigmoid function,as output of the candidate hidden state, h t Is the output of the current time, h t-1 Is the output of the last moment in time,in order to obtain the feature vector for the backward learning,for the feature vectors obtained by the forward learning,is the target feature vector.
4. The entity relationship extraction method of claim 3, wherein the multi-branch attention mechanism for inputting the target feature vector of the sentence into the entity relationship extraction model calculates the target feature vector weight value of the sentence comprises:
calculating the weight of a target feature input vector of a sentence at the time t, wherein the calculation formula is as follows:
wherein, beta i In order to initialize the characteristic parameters of the device,is a target feature vector, Z t The weight of a target feature vector of a sentence at the time t;
and summing the weights of the target feature vectors of the sentences from 1 to t to obtain the weight value of the target feature vector of the sentence, wherein the calculation formula is as follows:
where j = {1,2,3, ·...., t }, Z is a target eigenvector weight value of the sentence.
5. The entity relationship extraction method according to claim 4, wherein the probability of each word in the sentence being selected as the entity relationship is calculated by the following formula:
wherein, P (k) is the probability that the kth word in the sentence is selected as the entity relation.
6. The entity relationship extraction method according to claim 1, wherein the training process of the entity relationship extraction model is as follows:
collecting sentences containing different entities and relations as training samples, and inputting the training samples into a word2vec network to obtain characteristic input vectors of the samples;
inputting the characteristic input vector of the sample into a PCNN characteristic extraction network to extract a local characteristic vector of the sample;
inputting the local feature vector of the sample into a BiGRU neural network for bidirectional learning to obtain a target feature vector of the sample;
inputting the target feature vector of the sample into a multi-branch attention system to calculate a target feature vector weight value of the sample;
classifying the entity relationship of the sample according to the target characteristic vector weight of the sample by using a softmax function to obtain the probability that each word in the sample is selected as the entity relationship;
and continuously adjusting model parameters until the loss function is converged to obtain the trained entity relationship extraction model.
7. The entity relationship extraction method according to claim 6, wherein the loss function of the entity relationship extraction model is:
L=BCELoss(P(k),P(k')),
wherein, P (k) is the true probability that the kth word in the sample is used as the entity relationship, and P (k') is the predicted probability that the kth word in the sample is used as the entity relationship.
8. An entity relationship extraction system, comprising:
the input module is used for inputting sentences to be subjected to entity relationship extraction into a word2vec network in a trained entity relationship extraction model to obtain characteristic input vectors of the sentences;
the characteristic extraction module is used for extracting a local characteristic vector of the characteristic input vector by utilizing a PCNN characteristic extraction network in the entity relationship extraction model;
the bidirectional learning module is used for utilizing a BiGRU network in the entity relationship extraction model to enable the local feature vectors to be subjected to forward and backward learning to obtain target feature vectors;
the attention distribution module is used for calculating a weight value of a target feature vector by utilizing a multi-branch attention mechanism in the entity relationship extraction model;
and the classification module is used for classifying the entity relationship of the sentence by utilizing the softmax function in the entity relationship extraction model and calculating the probability of selecting each word in the sentence as the entity relationship.
9. An entity relationship extraction device, characterized by comprising:
a memory: for storing a computer program;
a processor: for implementing the steps of the entity relationship extraction method of any one of claims 1-7 when executing said computer program.
10. A computer-readable storage medium, having stored thereon, a computer program which, when being processed and executed, carries out the steps of the entity relationship extraction method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211027598.4A CN115600595A (en) | 2022-08-25 | 2022-08-25 | Entity relationship extraction method, system, equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211027598.4A CN115600595A (en) | 2022-08-25 | 2022-08-25 | Entity relationship extraction method, system, equipment and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115600595A true CN115600595A (en) | 2023-01-13 |
Family
ID=84842421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211027598.4A Pending CN115600595A (en) | 2022-08-25 | 2022-08-25 | Entity relationship extraction method, system, equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115600595A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116842128A (en) * | 2023-09-01 | 2023-10-03 | 合肥机数量子科技有限公司 | Text relation extraction method and device, computer equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859912A (en) * | 2020-07-28 | 2020-10-30 | 广西师范大学 | PCNN model-based remote supervision relationship extraction method with entity perception |
CN112084778A (en) * | 2020-08-04 | 2020-12-15 | 中南民族大学 | Entity relation extraction method and device based on novel relation attention mechanism |
CN112256939A (en) * | 2020-09-17 | 2021-01-22 | 青岛科技大学 | Text entity relation extraction method for chemical field |
CN113312907A (en) * | 2021-06-18 | 2021-08-27 | 广东工业大学 | Remote supervision relation extraction method and device based on hybrid neural network |
-
2022
- 2022-08-25 CN CN202211027598.4A patent/CN115600595A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859912A (en) * | 2020-07-28 | 2020-10-30 | 广西师范大学 | PCNN model-based remote supervision relationship extraction method with entity perception |
CN112084778A (en) * | 2020-08-04 | 2020-12-15 | 中南民族大学 | Entity relation extraction method and device based on novel relation attention mechanism |
CN112256939A (en) * | 2020-09-17 | 2021-01-22 | 青岛科技大学 | Text entity relation extraction method for chemical field |
CN113312907A (en) * | 2021-06-18 | 2021-08-27 | 广东工业大学 | Remote supervision relation extraction method and device based on hybrid neural network |
Non-Patent Citations (3)
Title |
---|
唐朝等: "ResNet结合BiGRU的关系抽取混合模型", vol. 34, no. 2, pages 38 - 45 * |
王明波;王峥;邱秀连;: "基于双向GRU和PCNN的人物关系抽取", no. 10 * |
高敬鹏: "深度学习 卷积神经网络技术与实践", vol. 1, 机械工业出版社, pages: 280 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116842128A (en) * | 2023-09-01 | 2023-10-03 | 合肥机数量子科技有限公司 | Text relation extraction method and device, computer equipment and storage medium |
CN116842128B (en) * | 2023-09-01 | 2023-11-21 | 合肥机数量子科技有限公司 | Text relation extraction method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109241524B (en) | Semantic analysis method and device, computer-readable storage medium and electronic equipment | |
CN113011533B (en) | Text classification method, apparatus, computer device and storage medium | |
CN111177374B (en) | Question-answer corpus emotion classification method and system based on active learning | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
CN107480143B (en) | Method and system for segmenting conversation topics based on context correlation | |
CN110597961A (en) | Text category labeling method and device, electronic equipment and storage medium | |
CN112819023A (en) | Sample set acquisition method and device, computer equipment and storage medium | |
CN113761868B (en) | Text processing method, text processing device, electronic equipment and readable storage medium | |
CN111222318A (en) | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network | |
CN114564563A (en) | End-to-end entity relationship joint extraction method and system based on relationship decomposition | |
CN115130538A (en) | Training method of text classification model, text processing method, equipment and medium | |
CN112925904A (en) | Lightweight text classification method based on Tucker decomposition | |
CN112597285A (en) | Man-machine interaction method and system based on knowledge graph | |
CN115935983A (en) | Event extraction method and device, electronic equipment and storage medium | |
CN113220865B (en) | Text similar vocabulary retrieval method, system, medium and electronic equipment | |
CN111881264B (en) | Method and electronic equipment for searching long text in question-answering task in open field | |
CN115600595A (en) | Entity relationship extraction method, system, equipment and readable storage medium | |
CN116049376B (en) | Method, device and system for retrieving and replying information and creating knowledge | |
CN113065350A (en) | Biomedical text word sense disambiguation method based on attention neural network | |
CN111783464A (en) | Electric power-oriented domain entity identification method, system and storage medium | |
CN113468311B (en) | Knowledge graph-based complex question and answer method, device and storage medium | |
CN115510230A (en) | Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism | |
CN115062123A (en) | Knowledge base question-answer pair generation method of conversation generation system | |
CN115983269A (en) | Intelligent community data named entity identification method, terminal and computer medium | |
CN114239584A (en) | Named entity identification method based on self-supervision learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230113 |