CN115600595A - Entity relationship extraction method, system, equipment and readable storage medium - Google Patents

Entity relationship extraction method, system, equipment and readable storage medium Download PDF

Info

Publication number
CN115600595A
CN115600595A CN202211027598.4A CN202211027598A CN115600595A CN 115600595 A CN115600595 A CN 115600595A CN 202211027598 A CN202211027598 A CN 202211027598A CN 115600595 A CN115600595 A CN 115600595A
Authority
CN
China
Prior art keywords
sentence
entity relationship
relationship extraction
entity
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211027598.4A
Other languages
Chinese (zh)
Inventor
钱雪忠
江旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202211027598.4A priority Critical patent/CN115600595A/en
Publication of CN115600595A publication Critical patent/CN115600595A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a method, a system, equipment and a readable storage medium for extracting entity relations, wherein the method comprises the following steps: the entity relationship extraction method provided by the invention can better process entity feature expression, enable sentence feature information to retain dependency information before and after the sentence in the transfer process, reduce the degradation problem of the sentence feature value in the transfer process, better solve the long-term dependency and reverse feature dependency transfer problems, and better identify the entity and the relationship.

Description

Entity relationship extraction method, system, equipment and readable storage medium
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a method, a system, a device, and a readable storage medium for extracting an entity relationship.
Background
With the high-speed development of the modern internet and the application of artificial intelligence and big data in the industry, exponential data and massive information are generated, the data contains a lot of valuable information, due to the reasons of much information redundancy, large information amount, wide information sources and information structure diversity, how to mine the high-value information in the disordered information becomes more and more urgent, information extraction becomes more and more important in mining information, the information extraction is structured information extracted from unstructured text, and the relation extraction task in the information extraction is the most critical because the relation between two entities can be extracted.
The relation extraction has wide application in the aspects of establishing a knowledge graph and the like in natural language processing, and the original purpose of the knowledge graph is to enable a machine to have recognition capability and reasoning capability, store the relation of the mutual connection of the entities and establish the dependence path of the entities and the relation. Knowledge maps are widely applied in the industry, academia and daily life, such as information recommendation, question and answer systems, intelligent search and the like. In the aspect of intelligent search, answers of related questions can be obtained by inputting texts, such as the current voice assistant and chat robot; secondly, it has been widely used in the medical related field, for example, to intelligently search for similar cases according to symptoms, and to make more precise treatment. The relationship extraction task has great research significance in life, can greatly facilitate the life of people, reduces complicated matters in life, and provides intelligence and boundaries for life, so that the relationship extraction is researched by more and more researchers in the aspects of natural language processing and the like, and the application range of the relationship extraction in the fields of informatization and industry, such as medical field and the like, is improved by the researchers through the improvement of the relationship extraction performance.
The existing relation extraction method automatically extracts features through a convolutional neural network, so that a large amount of feature marking work is reduced, a large amount of human resources are saved, and semantic information is lost due to the fact that the convolutional neural network cannot refine pooling features; besides, the application of the recurrent neural network model in entity relationship extraction is proposed, but the problems of gradient disappearance and gradient explosion exist, the long-distance relationship is easy to lose, and the problem of long-distance dependence is difficult to process.
In summary, the existing relationship extraction technology has the problems of semantic information loss and difficulty in long-distance relationship extraction.
Disclosure of Invention
Therefore, the technical problem to be solved by the invention is to overcome the problems that semantic information is lost when the relationship is extracted and the long-distance relationship is difficult to extract in the prior art.
In order to solve the above technical problem, the present invention provides an entity relationship extraction method, including:
inputting sentences to be subjected to entity relationship extraction into a trained entity relationship extraction model, and extracting feature input vectors of the sentences by using a word2vec network in the entity relationship extraction model;
inputting the characteristic input vector of the sentence into a PCNN characteristic extraction network in the entity relationship extraction model to extract a local characteristic vector of the sentence;
inputting the local feature vector of the sentence into a BiGRU neural network in the entity relationship extraction model for bidirectional learning to obtain a target feature vector of the sentence;
inputting the target feature vector of the sentence into a multi-branch attention mechanism in the entity relation extraction model to calculate a target feature vector weight value of the sentence;
and classifying the entity relation of the sentence according to the target characteristic vector weight of the sentence by using a softmax function in the entity relation extraction model to obtain the probability that each word in the sentence is selected as the entity relation.
In an embodiment of the present invention, the extracting a feature input vector of a sentence by using a word2vec network in the entity relationship extraction model includes:
obtaining an input text sequence S = { w) of sentences to be subjected to entity relation extraction 1 ,w 2 ,...,w l -means for, among other things, w i representing the code of the ith word in the sentence, and l represents the length of the sentence;
inputting the input text sequence into a word2vec network, wherein the output dimension is d p And obtaining a text sequence M = { p) from the first entity in the sentence to each word distance in the sentence according to the word vector 1 ,p 2 ,......,p m And the text sequence of the second entity in the sentence to each word distance in the sentence N = { p = { (p) } 1 ,p 2 ,......,p n };
Respectively inputting the text sequence M and the text sequence N into a word2vec network, and outputting two dimensions d t The word vector of (a);
will have dimension d p Of a word vector and two dimensions d t The word vector is spliced to obtain the dimension d p +2d t The feature input vector of (2).
In an embodiment of the present invention, the calculation formula of the target feature vector of the sentence obtained by inputting the local feature vector of the sentence into the BiGRU neural network in the entity relationship extraction model and performing bidirectional learning is as follows:
z t =σ(W z *[h t-1 ,x t ]),
r t =σ(W r *[h t-1 ,x t ]),
Figure BDA0003816369120000031
Figure BDA0003816369120000032
Figure BDA0003816369120000033
Figure BDA0003816369120000034
Figure BDA0003816369120000035
wherein r is t Is a reset gate, z t Is the update gate, x t Is input data, W z 、W r W is the weight matrix of the reset gate, the update gate, and the candidate hidden state, respectively, σ is a sigmoid function,
Figure BDA0003816369120000041
for output of candidate hidden states, h t Is the output of the current time, h t-1 Is the output of the last moment in time,
Figure BDA0003816369120000042
in order to obtain the feature vector by backward learning,
Figure BDA0003816369120000043
in order to obtain the feature vector for the forward learning,
Figure BDA0003816369120000044
is the target feature vector.
In one embodiment of the present invention, the multi-branch attention mechanism for inputting the target feature vector of the sentence into the entity relationship extraction model calculates the target feature vector weight value of the sentence by:
calculating the weight of the target feature vector of the sentence at the time t, wherein the calculation formula is as follows:
Figure BDA0003816369120000045
wherein, beta i In order to initialize the characteristic parameters of the device,
Figure BDA0003816369120000046
is a target feature vector, Z t The weight of a target feature vector of a sentence at the time t;
and summing the weights of the target feature vectors of the sentences from 1 to t to obtain the weight values of the target feature vectors of the sentences, wherein the calculation formula is as follows:
Figure BDA0003816369120000047
where j = {1,2,3, ·.
In one embodiment of the present invention, the calculation formula of the probability of each word in the sentence being selected as the entity relationship is:
Figure BDA0003816369120000048
wherein, P (k) is the probability that the kth word in the sentence is selected as the entity relation.
In an embodiment of the present invention, the training process of the entity relationship extraction model is as follows:
the method comprises the steps of collecting sentences containing different entities and relations as training samples, and inputting the training samples into a word2vec network to obtain characteristic input vectors of the samples;
inputting the characteristic input vector of the sample into a PCNN characteristic extraction network to extract a local characteristic vector of the sample;
inputting the local feature vector of the sample into a BiGRU neural network for bidirectional learning to obtain a target feature vector of the sample;
inputting the target feature vector of the sample into a multi-branch attention system to calculate a target feature vector weight value of the sample;
classifying the entity relationship of the sample according to the target characteristic vector weight of the sample by using a softmax function to obtain the probability of each word in the sample being selected as the entity relationship;
and continuously adjusting model parameters until the loss function is converged to obtain the trained entity relationship extraction model. In one embodiment of the present invention, the loss function of the entity-relationship extraction model is:
L=BCELoss(P(k),P(k')),
wherein, P (k) is the true probability that the kth word in the sample is used as the entity relationship, and P (k') is the predicted probability that the kth word in the sample is used as the entity relationship.
The invention also provides an entity relationship extraction system, which comprises:
the input module is used for inputting sentences to be subjected to entity relationship extraction into a word2vec network in a trained entity relationship extraction model to obtain characteristic input vectors of the sentences;
the characteristic extraction module is used for extracting a local characteristic vector of the characteristic input vector by utilizing a PCNN characteristic extraction network in the entity relationship extraction model;
the bidirectional learning module is used for utilizing a BiGRU network in the entity relationship extraction model to enable the local feature vectors to be subjected to forward and backward learning to obtain target feature vectors;
the attention distribution module is used for calculating a weight value of the target feature vector by utilizing a multi-branch attention mechanism in the entity relationship extraction model;
and the classification module is used for classifying the entity relationship of the sentence by utilizing the softmax function in the entity relationship extraction model and calculating the probability of selecting each word in the sentence as the entity relationship.
The invention also provides an entity relationship extraction device, which comprises:
a memory: for storing a computer program;
a processor: for implementing the steps of the entity relationship extraction method when executing the computer program.
The invention also provides a computer readable storage medium, on which a computer program is stored, and when the computer program is processed and executed, the steps of the entity relationship extraction method are realized.
The entity relationship extraction method provided by the invention extracts the word vector of the sentence by using the word2vec network, then introduces the PCNN feature extraction network to effectively collect the feature information of the extracted sentence, so that the dependency information before and after the sentence is kept in the transmission process of the sentence feature information, the degradation problem of the sentence feature value in the transmission process is reduced, the sentence is better and fully identified and trained by using the sentence feature information through the BiGRU neural network, the long-term dependency and inverse feature dependency transmission problems can be better solved, finally, the entity and the relationship can be better identified by using a multi-branch attention mechanism, the correlation between each part and the relationship in the sentence is obtained, the corresponding weight is distributed to the part and the relationship, the correct relationship and the entity are endowed with higher weight, the forward direction entity and the relationship are enhanced, and the relationship extraction performance of the model is improved.
Drawings
In order that the present disclosure may be more readily and clearly understood, reference is now made to the following detailed description of the embodiments of the present disclosure taken in conjunction with the accompanying drawings, in which
FIG. 1 is a flow chart of a method for entity relationship extraction;
FIG. 2 is a diagram of a model PCNN-BiGRU-MulATT;
FIG. 3 is a diagram of an entity relationship extraction system.
Detailed Description
The present invention is further described below in conjunction with the drawings and the embodiments so that those skilled in the art can better understand the present invention and can carry out the present invention, but the embodiments are not to be construed as limiting the present invention.
The relationship extraction can extract the relationship between the entities in the unstructured and programmed text, so as to provide more accurate and comprehensive information for the user, the Entity relationship extraction is to extract the binary relationship between the entities in the text and form a relationship triple, for example, entity1 and Entity2 represent two entities, and relationship represents the relationship between the two entities. Given a sentence "liang sing is a facial Chinese architecture", it can be seen that the relationship between two entities is.
Example 1:
referring to fig. 1, an entity relationship extraction method provided in the embodiment of the present invention includes:
s10: inputting a sentence to be subjected to entity relationship extraction into a trained entity relationship extraction model, and extracting a feature input vector of the sentence by using a word2vec network in the entity relationship extraction model, wherein the method specifically comprises the following steps:
s100: acquiring an input text sequence S = { w) of sentences to be subjected to entity relationship extraction 1 ,w 2 ,...,w l In which w i Representing the code of the ith word in the sentence, and l represents the length of the sentence;
s101: inputting the input text sequence into a word2vec model, wherein the output dimension is d p And obtaining a text sequence M = { p) from the first entity in the sentence to each word distance in the sentence according to the word vector 1 ,p 2 ,......,p m And the text sequence of the second entity in the sentence to each word distance in the sentence N = { p = { (p) } 1 ,p 2 ,......,p n };
S102: respectively inputting the text sequence M and the text sequence N into a word2vec model, and outputting two dimensions d t The word vector of (2);
s103: will have dimension d p Of a word vector and two dimensions d t The word vector is spliced to obtain the dimension d p +2d t The feature input vector of (2).
S20: inputting the characteristic input vector of the sentence into a PCNN characteristic extraction network in the entity relationship extraction model to extract a local characteristic vector of the sentence;
s30: inputting the local feature vector of the sentence into the BiGRU neural network in the entity relationship extraction model for bidirectional learning to obtain a target feature vector of the sentence, wherein the specific formula comprises the following steps:
z t =σ(W z *[h t-1 ,x t ]),
r t =σ(W r *[h t-1 ,x t ]),
Figure BDA0003816369120000081
Figure BDA0003816369120000082
wherein r is t Is a reset gate, z t Is the update gate, x t Is input data, W z 、W r W is the weight matrix of the reset gate, the update gate, and the candidate hidden state, respectively, σ is a sigmoid function,
Figure BDA0003816369120000083
for output of candidate hidden states, h t Is the output of the current time, h t-1 Is the output of the previous moment;
Figure BDA0003816369120000084
Figure BDA0003816369120000085
Figure BDA0003816369120000086
wherein the content of the first and second substances,
Figure BDA0003816369120000087
in order to obtain the feature vector for the backward learning,
Figure BDA0003816369120000088
in order to obtain the feature vector for the forward learning,
Figure BDA0003816369120000089
in order to be the target feature vector,
Figure BDA00038163691200000810
the target feature vector is taken;
since the backward features cannot be learned in the GRU network to affect the forward features, and the forward features and the backward features affect each other in the general sentence, the BiGRU neural network can be used for fully learning the features between two directions in the embodiment.
S40: inputting the target feature vector of the sentence into a multi-branch attention mechanism in the entity relationship extraction model to calculate a target feature vector weight value of the sentence, wherein the method specifically comprises the following steps:
calculating the weight of a target feature input vector of a sentence at the time t, wherein the calculation formula is as follows:
Figure BDA00038163691200000811
wherein, beta i In order to initialize the characteristic parameters of the device,
Figure BDA0003816369120000091
is a target feature vector, Z t The weight of a target feature vector of a sentence at the time t is taken as the weight;
and summing the weights of the target feature vectors of the sentences from 1 to t to obtain the weight value of the target feature vector of the sentence, wherein the calculation formula is as follows:
Figure BDA0003816369120000092
where j = {1,2,3, ·.
The method has the advantages that through the fusion of the BiGRU neural network and the multi-branch attention mechanism, entities and relations are fully learned, the relations can better identify entity characteristics, the degradation problem of the characteristics can be reduced in the transfer process, the weight of positive phase relations can be increased by the fusion of the multi-branch attention mechanism and the BiGRU, the multi-branch attention mechanism is that each attention mechanism and each output are subjected to attention mechanism fusion, the obtained weight is updated, each attention mechanism can be influenced by the output of the BiGRU, the attention weight can be distributed to each statement, so that words and phrases in each statement affect the entity relations, correct relations, entities and statements are enhanced, and superior relation identification performance is achieved.
For example: in the sentence "thumb preceding semantic information semantic at the White House", when the entity relationship of the sentence is thumb, the entity relationship expression in the sentence can be more reflected by the ceremony and the White House in the sentence, so that the sentence is assigned with larger weight to the word in the multi-branch attention mechanism; in the sentence "Trump is a facial American business house", the sentence has no words that can directly express the president, so that each word can obtain less weight when calculating the weight of the president, and when the sentence expresses the business, the house can obtain higher weight, thereby influencing the entity. Therefore, the BiGRU network fused with the multi-branch attention mechanism can better reflect the relation between the entity and each word in the sentence, and better obtain characteristic expression.
S50: classifying the entity relationship of the sentence according to the target feature vector weight value of the sentence by using a softmax function in the entity relationship extraction model to obtain the probability that each word in the sentence is selected as the entity relationship, wherein the probability calculation formula is as follows:
Figure BDA0003816369120000101
wherein, P (k) is the probability that the kth word in the sentence is selected as the entity relation.
The training process of the entity relationship extraction model comprises the following steps:
the method comprises the steps of collecting sentences containing different entities and relations as training samples, and inputting the training samples into a word2vec network to obtain characteristic input vectors of the samples;
inputting the characteristic input vector of the sample into a PCNN characteristic extraction network to extract a local characteristic vector of the sample;
inputting the local feature vector of the sample into a BiGRU neural network for bidirectional learning to obtain a target feature vector of the sample;
inputting the target feature vector of the sample into a multi-branch attention system to calculate a target feature vector weight value of the sample;
classifying the entity relationship of the sample according to the target characteristic vector weight of the sample by using a softmax function to obtain the probability of each word in the sample being selected as the entity relationship;
continuously adjusting model parameters until a loss function is converged to obtain a trained entity relationship extraction model, wherein the formula of the loss function is as follows:
L=BCELoss(P(k),P(k')),
wherein, P (k) is the true probability that the kth word in the sample is used as the entity relationship, and P (k') is the predicted probability that the kth word in the sample is used as the entity relationship.
The entity relationship extraction method provided by this embodiment is based on a BiGRU network fused with a PCNN feature extraction network and a multi-branch attention mechanism, so as to better process the entity feature expression, reduce the problem of feature resolution, and enable better interaction between the entity and the relationship.
Example 2:
based on the entity relationship extraction method provided in the above embodiment, to verify the effect of the method, the entity relationship extraction model is evaluated on four data sets, where decibels of the four data sets are NYT, webNLG, ADE, and SciERC, and the sizes of the training set, the verification set, the test set, and each entity and relationship category corresponding to each data set are shown in table 1:
TABLE 1
Figure BDA0003816369120000111
The Scierc data set is a summary part of the artificial intelligence related paper from 500, and has 2687 pieces of data in total, wherein the training data set has 1861 pieces of related data, the test data set has 551 pieces of data, and the verification data set has 275 pieces of data; there are six types of entities and 7 types of relationships in the dataset.
An ADE data set which is described by being obtained by arranging in a medical report of adverse reactions caused by drug use and comprises 4272 pieces of data, wherein the original model is verified by using a 10-fold cross validation method, and in the experiment, 403 pieces of data are randomly sampled from a training set to serve as a verification set, and the verification method is consistent with the previous verification method; there is only one relationship type and two entity types in total in the dataset, containing 6821 relationships.
In a WebNLG data set, 5019 data exist in a training set, 703 data exist in a test set, 500 data exist in a verification set, and 170 relation types exist.
There is no standard entity type in the NYT and WebNLG datasets, so the label of the entity type is set to "NONE", and thus the entity type will not be predicted in the NYT and WebNLG datasets.
In this embodiment, the model is evaluated by using the values of the precision P, the recall rate R, and the F, and in the relationship extraction, the relationship extraction is correct only when the relationship types between the entities are correct, wherein the calculation formulas of the values of the precision P, the recall rate R, and the F are respectively:
Figure BDA0003816369120000112
Figure BDA0003816369120000121
Figure BDA0003816369120000122
where TP represents the number of samples for which positive samples are predicted to be positive, FP represents the number of samples for which negative samples are predicted to be positive, and FN represents the number of samples for which positive samples are predicted to be negative.
In this embodiment, the model parameters extracted by entity relationship are shown in table 2:
TABLE 2
Figure BDA0003816369120000123
In order to verify the superiority of the method, this embodiment further provides a comparative analysis result of the CNN + ONE model and the CNN + ATT model and the PCNN _ BiGRU _ MulATT model adopted by the present invention on different verification sets:
CNN + ONE: the model uses a convolutional neural network to extract the characteristics of sentences, and then alleviates the error problem of labels by using a plurality of instances, wherein ONE is equivalent to that only ONE sentence in the sentence instances is selected to perform characteristic representation;
CNN + ATT: the model is obtained by combining a CNN module and an Attention mechanism module, a relation extraction model combining a segmented convolutional neural network and an Attention mechanism is used, the CNN is used for extracting features, and then the Attention mechanism is used for endowing corresponding weight information for the relation between each part in a sentence and a target.
The results of the three models on the NYT data set are shown in table 3:
TABLE 3
Kind of model P R F
CNN+ONE 78.5 68.9 76.4
CNN+ATT 80.2 75.4 78.9
PCNN_BiGRU_MulATT 84.6 81.4 82.3
The experimental results of the three models on the WebNLG dataset are shown in table 4:
TABLE 4
Kind of model P R F
CNN+ONE 80.5 75.7 79.3
CNN+ATT 85.1 82.0 84.2
PCNN_BiGRU_MulATT 87.5 84.3 85.3
The results of the three models on the ADE dataset are shown in table 5:
TABLE 5
Kind of model P R F
CNN+ONE 83.1 77.6 82.9
CNN+ATT 87.0 84.6 85.8
PCNN_BiGRU_MulATT 91.5 86.3 88.7
The results of the three models on the SciERC data set are shown in table 6:
TABLE 6
Kind of model P R F
CNN+ONE 55.7 52.0 54.9
CNN+ATT 67.2 61.3 66.1
PCNN_BiGRU_MulATT 69.2 67.9 68.2
As can be seen from tables 3, 4, 5 and 6, the models provided by the invention are superior to the other two models in the accuracy rate, recall rate and harmonic mean of the accurate association and recall rate when the entity relationship extraction is performed on each data set; the model provided by the invention can well solve the problem of unbalanced characteristic information interaction between the entities and the relations, and the problems of no direct connection between the subsequently extracted relation characteristics and the previously extracted entity characteristics, error information characteristic propagation and the like.
The specific embodiment of the present invention further provides an entity relationship extraction system, as shown in fig. 3, which includes:
the input module 100 is configured to input a sentence to be subjected to entity relationship extraction into a word2vec network in a trained entity relationship extraction model to obtain a feature input vector of the sentence;
a feature extraction module 200, configured to extract a local feature vector of the feature input vector by using a PCNN feature extraction network in an entity relationship extraction model;
the bidirectional learning module 300 is configured to utilize a BiGRU network in the entity relationship extraction model to perform forward and backward learning on the local feature vectors to obtain target feature vectors;
an attention allocation module 400, configured to calculate a target feature vector weight value by using a multi-branch attention mechanism in an entity relationship extraction model;
the classification module 500 is configured to classify the entity relationship of the sentence by using the softmax function in the entity relationship extraction model, and calculate a probability that each word in the sentence is selected as the entity relationship.
The entity relationship extraction system is configured to implement the entity relationship extraction method, and thus a specific implementation of the entity relationship extraction system can be seen in the foregoing embodiment of the entity relationship extraction method, for example, the input module 100 is configured to implement the step S10 in the entity relationship extraction method, the feature extraction module 200 is configured to implement the step S20 in the entity relationship extraction method, the bidirectional learning module 300 is configured to implement the step S30 in the entity relationship extraction method, the attention allocation module 400 is configured to implement the step S40 in the entity relationship extraction method, and the classification module 500 is configured to implement the step S50 in the entity relationship extraction method.
The specific embodiment of the present invention further provides an entity relationship extraction device, including:
a memory: for storing a computer program;
a processor: for implementing the steps of the entity relationship extraction method described above when executing the computer program.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is processed and executed, the computer program implements the steps of the entity relationship extraction method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. This need not be, nor should it be exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (10)

1. An entity relationship extraction method, comprising:
inputting sentences to be subjected to entity relationship extraction into a trained entity relationship extraction model, and extracting feature input vectors of the sentences by using a word2vec network in the entity relationship extraction model;
inputting the characteristic input vector of the sentence into a PCNN characteristic extraction network in the entity relationship extraction model to extract a local characteristic vector of the sentence;
inputting the local feature vector of the sentence into a BiGRU neural network in the entity relationship extraction model for bidirectional learning to obtain a target feature vector of the sentence;
inputting the target feature vector of the sentence into a multi-branch attention mechanism in the entity relation extraction model to calculate a target feature vector weight value of the sentence;
and classifying the entity relation of the sentence according to the target characteristic vector weight value of the sentence by using the softmax function in the entity relation extraction model to obtain the probability of selecting each word in the sentence as the entity relation.
2. The entity relationship extraction method according to claim 1, wherein the extracting the feature input vector of the sentence by using the word2vec network in the entity relationship extraction model comprises:
acquiring an input text sequence S = { w) of sentences to be subjected to entity relationship extraction 1 ,w 2 ,...,w l In which w i Representing the code of the ith word in the sentence, and l represents the length of the sentence;
inputting the input text sequence into a word2vec network, wherein the output dimension is d p And obtaining a text sequence M = { p) from the first entity in the sentence to each word distance in the sentence according to the word vector 1 ,p 2 ,......,p m H, and the text sequence N = { p } of the second entity in the sentence to each word distance in the sentence 1 ,p 2 ,......,p n };
Respectively inputting the text sequence M and the text sequence N into a word2vec network, and outputting two dimensions d t The word vector of (2);
will have dimension d p Of a word vector and two dimensions d t The word vector is spliced to obtain the dimension d p +2d t The feature input vector of (2).
3. The entity relationship extraction method according to claim 1, wherein the calculation formula for inputting the local feature vector of the sentence into the BiGRU neural network in the entity relationship extraction model to perform bidirectional learning to obtain the target feature vector of the sentence is as follows:
z t =σ(W z *[h t-1 ,x t ]),
r t =σ(W r *[h t-1 ,x t ]),
Figure FDA0003816369110000021
Figure FDA0003816369110000022
Figure FDA0003816369110000023
Figure FDA0003816369110000024
Figure FDA00038163691100000211
wherein r is t Is a reset gate, z t Is to update the door, x t Is input data, W z 、W r W is the weight matrix of the reset gate, the update gate, and the candidate hidden state, respectively, σ is a sigmoid function,
Figure FDA0003816369110000025
as output of the candidate hidden state, h t Is the output of the current time, h t-1 Is the output of the last moment in time,
Figure FDA0003816369110000026
in order to obtain the feature vector for the backward learning,
Figure FDA0003816369110000027
for the feature vectors obtained by the forward learning,
Figure FDA0003816369110000028
is the target feature vector.
4. The entity relationship extraction method of claim 3, wherein the multi-branch attention mechanism for inputting the target feature vector of the sentence into the entity relationship extraction model calculates the target feature vector weight value of the sentence comprises:
calculating the weight of a target feature input vector of a sentence at the time t, wherein the calculation formula is as follows:
Figure FDA0003816369110000029
wherein, beta i In order to initialize the characteristic parameters of the device,
Figure FDA00038163691100000210
is a target feature vector, Z t The weight of a target feature vector of a sentence at the time t;
and summing the weights of the target feature vectors of the sentences from 1 to t to obtain the weight value of the target feature vector of the sentence, wherein the calculation formula is as follows:
Figure FDA0003816369110000031
where j = {1,2,3, ·...., t }, Z is a target eigenvector weight value of the sentence.
5. The entity relationship extraction method according to claim 4, wherein the probability of each word in the sentence being selected as the entity relationship is calculated by the following formula:
Figure FDA0003816369110000032
wherein, P (k) is the probability that the kth word in the sentence is selected as the entity relation.
6. The entity relationship extraction method according to claim 1, wherein the training process of the entity relationship extraction model is as follows:
collecting sentences containing different entities and relations as training samples, and inputting the training samples into a word2vec network to obtain characteristic input vectors of the samples;
inputting the characteristic input vector of the sample into a PCNN characteristic extraction network to extract a local characteristic vector of the sample;
inputting the local feature vector of the sample into a BiGRU neural network for bidirectional learning to obtain a target feature vector of the sample;
inputting the target feature vector of the sample into a multi-branch attention system to calculate a target feature vector weight value of the sample;
classifying the entity relationship of the sample according to the target characteristic vector weight of the sample by using a softmax function to obtain the probability that each word in the sample is selected as the entity relationship;
and continuously adjusting model parameters until the loss function is converged to obtain the trained entity relationship extraction model.
7. The entity relationship extraction method according to claim 6, wherein the loss function of the entity relationship extraction model is:
L=BCELoss(P(k),P(k')),
wherein, P (k) is the true probability that the kth word in the sample is used as the entity relationship, and P (k') is the predicted probability that the kth word in the sample is used as the entity relationship.
8. An entity relationship extraction system, comprising:
the input module is used for inputting sentences to be subjected to entity relationship extraction into a word2vec network in a trained entity relationship extraction model to obtain characteristic input vectors of the sentences;
the characteristic extraction module is used for extracting a local characteristic vector of the characteristic input vector by utilizing a PCNN characteristic extraction network in the entity relationship extraction model;
the bidirectional learning module is used for utilizing a BiGRU network in the entity relationship extraction model to enable the local feature vectors to be subjected to forward and backward learning to obtain target feature vectors;
the attention distribution module is used for calculating a weight value of a target feature vector by utilizing a multi-branch attention mechanism in the entity relationship extraction model;
and the classification module is used for classifying the entity relationship of the sentence by utilizing the softmax function in the entity relationship extraction model and calculating the probability of selecting each word in the sentence as the entity relationship.
9. An entity relationship extraction device, characterized by comprising:
a memory: for storing a computer program;
a processor: for implementing the steps of the entity relationship extraction method of any one of claims 1-7 when executing said computer program.
10. A computer-readable storage medium, having stored thereon, a computer program which, when being processed and executed, carries out the steps of the entity relationship extraction method of any one of claims 1 to 7.
CN202211027598.4A 2022-08-25 2022-08-25 Entity relationship extraction method, system, equipment and readable storage medium Pending CN115600595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211027598.4A CN115600595A (en) 2022-08-25 2022-08-25 Entity relationship extraction method, system, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211027598.4A CN115600595A (en) 2022-08-25 2022-08-25 Entity relationship extraction method, system, equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN115600595A true CN115600595A (en) 2023-01-13

Family

ID=84842421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211027598.4A Pending CN115600595A (en) 2022-08-25 2022-08-25 Entity relationship extraction method, system, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN115600595A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116842128A (en) * 2023-09-01 2023-10-03 合肥机数量子科技有限公司 Text relation extraction method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859912A (en) * 2020-07-28 2020-10-30 广西师范大学 PCNN model-based remote supervision relationship extraction method with entity perception
CN112084778A (en) * 2020-08-04 2020-12-15 中南民族大学 Entity relation extraction method and device based on novel relation attention mechanism
CN112256939A (en) * 2020-09-17 2021-01-22 青岛科技大学 Text entity relation extraction method for chemical field
CN113312907A (en) * 2021-06-18 2021-08-27 广东工业大学 Remote supervision relation extraction method and device based on hybrid neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859912A (en) * 2020-07-28 2020-10-30 广西师范大学 PCNN model-based remote supervision relationship extraction method with entity perception
CN112084778A (en) * 2020-08-04 2020-12-15 中南民族大学 Entity relation extraction method and device based on novel relation attention mechanism
CN112256939A (en) * 2020-09-17 2021-01-22 青岛科技大学 Text entity relation extraction method for chemical field
CN113312907A (en) * 2021-06-18 2021-08-27 广东工业大学 Remote supervision relation extraction method and device based on hybrid neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
唐朝等: "ResNet结合BiGRU的关系抽取混合模型", vol. 34, no. 2, pages 38 - 45 *
王明波;王峥;邱秀连;: "基于双向GRU和PCNN的人物关系抽取", no. 10 *
高敬鹏: "深度学习 卷积神经网络技术与实践", vol. 1, 机械工业出版社, pages: 280 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116842128A (en) * 2023-09-01 2023-10-03 合肥机数量子科技有限公司 Text relation extraction method and device, computer equipment and storage medium
CN116842128B (en) * 2023-09-01 2023-11-21 合肥机数量子科技有限公司 Text relation extraction method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109241524B (en) Semantic analysis method and device, computer-readable storage medium and electronic equipment
CN113011533B (en) Text classification method, apparatus, computer device and storage medium
CN111177374B (en) Question-answer corpus emotion classification method and system based on active learning
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN107480143B (en) Method and system for segmenting conversation topics based on context correlation
CN110597961A (en) Text category labeling method and device, electronic equipment and storage medium
CN112819023A (en) Sample set acquisition method and device, computer equipment and storage medium
CN113761868B (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN114564563A (en) End-to-end entity relationship joint extraction method and system based on relationship decomposition
CN115130538A (en) Training method of text classification model, text processing method, equipment and medium
CN112925904A (en) Lightweight text classification method based on Tucker decomposition
CN112597285A (en) Man-machine interaction method and system based on knowledge graph
CN115935983A (en) Event extraction method and device, electronic equipment and storage medium
CN113220865B (en) Text similar vocabulary retrieval method, system, medium and electronic equipment
CN111881264B (en) Method and electronic equipment for searching long text in question-answering task in open field
CN115600595A (en) Entity relationship extraction method, system, equipment and readable storage medium
CN116049376B (en) Method, device and system for retrieving and replying information and creating knowledge
CN113065350A (en) Biomedical text word sense disambiguation method based on attention neural network
CN111783464A (en) Electric power-oriented domain entity identification method, system and storage medium
CN113468311B (en) Knowledge graph-based complex question and answer method, device and storage medium
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
CN115062123A (en) Knowledge base question-answer pair generation method of conversation generation system
CN115983269A (en) Intelligent community data named entity identification method, terminal and computer medium
CN114239584A (en) Named entity identification method based on self-supervision learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230113