CN111428481A

CN111428481A - Entity relation extraction method based on deep learning

Info

Publication number: CN111428481A
Application number: CN202010222471.2A
Authority: CN
Inventors: 路松峰
Original assignee: Nanjing Souwen Information Technology Co ltd
Current assignee: Nanjing Souwen Information Technology Co ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2020-07-17

Abstract

The invention discloses an entity relation extraction method based on deep learning, which comprises the steps of extracting entity relations by using a convolutional neural network and a cyclic neural network, adding word vector characteristics, position characteristics, local characteristics, sequence characteristics and the like in texts on CNN and L STM neural networks, combining the convolutional neural network and the cyclic neural network together, respectively combining the combined models in series and parallel to extract the entity relations, learning characteristics from different angles to obtain more complete learning capability, combining the convolutional neural network and the cyclic neural network together on the basis of neural networks with various different structures, respectively using the modes of series and parallel to automatically extract samples, finally combining a plurality of models based on deep learning, combining the models together to extract the entity relations by using a plurality of previously designed relation extraction models, and further selecting the entity relations with better samples.

Description

Entity relation extraction method based on deep learning

Technical Field

The invention relates to the field of entity relationship extraction, in particular to an entity relationship extraction method based on deep learning.

Background

The task of entity relation extraction is to perform semantic recognition on entity pairs in a text, and judge whether a relation exists between the entity pairs or what type of relation the entity pairs belong to according to meanings of the entity pairs in the text and sentences. Early entity relationship extraction mainly adopted a pattern matching method. Rule-based methods analyze some implicit features in the text, rules and patterns are manually defined by participating professionals, and relationships are discovered and matched by using relationship patterns. Rule-based methods require a person with specialized knowledge in the professional domain to manually write the rules, requiring a large amount of human input. When the extraction rule is applied to other fields, the extraction rule is limited and has strong professionalism and limitation. This early rule-based approach, which relied primarily, has achieved good results in a number of areas and areas of expertise.

With the development and application of machine learning, entity relationship extraction has a new research method, and a research method for extracting entity relationship by using machine learning is widely concerned. The machine learning method is widely applied to the extraction of the entity relationship, unsupervised, weakly supervised and supervised machine learning methods are correspondingly researched in the extraction of the entity relationship, the unsupervised and weakly supervised learning method does not need too many manually marked sample data, the dependence on the sample data label is reduced in training and learning, but the training process is easily interfered by noise, and the extraction performance of the entity relationship is reduced. Besides unsupervised and weakly supervised entity relation extraction methods, the extraction method based on supervised learning is researched and applied in multiple fields, and a better extraction effect is achieved. The entity relationship extraction method with supervised learning relatively needs more energy input, the extraction model depends on more labeled data sets, the training process of the entity relationship extraction model can be smoothly carried out only by sufficient labeled data, and the labeled data is obtained with time and labor waste, so that the general premise with supervised learning is to have enough labor input.

Most of entity relation extraction methods based on machine learning need to extract features from texts, perform text vocabulary analysis, grammar analysis and the like, and need to research automatic extraction of entity relations by means of language knowledge and natural language processing tools. The professional terms, noun analysis and annotation data of many professional fields also need the participation of persons with rich professional knowledge, and some machine learning methods based on language features cannot be simply applied to other fields.

Disclosure of Invention

The invention aims to provide an entity relation extraction method based on deep learning aiming at the defects in the prior art.

In order to achieve the above object, the present invention provides an entity relationship extraction method based on deep learning, which includes:

extracting features from the input sample by adopting a CNN neural network and learning to obtain a first candidate relation type in the mode;

extracting features from the input samples by adopting a B L STM neural network and learning to obtain a second candidate relation type in the mode;

respectively extracting features from input samples by adopting a CNN neural network and a B L STM neural network, learning, and sequentially and respectively inputting the learned information to the B L STM neural network and the CNN neural network for further learning so as to respectively obtain a third candidate relationship type and a fourth candidate relationship type;

extracting and learning features from the input sample by adopting a CNN neural network and a B L STM neural network, and splicing the learned features to obtain a fifth candidate relationship type;

and evaluating the first candidate relationship type, the second candidate relationship type, the third candidate relationship type, the fourth candidate relationship type and the fifth candidate relationship type, and selecting the candidate relationship type with the highest score as a final relationship classification result of the sample.

Further, if there is more than one candidate relationship type with the highest score, one type is randomly selected from all the candidate relationship types with the highest score as the classification result.

Furthermore, the CNN neural network extracts local features by adopting sliding windows of multiple sizes, performs maximum pooling sampling on the learned features, and trains an entity relationship extraction model through calculation processes such as full connection layer, softmax operation and back propagation.

Further, the input sample is input in a bidirectional L STM statement and is in an entity relationIn the system extraction model, sample data is input into L STM neural units according to the original sequence of words in sentences, and the state h of each time_tDepending on the state of the previous time and the state of the current time, h_tThe STM is formed by splicing the outputs of a forward sequence and a reverse sequence at the time t of B L STM, and the specific steps are as follows:

wherein the content of the first and second substances,

for sentences input in forward order to the output of B L STM at time t,

the output in reverse order, to let the B L STM node feature learn the forward and backward sequences simultaneously.

Further, the feature extraction from the input sample includes a word vector feature, a position feature, a local feature and a sequence feature.

Has the advantages that: the method utilizes the connection between the characteristics of locality, sequence and the like of text sentences and the text relation categories, exerts the characteristics of two types of neural network structures by combining a convolutional neural network and a cyclic neural network, and combines a plurality of models to extract entity relations so as to strengthen the autonomous learning capability of the models on the characteristics of all aspects of texts; according to the characteristics of the two types of neural networks, a plurality of combined models which connect the two types of neural networks in parallel and in series are designed, the network structure forms of a single neural network model and the combined models which connect the two types of neural networks in series and in parallel are different, the learning ability may be emphasized on a certain aspect, the learned characteristics of the single model are single, the neural network models with different structures are combined, the learning characteristics are learned from different angles, and more full and comprehensive learning ability is obtained. Except that the convolutional neural network and the cyclic neural network are adopted to extract the entity relationship respectively, the convolutional neural network and the cyclic neural network are combined together on the basis of neural networks with various different structures, and the samples are extracted automatically in a series connection mode and a parallel connection mode respectively. And finally combining a plurality of models based on deep learning, combining a plurality of previously designed relationship extraction models together for entity relationship extraction by using the models, and further selecting an entity relationship with a better sample. The neural unit is adopted to simulate the process of human brain learning, and the automatic extraction of the relationship is completed by constructing structures such as a convolutional neural network, a cyclic neural network and the like. Compared with the traditional machine learning-based method, the entity relation extraction method based on deep learning does not need manual feature extraction, and the deep learning algorithm has the capability of automatic learning algorithm.

Drawings

FIG. 1 is a schematic diagram of a multi-model federated entity relationship extraction process;

FIG. 2 is a schematic diagram of a tandem combinatorial model of CNN tandem B L STM;

FIG. 3 is a schematic diagram of a tandem combinatorial model of B L STM tandem CNN;

fig. 4 is a schematic diagram of a parallel combinatorial model of CNN tandem B L STM.

Detailed Description

The present invention will be further illustrated with reference to the accompanying drawings and specific examples, which are carried out on the premise of the technical solution of the present invention, and it should be understood that these examples are only for illustrating the present invention and are not intended to limit the scope of the present invention.

As shown in fig. 1 to 4, an embodiment of the present invention provides an entity relationship extraction method based on deep learning, including:

and extracting features from the input sample by adopting a CNN neural network and learning to obtain a first candidate relation type in the mode. Consider a CNN neural network with multiple windows and entity pair locations, with location features taking the distance of individual words from the entity pair e1 and e2, and sample input vectors of sentence word features and location features stitched together. The input layer is input into the model, the rolling layer adopts a plurality of sliding windows to extract local features, the learned features are sampled in a largest pooling mode, and then the entity relation extraction model is trained through calculation processes of a full connection layer, softmax operation, back propagation and the like.

Using Bidirectional L STM (Bidirectional L ong Short-Term Memory, B L STM) to input sample sentences into the model, in the entity relation extraction model, sample data is input into L STM neural unit according to the original sequence of words in the sentences, and the state h of each moment_tDepending on the state of the previous time and the state of the current time, the following is specific:

wherein the content of the first and second substances,

for sentences input in forward order to the output of B L STM at time t,

output in reverse order, h_tThe method is characterized in that the method is formed by splicing the outputs of a forward sequence and a reverse sequence at the time t of a B L STM, and a B L STM node simultaneously performs feature learning on a forward sequence and a backward sequence.

And respectively extracting features from the input samples by adopting a CNN neural network and a B L STM neural network, learning, and sequentially and respectively inputting the learned information to the B L STM neural network and the CNN neural network for further learning so as to respectively obtain a third candidate relationship type and a fourth candidate relationship type.

And respectively extracting and learning features from the input samples by adopting a CNN neural network and a B L STM neural network, and splicing the learned features to obtain a fifth candidate relationship type.

For the first candidate relationship type, the second candidate relationship type, the third candidate relationship type, the fourth candidate relationship type and the fifth candidate relationship typeThe method comprises the steps of evaluating candidate relation types, selecting the candidate relation type with the highest score as a final relation classification result of a sample, specifically, counting the prediction results of CNN, B L STM, CNN-S-B L STM (CNN is connected with B L STM in series), B L STM-S-CNN (B L STM is connected with CNN in series) and CNN-P-B L STM (CNN is connected with B L STM in parallel), adding 1 score to the relation type with the highest probability predicted by each model, and finally selecting the relation type with the highest score as the final relation classification result of the sample₂、R₂、R₁、R₃、R₂Then select R₂Score highest, R₂As the final result of the relationship classification.

If there is more than one candidate relationship type with the highest score in the present embodiment, one type is randomly selected from all the candidate relationship types with the highest score as the result of classification. If the model prediction types are respectively as follows: r₂、R₂、R₃、R₁、R₃Wherein R is₂And R₃The relationship types score the same and most, then R is chosen randomly₂、R₃One of which is the result of a classification of the relationship.

The features extracted from the input samples by the CNN neural network and the B L STM neural network respectively comprise word vector features, position features, local features, sequence features and the like, and different influences of the features on the performance of the entity relationship extraction model are analyzed.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that other parts not specifically described are within the prior art or common general knowledge to those of ordinary skill in the art. Without departing from the principle of the invention, several improvements and modifications can be made, and these improvements and modifications should also be construed as the scope of the invention.

Claims

1. An entity relationship extraction method based on deep learning is characterized by comprising the following steps:

2. The deep learning-based entity relationship extraction method according to claim 1, wherein if there is more than one type of the highest-score candidate relationship types, one type is randomly selected from all the highest-score candidate relationship types as the classification result.

3. The entity relationship extraction method based on deep learning of claim 1, wherein the CNN neural network adopts sliding windows of multiple sizes to extract local features, performs maximum pooling sampling on the learned features, and then trains the entity relationship extraction model through computation flows such as full connectivity, softmax operation, and back propagation.

4. The entity relationship extraction method based on deep learning of claim 1, wherein the input samples are inputted in a bidirectional L STM sentence, and in the entity relationship extraction model, the sample data are inputted into L STM neural units according to the original sequence of words in the sentence, and the state h at each time is_tDependent on previous timeThe state of the moment and the state of the current moment, h_tThe STM is formed by splicing the outputs of a forward sequence and a reverse sequence at the time t of B L STM, and the specific steps are as follows:

wherein the content of the first and second substances,

for sentences input in forward order to the output of B L STM at time t,

5. The entity relationship extraction method based on deep learning of claim 1, wherein the feature extraction from the input sample comprises word vector feature, position feature, local feature and sequence feature.