CN111611395B

CN111611395B - Entity relationship identification method and device

Info

Publication number: CN111611395B
Application number: CN201910139176.8A
Authority: CN
Inventors: 罗文娟; 李奘
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2019-02-25
Filing date: 2019-02-25
Publication date: 2023-05-16
Anticipated expiration: 2039-02-25
Also published as: CN111611395A

Abstract

The application provides a method and a device for identifying entity relationships, wherein the method comprises the following steps: extracting feature vectors of entity texts and feature vectors of unstructured texts contained in the target corpus; based on the feature vector of the unstructured text and a first classification model obtained by pre-training, determining the prediction probability of each entity relationship of the unstructured text belonging to a plurality of entity relationships, and forming probability distribution feature vectors by the prediction probabilities respectively corresponding to the plurality of entity relationships; and identifying entity relations in the target corpus based on the feature vectors of the entity texts, the probability distribution feature vectors and a second classification model obtained by pre-training. By the method, the entity relationship of the target corpus can be accurately predicted.

Description

Entity relationship identification method and device

Technical Field

The application relates to the technical field of knowledge maps, in particular to a method and a device for identifying entity relations.

Background

Knowledge graph technology is widely applied to the fields of natural language processing, text retrieval, knowledge reasoning, chat robots and the like. In general, a knowledge graph may be described as a number of entities, and relationships between the entities. At present, due to limited manually-marked data, entities and relationships between entities are often required to be automatically mined from unstructured text data when knowledge maps are constructed. For example, given a piece of unstructured text data "Zhang Sanshou in Beijing university", it is necessary to extract Zhang Sancorresponding entity A, beijing university corresponding entity B, and also to extract the relationship between the two entities as "Pijingku university".

However, it is difficult to mine the relationship between entities from such unstructured text data, and a scheme capable of intelligently identifying the relationship between entities is needed.

Disclosure of Invention

In view of this, an object of the embodiments of the present application is to provide a method and apparatus for identifying an entity relationship, so as to intelligently identify the entity relationship between entities.

In a first aspect, an embodiment of the present application provides a method for identifying an entity relationship, including:

extracting feature vectors of entity texts and feature vectors of unstructured texts contained in the target corpus;

based on the feature vector of the unstructured text and a first classification model obtained by pre-training, determining the prediction probability of each entity relationship of the unstructured text belonging to a plurality of entity relationships, and forming probability distribution feature vectors by the prediction probabilities respectively corresponding to the plurality of entity relationships;

and identifying entity relations in the target corpus based on the feature vectors of the entity texts, the probability distribution feature vectors and a second classification model obtained by pre-training.

In a possible implementation manner, the extracting feature vectors of unstructured text included in the target corpus includes:

Extracting word vectors of each word in the unstructured text, and extracting position word vectors used for representing the position relation of each word in the target corpus in the unstructured text.

In a possible implementation manner, the determining, based on the feature vector of the unstructured text and a first classification model obtained by training in advance, a prediction probability that the unstructured text belongs to each entity relationship in a plurality of entity relationships includes:

for each word in the unstructured text, splicing the word vector of each word with the corresponding position word vector to obtain a first feature vector corresponding to each word;

inputting the first feature vector corresponding to each word into a first classification model obtained through pre-training to extract features, and obtaining a second feature vector corresponding to the unstructured text;

calculating the entity relation correlation degree between the second feature vector and the feature vector of each entity relation in a plurality of entity relations;

adjusting a feature value in the second feature vector based on the second feature vector and each calculated entity relationship correlation;

and classifying the adjusted second feature vector to obtain the prediction probability of the unstructured text belonging to each entity relation.

In a possible implementation manner, the adjusting the feature value in the second feature vector based on the second feature vector and each calculated entity relationship correlation degree includes:

and aiming at the ith dimension characteristic value in the second characteristic vector, i is a positive integer, calculating the product between the ith dimension characteristic value and each entity relation correlation degree, and taking the sum of the calculated products as the ith dimension characteristic value of the adjusted second characteristic vector.

In a possible implementation manner, the step of inputting the first feature vector corresponding to each word into the first classification model obtained by training in advance to perform feature extraction to obtain the second feature vector corresponding to the unstructured text includes:

for the j-th word in the unstructured text, j is a positive integer, executing the following operations:

transforming the first feature vector corresponding to the j-th word to obtain a first transformed feature vector;

calculating text relativity between a first transformation feature vector corresponding to the j-th word and a word vector of each word in the unstructured text;

based on the first transformation feature vector corresponding to the jth word and the calculated text correlation corresponding to each word, adjusting the feature value in the first transformation feature vector corresponding to the jth word;

And splicing the adjusted first transformation feature vector corresponding to each word in the unstructured text to obtain a second feature vector corresponding to the unstructured text.

In a possible implementation manner, the adjusting the feature value in the first transformation feature vector corresponding to the jth word based on the first transformation feature vector corresponding to the jth word and the calculated text relevance corresponding to each word includes:

and calculating the product between the k-th dimension characteristic value and the text correlation corresponding to each word aiming at the k-th dimension characteristic value in the first transformation characteristic vector corresponding to the j-th word, wherein k is a positive integer, and taking the sum of the calculated products as the k-th dimension characteristic value of the adjusted first transformation characteristic vector.

In a possible implementation manner, the transforming the first feature vector corresponding to the jth word to obtain a first transformed feature vector includes:

performing bidirectional feature extraction on the first feature vector corresponding to the j-th word to obtain a forward feature vector and a backward feature vector;

and constructing a first transformation feature vector corresponding to the j-th word based on the forward feature vector and the backward feature vector.

In a possible implementation manner, the identifying the entity relationship in the target corpus based on the feature vector of the entity text, the probability distribution feature vector and a second classification model obtained by pre-training includes:

and splicing the feature vector of the entity text with the probability distribution feature vector, inputting the feature vector into the second classification model obtained by pre-training, and identifying the entity relation in the target corpus.

In a possible implementation manner, the extracting the feature vector of the entity text included in the target corpus includes:

extracting word vectors of each word of the entity text, and determining feature vectors of the entity text according to the extracted word vectors of each word.

In a possible implementation manner, the determining the feature vector of the entity text according to the extracted word vector of each word includes:

and aiming at the first dimension characteristic value in the word vector of each word, wherein l is a positive integer, adding the first dimension characteristic value in the word vector of each word, and taking the sum of the characteristic values obtained after adding as the first dimension characteristic value in the characteristic vector of the entity text.

In a possible embodiment, the method further comprises:

Acquiring a sample corpus set, wherein the sample corpus set comprises at least one sample corpus marked with entity relations;

extracting sample feature vectors of entity texts and sample feature vectors of unstructured texts contained in each sample corpus;

training the first classification model based on sample feature vectors of unstructured texts corresponding to each sample corpus and entity relations of labeling of each sample corpus;

after the first classification model training is determined, determining a probability distribution feature vector of each sample corpus based on the sample feature vector of the unstructured text corresponding to each sample corpus and the first classification model;

and training the second classification model based on the feature vector and the probability distribution feature vector of the entity text corresponding to each sample corpus and the entity relation marked by each sample corpus until the second classification model training is determined to be completed.

In a second aspect, an embodiment of the present application further provides an apparatus for identifying an entity relationship, including:

the extraction module is used for extracting the feature vector of the entity text and the feature vector of the unstructured text included in the target corpus;

The determining module is used for determining the prediction probability of each entity relation of the unstructured text belonging to a plurality of entity relations based on the feature vector of the unstructured text and a first classification model obtained through pre-training, and forming probability distribution feature vectors by the prediction probabilities respectively corresponding to the plurality of entity relations;

and the identification module is used for identifying the entity relationship in the target corpus based on the feature vector of the entity text, the probability distribution feature vector and a second classification model obtained by pre-training.

In one possible design, the extracting module is specifically configured to, when extracting a feature vector of an unstructured text included in the target corpus:

In one possible design, the determining module is specifically configured to, when determining, based on the feature vector of the unstructured text and the first classification model obtained by training in advance, a prediction probability that the unstructured text belongs to each entity relationship in the plurality of entity relationships:

In one possible design, the determining module is specifically configured to, when adjusting the feature value in the second feature vector based on the second feature vector and the calculated correlation of each entity relationship:

In one possible design, the determining module is specifically configured to, when the first feature vector corresponding to each word is input into a first classification model obtained by training in advance to perform feature extraction, obtain a second feature vector corresponding to an unstructured text:

In one possible design, the determining module is specifically configured to, when adjusting the feature value in the first transformation feature vector corresponding to the jth word based on the first transformation feature vector corresponding to the jth word and the calculated text relevance corresponding to each word:

In one possible design, the determining module is specifically configured to, when transforming the first feature vector corresponding to the jth word to obtain a first transformed feature vector:

In one possible design, the identifying module is specifically configured to, when identifying the entity relationship in the target corpus based on the feature vector of the entity text, the probability distribution feature vector, and a second classification model obtained by training in advance:

In one possible design, the extracting module is specifically configured to, when extracting a feature vector of an entity text included in the target corpus:

In one possible design, the extracting module is specifically configured to, when determining the feature vector of the entity text according to the extracted word vector of each word:

In one possible design, the apparatus further comprises:

the system comprises a sample acquisition module, a data acquisition module and a data processing module, wherein the sample acquisition module is used for acquiring a sample corpus set, and the sample corpus set comprises at least one sample corpus marked with entity relations;

the sample extraction module is used for extracting sample feature vectors of the entity text and sample feature vectors of unstructured text included in each sample corpus;

the first model training module is used for training the first classification model based on the sample feature vectors of the unstructured text corresponding to each sample corpus and the entity relation of each sample corpus label;

The sample determining module is used for determining probability distribution feature vectors of each sample corpus based on the sample feature vectors of unstructured texts corresponding to each sample corpus and the first classification model after the first classification model is determined to be trained;

and the second model training module is used for training the second classification model based on the feature vector and the probability distribution feature vector of the entity text corresponding to each sample corpus and the entity relation marked by each sample corpus until the second classification model training is determined to be completed.

In a third aspect, embodiments of the present application further provide an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the method of identifying an entity relationship of the first aspect, or any of the possible implementations of the first aspect.

In a fourth aspect, the embodiments of the present application further provide a computer readable storage medium, on which a computer program is stored, which when executed by a processor performs the steps of the method for identifying entity relationships described in the first aspect, or any one of the possible implementation manners of the first aspect.

According to the entity relation recognition method and device, the feature vector of the unstructured text in the target corpus is extracted, the prediction probability of the unstructured text belonging to each entity relation is predicted based on the first classification model, and then the probability distribution feature vector is constructed according to the prediction probability; combining the feature vector of the entity text in the target corpus with the probability distribution feature vector, and predicting the entity relation in the target corpus based on the second classification model. By inputting the predicted probability of each entity relation predicted by the first classification model as part of the second classification model, and additionally inputting the feature vector of the entity text associated with the unstructured text as part of the second classification model, the predicted entity relation of the target corpus is more accurate.

In order to make the above objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for identifying entity relationships according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of entity relationship prediction based on a first classification model according to an embodiment of the present application;

FIG. 3 is a flow chart illustrating an applicable method for obtaining a second feature vector of unstructured text through a first classification model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a training process for a first classification model and a second classification model according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of training a first classification model according to an embodiment of the present application;

FIG. 6 is a schematic flow chart of training a second classification model according to an embodiment of the present application;

fig. 7 is a schematic architecture diagram of an entity relationship recognition device 700 according to an embodiment of the present application;

fig. 8 shows a schematic structural diagram of an electronic device 800 according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.

In general, data in a knowledge graph is generally stored in the form of triples, for example, zhang san and Beijing university are entities, and the relationship between the entities represented by the graduation is the university of graduation, and if the data is stored in the form of triples, the data can be expressed as (Zhang san, university of graduation, beijing university). In practical application, the entities 'Zhang Sano' and 'Beijing university' in the target corpus can be identified in a semantic segmentation mode, and the prediction difficulty for the entity relationship represented by 'graduation' is high, so that the entity relationship represented by the unstructured text is difficult to accurately predict in the prior art.

Aiming at the problems existing in the prior art, the application provides a method and a device for identifying entity relations, which can be applied to the aspects of automatically constructing a knowledge graph and the like, and the relation among entities in a target corpus is determined by identifying the entity relation corresponding to unstructured text in the target corpus. Specifically, in the embodiment of the application, the two classification models are combined to respectively predict the entity relationship of the unstructured text in the target corpus to be identified, so that the prediction result of the entity relationship of the unstructured text is combined with the characteristic information of the entity text, and the combined target corpus is further predicted for the entity relationship, so that the entity relationship of the predicted target corpus is more accurate. In addition, in the specific prediction process, the relevance between each word in the unstructured text and the relevance between the unstructured text and each entity relation are also considered, so that the finally predicted entity relation can be more accurate.

The technical scheme provided by the application is described in detail below with reference to specific embodiments.

Example 1

An embodiment of the present application provides a method for identifying an entity relationship, as shown in fig. 1, which is a flow chart of the method for identifying an entity relationship provided in the embodiment of the present application, and includes the following steps:

and 101, extracting feature vectors of entity texts and feature vectors of unstructured texts included in the target corpus.

Wherein entity text is a description of objectively existing and mutually distinguishable things, and unstructured text is used to describe entities and relationships between entities. For example, if the target corpus is "Zhang San Pi to Beijing university," Zhang San "and" Beijing university "are the entity text in the target corpus, and" Pi to "is the unstructured text of the target corpus.

In an example, extraction of entity text and unstructured text in the target corpus may be achieved by means of semantic segmentation, for example, the target corpus may be input into a pre-trained semantic segmentation model, the extracted entity text and structured text are output, and a specific semantic segmentation method will not be described herein.

In a specific implementation, when extracting the feature vector of the entity text included in the target corpus, the word vector of each word in the entity text may be first extracted, and then the feature vector of the entity text may be determined according to the extracted word vector of each word.

Illustratively, a word vector for each word in the entity text may be generated by a word2vec toolkit. The word2vec tool kit is trained in advance by using a corpus training set, and can generate a feature extraction model for describing feature vectors of feature information of each word. The word2vec tool can be used for generating a corresponding feature vector with set dimension for each word. Wherein the dimensions of the feature vectors of different words may be identical, but the feature values of the feature vectors are not exactly identical. The difference in eigenvalue between the paraphraseology is small and the difference in eigenvalue between the non-paraphraseology is large.

When determining the feature vector of the entity text according to the extracted word vector of each word, the first dimension feature value in the word vector of each word can be added, the sum of the feature values obtained after the addition is used as the first dimension feature value in the feature vector of the entity text, l is a positive integer, and l can be taken to pass through each dimension in the word vector.

In one example, if the word vector of each word is a 3-dimensional vector, the entity text ABCD contains A, B, C, D four words, and the word vector of a is { a } ₁ ，A ₂ ，A ₃ The word vector of B is { B }, B ₁ ，B ₂ ，B ₃ The word vector of C is { C }, C ₁ ，C ₂ ，C ₃ The word vector of D is { D }, D ₁ ，D ₂ ，D ₃ Then according to the word vector of A, B, C, D four words, the determined feature vector of the entity text ABCD is { A } ₁ +B ₁ +C ₁ +D ₁ ，A ₂ +B ₂ +C ₂ +D ₂ ，A ₃ +B ₃ +C ₃ +D ₃ }。

In the embodiment of the application, the feature vector of the unstructured text included in the target corpus can be extracted. In one possible implementation, a word vector for each word in the unstructured text may be extracted, and a positional word vector for each word in the unstructured text may be extracted that is used to represent the positional relationship of each word in the target corpus.

Illustratively, when extracting the word vector for each word in the unstructured text, the word vector for each word in the unstructured text may also be generated by a word2vec toolkit.

By way of example, there may be several cases in extracting a positional word vector for representing the positional relationship of each word in the target corpus in unstructured text:

taking the extraction of the position word vector of the i-th word in the unstructured text as an example, if i is a positive integer, the value of the position word vector of the i-th word may be:

Case one: the i-th character is positioned on the left side of the entity text, and the value of the characteristic value of the position word vector of the i-th character is 0;

and a second case: the i-th word is arranged on the right side of the entity text, and the distance between the i-th word and the rightmost word of the entity text does not exceed the preset distance, so that the value of the characteristic value of the position word vector of the i-th word is the position of the i-th word in the target corpus+the preset distance+1;

and a third case: the i-th word is on the right side of the entity text, and the distance between the i-th word and the rightmost word of the entity text exceeds the preset distance, and the eigenvalue of the position word vector of the i-th word is 2 x (the preset distance+1).

Combining the three cases, the value of the eigenvalue of the position word vector of the ith character can be calculated by the following formula:

wherein P is the value of the characteristic value of the position word vector of the ith word, m is the position of the ith word in the target corpus, n is a preset distance, and q is the position of the rightmost word in the entity text in the target corpus, wherein P, m, n and q are all positive integers.

In some embodiments of the present application, the target corpus may include two or more entity texts, for example, zhang Sanjingdi in Beijing university, "Jiv Sanjingdi" is an unstructured text, "Beijing university" is an entity text, and "Beijing university" is also an entity text, where when calculating the position word vector of each word of the unstructured text, the position word vector of each word of the unstructured text relative to the entity text "Beijing university" may be calculated based only on the entity text "Beijing university"; alternatively, the position word vector of each word of the unstructured text relative to the solid text 'Zhang Sany' can be calculated by taking the solid text 'Zhang Sany' as a reference; or, respectively taking the two entity texts as the reference, calculating the position word vector of each word of the unstructured text relative to the entity texts, namely Zhang san and Beijing university, and combining the two position word vectors to construct the position word vector of the unstructured text.

For example, if the unstructured text between the entity text a and the entity text B is C, the word P in the unstructured text C is { a } with respect to the entity text a, and the resultant positional word vector is ₁ P is { B }, the position word vector obtained by taking the entity text B as a reference ₁ Then the position word vector of the word P in the unstructured text C can be determined to be { A } ₁ ，B ₁ }。

For another example, when calculating the position word vector of unstructured text "graduation in" of Beijing university "of the target corpus," Zhang Sanjia ", the preset distance is 2, the position of the" business "word in the target corpus is 4 based on" Zhang Sanjia ", the position of the" three "word on the rightmost side of" Zhang Sanjia "in the target corpus is 2, and the position word vector of the" business "word is {0}; taking Beijing university as a reference, the leftmost word north of Beijing university is 6 in the target corpus, and the position word vector of the industrial word is {4+2+1} = {7}; if the position of two entity texts is considered, the position word vector of the "industry" word is {0,7}.

Step 102, determining the prediction probability of the unstructured text belonging to each entity relationship in a plurality of entity relationships based on the feature vector of the unstructured text and a first classification model obtained through pre-training.

The first classification model may be a deep neural network model, for example, a convolutional neural network (Convolutional Neural Networks, CNN) model, a recurrent neural network (Recurrent Neural Network, RNN) model, a gated recurrent unit (Gated Recurrent Unit, GRU), and the like.

In the implementation, after the feature vector of the unstructured text is obtained, for each word of the unstructured text, the word vector of each word and the corresponding position word vector can be spliced to obtain a first feature vector corresponding to each word, then the first feature vector is input into a first classification model to obtain the feature vector corresponding to the unstructured text, then the feature value in the feature vector of the unstructured text is adjusted based on a multi-level Attention mechanism, and further the feature value can reflect the correlation degree between the unstructured text and each entity relation, and the specific implementation step based on the prediction probability of the first classification model is described in the second embodiment.

And 103, constructing probability distribution feature vectors according to the prediction probabilities respectively corresponding to the entity relationships.

In an example, if there are N entity relationships, N prediction probabilities X of the unstructured text may be obtained through the first classification model ₁ ，X ₂ ，X ₃ ，…，X _N Wherein X1 represents the predicted probability that the unstructured text belongs to a first entity relationship, X2 represents the predicted probability that the unstructured text belongs to a second entity relationship, X _N Representing the predicted probability that the unstructured text belongs to the nth entity relationship. This can form a probability distribution feature vector of x= { X ₁ ，X ₂ ，X ₃ ，…，X _N }。

And 104, identifying entity relations in the target corpus based on the feature vectors of the entity texts, the probability distribution feature vectors and the pre-trained second classification model.

Wherein the second classification model may be a decision tree model such as xgboost or the like.

In a possible implementation manner, in order to highlight the association between the entity text and the unstructured text and improve the accuracy of predicting the entity relationship, when the entity relationship in the target corpus is identified based on the feature vector of the entity text, the probability distribution feature vector and the second classification model obtained by pre-training, the feature vector of the entity text and the probability distribution feature vector can be spliced first, then the spliced feature vector is input into the second classification model obtained by pre-training, and the entity relationship in the identified target corpus is output.

For example, if the feature vector of two entity texts is S ₁ Dimension vector, probability distribution feature vector S ₂ Dimension vector, the spliced feature vector is 2*S ₁ +S ₂ And (5) a dimension vector.

The training method of the first classification model and the second classification model will be described in the third embodiment, and will not be described in detail here.

In the foregoing embodiments of the present application, by extracting feature vectors of unstructured text in a target corpus, prediction probabilities of the unstructured text belonging to each entity relationship may be predicted based on the first classification model, and these prediction probabilities may form probability distribution feature vectors. Furthermore, the feature vector of the entity text in the target corpus can be combined with the probability distribution feature vector, and the entity relation in the target corpus can be predicted based on the second classification model. By inputting the predicted probability of each entity relation predicted by the first classification model as part of the second classification model, and additionally inputting the feature vector of the entity text associated with the unstructured text as part of the second classification model, the predicted entity relation of the target corpus is more accurate.

Example two

In the embodiment of the application, when the prediction probability of each entity relation corresponding to the unstructured text is predicted through the first classification model, a multi-level-based attribute mechanism is introduced, wherein the attribute mechanism can be used for adjusting the characteristic value in the characteristic vector of the structured text, and the adjusted characteristic value can reflect the degree of correlation with each entity relation, so that the accuracy of the prediction result of each entity relation corresponding to the unstructured text is improved.

Next, referring to fig. 2, a detailed description will be given of an entity relationship prediction process based on the first classification model, including the following steps:

step 201, for each word in the unstructured text, splicing the word vector of each word with the corresponding position word vector to obtain a first feature vector corresponding to each word.

In an example, if the word vector of each word is an N-dimensional vector and the corresponding position word vector is an M-dimensional vector, the first feature vector obtained by splicing the word vector of each word and the corresponding position word vector is an m+n-dimensional vector, and M, N is a positive integer.

Step 202, inputting a first feature vector corresponding to each word into a first classification model obtained through training in advance for feature extraction, and obtaining a second feature vector corresponding to the unstructured text.

In an example, if the unstructured text is "graduation" then the first feature vectors corresponding to "graduation", "graduation" and "graduation" are input into the first classification model trained in advance to perform feature extraction, and then the second feature vector corresponding to the unstructured text "graduation" can be obtained.

Specifically, referring to fig. 3, a flowchart of a method for obtaining a second feature vector of unstructured text through a first classification model according to an embodiment of the present application is provided, which specifically includes the following steps:

Step 301, for the jth word in the unstructured text, transforming the first feature vector corresponding to the jth word to obtain a first transformed feature vector.

Where j is a positive integer, j may be taken over each word in the unstructured text. In order to extract the feature information of the jth word deeply, the feature extraction of the jth word may be performed deeply. For example, the first feature vector corresponding to the jth word may be subjected to bidirectional feature extraction, for example, forward feature vector corresponding to the jth word and backward feature vector are extracted by using a bidirectional RNN model. Then, a first transformation feature vector corresponding to the j-th word is constructed based on the forward feature vector and the backward feature vector. For example, the forward feature vector and the backward feature vector may be spliced to obtain a first transformed feature vector corresponding to the jth word, or each dimension of feature values in the forward feature vector and the backward feature vector may be weighted and summed to obtain a first transformed feature vector corresponding to the jth word.

Step 302, calculating a text relevance between the first transformation feature vector corresponding to the j-th word and the word vector of each word in the unstructured text.

The first transformation feature vector of the jth word represents deep features of the jth word, but for the whole unstructured text, the first transformation feature vector of the jth word cannot highlight association of the jth word with other words in the unstructured text, so that in the embodiment of the application, the text relevance between the first transformation feature vector corresponding to the jth word and the word vector of each word in the unstructured text can be calculated, and then the feature value of the first transformation feature vector of the jth word is adjusted according to the calculated text relevance, so that the feature value of the adjusted first transformation feature vector of the jth word can reflect not only the feature information of the independent jth word, but also the feature information between the jth word and other words in the unstructured text.

Specifically, when calculating the text relevance between the first transformation feature vector corresponding to the jth word and the word vector of each word in the unstructured text, for example, the euclidean distance may be calculated, the cosine similarity may be calculated, or the relevance may be calculated by a hash algorithm, which is not further described in the embodiments of the present application.

Step 303, adjusting the feature value in the first transformation feature vector corresponding to the jth word based on the first transformation feature vector corresponding to the jth word and the calculated text relevance corresponding to each word.

In a specific implementation, when adjusting the feature value in the first transformed feature vector corresponding to the jth word, the following manner may be referred to:

mode one: and calculating the product between the k-th dimension characteristic value and the text relativity corresponding to each word aiming at the k-th dimension characteristic value in the first transformation characteristic vector corresponding to the j-th word, wherein k is a positive integer, and taking the sum of the calculated products as the k-th dimension characteristic value of the adjusted first transformation characteristic vector.

For example, if the first transformation feature vector corresponding to the jth word is {5,6,7}, the text correlation degree of the jth word and the kth word is {0.9,0.8,0.7}, the feature value "5" of the first transformation feature vector corresponding to the jth word after adjustment is adjusted to be 5×0.9+5×0.8+5×0.7=12, the feature value "6" of the first transformation feature vector corresponding to the jth word after adjustment is adjusted to be 6×0.9+6×0.8+6×0.7=14.4, and the feature value "7" of the first transformation feature vector corresponding to the jth word after adjustment is adjusted to be 7×0.9+7×0.8+7×0.7=16.8, so the first transformation feature vector corresponding to the jth word after adjustment is {12, 14.4, 16.8}.

Mode two: the average value of the text relevance may be calculated first, then the product between the kth dimension eigenvalue and the average value of the text relevance corresponding to each word may be calculated, and the calculated product may be used as the kth dimension eigenvalue of the adjusted first transformation eigenvector.

For example, if the first transformation feature vector corresponding to the jth word is {5,6,7}, the text correlation degree of the jth word and the kth word is {0.9,0.8,0.7}, the feature value "5" of the first transformation feature vector corresponding to the jth word after adjustment is adjusted to be 5× [ (0.9+0.8+0.7)/3 ] =4, the feature value "6" of the first transformation feature vector corresponding to the jth word after adjustment is adjusted to be 6× [ (0.9+0.8+0.7)/3 ] =4.8, and the feature value "7" of the first transformation feature vector corresponding to the jth word after adjustment is adjusted to be 7× [ (0.9+0.8+0.7)/3 ] =5.6, and therefore the first transformation feature vector corresponding to the jth word after adjustment is {4,4.8,5.6}.

In another possible implementation manner, the first transformation feature vector corresponding to the jth word is M-dimension, and the text relevance between the jth word and the kth word is N-dimension, and the adjusted first transformation feature vector corresponding to the jth word is m×n-dimension, where the feature value of each dimension of the first transformation feature vector corresponding to the jth word is multiplied by the corresponding text relevance, so that the feature value of each dimension of the adjusted first transformation feature vector can be obtained.

For example, the first transformation feature vector corresponding to the jth word is {5,6,7}, and the text relevance of the jth word to the kth word is {0.9,0.8,0.7}, the calculation may be performed as follows:

the first transformation feature vector corresponding to the j-th word after adjustment is

And 304, splicing the first transformation feature vectors corresponding to each word in the adjusted unstructured text to obtain a second feature vector corresponding to the unstructured text.

For example, the unstructured text may include A, B, C three words, the first transformation feature vectors of A, B, C three words respectively correspond to D, E, F dimensions, and after stitching, the second feature vector corresponding to the unstructured text is d+e+f dimensions.

Step 203, calculating the entity relationship correlation degree between the second feature vector and the feature vector of each entity relationship in the plurality of entity relationships.

In a specific implementation, a corpus set corresponding to each entity relationship may be preset, where the corpus set of each entity relationship includes a plurality of sample corpora with the entity relationship labels. For each entity relation, the feature vector of the unstructured text contained in each sample corpus in the corpus set of the entity relation can be determined, and then the feature value of the feature vector of the entity relation is determined based on the feature value of the feature vector of the unstructured text contained in each sample corpus, for example, the average value among the feature values of the feature vectors of the unstructured text of all sample corpora corresponding to the entity relation can be used as the feature value of the feature vector of the entity relation.

In calculating the correlation degree of the entity relationship between the second feature vector and the feature vector of the entity relationship, for example, the euclidean distance may be calculated, the cosine similarity may be calculated, the correlation degree may be calculated by a hash algorithm, and the like, which will not be described herein.

Step 204, adjusting the feature value in the second feature vector based on the second feature vector and the calculated correlation degree of each entity relationship.

In some embodiments of the present application, for the ith dimension feature value in the second feature vector, i is a positive integer, a product between the ith dimension feature value and each entity relationship correlation degree may be calculated, and a sum of the calculated products is used as the ith dimension feature value of the adjusted second feature vector.

Or, for the ith dimension feature value in the second feature vector, i is a positive integer, the average value of all entity relationship relativity can be calculated, then the product between the ith dimension feature value and the average value of the entity relationship relativity corresponding to each word is calculated, and the calculated product is used as the ith dimension feature value of the adjusted first transformation feature vector.

Or the second feature vector of the unstructured text is M-dimensional, the entity relation correlation corresponding to the S-th type of entity relation is N-dimensional, and the adjusted second feature vector of the unstructured text is M x N-dimensional, wherein each dimension of the second feature vector corresponding to the entity relation is multiplied by the corresponding entity relation correlation, and the feature value of the adjusted second feature vector can be obtained.

Step 205, classifying the adjusted second feature vector to obtain the prediction probability of the unstructured text belonging to each entity relationship.

In a specific implementation, the adjusted second feature vector may be input into a pre-trained classifier, so as to obtain a prediction probability that the unstructured text belongs to each entity relationship, where, for example, the classifier is a softmax model or the like.

In the above embodiment of the present application, a second feature vector of an unstructured text of a target corpus is obtained through a first classification model, then an entity relationship correlation degree between the second feature vector and a feature vector of each entity relationship in a plurality of entity relationships is calculated, then a feature value in the second feature vector is adjusted according to the second feature vector and the calculated entity relationship correlation degree, and finally a prediction probability that the unstructured text belongs to each entity relationship is obtained based on the adjusted second feature vector. The feature value of the second feature vector is adjusted by using the calculated entity relationship correlation degree, so that the adjusted second feature vector can highlight the association degree between the unstructured text and each entity relationship, and the prediction probability of the unstructured text belonging to each entity relationship is predicted to be more accurate.

Example III

In a third embodiment of the present application, a training process of the first classification model and the second classification model will be described in detail, as shown in fig. 4 below, including the following steps:

step 401, obtaining a sample corpus set.

In particular, the sample corpus may include at least one sample corpus labeled with entity relationships.

Step 402, extracting a sample feature vector of the entity text included in each sample corpus, and a sample feature vector of the unstructured text.

In the step, extracting the sample feature vector of the entity text included in each sample corpus comprises extracting the word vector of each word of the entity text included in each sample corpus, and determining the sample feature vector of the entity text included in each sample corpus according to the extracted word vector of each word; extracting the sample feature vector of the unstructured text included in each sample corpus includes extracting a word vector of each word in the unstructured text included in each sample corpus, and extracting a position word vector in the unstructured text included in each sample corpus for representing a position relationship of each word in the target corpus.

Specifically, the method for extracting the sample feature vector of the entity text and the sample feature vector of the unstructured text included in the sample corpus is similar to the method for extracting the feature vector of the entity text and the feature vector of the unstructured text included in the target corpus in the first embodiment, and will not be described in detail herein.

Step 403, training the first classification model based on the sample feature vector of the unstructured text corresponding to each sample corpus and the entity relationship of each sample corpus label.

Wherein, as shown in fig. 5, the training of the first classification model includes the following steps:

step 4031, selecting a preset number of sample corpora from the sample corpus set, and splicing word vectors and position word vectors of each word of the unstructured text of each selected sample corpus to obtain a first feature vector corresponding to each word.

Step 4032, inputting the first feature vector of each word of each sample corpus to the first classification model to obtain a second feature vector of the unstructured text of each sample corpus.

Step 4033, determining prediction results corresponding to a preset number of sample corpora respectively according to the second feature vector of each unstructured text, wherein the prediction result of each sample corpus represents the prediction probability that the unstructured text of the sample corpus belongs to each entity relationship.

Specifically, the entity relationship correlation degree between the second feature vector of each unstructured text and the feature vector of each entity relationship can be calculated, and then the second feature vector of each unstructured text is adjusted according to the entity relationship correlation degree; based on the adjusted second feature vector, the prediction probability that the unstructured text belongs to each entity relationship is determined, and the specific implementation method is similar to that provided in the second embodiment and will not be described herein.

Step 4034, comparing the predicted result of each sample corpus with the entity relationship marked by the sample corpus, and determining the first accuracy of the current round of prediction process.

Step 4035, judging whether the predicted first accuracy is not less than a first preset accuracy.

If yes, go to step 4036;

if the determination result is no, step 4037 is executed.

Step 4036, determining that the training of the first classification model is completed.

Step 4037, adjusting model parameters of the first classification model, and then returning to execute step 4031 to continue training the first classification model until it is determined that the first accuracy of the entity relationship predicted by the first classification model is greater than or equal to the first preset accuracy.

Step 403 is carried out, further comprising:

step 404, after determining that the training of the first classification model is completed, determining a probability distribution feature vector of each sample corpus based on the sample feature vector of the unstructured text corresponding to each sample corpus and the first classification model.

And step 405, training the second classification model based on the feature vector and the probability distribution feature vector of the entity text corresponding to each sample corpus and the entity relation marked by each sample corpus until the second classification model training is determined to be completed.

Specifically, the training process of the second classification model is shown in fig. 6, and includes the following steps:

step 4051, selecting a preset number of sample corpora from the sample corpus set, and splicing the feature vector of the entity text of each selected sample corpus with the probability distribution feature vector to obtain a spliced feature vector.

Step 4052, inputting the spliced feature vectors corresponding to the sample corpus with the preset number into a second classification model, and predicting the entity relationship of the matching of each sample corpus.

Step 4053, determining the second accuracy of the current round of prediction process by comparing the predicted entity relationship matched with each sample corpus with the entity relationship marked with each sample corpus.

Step 4054, judging whether the predicted second accuracy is not less than the second preset accuracy.

When the determination result is yes, step 4055 is executed;

when the determination result is no, step 4056 is executed.

Step 4055, determining that the second classification model training is completed.

Step 4056, adjusting the model parameters of the second classification model, and returning to step 4051, and continuing to train the second classification model until it is determined that the second accuracy predicted by the second classification model is greater than or equal to the second preset accuracy.

By adopting the embodiment, the first classification model is trained by extracting the sample feature vector of each entity text with entity relation labels and the sample feature vector of the unstructured text contained in the sample corpus, then after the training of the first classification model is completed, the probability distribution feature vector of each sample corpus is determined based on the sample feature vector of the unstructured text corresponding to each sample corpus and the trained first classification model, and finally, the second classification model is trained based on the feature vector and the probability distribution feature vector of the entity text corresponding to each sample corpus and the entity relation of each sample corpus label. The output of the trained first classification model is used as the input when the second classification model is trained, and the accuracy rate of the entity relation prediction is higher for the first classification model and the second classification model obtained through training in the mode.

Example IV

Referring to fig. 7, an architecture diagram of an entity relationship recognition apparatus 700 provided in an embodiment of the present application includes an extraction module 701, a determination module 702, and a recognition module 703, specifically:

the extracting module 701 is configured to extract feature vectors of the entity text and feature vectors of unstructured text included in the target corpus;

The determining module 702 is configured to determine, based on the feature vector of the unstructured text and a first classification model obtained by training in advance, a prediction probability of each entity relationship of the unstructured text belonging to multiple entity relationships, and configure the prediction probabilities corresponding to the multiple entity relationships respectively into probability distribution feature vectors;

the identifying module 703 is configured to identify the entity relationship in the target corpus based on the feature vector of the entity text, the probability distribution feature vector, and a second classification model obtained by training in advance.

In one possible design, the extracting module 701 is specifically configured to, when extracting a feature vector of an unstructured text included in the target corpus:

In one possible design, the determining module 702 is specifically configured to, when determining, based on the feature vector of the unstructured text and the first classification model trained in advance, a prediction probability that the unstructured text belongs to each of a plurality of entity relationships:

In one possible design, the determining module 702 is specifically configured to, when adjusting the feature value in the second feature vector based on the second feature vector and the calculated correlation of each entity relationship:

In one possible design, the determining module 702 is specifically configured to, when the first feature vector corresponding to each word is input into the first classification model obtained by training in advance to perform feature extraction, obtain a second feature vector corresponding to the unstructured text:

In one possible design, the determining module 702 is specifically configured to, when adjusting the feature value in the first transformed feature vector corresponding to the jth word based on the first transformed feature vector corresponding to the jth word and the calculated text relevance corresponding to each word:

In one possible design, the determining module 702 is specifically configured to, when transforming the first feature vector corresponding to the jth word to obtain a first transformed feature vector:

In one possible design, the identifying module 703 is specifically configured to, when identifying the entity relationship in the target corpus based on the feature vector of the entity text, the probability distribution feature vector, and a second classification model obtained by training in advance:

In one possible design, the extracting module 701 is specifically configured to, when extracting a feature vector of an entity text included in the target corpus:

In one possible design, the extracting module 701 is specifically configured to, when determining the feature vector of the entity text according to the extracted word vector of each word:

In one possible design, the apparatus further comprises:

a sample acquisition module 704, configured to acquire a sample corpus set, where the sample corpus set includes at least one sample corpus labeled with an entity relationship;

a sample extraction module 705, configured to extract a sample feature vector of the entity text and a sample feature vector of the unstructured text included in each sample corpus;

a first model training module 706, configured to train the first classification model based on a sample feature vector of the unstructured text corresponding to each sample corpus and an entity relationship of each sample corpus label;

The sample determining module 707 is configured to determine, after determining that the training of the first classification model is completed, a probability distribution feature vector of each sample corpus based on a sample feature vector of an unstructured text corresponding to each sample corpus and the first classification model;

and a second model training module 708, configured to train the second classification model based on the feature vector and the probability distribution feature vector of the entity text corresponding to each sample corpus and the entity relationship marked by each sample corpus, until it is determined that the training of the second classification model is completed.

According to the entity relation recognition device provided by the embodiment of the application, the feature vector of the unstructured text in the target corpus is extracted, the prediction probability of the unstructured text belonging to each entity relation is predicted based on the first classification model, and then the probability distribution feature vector is constructed according to the prediction probability; combining the feature vector of the entity text in the target corpus with the probability distribution feature vector, and predicting the entity relation in the target corpus based on the second classification model. By inputting the predicted probability of each entity relation predicted by the first classification model as part of the second classification model, and additionally inputting the feature vector of the entity text associated with the unstructured text as part of the second classification model, the predicted entity relation of the target corpus is more accurate.

Example five

Based on the same technical conception, the embodiment of the application also provides electronic equipment. Referring to fig. 8, a schematic structural diagram of an electronic device 800 according to an embodiment of the present application includes a processor 801, a memory 802, and a bus 803. The memory 802 is used for storing execution instructions, including a memory 8021 and an external memory 8022; the memory 8021 is also referred to as an internal memory, and is used for temporarily storing operation data in the processor 801 and data exchanged with an external memory 8022 such as a hard disk, and the processor 801 exchanges data with the external memory 8022 through the memory 8021, and when the electronic device 800 operates, the processor 801 and the memory 802 communicate with each other through the bus 803, so that the processor 801 executes the following instructions:

The specific process flow of the processor 801 may refer to the descriptions of the above method embodiments, and will not be described herein.

Based on the same technical idea, the embodiments of the present application also provide a computer readable storage medium, on which a computer program is stored, which when executed by a processor performs the steps of the above-mentioned entity relationship identification method.

Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when the computer program on the storage medium is executed, the above-mentioned entity relationship identification method can be executed, so as to realize accurate prediction of the entity relationship of the target corpus.

Based on the same technical concept, the embodiments of the present application further provide a computer program product, which includes a computer readable storage medium storing program code, where instructions included in the program code may be used to execute the steps of the service efficiency control method, and specific implementation may refer to the foregoing method embodiments and will not be described herein.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again. In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing an electronic device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes or substitutions are covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for identifying an entity relationship, comprising:

extracting feature vectors of entity texts and feature vectors of unstructured texts included in target corpus, wherein the feature vectors of the unstructured texts comprise: a word vector of each word in the unstructured text and a position word vector used for representing the position relation of each word in the target corpus in the unstructured text;

identifying entity relations in the target corpus based on the feature vectors of the entity texts, the probability distribution feature vectors and a second classification model obtained by pre-training;

the determining, based on the feature vector of the unstructured text and a first classification model obtained by training in advance, a prediction probability that the unstructured text belongs to each entity relationship in a plurality of entity relationships includes:

extracting features of the first feature vector corresponding to each word based on a first classification model to obtain a second feature vector corresponding to the unstructured text;

adjusting the feature value of the second feature vector corresponding to the unstructured text by using an attribute mechanism;

2. The method of claim 1, wherein the adjusting, by using an Attention mechanism, the feature value of the second feature vector corresponding to the obtained unstructured text includes:

and adjusting the characteristic value in the second characteristic vector based on the second characteristic vector and each calculated entity relation correlation degree.

3. The method of claim 2, wherein the adjusting feature values in the second feature vector based on the second feature vector and each calculated entity-relationship relevance comprises:

4. The method of claim 2, wherein the feature extracting the first feature vector corresponding to each word based on the first classification model to obtain the second feature vector corresponding to the unstructured text comprises:

5. The method of claim 4, wherein the adjusting the feature value in the first transformed feature vector corresponding to the jth word based on the first transformed feature vector corresponding to the jth word and the calculated text relevance for each word comprises:

6. The method of claim 4, wherein transforming the first feature vector corresponding to the j-th word to obtain a first transformed feature vector comprises:

7. The method of claim 1, wherein the identifying entity relationships in the target corpus based on feature vectors of the entity text, the probability distribution feature vectors, and a pre-trained second classification model comprises:

8. The method according to claim 1 or 7, wherein the extracting feature vectors of the entity text included in the target corpus includes:

9. The method of claim 8, wherein the determining the feature vector of the entity text based on the extracted word vector of each word comprises:

10. The method of claim 1, wherein the method further comprises:

11. An apparatus for identifying an entity relationship, comprising:

the extraction module is used for extracting feature vectors of entity texts and feature vectors of unstructured texts included in the target corpus, wherein the feature vectors of the unstructured texts comprise: a word vector of each word in the unstructured text and a position word vector used for representing the position relation of each word in the target corpus in the unstructured text;

The recognition module is used for recognizing entity relations in the target corpus based on the feature vectors of the entity texts, the probability distribution feature vectors and a second classification model obtained by pre-training;

the determining module is specifically configured to, when determining the prediction probability that the unstructured text belongs to each entity relationship in the plurality of entity relationships based on the feature vector of the unstructured text and a first classification model obtained by training in advance:

12. The apparatus of claim 11, wherein the determining module is configured to, when adjusting the feature value of the second feature vector corresponding to the obtained unstructured text by using an Attention mechanism:

13. The apparatus of claim 12, wherein the determining module, when adjusting the eigenvalue in the second eigenvector based on the second eigenvector and each calculated entity-relationship relevance, is specifically configured to:

14. The apparatus of claim 12, wherein the determining module is specifically configured to, when performing feature extraction on the first feature vector corresponding to each word based on the first classification model, obtain the second feature vector corresponding to the unstructured text:

15. The apparatus of claim 14, wherein the determining module, when adjusting the feature value in the first transformed feature vector corresponding to the jth word based on the first transformed feature vector corresponding to the jth word and the calculated text relevance corresponding to each word, is specifically configured to:

16. The apparatus of claim 14, wherein the determining module is configured to, when transforming the first feature vector corresponding to the j-th word to obtain a first transformed feature vector:

17. The apparatus of claim 11, wherein the identifying module, when identifying the entity relationship in the target corpus based on the feature vector of the entity text, the probability distribution feature vector, and a pre-trained second classification model, is specifically configured to:

18. The apparatus of claim 11 or 17, wherein the extracting module is specifically configured to, when extracting a feature vector of an entity text included in the target corpus:

19. The apparatus of claim 18, wherein the extraction module, when determining the feature vector of the entity text based on the extracted word vector of each word, is specifically configured to:

20. The apparatus of claim 11, wherein the apparatus further comprises:

21. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the method of identifying an entity relationship of any one of claims 1 to 10.

22. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of the method of identifying an entity relationship according to any of claims 1 to 10.