CN110852066A

CN110852066A - Multi-language entity relation extraction method and system based on confrontation training mechanism

Info

Publication number: CN110852066A
Application number: CN201810827459.7A
Authority: CN
Inventors: 刘知远; 王晓智; 韩旭; 林衍凯; 孙茂松
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2018-07-25
Filing date: 2018-07-25
Publication date: 2020-02-28
Anticipated expiration: 2038-07-25
Also published as: CN110852066B

Abstract

The invention provides a multilingual entity relation extraction method and system based on an antagonistic training mechanism, which respectively encode target sentences in related languages of a target entity into independent semantic spaces corresponding to the languages and consistent semantic spaces corresponding to all the languages, and acquire independent information of each language and consistent information of cross-language contained in the target sentences; and then respectively measuring the attention weight of each target sentence relative to each relationship type by adopting an independent attention mechanism of each language and a consistent attention mechanism among the languages, finally obtaining the global probability corresponding to each relationship type by combining the attention weights of all the target sentences relative to each relationship type, selecting the maximum probability from the global probabilities corresponding to each relationship type, and finally predicting the relationship between the target entity pairs according to the relationship type corresponding to the maximum probability. The method and the system can deeply utilize the complementarity among multiple languages, and effectively improve the accuracy of the relation extraction result in a multi-language scene.

Description

Multi-language entity relation extraction method and system based on confrontation training mechanism

Technical Field

The invention relates to the technical field of information processing, in particular to a multi-language entity relation extraction method and system based on an adversarial training mechanism.

Background

A knowledge graph, also referred to as a knowledge base in some scenarios, is a knowledge system formed by structuring the knowledge of human beings in the real world. In a knowledge graph, a large amount of knowledge, such as information in open databases and encyclopedias, is often expressed in the form of a set of relational data. In a relational data set, basic facts are abstracted into entities, and relevance information such as rules, logics, reasoning and the like is abstracted into relationships among the entities. If the entity is corresponding to the point and the relationship is corresponding to the edge, the knowledge can be further presented in the form of a graph, so that the knowledge can be efficiently used by a computer, and the meaning of the knowledge graph is researched. This model of structuring entities and abstractions into multi-relational data sets has also been energetically advocated in recent years. The knowledge graph enables information, particularly knowledge information, which is touched by people to break through the basic linear formation form of the original text character string and exist in a network-like form formed by entities and relations.

At present, the knowledge graph is taken as a basic core technology in the field of artificial intelligence and is widely introduced into tasks such as information retrieval, question answering systems, recommendation systems and the like. The high-quality structured knowledge information in the map can guide an intelligent model to have deeper object understanding, more accurate task query and logical reasoning capability to a certain extent, so that the intelligent model plays a vital role in knowledge-driven application.

Although existing knowledge-maps contain hundreds of millions of facts, they are still far from perfect compared to the endless real world. In order to further enlarge the scale of the knowledge graph, new relation facts are automatically captured from massive text data, and relation extraction is needed. The task of relation extraction is to automatically extract features from free texts and automatically judge the relation between entity pairs appearing in the texts, so that new edges are automatically expanded for the knowledge graph, and the content of the knowledge graph is enriched.

However, most of the existing relation extraction methods only focus on the relation extraction problem in a single language scene, that is, only one language is considered in training data and application. Such models ignore potential complementarity and consistency between different languages. In today's big data age, information sources are various, and massive free text resources to be subjected to relational extraction obtained from the internet are frequently multi-lingual. The existing relation extraction model aiming at the single language scene is difficult to obtain better performance under the practical application scene of multiple languages.

In view of the above, it is desirable to provide a relationship extraction method and system suitable for multi-language scenarios.

Disclosure of Invention

The invention provides a multi-language entity relation extraction method and system based on an antagonistic training mechanism, aiming at solving the problem that the existing relation extraction model aiming at a single-language scene is difficult to obtain better performance in a multi-language practical application scene.

In one aspect, the invention provides a multilingual entity relationship extraction method based on an antithetical training mechanism, which comprises the following steps:

for any one of a plurality of languages, acquiring a preset number of sentences related to target entity pairs in the language as target sentences, and constructing a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language and a second sentence vector representation in a consistent semantic space;

presetting a plurality of relation types, constructing a first relation vector representation of the relation type in an independent semantic space corresponding to the language for any relation type, and obtaining a first global vector representation of a target entity in the independent semantic space corresponding to the relation type in the language relative to the relation type according to the first relation vector representation and the first sentence vector representation of all target sentences in the language;

constructing a second relation vector representation of the relation type in a consistent semantic space, and obtaining a second global vector representation of the target entity in the consistent semantic space relative to the relation type according to the second relation vector representation and second sentence vector representations of all target sentences in all languages;

and obtaining the global probability corresponding to the relation type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, selecting the maximum probability from the global probabilities corresponding to each relation type, and predicting the relation between the target entity pair according to the relation type corresponding to the maximum probability.

Preferably, a first sentence vector representation of each target sentence in the independent semantic space corresponding to the language is constructed, specifically:

for any target sentence, all words in the target sentence are obtained, all words are input to an input layer of an encoder of a pre-trained independent semantic space, and a first expression vector corresponding to each word is obtained;

combining the first expression vectors corresponding to all the words to obtain a first expression vector sequence, inputting the first expression vector sequence into a core processing layer of an encoder of an independent semantic space, and obtaining a first sentence vector representation of the target sentence in the independent semantic space corresponding to the language;

wherein the first representation vector comprises a word vector and a position vector.

Preferably, a second sentence vector representation of each target sentence in the consistency semantic space is constructed, specifically:

for any target sentence, all words in the target sentence are obtained, all words are input to an input layer of an encoder of a pre-trained consistent semantic space, and a second expression vector corresponding to each word is obtained;

combining the second expression vectors corresponding to all the words to obtain a second expression vector sequence, inputting the second expression vector sequence into a core processing layer of an encoder of a consistent semantic space, and obtaining a second sentence vector representation of the target sentence in the consistent semantic space;

wherein the second representation vector comprises a word vector and a position vector.

Preferably, a first global vector representation of the target entity pair in the independent semantic space corresponding to the language with respect to the relationship type is obtained according to the first relationship vector representation and the first sentence vector representation of all target sentences in the language, and the specific calculation formula is as follows:

wherein s is_jRepresenting a first global vector of the relative relation type r in an independent semantic space corresponding to the language j for the target entity;

is the attention weight of the target sentence i relative to the relation type r in the independent semantic space corresponding to the language j, r_jIs a first relation vector representation of relation type r in an independent semantic space corresponding to language j,

and representing a target sentence i in a first sentence vector in an independent semantic space corresponding to the language j, wherein k is any one target sentence in the language j, and m is the total number of all target sentences in the language j.

Preferably, a second global vector representation of the target entity pair in the consistent semantic space with respect to the relationship type is obtained according to the second relationship vector representation and the second sentence vector representations of all target sentences in all languages, and the specific calculation formula is as follows:

wherein the content of the first and second substances,

representing a second global vector of the target entity relative relationship type r in a consistency semantic space;for attention weight of a target sentence i in language j in the consistent semantic space with respect to relationship type r,

for a second relationship vector representation of relationship type r in the consistency semantic space,

and representing a second sentence vector of the target sentence i in the language j in the consistent semantic space, wherein l is any one of multiple languages, k is any one of the target sentences in the language l, m is the total number of all the target sentences in the language l, and n is the total number of types of all the languages.

Preferably, the global probability corresponding to the relationship type is obtained based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relationship type, specifically:

obtaining a first conditional probability of the relationship type in a consistency semantic space based on the second global vector representation by utilizing a normalization function;

obtaining a second conditional probability of the relation type in the independent semantic space corresponding to each language based on a first global vector of the target entity in the independent semantic space corresponding to each language relative to the relation type by utilizing a normalization function;

and obtaining the global probability corresponding to the relationship type according to the first conditional probability and all the second conditional probabilities.

Preferably, constructing a second sentence vector representation of each target sentence in the consistency semantic space further comprises:

initializing an encoder network and a discriminator network, and performing wheel-by-wheel pair resistance training on the encoder network and the discriminator network so that the output result of the encoder network can erase the characteristics of each language to confuse the discriminator network and obtain a trained encoder with a consistent semantic space;

during the wheel-by-wheel pair resistance training process, the parameters of the encoder network are optimized by adopting a random gradient descent algorithm so as to minimize the correct probability of a judgment result output by the discriminator network; and the parameters of the discriminator network are optimized by adopting a random gradient descent algorithm so as to maximize the correct probability of the discrimination result output by the discriminator network.

In one aspect, the present invention provides a multilingual entity relationship extraction system based on an adversarial training mechanism, comprising:

a sentence coding module, configured to, for any one of multiple languages, obtain a preset number of sentences related to a target entity pair in the language as target sentences, and construct a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language and a second sentence vector representation in a consistent semantic space;

the first attention mechanism module is used for constructing a first relation vector representation of the relation type in an independent semantic space corresponding to the language for any relation type in the preset plurality of relation types, and obtaining a first global vector representation of a target entity in the independent semantic space corresponding to the language relative to the relation type according to the first relation vector representation and the first sentence vector representation of all target sentences in the language;

the second attention mechanism module is used for constructing a second relation vector representation of the relation type in the consistency semantic space and obtaining a second global vector representation of the target entity in the consistency semantic space relative to the relation type according to the second relation vector representation and the second sentence vector representation of all target sentences in all languages;

and the relation extraction module is used for obtaining the global probability corresponding to the relation type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, selecting the maximum probability from the global probabilities corresponding to the relation types, and predicting the relation between the target entity pair according to the relation type corresponding to the maximum probability.

In one aspect, the present invention provides an electronic device comprising:

at least one processor; and

at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor, the processor being capable of performing any of the methods described above when invoked by the processor.

In one aspect, the invention provides a non-transitory computer readable storage medium storing computer instructions that cause a computer to perform any of the methods described above.

The invention provides a multilingual entity relation extraction method and system based on an antagonistic training mechanism, which respectively encode target sentences in related languages of a target entity into independent semantic spaces corresponding to the languages and consistent semantic spaces corresponding to all the languages, and acquire independent information of each language and consistent information of cross-language contained in the target sentences; and then respectively measuring the attention weight of each target sentence relative to each relationship type by adopting an independent attention mechanism of each language and a consistent attention mechanism among the languages, finally obtaining the global probability corresponding to each relationship type by combining the attention weights of all the target sentences relative to each relationship type, selecting the maximum probability from the global probabilities corresponding to each relationship type, and finally predicting the relationship between the target entity pairs according to the relationship type corresponding to the maximum probability. The method and the system can effectively extract cross-language semantic information and unique structure information of each language in the multi-language text, can deeply utilize the complementarity among multiple languages, so as to achieve the effect of improving the relation extraction task expression under the multi-language scene in practical application, are beneficial to improving the quality of automatically constructing the knowledge graph, and have wide application prospect.

Drawings

FIG. 1 is a schematic overall flowchart of a multilingual entity-relationship extraction method based on an anti-training mechanism according to an embodiment of the present invention;

FIG. 2 is a simulation diagram of a multilingual entity-relationship extraction method based on an anti-training mechanism according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating an overall structure of a multi-language entity relation extraction system based on an anti-exercise mechanism according to an embodiment of the present invention;

fig. 4 is a schematic structural framework diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

Fig. 1 is a schematic overall flow chart of a multilingual entity relationship extraction method based on an anti-training mechanism according to an embodiment of the present invention, and as shown in fig. 1, the present invention provides a multilingual entity relationship extraction method based on an anti-training mechanism, including:

s1, for any one of a plurality of languages, obtaining a preset number of sentences related to the target entity pairs in the language as target sentences, and constructing a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language and a second sentence vector representation in a consistent semantic space;

specifically, in this embodiment, two entities to be subjected to relationship extraction are used as a target entity pair. When the target entity pair is subjected to relationship extraction, a preset number of sentences related to the target entity pair are obtained in multiple languages, and each sentence related to the target entity pair is used as a target sentence. Wherein, the sentence related to the target entity pair means that the sentence includes the target entity pair. For example, assuming that the target entity pair is "china" and "beijing", a preset number of sentences including "china" and "beijing" can be obtained from multiple languages such as chinese, english, and japanese as target sentences. In addition, the preset number in this embodiment may be set according to actual requirements, and is not specifically limited herein.

After a preset number of target sentences are obtained in each language, encoders of respective independent semantic spaces are trained in advance in semantic spaces corresponding to the languages, and encoders of consistent semantic spaces are trained in advance in consistent semantic spaces corresponding to all the languages on the basis of a countertraining mechanism. On the basis, for any language, a first sentence vector representation of each target sentence in the language in the independent semantic space corresponding to the language is constructed by utilizing the encoder of the independent semantic space corresponding to the language, and a second sentence vector representation of each target sentence in the language in the consistent semantic space is constructed by utilizing the encoder of the consistent semantic space. Thereby, a first sentence vector representation in a respective independent semantic space and a second sentence vector representation in a consistent semantic space for each target sentence in each language may be obtained. The first sentence vector representation of the target sentence reflects the unique information of the target sentence in the corresponding language; the second vector representation of the target sentence embodies the consistency information of the target sentence across languages.

S2, presetting a plurality of relation types, constructing a first relation vector representation of the relation type in an independent semantic space corresponding to the language for any relation type, and obtaining a first global vector representation of a target entity in the independent semantic space corresponding to the relation type relative to the relation type according to the first relation vector representation and the first sentence vector representation of all target sentences in the language;

it should be noted that, because different encoders are used in the above technical solutions to separately encode and obtain the first sentence vector representation and the second sentence vector representation of each target sentence, on this basis, different attention mechanisms are needed to measure the information richness of each target sentence. In view of this, in the present embodiment, a plurality of relationship types are preset for the target entity pair. On the basis, firstly, the information richness of each target sentence is measured by adopting an attention mechanism independent of each language, and the specific implementation process is as follows:

for any one relation type in a plurality of preset relation types, in an independent semantic space corresponding to a certain language, first, a first relation vector representation of the relation type in the independent semantic space corresponding to the language is constructed. At the same time, a first sentence vector representation of all target sentences in the language is obtained. On the basis, for a certain target sentence in the language, the attention weight of the target sentence can be obtained according to the first relation vector representation and the first sentence vector representation of the target sentence and by combining the first sentence vector representations of all other target sentences in the language. On the basis, the attention weights of all target sentences in the language and the first sentence vector representations of all the target sentences are combined, so that the first global vector representation of the target entity in the independent semantic space corresponding to the language relative to the relationship type can be obtained. Similarly, the first global vector representation of the target entity pair in the independent semantic space corresponding to other languages relative to the relationship type can be obtained, that is, the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relationship type can be obtained. Similarly, a first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to each relationship type can be obtained.

S3, constructing a second relation vector representation of the relation type in the consistency semantic space, and obtaining a second global vector representation of the target entity in the consistency semantic space relative to the relation type according to the second relation vector representation and the second sentence vector representation of all target sentences in all languages;

specifically, on the basis of the above technical solution, meanwhile, the information richness of each target sentence is measured by adopting a consistent attention mechanism among languages, and the specific implementation process is as follows:

for any one relation type in a plurality of preset relation types, a second relation vector representation of the relation type in a consistency semantic space is constructed firstly. At the same time, second sentence vector representations of all target sentences in all languages are obtained. On the basis, for a certain target sentence in a certain language, the attention weight of the target sentence can be obtained according to the second relation vector representation and the second sentence vector representation of the target sentence and by combining the second sentence vector representations of all other target sentences in all languages. On the basis, the attention weights of all target sentences in all languages and the second sentence vector representations of all the target sentences are combined, so that the second global vector representation of the target entity in the consistency semantic space relative to the relationship type can be obtained. Similarly, a second global vector representation of the target entity pair in the consistent semantic space relative to other relationship types can be obtained, that is, a second global vector representation of the target entity pair in the consistent semantic space relative to each relationship type can be obtained.

S4, obtaining the global probability corresponding to the relation type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, selecting the maximum probability from the global probabilities corresponding to each relation type, and predicting the relation between the target entity pair according to the relation type corresponding to the maximum probability.

Specifically, on the basis of the above technical solution, the probability that a certain relationship type exists between the target entity pair is further determined. For a certain relation type, a first conditional probability of the relation type in a consistent semantic space is obtained based on a second global vector representation, meanwhile, a second conditional probability of the relation type in an independent semantic space corresponding to each language is obtained based on a first global vector of a target entity in the independent semantic space corresponding to each language relative to the relation type, and finally, the global probability corresponding to the relation type can be obtained by combining the first conditional probability and all the second conditional probabilities. In the same way, the global probability corresponding to other relation types can be obtained. The global probability corresponding to a certain relationship type is the probability that the relationship type exists between the target entity pairs.

On the basis of obtaining the global probabilities corresponding to the relationship types, selecting the maximum probability from the global probabilities corresponding to the relationship types, and finally predicting the relationship between the target entity pair according to the relationship type corresponding to the maximum probability. For example, if the relationship type corresponding to the maximum probability is r, the relationship between the target entity pair can be predicted to be r.

The invention provides a multilingual entity relation extraction method based on an antagonistic training mechanism, which is characterized in that target sentences in related languages of a target entity are respectively coded into independent semantic spaces corresponding to the languages and consistent semantic spaces corresponding to all the languages, so that independent information of each language and cross-language consistent information contained in the target sentences are obtained; and then respectively measuring the attention weight of each target sentence relative to each relationship type by adopting an independent attention mechanism of each language and a consistent attention mechanism among the languages, finally obtaining the global probability corresponding to each relationship type by combining the attention weights of all the target sentences relative to each relationship type, selecting the maximum probability from the global probabilities corresponding to each relationship type, and finally predicting the relationship between the target entity pairs according to the relationship type corresponding to the maximum probability. The method can effectively extract cross-language semantic information and unique structure information of each language in the multi-language text, can deeply utilize the complementarity among multiple languages, achieves the effect of improving the relation extraction task performance under the multi-language scene in practical application, is favorable for improving the quality of automatically constructing the knowledge graph, and has wide application prospect.

Based on any of the embodiments above, a multilingual entity relationship extraction method based on an adversarial training mechanism is provided, and a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language is constructed, specifically: for any target sentence, all words in the target sentence are obtained, all words are input to an input layer of an encoder of a pre-trained independent semantic space, and a first expression vector corresponding to each word is obtained; combining the first expression vectors corresponding to all the words to obtain a first expression vector sequence, inputting the first expression vector sequence into a core processing layer of an encoder of an independent semantic space, and obtaining a first sentence vector representation of the target sentence in the independent semantic space corresponding to the language; wherein the first representation vector comprises a word vector and a position vector.

Specifically, on the basis of the above technical solution, a specific implementation process for constructing a first sentence vector representation of each target sentence in the independent semantic space corresponding to the language is as follows:

for any target sentence in any language, all words in the target sentence are obtained, and then all words are input into a pre-trained encoder of an independent semantic space, wherein the pre-trained encoder of the independent semantic space is pre-trained in the independent semantic space corresponding to the language. In this embodiment, the network structure of the encoder of the independent semantic space includes an input layer and a core processing layer. Firstly, all words are input into an input layer of an encoder of a pre-trained independent semantic space, and each word is converted into a corresponding first expression vector by using the input layer. The first expression vector is composed of a word vector and a position vector, the word vector is used for describing grammar and semantic information of each word, the position vector is used for describing position information of each word in a sentence, the position information is defined as vector expression of mutual position difference between each word and a head entity and a tail entity in the sentence, and the word vector and the position vector of each word are spliced to obtain the first expression vector corresponding to each word.

Furthermore, the first expression vectors corresponding to all the words are combined to obtain a first expression vector sequence, and then the first expression vector sequence is input to a core processing layer of an encoder of an independent semantic space. In this embodiment, the core processing layer may be a convolutional neural network structure, and the core processing layer converts the input first expression vector sequence into a first sentence vector expression of the target sentence in the independent semantic space corresponding to the language through convolution, pooling and nonlinear operations.

Wherein a convolution operation is defined as an operation between the first sequence of representation vectors x and the convolution matrix W. The convolution operation may extract the local features through a sliding window with a length of m, and the ith dimension feature of the target sentence is defined as:

h_i＝[Wx_i-m+1:i+b]_i

wherein x is_i-m+1：iIs the concatenation of all the first representation vectors inside the ith window, w is the convolution kernel matrix, and b is the offset vector.

Further, pooling and nonlinear operation are carried out on the ith dimension characteristic of the target sentence, wherein the primary function of pooling is to select a signal strongest value in each dimension of each local sampling output, so that the global semantic characteristic can be finally obtained in a summary manner. Since the function that can be fitted only by convolution and pooling is linear, in order to obtain a stronger characterization capability, nonlinear activation by a hyperbolic tangent function is required. After the above processing steps, the ith dimension feature of the target sentence in the independent semantic space corresponding to the language j can be defined as:

wherein the language j is any one of a plurality of languages. On the basis, the first sentence vector representation of the target sentence in the independent semantic space corresponding to the language j can be obtained by combining all dimensional characteristics of the target sentence.

In addition, in this embodiment, the core processing layer may also be a recurrent neural network structure, and specifically, a bidirectional recurrent neural network structure. The bidirectional recurrent neural network encodes semantic information of a target sentence from two directions:

wherein the content of the first and second substances,

and

respectively representing sentence coding vectors obtained in the forward direction and the backward direction; RNN represents a recurrent neural network element, and the first sentence-representation vector of the final target sentence is formed by splicing forward and backward vectors, and can be represented as:

in addition, in other embodiments, the specific structure of the encoder of the independent semantic space may be set according to actual requirements, and is not specifically limited herein.

The invention provides a multilingual entity relationship extraction method based on an antagonistic training mechanism, which comprises the steps of acquiring all words in any target sentence, inputting all the words to an input layer of a coder of a pre-trained independent semantic space, and acquiring a first expression vector corresponding to each word; and combining the first expression vectors corresponding to all the words to obtain a first expression vector sequence, inputting the first expression vector sequence into a core processing layer of an encoder of an independent semantic space, and obtaining a first sentence vector representation of the target sentence in the independent semantic space corresponding to the language. The method can effectively extract unique structural information of each language, is beneficial to subsequently extracting the entity relationship by combining the cross-language semantic information in the multi-language text, and can effectively improve the accuracy of the entity relationship extraction result.

Based on any one of the embodiments, a multilingual entity relationship extraction method based on an adversarial training mechanism is provided, and a second sentence vector representation of each target sentence in a consistent semantic space is constructed, specifically: for any target sentence, all words in the target sentence are obtained, all words are input to an input layer of an encoder of a pre-trained consistent semantic space, and a second expression vector corresponding to each word is obtained; combining the second expression vectors corresponding to all the words to obtain a second expression vector sequence, inputting the second expression vector sequence into a core processing layer of an encoder of a consistent semantic space, and obtaining a second sentence vector representation of the target sentence in the consistent semantic space; wherein the second representation vector comprises a word vector and a position vector.

Specifically, on the basis of the above technical solution, a specific implementation process for constructing the second sentence vector representation of each target sentence in the consistency semantic space is as follows:

for any target sentence in any language, all words in the target sentence are obtained, and then all words are input into a pre-trained encoder of a consistent semantic space, wherein the pre-trained encoder of the consistent semantic space is based on a countermeasure training mechanism in the consistent semantic space. In this embodiment, the network structure of the encoder of the consistent semantic space includes an input layer and a core processing layer. Firstly, all words are input into an input layer of an encoder of a pre-trained consistency semantic space, and each word is converted into a corresponding second expression vector by using the input layer. The second expression vector is composed of a word vector and a position vector, the word vector is used for describing grammar and semantic information of each word, the position vector is used for describing position information of each word in a sentence, the position information is defined as vector expression of mutual position difference between each word and a head entity and a tail entity in the sentence, and the word vector and the position vector of each word are spliced to obtain the second expression vector corresponding to each word.

Furthermore, second expression vectors corresponding to all words are combined to obtain a second expression vector sequence, and then the second expression vector sequence is input to a core processing layer of an encoder of a consistency semantic space. In this embodiment, the core processing layer may be a convolutional neural network structure, and the core processing layer converts the input second expression vector sequence into a second sentence vector expression of the target sentence in the consistency semantic space through convolution, pooling and nonlinear operations. The specific implementation process is the same as the processing step of the core processing layer of the encoder in the independent semantic space, and reference may be made to the corresponding method steps in the above method embodiments, which are not described herein again.

In addition, in this embodiment, the core processing layer may also be a recurrent neural network structure, and specifically, a bidirectional recurrent neural network structure. The bidirectional cyclic neural network encodes semantic information of a target sentence from two directions and respectively obtains forward and backward sentence encoding vectors; finally, the forward sentence coding vectors and the backward sentence coding vectors are spliced, and the second sentence vector representation of the target sentence in the consistency semantic space can be obtained. The specific implementation process is the same as the processing step of the core processing layer of the encoder in the independent semantic space, and reference may be made to the corresponding method steps in the above method embodiments, which are not described herein again.

The invention provides a multi-language entity relation extraction method based on an antagonistic training mechanism, which comprises the steps of acquiring all words in any target sentence, inputting all the words to an input layer of a pre-trained encoder of a consistent semantic space, and acquiring a second expression vector corresponding to each word; and combining the second expression vectors corresponding to all the words to obtain a second expression vector sequence, inputting the second expression vector sequence into a core processing layer of an encoder of a consistent semantic space, and obtaining a second sentence vector representation of the target sentence in the consistent semantic space. The method can effectively extract cross-language semantic information in the multi-language text, is beneficial to subsequently combining unique structure information of each language to extract the entity relationship, and can effectively improve the accuracy of the entity relationship extraction result.

Based on any of the embodiments above, a multi-language entity relationship extraction method based on an adversarial training mechanism is provided, according to a first relationship vector representation and a first sentence vector representation of all target sentences in the language, a first global vector representation of a target entity pair in an independent semantic space corresponding to the relationship type in the language is obtained, and a specific calculation formula is as follows:

Specifically, on the basis of the above technical solution, a first global vector representation of the target entity pair in the independent semantic space corresponding to the language with respect to the relationship type is obtained according to the first relationship vector representation and the first sentence vector representation of all target sentences in the language, and the specific implementation process is as follows:

for any one relation type in a plurality of preset relation types, in an independent semantic space corresponding to a certain language, first, a first relation vector representation of the relation type in the independent semantic space corresponding to the language is constructed. At the same time, a first sentence vector representation of all target sentences in the language is obtained. On the basis, for a certain target sentence in the language, according to the first relation vector representation and the first sentence vector representation of the target sentence, and in combination with the first sentence vector representations of all other target sentences in the language, the attention weight of the target sentence in the independent semantic space corresponding to the language relative to the relation type can be obtained. The specific calculation formula is as follows:

wherein the content of the first and second substances,

In the above formula, i represents a specific target sentence among all target sentences in the language j, and k represents any target sentence in the language j.

On the basis, the attention weights of all target sentences in the language and the first sentence vector representations of all the target sentences are combined, so that the first global vector representation of the target entity in the independent semantic space corresponding to the language relative to the relationship type can be obtained. The specific calculation formula is as follows:

wherein s is_jAnd representing the target entity pair by a first global vector of the relative relationship type r in the independent semantic space corresponding to the language j.

Based on any of the embodiments above, a multi-language entity relationship extraction method based on a countervailing training mechanism is provided, according to a second relationship vector representation and second sentence vector representations of all target sentences in all languages, a second global vector representation of a target entity pair in a consistent semantic space with respect to the relationship type is obtained, and a specific calculation formula is as follows:

wherein the content of the first and second substances,

representing a second global vector of the target entity relative relationship type r in a consistency semantic space;

for attention weight of a target sentence i in language j in the consistent semantic space with respect to relationship type r,

Specifically, on the basis of the above technical solution, a second global vector representation of the target entity pair in the consistent semantic space with respect to the relationship type is obtained according to the second relationship vector representation and the second sentence vector representations of all target sentences in all languages, and the specific implementation process is as follows:

for any one relation type in a plurality of preset relation types, a second relation vector representation of the relation type in a consistency semantic space is constructed firstly. At the same time, second sentence vector representations of all target sentences in all languages are obtained. On the basis, for a certain target sentence in a certain language, the attention weight of the target sentence can be obtained according to the second relation vector representation and the second sentence vector representation of the target sentence and by combining the second sentence vector representations of all other target sentences in all languages. The specific calculation formula is as follows:

wherein the content of the first and second substances,

for a second relationship vector representation of relationship type r in the consistency semantic space,and representing a second sentence vector of the target sentence i in the language j in the consistent semantic space, wherein l is any one of multiple languages, k is any one of the target sentences in the language l, m is the total number of all the target sentences in the language l, and n is the total number of types of all the languages.

It should be noted that, in the above calculation formula, j represents a specific language, and i represents a specific target sentence of all target sentences in the language j; l represents any one of a plurality of languages, and k represents any one target sentence in the language l.

On the basis, the attention weights of all target sentences in all languages and the second sentence vector representations of all the target sentences are combined, so that the second global vector representation of the target entity in the consistency semantic space relative to the relationship type can be obtained. The specific calculation formula is as follows:

wherein the content of the first and second substances,

a second global vector representation of the target entity relative relationship type r in the consistency semantic space.

Based on any of the embodiments above, a multi-language entity relationship extraction method based on an antagonistic training mechanism is provided, where a global probability corresponding to a relationship type is obtained based on a second global vector representation and a first global vector representation of a target entity pair in an independent semantic space corresponding to the relationship type in each language, and specifically: obtaining a first conditional probability of the relationship type in a consistency semantic space based on the second global vector representation by utilizing a normalization function; obtaining a second conditional probability of the relation type in the independent semantic space corresponding to each language based on a first global vector of the target entity in the independent semantic space corresponding to each language relative to the relation type by utilizing a normalization function; and obtaining the global probability corresponding to the relationship type according to the first conditional probability and all the second conditional probabilities.

Specifically, on the basis of the above technical solution, the global probability corresponding to the relationship type is obtained based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language, and the specific implementation process is as follows:

and obtaining a first conditional probability of the relationship type in the consistency semantic space based on the second global vector representation by utilizing a normalization function, wherein a specific calculation formula is as follows:

wherein the content of the first and second substances,for the first conditional probability of relationship type r in the consistency semantic space, softmax is a normalization function,

in order to be the second global vector, the first global vector,

a second relationship vector representation in the consistency semantic space for each relationship type constitutes a matrix,

is a bias vector in the consistency semantic space.

Meanwhile, a second conditional probability of the relation type in the independent semantic space corresponding to each language is obtained by using a normalization function based on a first global vector of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, and the specific calculation formula is as follows:

P(r|s_j)＝softmax[R_js_j+d_j]

wherein, P (r | s)_j) Is a second conditional probability of the relation type r in the independent semantic space corresponding to the language j, softmax is a normalization function, s_jRepresenting a first global vector of a relative relation type R in an independent semantic space corresponding to a language j for a target entity_jA matrix composed of a first relation vector representation in an independent semantic space corresponding to language j for each relation type, d_jIs the bias vector in the independent semantic space corresponding to language j.

Finally, obtaining the global probability corresponding to the relationship type according to the first conditional probability and all the second conditional probabilities, wherein a specific calculation formula is as follows:

wherein, P (r | T) is the global probability corresponding to the relation type r, the language j is any one of the languages, and the total number of the types of all the languages is n.

The invention provides a multi-language entity relation extraction method based on an adversarial training mechanism, which utilizes a normalization function to obtain a first conditional probability of a relation type in a consistent semantic space based on second global vector representation; obtaining a second conditional probability of the relation type in the independent semantic space corresponding to each language based on a first global vector of the target entity in the independent semantic space corresponding to each language relative to the relation type by utilizing a normalization function; and obtaining the global probability corresponding to the relationship type according to the first conditional probability and all the second conditional probabilities. The method combines the semantic information of cross-languages in the multi-language text and the unique structure information of each language to extract the entity relationship, and can effectively improve the accuracy of the entity relationship extraction result.

Based on any of the above embodiments, there is provided a multi-language entity relationship extraction method based on an adversarial training mechanism, which constructs a second sentence vector representation of each target sentence in a consistent semantic space, and includes the following steps: initializing an encoder network and a discriminator network, and performing wheel-by-wheel pair resistance training on the encoder network and the discriminator network so that the output result of the encoder network can erase the characteristics of each language to confuse the discriminator network, thereby training an encoder which can better encode the consistent semantic space of cross-language information; during the wheel-by-wheel pair resistance training process, the parameters of the encoder network are optimized by adopting a random gradient descent algorithm so as to minimize the correct probability of a judgment result output by the discriminator network; and the parameters of the discriminator network are optimized by adopting a random gradient descent algorithm so as to maximize the correct probability of the discrimination result output by the discriminator network.

Specifically, in this embodiment, before constructing the second sentence vector representation of each target sentence in the consistent semantic space, the encoder of the consistent semantic space needs to be trained in advance, so that the target sentences of various languages can be encoded into the consistent semantic space to extract the consistency between the languages. In order to eliminate the mode difference between sentence vector representations of target sentences of all languages in the consistent semantic space, a countertraining mechanism is adopted to train an encoder of the consistent semantic space, and the specific training process is as follows:

using a multi-layered perceptron network as a discriminator for determining from which language a given sentence vector representation is derived, the discrimination function being specific to a given sentence vector representation

The probability distribution to decide which language it belongs to is as follows:

the MLP function represents the output of the multi-layer perceptron network of discriminators.

During the training process, the discriminator needs to maximize the probability that it discriminates accurately under the vector distribution given by the encoder, and the encoder needs to confuse the discriminator by changing its own coding distribution in an attempt to minimize the probability that the discriminator discriminates accurately. The two wheels resist training one by one, so that the distribution coded by the coder can erase the characteristics of each language to confuse the discriminator, thereby better extracting the cross-language semantic information.

In the training process, we achieve the countertraining by alternately minimizing two loss functions of the encoder and the discriminator, and specifically, for a consistent semantic space encoder, our optimization objective is as follows:

wherein

As a function of the losses of the encoder,

all parameters of the encoder for consistent semantic space, l being any one of a plurality of languages, T_lFor the set of all sentences in the language l,

and (3) an encoding function of an encoder for expressing the consistent semantic space of the j language, namely, inputting an input vector in the language to obtain a semantic vector encoded into the consistent semantic space. It can be seen that the intuitive interpretation of this optimization process is byThe parameters of the coincidence encoder are changed to minimize the probability that the discrimination function discriminates correctly.

For the arbiter, our optimization objective is as follows:

wherein the content of the first and second substances,

as a loss function of the discriminator, theta_DAn intuitive interpretation of this optimization process is to maximize the probability of discrimination correctness by changing the parameters of the discriminators for all parameters of the network of discriminators.

The training method adopted in the whole confrontation training process is a random gradient descent method, and the confrontation training effect is achieved by alternately performing gradient descent on the two loss functions by wheel pair according to a certain proportion. When the discriminator and the encoder reach balance in training, the sentence vector representations from different languages no longer have mode differences which can be distinguished obviously in a consistent semantic space, so that the capability of extracting cross-language consistent information can be effectively improved, and the performance of extracting multi-language relationship is further improved.

The invention provides a multi-language entity relation extraction method based on an adversarial training mechanism, before constructing a second sentence vector representation of each target sentence in a consistent semantic space,

initializing an encoder network and a discriminator network, and performing wheel-by-wheel pair resistance training on the encoder network and the discriminator network so that the output result of the encoder network can erase the characteristics of each language to confuse the discriminator network and obtain a trained encoder with a consistent semantic space; wherein the content of the first and second substances,

in the wheel-by-wheel pair resistance training process, the parameters of the encoder network are optimized by adopting a random gradient descent algorithm so as to minimize the correct probability of a judgment result output by the discriminator network; and the parameters of the discriminator network are optimized by adopting a random gradient descent algorithm so as to maximize the correct probability of the discrimination result output by the discriminator network. The method adopts an encoder for obtaining the consistent semantic space by training an antithetical training mechanism, so that target sentences of various languages can be encoded into the consistent semantic space to extract the consistency among the languages, the mode difference of the target sentences of the languages in the consistent semantic space among sentence vector representations is eliminated, the capability of extracting cross-language consistent information is effectively improved, and the performance of extracting multi-language relationship is further improved.

In order to better understand the processing steps in the above method embodiments, the following examples are specifically described:

FIG. 2 is a schematic diagram of a multi-language entity relation extraction method based on an anti-training mechanism according to an embodiment of the present invention, as shown in FIG. 2, the schematic diagram simulates a processing procedure of the multi-language entity relation extraction method based on the anti-training mechanism in a two-language scenario, and the overall processing procedure is from bottom to top, and first converts words of a target sentence in the two languages into a representation vector

Then the vector representation of sentences in each language independent space is obtained after the encoding of the encoder networkAnd sentence vector representation within consistent semantic space

Wherein, the encoder E passes through a discriminator D and a consistency semantic space^CCoder E for consistent semantic space^CThe ability to extract cross-language information is obtained. And respectively adopting respective independent attention mechanisms of the two languages to obtain global vector representation s of the target entity pair in the independent semantic space corresponding to the first language relative to each relationship type₁And a global vector representation s in an independent semantic space corresponding to the second language₂. Meanwhile, a global vector representation of the target entity in the consistency semantic space relative to each relationship type is obtained by adopting the attention mechanism of consistency between two languages

Finally, the global vector representation s is combined₁、s₂And

the relationship prediction of the target entity pair can be carried out.

Fig. 3 is a schematic diagram of an overall structure of a multi-language entity relation extraction system based on an anti-exercise mechanism according to an embodiment of the present invention, and as shown in fig. 3, based on any of the above embodiments, a multi-language entity relation extraction system based on an anti-exercise mechanism is provided, which includes:

a sentence coding module 1, configured to, for any one of multiple languages, obtain a preset number of sentences related to a target entity pair in the language as target sentences, and construct a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language and a second sentence vector representation in a consistent semantic space;

the first attention mechanism module 2 is used for constructing a first relation vector representation of the relation type in the independent semantic space corresponding to the language for any relation type in the preset plurality of relation types, and obtaining a first global vector representation of the target entity in the independent semantic space corresponding to the language relative to the relation type according to the first relation vector representation and the first sentence vector representation of all target sentences in the language;

the second attention mechanism module 3 is used for constructing a second relation vector representation of the relation type in the consistency semantic space, and obtaining a second global vector representation of the target entity in the consistency semantic space relative to the relation type according to the second relation vector representation and the second sentence vector representation of all target sentences in all languages;

and the relation extraction module 4 is used for obtaining the global probability corresponding to the relation type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, selecting the maximum probability from the global probabilities corresponding to each relation type, and predicting the relation between the target entity pair according to the relation type corresponding to the maximum probability.

Specifically, the present invention provides a multilingual entity relationship extraction system based on an adversarial training mechanism, which includes a sentence coding module 1, a first attention mechanism module 2, a second attention mechanism module 3, and a relationship extraction module 4, and the method in any one of the above method embodiments is implemented through cooperation between the modules, and the specific implementation process can refer to the above method embodiments, which is not described herein again.

The invention provides a multi-language entity relation extraction system based on an antagonistic training mechanism, which is characterized in that target sentences in related languages of a target entity are respectively coded into independent semantic spaces corresponding to the languages and consistent semantic spaces corresponding to all the languages, so that independent information and cross-language consistent information of each language contained in the target sentences are obtained; and then respectively measuring the attention weight of each target sentence relative to each relationship type by adopting an independent attention mechanism of each language and a consistent attention mechanism among the languages, finally obtaining the global probability corresponding to each relationship type by combining the attention weights of all the target sentences relative to each relationship type, selecting the maximum probability from the global probabilities corresponding to each relationship type, and finally predicting the relationship between the target entity pairs according to the relationship type corresponding to the maximum probability. The system can effectively extract cross-language semantic information and unique structure information of each language in the multi-language text, can deeply utilize the complementarity among multiple languages, achieves the effect of improving the relation extraction task performance under the multi-language scene in practical application, is favorable for improving the quality of automatically constructing the knowledge graph, and has wide application prospect.

Fig. 4 shows a block diagram of an electronic device according to an embodiment of the present invention. Referring to fig. 4, the electronic device includes: a processor (processor)41, a memory (memory)42, and a bus 43; wherein, the processor 41 and the memory 42 complete the communication with each other through the bus 43; the processor 41 is configured to call the program instructions in the memory 42 to execute the method provided by any of the above method embodiments, for example, including: for any one of a plurality of languages, acquiring a preset number of sentences related to target entity pairs in the language as target sentences, and constructing a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language and a second sentence vector representation in a consistent semantic space; presetting a plurality of relation types, constructing a first relation vector representation of the relation type in an independent semantic space corresponding to the language for any relation type, and obtaining a first global vector representation of a target entity in the independent semantic space corresponding to the relation type in the language relative to the relation type according to the first relation vector representation and the first sentence vector representation of all target sentences in the language; constructing a second relation vector representation of the relation type in a consistent semantic space, and obtaining a second global vector representation of the target entity in the consistent semantic space relative to the relation type according to the second relation vector representation and second sentence vector representations of all target sentences in all languages; and obtaining the global probability corresponding to the relation type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, selecting the maximum probability from the global probabilities corresponding to each relation type, and predicting the relation between the target entity pair according to the relation type corresponding to the maximum probability.

The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by any of the above method embodiments, for example, comprising: for any one of a plurality of languages, acquiring a preset number of sentences related to target entity pairs in the language as target sentences, and constructing a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language and a second sentence vector representation in a consistent semantic space; presetting a plurality of relation types, constructing a first relation vector representation of the relation type in an independent semantic space corresponding to the language for any relation type, and obtaining a first global vector representation of a target entity in the independent semantic space corresponding to the relation type in the language relative to the relation type according to the first relation vector representation and the first sentence vector representation of all target sentences in the language; constructing a second relation vector representation of the relation type in a consistent semantic space, and obtaining a second global vector representation of the target entity in the consistent semantic space relative to the relation type according to the second relation vector representation and second sentence vector representations of all target sentences in all languages; and obtaining the global probability corresponding to the relation type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, selecting the maximum probability from the global probabilities corresponding to each relation type, and predicting the relation between the target entity pair according to the relation type corresponding to the maximum probability.

The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform a method provided by any of the above method embodiments, for example, comprising: for any one of a plurality of languages, acquiring a preset number of sentences related to target entity pairs in the language as target sentences, and constructing a first sentence vector representation of each target sentence in an independent semantic space corresponding to the language and a second sentence vector representation in a consistent semantic space; presetting a plurality of relation types, constructing a first relation vector representation of the relation type in an independent semantic space corresponding to the language for any relation type, and obtaining a first global vector representation of a target entity in the independent semantic space corresponding to the relation type in the language relative to the relation type according to the first relation vector representation and the first sentence vector representation of all target sentences in the language; constructing a second relation vector representation of the relation type in a consistent semantic space, and obtaining a second global vector representation of the target entity in the consistent semantic space relative to the relation type according to the second relation vector representation and second sentence vector representations of all target sentences in all languages; and obtaining the global probability corresponding to the relation type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language relative to the relation type, selecting the maximum probability from the global probabilities corresponding to each relation type, and predicting the relation between the target entity pair according to the relation type corresponding to the maximum probability.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, the method of the present application is only a preferred embodiment and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A multi-language entity relation extraction method based on an adversarial training mechanism is characterized by comprising the following steps:

2. The method according to claim 1, wherein constructing a first sentence vector representation of each target sentence in the independent semantic space corresponding to the language is specifically:

3. The method according to claim 1, characterized by constructing a second sentence vector representation of each target sentence in the consistency semantic space, in particular:

4. The method according to claim 1, wherein the first global vector representation of the target entity pair in the independent semantic space corresponding to the language with respect to the relationship type is obtained according to the first relationship vector representation and the first sentence vector representation of all target sentences in the language, and the specific calculation formula is:

is the attention weight of the target sentence i relative to the relation type r in the independent semantic space corresponding to the language j, r_jIs a first relation vector representation of relation type r in an independent semantic space corresponding to language j,and representing a target sentence i in a first sentence vector in an independent semantic space corresponding to the language j, wherein k is any one target sentence in the language j, and m is the total number of all target sentences in the language j.

5. The method according to claim 1, wherein a second global vector representation of the target entity pair in the consistent semantic space with respect to the relationship type is obtained from the second relationship vector representation and the second sentence vector representation of all target sentences in all languages, and the specific calculation formula is:

wherein the content of the first and second substances,representing a second global vector of the target entity relative relationship type r in a consistency semantic space;

for attention weight of a target sentence i in language j in the consistent semantic space with respect to relationship type r,for a second relationship vector representation of relationship type r in the consistency semantic space,

6. The method according to claim 1, wherein the obtaining of the global probability corresponding to the relationship type based on the second global vector representation and the first global vector representation of the target entity pair in the independent semantic space corresponding to each language with respect to the relationship type specifically includes:

7. The method of claim 1, wherein constructing a second sentence vector representation of each target sentence in the consistency semantic space further comprises:

8. A multi-language entity relationship extraction system based on an adversarial training mechanism, comprising:

9. An electronic device, comprising:

at least one processor; and

at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 7.

10. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 7.