CN114254093A

CN114254093A - Multi-space knowledge enhanced knowledge graph question-answering method and system

Info

Publication number: CN114254093A
Application number: CN202111552990.6A
Authority: CN
Inventors: 李博涵; 季烨; 田佳颖; 刘毅; 吴佳骏; 向宇轩; 王高旭
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2021-12-17
Filing date: 2021-12-17
Publication date: 2022-03-29

Abstract

The invention relates to a knowledge graph question-answering method and a knowledge graph question-answering system with multi-space knowledge enhancement, which belong to the field of knowledge graph representation learning and question-answering systems.A neural network based on a self-attention mechanism is used for extracting the characteristics of input question sentences, vector representation is carried out on the question sentences according to the relevance among word levels and word sequence information in the question sentences, and the information in the question sentences is retained to a greater extent; secondly, embedding the knowledge graph into Euclidean space, complex vector space and hyperbolic space respectively, expressing candidate answer entities from multiple dimensions, utilizing information retention differences of different embedding spaces, expressing the candidate answer entities into multiple vector spaces, expanding the quantity of vector expressions, and comprehensively and reasonably expressing the candidate answers; and finally, by means of a double attention network, information in the knowledge graph is dynamically aggregated, scores of candidate answers are dynamically calculated, the knowledge representation capability of the knowledge graph question-answer model on the answer side is enhanced, and the accuracy of the knowledge graph question-answer model is improved.

Description

Multi-space knowledge enhanced knowledge graph question-answering method and system

Technical Field

The invention relates to the field of knowledge graph representation learning and question-answering systems, in particular to a knowledge graph question-answering method and a knowledge graph question-answering system with multi-space knowledge enhancement.

Background

The rapid development of knowledge graph makes it more and more important to utilize the knowledge graph, and knowledge graph question-answering is an important research direction in downstream tasks and has been receiving wide attention in recent years. The goal of the knowledge-graph question-answer is to automatically find answers from the knowledge-graph to the input natural language questions. The knowledge-graph question-answer is different from the interactive dialogue of the dialogue system and the robot, and the answer of the knowledge-graph question-answer is an entity or entity relation in the knowledge graph, and the answer of the knowledge-graph question-answer is generated through a model.

Methods of knowledge-graph question-answering can be broadly divided into two categories: a semantic parsing based method and an information extraction based method. The goal of the semantic parsing based approach is to build a semantic parser that can convert natural language into intermediate logical forms that allow the question-answering system to retrieve answers from a knowledge graph. Traditional supervised semantic resolvers rely heavily on annotated logical forms in lexical extraction and model training. These methods are difficult to extend to a wide range of domains due to the limitations of annotation data, and very difficult to cover due to the limitations of semantic parsers. In recent years, there have been many efforts to solve these problems, and representative research efforts have been to incorporate artificially constructed rules and features and training strategies employing weak supervision and remote supervision.

Different from a method based on semantic analysis, the method based on information extraction is to screen out a candidate answer set from a knowledge graph, then map the question and the candidate answer into a vector space and calculate the score between the question and the candidate answer, and finally find out the final answer according to the score. Since methods based on information extraction do not require artificial rules, such methods are more easily extended to larger scale or more complex knowledge maps and have therefore become more of a concern to researchers in recent years. The first step of the information extraction-based method is to construct a candidate answer set, for an input question, find its subject entity through an open interface, and then link the subject entity to the knowledge graph through the entity. Generally, researchers will select one or two hop nodes of a subject entity to form a candidate answer set. After the candidate answer set is constructed, the information extraction-based method can be broadly divided into three parts: vector representation of the question, vector representation of the candidate answer, and score calculation between the question and the candidate answer. In the vector representation of question, the earliest methods used a simple bag-of-words model for question representation. Researchers have then attempted to use more complex deep neural networks, typically involving multidimensional convolutional neural networks and bi-directional cyclic neural networks, in order to exploit word order information of the words in the question. In addition to this, some researchers have attempted to increase the information of a question using external information other than the question. The vector representation model of the question and the development of the natural language processing field are closely related, and in recent years, the deep neural network based on the self-attention mechanism is very colorful in the natural language processing field and achieves excellent performance in many downstream tasks. Therefore, in the aspect of expression and learning of question sentences, advanced results in the natural language processing field need to be followed, and a more excellent feature extraction module needs to be designed. On candidate answer representation, researchers have focused on how to better represent good candidate answers in the knowledge graph. Researchers initially use a subgraph containing candidate answers to represent, and later work considers that the representation mode of the subgraph is not accurate enough, and proposes a mode of representing by using answer entities, answer types, answer contexts and the like. Recent work has attempted to associate a question vector with an entity vector in a knowledge graph so that each type of representation in the knowledge graph can provide information on candidate answers with different weights.

Knowledge graph question-answering utilizes information in a knowledge graph based on a knowledge graph representation learning model, and in terms of the use of the knowledge graph representation learning model, the question-answering methods all use a model based on a semantic translation theory under the traditional Euclidean space. However, the knowledge graph representation model has been developed rapidly in recent years, and particularly, many knowledge graph representation learning models are not limited to euclidean space, and the semantic translation theory is expanded to complex vector space and other spaces. Related research work also proves that different embedding spaces have differences in retention of information in the knowledge graph spectrum, for example, embedding into a complex vector space is more favorable for logical reasoning, and embedding into a hyperbolic space can better retain structural information of the knowledge graph. Therefore, only one knowledge graph embedding space is used, the knowledge graph question-answering model cannot be accurately described, and the question-answering accuracy is affected.

Disclosure of Invention

The invention aims to provide a knowledge graph question-answering method and system with multi-space knowledge enhancement, which are used for comprehensively utilizing an attention network of multi-space information, enhancing the knowledge representation capability of a knowledge graph question-answering model on an answer side and improving the accuracy of the knowledge graph question-answering model.

In order to achieve the purpose, the invention provides the following scheme:

a multi-spatial knowledge-enhanced knowledge-graph question-answering method, the method comprising:

performing feature extraction on the input question through a neural network based on a self-attention mechanism to obtain a one-dimensional vector of the input question;

determining a plurality of candidate answers to the input question in the knowledge graph;

respectively embedding the knowledge graph into Euclidean space, complex vector space and hyperbolic space, and representing each candidate answer in the knowledge graph under each space by using entity information, path information, type information and context information in the knowledge graph under each space;

constructing an answer side attention network and a multi-space attention network;

according to the one-dimensional vector of the input question and each candidate answer in the knowledge graph under each space represented by the entity information, the path information, the type information and the context information in the knowledge graph under each space, dynamically aggregating multi-aspect information of the candidate answers by using an answer side attention network to obtain an answer vector of each candidate answer under each space;

according to the answer vector of each candidate answer in each space and each candidate answer in the knowledge graph in each space represented by the entity information, the path information, the type information and the context information in the knowledge graph in each space, the multi-space information of the knowledge graph is dynamically aggregated by using a multi-space attention network, and the one-dimensional vector of each candidate answer is obtained;

determining the score of each candidate answer in an inner product mode according to the one-dimensional vector of each candidate answer and the one-dimensional vector of the input question;

and determining a final correct answer set according to the score of each candidate answer.

Optionally, the obtaining a one-dimensional vector of the input question by performing feature extraction on the input question through a neural network based on a self-attention mechanism specifically includes:

embedding input words of the input question into a matrix to obtain word vector representation of the input question;

adding a position offset into the word vector representation to obtain a word vector representation with position information;

the word vector with the position information is input into the stacked self-attention neural network to obtain a one-dimensional vector of the input question; the stacked self-attentive neural network includes a multi-headed attention module, a residual module, and a normalization module.

Optionally, the determining a plurality of candidate answers to the input question in the knowledge graph specifically includes:

identifying a subject entity of the input question to obtain the subject entity of the input question;

associating the subject entity of the input question with an entity node in a knowledge graph through an entity linking tool to determine a subject entity node;

identifying entity nodes in two hops of the subject entity node in the knowledge graph, and determining the subject entity node and the entity nodes in the two hops of the subject entity node as candidate answers; and the entity nodes in the two hops are the entity nodes of which the shortest paths with the subject entity node are less than or equal to 2.

Optionally, the knowledge graph is respectively embedded into an euclidean space, a complex vector space, and a hyperbolic space, and specifically includes:

performing representation learning on the knowledge graph by adopting a TransE model of Euclidean space;

adopting a Rotate model of a complex vector space to express and learn the knowledge graph;

and performing representation learning on the knowledge graph by adopting a hyperbola spatial HyperKG model.

Optionally, the obtaining an answer vector of each candidate answer in each space by dynamically aggregating, according to the one-dimensional vector of the input question and each candidate answer in the knowledge graph in each space represented by the entity information, the path information, the type information, and the context information in the knowledge graph in each space, the multi-aspect information of the candidate answer by using the attention network at the answer side includes:

using a formula based on the one-dimensional vector of the input question and each candidate answer in each spatial knowledge map represented by entity information, path information, type information and context information in the respective spatial knowledge map

Calculating weights of entity information, path information, type information and context information; in the formula, alpha_iIs the weight of the information i, j representing entity information, path information, type information or context information, η_i、η_jInitial before normalization for information i, jWeight value, q is a one-dimensional vector of the input question, W₁Is a first weight matrix, v_1i、v_2i、v_3iInformation i representation in Euclidean space, complex vector space and hyperbolic space, b₁Is a first offset;

according to the weight of the entity information, the path information, the type information and the context information and each candidate answer in each spatial lower knowledge map represented by the entity information, the path information, the type information and the context information in the respective spatial lower knowledge map, using a formula

Calculating an answer vector of each candidate answer in each space; in the formula, gamma_klThe answer vector of the ith candidate answer in the kth space,

the information i of the ith candidate answer in the kth space represents that the kth space is Euclidean space, complex vector space or hyperbolic space.

Optionally, the obtaining a one-dimensional vector of each candidate answer by dynamically aggregating multi-space information of the knowledge graph by using the multi-space attention network according to the answer vector of each candidate answer in each space and each candidate answer in the knowledge graph in each space represented by entity information, path information, type information, and context information in the knowledge graph in each space specifically includes:

according to the answer vector of each candidate answer in each space and each candidate answer in each space knowledge graph represented by entity information, path information, type information and context information in each space knowledge graph, using a formula

Calculating the weights of Euclidean space, complex vector space and hyperbolic space; in the formula, beta_kIs the weight of the k-th space, θ_k、θ_nAre respectively the k and nInitial weight value before spatial normalization, W₂Is a second weight matrix, v_1k、v_2k、v_3k、v_4kEntity information representation, path information representation, type information representation, context information representation, respectively, of the k-th space, b₂Is a second offset;

according to the weights of Euclidean space, complex vector space and hyperbolic space and the answer vector of each candidate answer in each space, using formula

Calculating a one-dimensional vector of each candidate answer; in the formula, a_lIs a one-dimensional vector of the ith candidate answer, gamma_klThe answer vector of the ith candidate answer in the kth space.

Optionally, the determining the score of each candidate answer according to the one-dimensional vector of each candidate answer and the one-dimensional vector of the input question in an inner product manner specifically includes:

according to the one-dimensional vector of each candidate answer and the one-dimensional vector of the input question, a formula S (q, a) is utilized in an inner product mode_l)＝h(q，a_l) Determining a score for each candidate answer;

in the formula, S (q, a)_l) Is the score of the ith candidate answer, a_lIs a one-dimensional vector of the l-th candidate answer, q is a one-dimensional vector of the input question, and h () is an inner product function.

Optionally, the determining a final correct answer set according to the score of each candidate answer specifically includes:

according to the score of each candidate answer, using formula A_q＝{a_l|a_l∈C_qand S_max-S(q，a_l) < m }, determining a final correct answer set;

in the formula, A_qAs the final correct answer set, a_lIs a one-dimensional vector of the ith candidate answer, C_qA candidate answer set consisting of a plurality of candidate answers, S_maxThe highest score in the candidate answers, S (q,a_l) Is the score of the ith candidate answer, and m is the boundary parameter.

Optionally, the neural network based on the self-attention mechanism, the answer side attention network and the multi-space attention network are trained as an integral model, and a small-batch random gradient descent algorithm is adopted as an optimizer during training;

the objective function of the overall model is:

wherein L (q, a, a ') [ m-S (q, a) + S (q, a')]+，R_qFor a correct answer set, W_qIs a set of wrong answers, m is a boundary parameter, [ z ]]₊The representation takes the larger of 0 and z, S (q, a) is the score of a question and a correct answer, 5(q, a ') is the score of a question and a wrong answer, L (q, a, a ') is the loss value of pairwise training, a is the vector representation of a correct answer, a ' is the vector representation of a wrong answer, and q is the one-dimensional vector of the input question.

A multi-spatial knowledge-enhanced knowledge-graph question-answering system, the system comprising:

the characteristic extraction module is used for extracting the characteristics of the input question through a neural network based on a self-attention mechanism to obtain a one-dimensional vector of the input question;

the candidate answer determining module is used for determining a plurality of candidate answers of the input question in the knowledge graph;

the multi-space embedding module is used for respectively embedding the knowledge graph into Euclidean space, complex vector space and hyperbolic space, and expressing each candidate answer in the knowledge graph under each space by using entity information, path information, type information and context information in the knowledge graph under each space;

the network construction module is used for constructing an answer side attention network and a multi-space attention network;

the answer vector obtaining module is used for dynamically aggregating multi-aspect information of the candidate answers by using an answer side attention network according to the one-dimensional vector of the input question and each candidate answer in the knowledge map under each space, which is represented by the entity information, the path information, the type information and the context information in the knowledge map under each space, so as to obtain the answer vector of each candidate answer under each space;

a one-dimensional vector obtaining module, configured to dynamically aggregate multi-space information of the knowledge graph by using a multi-space attention network according to an answer vector of each candidate answer in each space and each candidate answer in the knowledge graph in each space, which is represented by entity information, path information, type information, and context information in the knowledge graph in each space, so as to obtain a one-dimensional vector of each candidate answer;

the score determining module is used for determining the score of each candidate answer in an inner product mode according to the one-dimensional vector of each candidate answer and the one-dimensional vector of the input question;

and the correct answer set determining module is used for determining a final correct answer set according to the score of each candidate answer.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the invention discloses a multi-space knowledge-enhanced knowledge map question-answering method and a multi-space knowledge-enhanced knowledge map question-answering system.A neural network based on a self-attention mechanism is used for extracting the characteristics of input questions, and vector representation is carried out on the questions according to the relevance among word levels and word sequence information in the questions, so that the information in the questions can be retained to a greater extent; secondly, embedding the knowledge graph into Euclidean space, complex vector space and hyperbolic space, expressing each candidate answer in the knowledge graph under each space by using entity information, path information, type information and context information in the knowledge graph under each space, expressing the candidate answer entity into a plurality of vector spaces by using the information retention difference of different embedding spaces, expanding the vector expression quantity, and comprehensively and reasonably expressing the candidate answers; and finally, by means of a double attention network, information in the knowledge graph is dynamically aggregated, scores of candidate answers are dynamically calculated, the knowledge representation capability of the knowledge graph question-answer model on the answer side is enhanced, and the accuracy of the knowledge graph question-answer model is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a flow chart of a multi-spatial knowledge enhanced knowledge-graph question-answering method according to the present invention;

FIG. 2 is a schematic diagram of a multi-spatial knowledge enhanced knowledge-graph question-answering method according to the present invention;

FIG. 3 is a graph comparing F1 scores at different embedding dimensions on a reference dataset as provided by the present invention with a conventional method using only a single knowledge-graph embedding space.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

The intelligent question answering is always an important research content in the field of artificial intelligence, and along with the rapid development of knowledge graphs, question answering systems based on the knowledge graphs are more and more concerned. And (4) combining intelligent question answering of the knowledge graph, mining and reasoning out potential relations through direct connection among entities in the knowledge graph, and returning the most accurate question answer to the user. The study of knowledge-graph question-answering is mainly divided into two technical routes, one is a rule analysis-based mode, and the other is an information extraction-based mode. At this stage, researchers have focused primarily on information-based extraction by representing both questions and answers into a vector space, thereby calculating scores to arrive at a final answer. In the aspect of representing the learning model, the question side mainly refers to the advanced technology in the natural language processing field, and the answer side depends on the representing learning model of the knowledge map. However, knowledge representation on the answer side is insufficient due to the limitation of the knowledge graph representation learning model on the graph information retention capability. The invention aims to comprehensively utilize knowledge graphs in various spaces to represent a learning model through an attention network, strengthen the representing capability of candidate answers, simultaneously follow up a front sequence model at the question side and improve the overall accuracy of a knowledge graph question-answering system.

The invention provides a knowledge graph question-answering method for enhancing multi-space knowledge, which comprises the following steps of:

step 101, performing feature extraction on an input question through a neural network based on a self-attention mechanism to obtain a one-dimensional vector of the input question.

The method specifically comprises the following steps:

adding position offset into the word vector representation to obtain a word vector representation with position information;

the word vector with the position information is input into the stacked self-attention neural network to obtain a one-dimensional vector of the input question; the stacked self-attentive neural network includes a multi-headed attentive module, a residual module, and a normalization module.

Embedding the input question into the matrix, and obtaining the word vector representation Q ═ w (w) of the input question₁，W₂…w_n) N is the length of the question, in the word vectorAdding a position offset pos to the base of (p)₁，p₂...p_n) Thereby obtaining word order information in the question, and then expressing Q (x) as a word vector with position information₁，x₂...x_n) As input from the attention neural network, a one-dimensional vector representation q of a question is obtained. The stacked self-attention neural network is composed of a multi-head attention module, a residual error module, a normalization module and the like, and is transmitted through a multi-layer network, and an input question is finally expressed into a one-dimensional vector.

The input from the attention neural network is the word vector representation with position information of the question Q ═ x (x)₁，x₂…x_n) The self-attention mechanism will divide each word vector xi of the input into three identical partial query vectors, key vectors and value vectors. The query vector and the key vector are used for calculating weights among words, and the value vector is used for value transmission. Generally, a plurality of self-attention modules are stacked together to form a multi-head self-attention module, and vectors output by the modules are spliced together and become the length of the final output of the model through linear mapping. The calculation of self-attention can be expressed by the following formula, Q, K is the vector used to calculate attention, V is the vector used for value transfer, and dk is used for scaling to avoid over-dot product.

Step 102, determining a plurality of candidate answers to the input question in the knowledge graph.

The method specifically comprises the following steps:

identifying entity nodes in two hops of the subject entity node in the knowledge graph, and determining the subject entity node and the entity nodes in the two hops of the subject entity node as candidate answers; the entity node within two hops is an entity node having a shortest path with the subject entity node of less than or equal to 2.

Firstly, using an open source interface of Freebase to identify a subject entity of an input question, obtaining a result that the subject is the subject of the question, and then associating the subject entity with an entity node in a knowledge graph through an entity linking tool. A knowledge graph is composed of triples < entity, relationship, entity >, wherein the entity nodes can be people, organizations, places, concepts, and the like. Theoretically, all entities in the knowledge graph should be used as candidate answers, but the method is too inefficient, so that neighbors in two hops of subject entity nodes in the knowledge graph are identified to form a candidate answer set, and the method can almost cover all candidate answers and greatly improve the model efficiency. In the knowledge graph, the length of the shortest path between an entity and the entity is called a hop, and the neighbor in two hops refers to an entity node with the shortest path to a subject entity less than or equal to 2.

And 103, respectively embedding the knowledge graph into Euclidean space, complex vector space and hyperbolic space, and representing each candidate answer in the knowledge graph under each space by using entity information, path information, type information and context information in the knowledge graph under each space.

Respectively embedding the knowledge graph into Euclidean space, complex vector space and hyperbolic space, specifically comprising:

performing representation learning on a knowledge graph spectrum by adopting a TransE model of a Euclidean space; the candidate answer in this space is expressed in four aspects as v₁，v₂，v₃，v₄Wherein v is₁Representing the answer entity itself in Euclidean space, v₂Representing answer entity type information representation in Euclidean space, v₃Representing answer entity relationship information representation in Euclidean space, v₄Representing the answer entity context information representation in euclidean space. In TransE, the relationship rel in the knowledge-graph is considered to be a translation between the head entity head and the tail entity tail in the knowledge-graph. Distance thereofThe function is represented as:

a Rotate model of a complex vector space is adopted to express and learn the knowledge graph spectrum; the candidate answer in this space is expressed in four aspects as v₅，v₆，v₇，v₈. Wherein v is₅Representing the answer entity itself in complex vector space, v₆Representing answer entity type information representation in complex vector space, v₇Representing answer entity relationship information representation in complex vector space, v₈Representing the answer entity context information representation in complex vector space. The rotate model defines the relation rel in the knowledge-graph as the rotation of the head entity head in the knowledge-graph to the tail entity tail in the complex vector space. The representation learning of the complex vector space is beneficial to processing multiple relations in the knowledge graph, and the distance calculation is obtained through a Hadamard product, and the integral distance function is as follows:

s_Rotate(head,rel,tail)＝||headοrel-tail||

and (3) representing and learning the knowledge graph by adopting a HyperKG model of a hyperbolic space. The candidate answer in this space is expressed in four aspects as v₉，v₁₀，v₁₁,v₁₂. Wherein v is₉Representing the answer entity itself in hyperbolic space, v₁₀Representing answer entity type information representation in complex hyperbolic space, v₁₁Representing answer entity relationship information representation in hyperbolic space, v₁₂Representing the answer entity context information representation in the hyperbolic space. The hyperbolic space is beneficial to representing topological information of the knowledge graph and can better reflect the structural characteristics of the knowledge graph. The calculation of the distance function in the model is based on a poincare sphere model, and the distance function is as follows, wherein u and v are two vectors under a hyperbolic space:

s_HyperKG(head,rel,tail)＝d_p(head+·∏_βtail，rel)

d_p(u，v)＝acosh(1+2δ(u，v))

representing each candidate answer in the candidate answer set generated in the step one from four aspects, wherein the first aspect is the entity of the candidate answer in the knowledge graph; the second aspect is type information of the answer entity; the third aspect is the average value of the path vector from the answer entity to the question entity, that is, the relationship information between the answer entity and the question; the last aspect is context information of the answer entity, specifically, the mean value of the one-hop neighbor vector of the answer entity. The entire knowledge-graph is embedded into three vector spaces, namely euclidean space, complex vector space and hyperbolic space, and the candidate answers have four aspects of representation in each vector space, so that for each candidate answer, 12 space vectors represent the candidate answer.

And 104, constructing an answer side attention network and a multi-space attention network.

Constructing a two-layer attention network, wherein the first layer of attention network is used for aggregating four aspects of information of answers by utilizing vector representation of question sentences and dynamic weight matrix W₁Calculating the weight alpha of different aspects of each answer₁，α₂，α₃，α₄In which α is₁Representing the weight, alpha, of the answer entity itself₂Weight, alpha, representing answer entity type information representation₃Weight, alpha, representing the representation of answer entity relationship information₄Representing the weight of the answer entity context information representation. By the weight alpha₁，α₂，α₃，α₄Aggregating answer information under each space into a vector gamma₁，γ₂，γ₃，γ₁Is the answer vector in Euclidean space, gamma₂Is the answer vector in complex vector space, gamma₃Is the answer vector in hyperbolic space. The second layer attention network is used for aggregating answer information of a plurality of spaces by using a weight matrix W₂Each time of calculationWeight β of vector space₁，β₂，β₃Wherein beta is₁Is the weight of the Euclidean spatial information, beta₂Is the weight of the complex vector spatial information, beta₃Is the weight of the hyperbolic spatial information. According to a weight beta₁，β₂，β₃And aggregating the vectors of the plurality of spaces into a one-dimensional space vector a, wherein the length of the one-dimensional space vector a is consistent with the length of the question vector. And finally, inputting the one-dimensional vector representation q of the question sentence and the one-dimensional vector representation a of the candidate answer into a score function, and calculating the score of the candidate answer in an inner product mode.

In FIG. 2, Attention is an Attention module, Weighted Sum is a Weighted Sum module, Add & Norm is a normalization module, Feed Forward represents Feed Forward, Answer aspect Attention represents Answer side Attention network, and Muti-Space Attention represents multi-Space Attention network.

And 105, dynamically aggregating multi-aspect information of the candidate answers by using an answer side attention network according to the one-dimensional vector of the input question and each candidate answer in each spatial knowledge map represented by the entity information, the path information, the type information and the context information in each spatial knowledge map, and obtaining an answer vector of each candidate answer in each space.

The method specifically comprises the following steps:

Calculating weights of entity information, path information, type information and context information; in the formula, alpha_iIs the weight of the information i, j representing entity information, path information, type information or context information, η_i、η_jThe initial weight value before normalization of information i and j, q is a one-dimensional vector of the input question, W₁Is a first weight matrix, v_1i、v_2i、v_3iRespectively a Euclidean space,Information i representation in complex vector space and hyperbolic space, b₁Is a first offset;

information i representing the ith candidate answer in the kth space, wherein the kth space is Euclidean space, complex vector space or hyperbolic space [;]representing the concatenation of two vectors.

And fusing the answer information of the four aspects together according to the relevance between the vector representation of the question and the vector representation of the candidate answer. The information fusion mode adopts a weighted summation mode, and the calculation of the weight is realized by an attention network. Thus, the following attention calculation is defined:

correspondingly

And step 106, dynamically aggregating the multi-space information of the knowledge graph by using the multi-space attention network according to the answer vector of each candidate answer in each space and each candidate answer in the knowledge graph in each space represented by the entity information, the path information, the type information and the context information in the knowledge graph in each space, and obtaining the one-dimensional vector of each candidate answer.

The method specifically comprises the following steps:

Calculating weights of Euclidean space, complex vector space and hyperbolic space; in the formula, beta_kIs the weight of the k-th space, θ_k、θ_nRespectively are the initial weight values before the k and n space normalization, W₂Is a second weight matrix, v_1k、v_2k、v_3k、v_4kEntity information representation, path information representation, type information representation, context information representation, respectively, of the k-th space, b₂Is the second biasMoving amount;

After the weights of the answers in different aspects are calculated, the weights of different spaces are also calculated, and finally, the candidate answers are expressed into a one-dimensional vector consistent with the length of the question. Since there is no correlation between the weight calculation of the embedding space and the input question, it is not necessary to use a vector representing the question when calculating the weight of each embedding space, and the calculation method is as follows:

correspondingly

After these weight parameters are obtained, the vectors representing the answers can be weighted and summed into one vector.

And step 107, determining the score of each candidate answer in an inner product mode according to the one-dimensional vector of each candidate answer and the one-dimensional vector of the input question.

The score between the question and the answer is calculated in an inner product mode, and the method specifically comprises the following steps:

And step 108, determining a final correct answer set according to the score of each candidate answer.

When the model picks the correct answer, it is incorrect to use the scored answer as the correct answer because there may be more than one correct answer to the input question. Therefore, a boundary parameter m is set to solve the problem, and as long as the candidate answer with the difference value between the score and the highest score within the range of m is considered to be the correct answer, the final answer selection method is as follows:

in the formula, A_qAs the final correct answer set, a_lIs a one-dimensional vector of the ith candidate answer, C_qA candidate answer set consisting of a plurality of candidate answers, S_maxIs the highest score in the candidate answer, S (q, a)_l) Is the score of the ith candidate answer, and m is the boundary parameter.

Training by taking a neural network based on a self-attention mechanism, an answer side attention network and a multi-space attention network as an integral model, and taking a random gradient descent algorithm based on small batches as an optimizer during training;

purpose of integral modelThe standard function is:

wherein L (q, a, a ') [ m-S (q, a) + S (q, a')]₊，R_qFor a correct answer set, W_qIs a set of wrong answers, m is a boundary parameter, [ z ]]₊The representation takes the larger of 0 and z, S (q, a) is the score of a question and a correct answer, S (q, a ') is the score of a question and a wrong answer, L (q, a, a ') is the loss value of pairwise training, a is the vector representation of a correct answer, a ' is the vector representation of a wrong answer, and q is the one-dimensional vector of the input question.

In the model training, for each candidate answer in the candidate answer set, k wrong answers are randomly selected to form a negative sample, and then the whole question-answering model is trained in a paired training mode. In the training process, random gradient descent based on small batch is used as an optimizer, a parameter matrix for obtaining self-attention calculated in a self-attention network represented by question vectors is continuously updated in the training process, and a weight matrix W for aggregating answer information is used₁And W₂And offset b₁And b₂Adjustments are made continuously to minimize the score function. And the number k of negative samples and the boundary parameter m are hyperparameters, and the optimal case is finally set to be k equal to 1000 and m equal to 0.5 through testing.

The invention discloses a knowledge map question-answering model for dynamically aggregating multi-space information by using an attention network, which is characterized in that performance statistics and analysis of the knowledge map question-answering model in different vector spaces are performed, and a knowledge map is used for representing information retention differences of a learning model in multiple spaces. Compared with the traditional knowledge graph question-answer model based on representation learning, the knowledge graph question-answer model can effectively capture knowledge information in a knowledge graph and further improve knowledge aggregation efficiency, so that the model has stronger anti-interference performance on noise. The invention mainly comprises the following steps: (1) question representation learning module: the traditional question expression model based on the convolutional neural network and the cyclic neural network is improved, and the expression capability of the problem side neural network is enhanced by utilizing a front edge self-attention mechanism; (2) a multi-space knowledge graph representation module: the method comprises the steps of performing multi-aspect representation on an answer side, and embedding a knowledge graph into a plurality of vector spaces, so that the information of the answer side has multi-space characteristics; (3) attention network module: the knowledge graph information of different spaces is dynamically aggregated, and the information of multiple aspects of the answer side is dynamically aggregated, so that the whole model has better robustness.

The technical solution of the present invention is further described in detail below with reference to the accompanying drawings and examples, wherein the configuration environment of the example is as follows: the CPU 8700K main frequency 3.7GHz, the ROM 16G, the graphics computing card NVIDIA GTX2080Ti, the Linux Ubuntu 18.04 system and the programming language Python 3 are based on a Pythrch deep learning framework.

The first embodiment is as follows:

firstly, extracting a subject entity from an input question, and linking the subject entity to a knowledge graph, wherein the knowledge graph selects a subset FB2M of Freebase. The representation learning model and the question-answering model of the knowledge graph are trained respectively, the knowledge graph of FB2M is embedded and trained respectively by using the source codes of TransE, Rotate and HyperKG, and then the trained entity and the relationship vector are used as the input of the question-answering model. Inputting a question through a word embedding matrix and then inputting the question into a self-attention-based deep neural network, obtaining a vector from a knowledge graph according to a designed representation mode for representing candidate answers, and finally calculating a score by using the attention network. The model was performance verified on the WebQuestions data and analyzed. The degree of contribution of each sub-functional module of the model to the overall model is first tested. When a self-attention-based deep neural network is applied on the input question side and the vectors of candidate entities under euclidean space in the knowledge graph are directly used on the answer side, the model achieves an F1 score of 39.8. After adding the multi-dimensional representation of the candidate answer and a layer of attention network, the model score reached 42.8. After the embedding space of the knowledge graph is switched into a complex vector space and a hyperbolic space, the performance of the model is improved, and F1 scores are respectively improved to 43.2 and 43.4. Finally, after expanding the embedding space of the knowledge graph to three and using the attention network to aggregate the information of the three spaces, the performance of the model reaches the highest 44.1, and the method surpasses the traditional method.

Example two:

according to the implementation method, the performance difference of the model under different embedded vector dimensions is compared. The embedding dimensions are set from 50, 100, 150, 200 to 250, and performance comparison is carried out on different embedding spaces in each dimension. Referring to fig. 3, experiments prove that as the embedding dimensions of the knowledge graph and the question sentence are improved, the performance of the model is improved, and the highest F1 score can be obtained no matter what dimensions by comprehensively utilizing a plurality of embedding spaces, which indicates that the multi-space knowledge enhancement using the attention mechanism in the question-answering system is effective. The abscissa Embedding dimension in fig. 3 represents the Embedding dimension, the ordinate Macro F1 represents the F1 index, and the Multi-space represents the multispace.

The invention creatively utilizes the multi-space knowledge representation difference based on representation learning to carry out knowledge enhancement so as to solve the problem of knowledge map question answering, and compared with the prior art, the technical method has the following beneficial effects:

1) the invention utilizes a depth neural network model of the front edge to construct an input question feature extraction module based on self-attention. According to the relevance between word levels in the question and the word sequence information, the question is subjected to vector representation, and the method can retain the information in the question to a greater extent and is beneficial to the use of a follow-up model for the question vector.

2) The invention strengthens the representing capability of candidate answers by constructing a multi-space knowledge graph representation learning model. The candidate answer entity is expressed into a plurality of vector spaces, the number of vector expressions is expanded, and the expression mode of the entity is redesigned in each space, so that the aim of comprehensively and reasonably expressing the candidate answer is fulfilled.

3) The invention dynamically aggregates information in the knowledge graph and dynamically calculates the score of the candidate answer through a double attention network. The weight of each kind of information is adjusted by associating the input question with the information such as the attribute of the candidate answer, so that the information aggregation is more efficient. By adjusting the weight of each spatial information, the accuracy and robustness of the whole model are improved.

The invention also provides a knowledge-graph question-answering system with multi-space knowledge enhancement, which comprises:

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A multi-space knowledge-enhanced knowledge graph question-answering method is characterized by comprising the following steps:

2. The method for multi-space knowledge-enhanced knowledge-graph question-answering according to claim 1, wherein the obtaining of the one-dimensional vector of the input question through feature extraction of the input question by a neural network based on a self-attention mechanism specifically comprises:

3. The method according to claim 1, wherein the determining a plurality of candidate answers to the input question in the knowledge-graph specifically comprises:

4. The multi-space knowledge-enhanced knowledge graph question-answering method according to claim 1, wherein the embedding of the knowledge graph into euclidean space, complex vector space and hyperbolic space respectively comprises:

5. The method according to claim 1, wherein the obtaining of the answer vector of each candidate answer in each space by dynamically aggregating multifaceted information of the candidate answers according to the one-dimensional vector of the input question and each candidate answer in each knowledge graph under the space represented by entity information, path information, type information and context information in the knowledge graph under the respective space by using an answer-side attention network specifically comprises:

Calculating weights of entity information, path information, type information and context information; in the formula, alpha_iIs the weight of the information i, j representing entity information, path information, type information or context information, η_i、η_jThe initial weight value before normalization of information i and j, q is a one-dimensional vector of the input question, W₁Is a first weight matrix, v_1i、v_2i、v_3iInformation i representation in Euclidean space, complex vector space and hyperbolic space, b₁Is a first offset;

6. The method according to claim 1, wherein the obtaining of the one-dimensional vector of each candidate answer according to the answer vector of each candidate answer in each space and each candidate answer in each knowledge graph in each space represented by entity information, path information, type information and context information in the knowledge graph in each space by dynamically aggregating the multi-space information of the knowledge graph using a multi-space attention network comprises:

Calculating weights of Euclidean space, complex vector space and hyperbolic space; in the formula, beta_kIs the weight of the k-th space, θ_k、θ_nRespectively are the initial weight values before the k and n space normalization, W₂Is a second weight matrix, v_1k、v_2k、v_3k、v_4kEntity information representation, path information representation, type information representation, context information representation, respectively, of the k-th space, b₂Is a second offset;

7. The method according to claim 1, wherein the determining the score of each candidate answer by an inner product according to the one-dimensional vector of each candidate answer and the one-dimensional vector of the input question specifically comprises:

according to the one-dimensional vector of each candidate answer and the one-dimensional vector of the input question, a formula S (q, a) is utilized in an inner product mode_l)＝h(q,a_l) Determining a score for each candidate answer;

8. The method according to claim 1, wherein the determining a final correct answer set according to the score of each candidate answer specifically comprises:

according to the score of each candidate answer, using formula A_q＝{a_l|a_l∈C_qandS_max-S(q,a_l) < m }, determining a final correct answer set;

9. The multi-space knowledge-enhanced knowledge graph question-answering method according to claim 1, wherein the neural network based on the self-attention mechanism, the answer-side attention network and the multi-space attention network are trained as an integral model, and a small-batch-based stochastic gradient descent algorithm is adopted as an optimizer during training;

the objective function of the overall model is:

10. A multi-spatial knowledge-enhanced knowledge-graph question-answering system, the system comprising: