CN113688217B

CN113688217B - Intelligent question and answer method oriented to search engine knowledge base

Info

Publication number: CN113688217B
Application number: CN202110972592.3A
Authority: CN
Inventors: 舒明雷; 刘浩; 周书旺; 高天雷; 许继勇
Original assignee: Qilu University of Technology; Shandong Institute of Artificial Intelligence
Current assignee: Qilu University of Technology; Shandong Institute of Artificial Intelligence
Priority date: 2021-08-24
Filing date: 2021-08-24
Publication date: 2022-04-22
Anticipated expiration: 2041-08-24
Also published as: CN113688217A

Abstract

An intelligent question-answering method facing a search engine knowledge base is based on the search engine knowledge base, constructs a reasoning path with symbol Markov property through dynamic search path reasoning and deep reinforcement learning, realizes path information coding and calculation of action space probability distribution based on an LSTM and a feedforward neural network, and sets judgment conditions according to vector representation of the search reasoning path. Based on the judgment condition and the reasoning path when searching the answers of the questions, the intelligent question answering of the search engine based on the knowledge base is realized. The method does not need to define rules and limit the length of the reasoning path, can be used for a complex question-answer reasoning process based on a search engine knowledge base, and realizes efficient and accurate search engine intelligent question-answer.

Description

Intelligent question and answer method oriented to search engine knowledge base

Technical Field

The invention relates to the field of question answering only of search engines, in particular to an intelligent question answering method oriented to a search engine knowledge base.

Background

The intelligent question-answering of the search engine, namely a natural language question sentence of a given search engine, searches corresponding entities from the existing knowledge base to be used as answers of the question sentence. Specifically, entity recognition and relation extraction are carried out on the question sentence, the question sentence is linked to the corresponding entity and relation in the knowledge base, candidate answers are inquired, matched and deduced, and the target answer is obtained. Nowadays, the following problems are mainly faced in the reasoning process of intelligent question answering in the field of search engines:

1) the traditional semantic analysis and information retrieval methods need to manually write a large number of templates or define a large number of rules.

2) The partial deep learning method can only process simple problems, cannot be applied to a complex reasoning process, and consumes a large amount of computing resources.

Disclosure of Invention

In order to overcome the defects of the technologies, the invention provides an efficient and accurate intelligent question-answering method facing a search engine knowledge base.

The technical scheme adopted by the invention for overcoming the technical problems is as follows:

an intelligent question-answering method oriented to a search engine knowledge base comprises the following steps:

a) setting the question searched in the search engine and the corresponding answer, and obtaining the entity e of the question through a natural language processing tool_sAnd query relation r of question_qSetting the answer corresponding to the question as e_oEntity e for processing Embedding problem by using natural language_sQuerying the relationship r_qAnd the answer e_oMapping into dense low latitude vector space, resulting in a vector representation epsilon for the entity in question_sQuery vector representation of relationships gamma_qAnd vector representation epsilon of the entity of the answer_o；

b) Entity e extracted from search questions_sAnd relation r_qDefining the path of reasoning question answers based on the search engine knowledge base as ((r)₀,e_s),(r₁,e₁),…,(r_n,e_n) Wherein e) is_i1, n denotes the ith entity in the path, n is the maximum inference path length, r_iN denotes the ith relationship in the inference path, r₀For the introduced redundancy relationship, r₀Entity e associated with the question_sForming actions together, defining a tuple (e, r) formed by an entity e and a relation r traversed in the search process of a search engine knowledge base as an action a, and defining a set of all actions as an action space A;

c) passing the action a of the search path in the search engine knowledge base through natureThe Embedding technique of language processing is mapped to a dense low-dimensional vector space, and a vector representation alpha corresponding to the action a is obtained as (gamma; epsilon), wherein; "is the operation of vector splicing, gamma is the vector representation of the relation r, epsilon is the vector representation of the entity e, and t is the vector alpha of the action a corresponding to the time step_tInputting into long and short term memory network LSTM, and processing by formula h_t＝LSTM(h_t-1,α_t) Obtaining a vector representation h of historical memory information corresponding to the time step t_tIn the formula h_t-1Vector representation of historical memory information corresponding to the t-1 time step;

d) by the formula

Calculating to obtain an action space A corresponding to the t time step_tCorresponding action fraction pi_θ(a_t) In the formula

Is an action space A_tVector representation, W, through natural language processing Embedding mapping₁And W₂For the weights of the network model, Relu (. cndot.) is the ReLU function, softmax (. cndot.) is the softmax function,

is a matrix product, ε_tSelecting the motion corresponding to the maximum motion score as the vector representation corresponding to the t-th time of the path of the reasoning question answers of the search engine knowledge base

Wherein

The relationship corresponding to the action a with the maximum action score,

the entity corresponding to the action a with the maximum action score;

e) repeating steps c) through c) based on the search engine knowledge baseStep d), formally defining the obtained reasoning search path as

Wherein

The action tuple corresponding to the maximum action score is i at the time step, i is 1.

Is the relation corresponding to the maximum action score of the ith time step,

the entity corresponding to the maximum action score of the ith time step;

f) setting the reward value R of the reasoning answer corresponding to the current reasoning searching path based on the searching path of the searching engine knowledge base_tIf the answer is inferred

Equal to the answer e corresponding to the question_oThen award value R_t1, if the answer is inferred

Not equal to answer e corresponding to the question_oThen by the formula

Calculating a reward value R_tWherein d is a cosine similarity function,

is composed of

The vector representation obtained by Embedding,

is composed of

Vector representation obtained by Embedding;

g) by the formula

Calculating the maximum reward value R on the path of the reasoning question answers based on the whole search engine knowledge base_u；

h) The parameter gradient is defined as

Wherein

For the purpose of graduating the network model parameter θ, where R ═ R_uThe parameter optimization of the LSTM network and the feedforward neural network is realized through inverse gradient propagation;

i) repeating the steps a) to h) on the data set of the whole question and the corresponding answer based on a search engine knowledge base to complete model training and obtain a multi-hop inference model with prediction and inference capabilities on the question;

j) inputting a certain question in the multi-hop reasoning model, and obtaining the reasoning question answer through the steps a) to f) if the question has a definite question answer

Judging and reasoning out answers to questions

Whether the answer is equal to the answer of the real question or not, if the question does not have the answer of the question compared with the answer of the real question, the judgment condition of the search reasoning path on the answer of the question is set as

Wherein lambda is a hyperparameter, if the judgment condition is satisfied, predicting the answer entity vector epsilon_tCorresponding predicted answer entity e_tFor the answer to the search question, the reasoning process is exited.

Further, the natural language processing tool in step a) is a HanLP natural language processing tool or a deep natural language processing tool.

Preferably, λ in step j) is 0.5.

The invention has the beneficial effects that: based on a search engine knowledge base, an inference path with symbol Markov property is constructed through dynamic search path inference and deep reinforcement learning, path information coding and action space probability distribution calculation are realized based on an LSTM and a feedforward neural network, and meanwhile, judgment conditions are set according to vector representation of the search inference path. Based on the judgment condition and the reasoning path when searching the answers of the questions, the intelligent question answering of the search engine based on the knowledge base is realized. The method does not need to define rules and limit the length of the reasoning path, can be used for a complex question-answer reasoning process based on a search engine knowledge base, and realizes efficient and accurate search engine intelligent question-answer.

Drawings

FIG. 1 is a flow diagram of a multi-hop inference model of the present invention.

Detailed Description

The invention is further described below with reference to fig. 1.

b) Entity e extracted from search questions_sAnd relation r_qDefining a path of reasoning question answers based on a search engine knowledge baseIs ((r)₀,e_s),(r₁,e₁),…,(r_n,e_n) Wherein e) is_iWhere i is 1, …, n denotes the ith entity in the path, n is the maximum inference path length, r_iN denotes the ith relationship in the inference path, r₀For the introduced redundancy relationship, r₀Entity e associated with the question_sForming actions together, defining a tuple (e, r) formed by an entity e and a relation r traversed in the search process of a search engine knowledge base as an action a, and defining a set of all actions as an action space A;

c) mapping the action a of the search path in the search engine knowledge base to a dense low-dimensional vector space through an Embedding technology of natural language processing to obtain a vector representation alpha (gamma; epsilon), where "; "is the operation of vector splicing, gamma is the vector representation of the relation r, epsilon is the vector representation of the entity e, and t is the vector alpha of the action a corresponding to the time step_tInputting into long and short term memory network LSTM, and processing by formula h_t＝LSTM(h_t-1,α_t) Obtaining a vector representation h of historical memory information corresponding to the time step t_tIn the formula h_t-1Vector representation of historical memory information corresponding to the t-1 time step;

d) by the formula

is a matrix product, ε_tSelecting a maximum action for a vector representation corresponding to the tth time of a path of reasoning question answers to a search engine knowledge baseThe corresponding actions of the score are recorded as

Wherein

The relationship corresponding to the action a with the maximum action score,

the entity corresponding to the action a with the maximum action score;

e) based on the search engine knowledge base, repeating the steps c) to d) to formally define the obtained inference search path as

Wherein

Is the relation corresponding to the maximum action score of the ith time step,

the entity corresponding to the maximum action score of the ith time step;

Not equal to answer e corresponding to the question_oThen by the formula

Calculating a reward value R_tWherein d is a cosine similarity function,

is composed of

The vector representation obtained by Embedding,

is composed of

Vector representation obtained by Embedding;

g) by the formula

Calculating the maximum reward value R on the path of the reasoning question answers based on the whole search engine knowledge base_uWherein R ═ R_uT is u, and T is the timestamp corresponding to the maximum reward value;

h) the parameter gradient is defined as

Wherein

j) inputting a certain question in the multi-hop inference model, and obtaining the question through steps a) to f) if the question has a definite question answerAnswers to questions of reasoning

Judging and reasoning out answers to questions

Based on a search engine knowledge base, an inference path with symbol Markov property is constructed through dynamic search path inference and deep reinforcement learning, path information coding and action space probability distribution calculation are realized based on an LSTM and a feedforward neural network, and meanwhile, judgment conditions are set according to vector representation of the search inference path. Based on the judgment condition and the reasoning path when searching the answers of the questions, the intelligent question answering of the search engine based on the knowledge base is realized. The method does not need to define rules and limit the length of the reasoning path, can be used for a complex question-answer reasoning process based on a search engine knowledge base, and realizes efficient and accurate search engine intelligent question-answer.

Preferably, λ in step j) is 0.5.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An intelligent question-answering method oriented to a search engine knowledge base is characterized by comprising the following steps:

c) mapping the action a of the search path in the search engine knowledge base to a dense low-dimensional vector space through an Embedding technology of natural language processing to obtain a vector representation alpha (gamma; epsilon), where_tInputting into long and short term memory network LSTM, and processing by formula h_t＝LSTM(h_t-1,α_t) Obtaining a vector representation of historical memory information corresponding at time step th_tIn the formula h_t-1Vector representation of historical memory information corresponding to the t-1 time step;

d) by the formula

Wherein

The relationship corresponding to the action a with the maximum action score,

the entity corresponding to the action a with the maximum action score;

Wherein

To be at timeStep i is the action tuple corresponding to the maximum action score, i is 1,.. n,

is the relation corresponding to the maximum action score of the ith time step,

the entity corresponding to the maximum action score of the ith time step;

Not equal to answer e corresponding to the question_oThen by the formula

Calculating a reward value R_tWherein d is a cosine similarity function,

is composed of

The vector representation obtained by Embedding,

is composed of

Vector representation obtained by Embedding;

g) by the formula

h) The parameter gradient is defined as

Wherein

Judging and reasoning out answers to questions

2. The intelligent question-answering method oriented to the search engine knowledge base according to claim 1, characterized in that: the natural language processing tool in the step a) is a HanLP natural language processing tool or a deep natural language processing tool.

3. The intelligent question-answering method oriented to the search engine knowledge base according to claim 1, characterized in that: λ ═ 0.5 in step j).