CN110674280A

CN110674280A - Answer selection algorithm based on enhanced question importance expression

Info

Publication number: CN110674280A
Application number: CN201911143753.7A
Authority: CN
Inventors: 琚生根; 谢正文; 熊熙; 孙界平
Original assignee: Sichuan University
Current assignee: Beijing Zhongke Microparticle Biotechnology Co ltd
Priority date: 2019-06-21
Filing date: 2019-11-20
Publication date: 2020-01-10
Anticipated expiration: 2039-11-20
Also published as: CN110674280B

Abstract

The invention relates to an answer selection algorithm based on enhanced question importance representation, which comprises the following steps: s1, coding the question and the answer through a BilSTM coding layer; s2, obtaining a new problem vector by the coded problem by using a self-attention mechanism; s3, establishing a word-level similarity matrix for the questions and the answers and aligning the word-level similarity matrix; s4, capturing semantic information of multiple granularities, and fusing and comparing vectors of different granularities; and S5, extracting fusion characteristics through the multi-window CNN to obtain the best option. The answer selection algorithm based on the enhanced question importance expression provides an answer selection algorithm based on a question importance expression network aiming at noise words in sentences, and the method generates 'clean' question sentence vectors by endowing different words with different weights again by using a self-attention mechanism; and capturing fine-grained semantic information between the question sentences and the answer sentences by using the word-level interaction matrix, thereby relieving the influence of noise words in the answer sentences.

Description

Answer selection algorithm based on enhanced question importance expression

Technical Field

The invention relates to the technical field of answer selection, in particular to an answer selection algorithm based on enhanced question importance expression.

Background

Answer Selection (AS) is a subtask in Question Answering (QA) and also a topical topic in Information Retrieval (IR) in recent years. Answer selection is to select the most appropriate answer from the candidate answer list according to the input question.

However, the currently used method has the influence of noise words, thereby influencing the accuracy of the answer.

Disclosure of Invention

In order to solve the above problems, it is an object of the present invention to provide an answer selection algorithm based on an enhanced representation of the importance of a question.

In order to achieve the purpose, the invention provides the following technical scheme: an answer selection algorithm based on enhanced question importance representation, comprising the steps of:

s1, coding the question and the answer through a BilSTM coding layer;

s2, the coded question and answer are regenerated by the question vector of the self-attention mechanism to obtain a new question vector;

s3, aligning the word levels in the question and the answer by using the word level similarity matrix;

s4, capturing semantic information of multiple granularities, and comparing vectors of different granularities;

and S5, extracting fusion characteristics through multiple layers of CNNs to obtain the best option.

Preferably, in step S1, Q is the question, the answer is a, and H is used_q＝{h_q1,...,h_qmH and_a＝{h_a1,...,h_anrepresents a question sentence vector and an answer sentence vector,is sentence H_qThe ith word of (1) is embedded, and m and n respectively represent the lengths of the question and the answer;

the question and answer capture the information of sentence context through a BilSTM coding layer, the hidden layer dimension of the LSTM is u, and the embedded word at the time t is x_tThe hidden layer and the memory unit at the previous moment are h respectively_t-1And c_t-1Hidden layer h at the next moment_tAnd a memory cell c_tThe calculation is as follows:

g_t＝φ(W_gx_t+V_gh_t-1+b_g)，

i_t＝σ(W_ix_t+W_ih_t-1+b_i)，

f_t＝σ(W_fx_t+W_fh_t-1+b_f)，

o_t＝σ(W_ox_t+W_oh_t-1+b_o),

c_t＝g_t⊙i_t+c_t-1⊙f_t，

h_t＝c_t⊙o_t

wherein the content of the first and second substances,

in the above-mentioned manner,

sigma and phi are sigmoid function and tanh function respectively, ⊙ represents that two vectors are subjected to element multiplication, and an input gate i, a forgetting gate f and an output gate o can be self-multipliedDynamic control of the flow of information, with storage unit c_tCan remember the long-distance information h_tIs a vector representation at time t.

Preferably, the sentence T of question obtained in step S1 is processed in step S2_q＝{t_q1,...,t_qmAnd the sentence T of the answer_a＝{t_a1,...,t_anTherein of

And calculating the weight of each word in the question and updating the weight to generate a new question vector representation. The new vector calculation formula is:

v＝T_qW₁(ii) a Wherein

α_qSigmoid (v); wherein

U_q＝α_q⊙T_q(ii) a Wherein

Preferably, the calculation method of the word-level matrix is as follows:

M(i,j)＝U_q(i)T_a(j)^T

wherein the content of the first and second substances,

each row of the word-level matrix is the influence of the words in the question on each word in the answer, and the rows and the columns of the word-level matrix are normalized by a softmax function to obtain a mutual information influence factor lambda_q(i, j) and λ_a(i, j) wherein λ_q(i, j) and λ_aThe value ranges of (i, j) are all [0,1 ]](ii) a Multiplying the question vector and the answer vector with the corresponding influence factors to obtain two new vectors E_qAnd E_a。

It is preferable thatIn step S4, the original problem vector is represented as Q, and the vectors passing through the attention alignment layer are represented as Q

The original vector of the answer is A, and the vector passing through the attention alignment layer is represented as

Vector subtraction represents the euclidean distance between two vectors, and vector multiplication is approximate to the cosine distance between two vectors, and the specific calculation formula is as follows:

wherein the content of the first and second substances,

preferably, the calculation formula in step S5 is:

u ═ CNN (Fuse), wherein Fuse represents the fusion content K_qOr the fusion content K_a；

Obtaining S from the output u of the CNN through maximum pooling and average pooling_q,max，S_a,max，S_q,mean，S_a,meanThen splicing into a vector S;

deriving final prediction vector by multi-layer perceptron (MLP)

Obtaining a score vector by using the following formula;

G＝softmax(Score)；

reducing the difference between the probability distribution of the predicted value and the probability distribution of the label value, wherein the formula is as follows:

compared with the prior art, the invention has the beneficial effects that: the answer selection algorithm based on the enhanced question importance expression provides an answer selection algorithm based on a question importance expression network aiming at noise words in sentences, and the method generates 'clean' question sentence vectors by endowing different words with different weights again by using a self-attention mechanism; and capturing fine-grained semantic information between the question sentences and the answer sentences by using the word-level interaction matrix, thereby relieving the influence of noise words in the answer sentences.

Drawings

FIG. 1 is an overall frame diagram of the present invention;

FIG. 2 is a graph generated based on a problem vector of the self-attention mechanism.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-2, the present invention provides a technical solution: an answer selection algorithm based on enhanced question importance representation, comprising the steps of:

s1, coding the question and the answer through a BilSTM coding layer; in step S1, Q is the question, the answer is A, and H is used_q＝{h_q1,...,h_qmH and_a＝{h_a1,...,h_anrepresents a question sentence vector and an answer sentence vector,

is sentence H_qThe ith word of (1) is embedded, m and n represent question and answer, respectivelyThe length of the case;

g_t＝φ(W_gx_t+V_gh_t-1+b_g)，

i_t＝σ(W_ix_t+W_ih_t-1+b_i)，

f_t＝σ(W_fx_t+W_fh_t-1+b_f)，

o_t＝σ(W_ox_t+W_oh_t-1+b_o),

c_t＝g_t⊙i_t+c_t-1⊙f_t，

h_t＝c_t⊙o_t

wherein the content of the first and second substances,

in the above-mentioned manner,sigma and phi are sigmoid function and tanh function respectively, ⊙ represents that two vectors are subjected to element multiplication, an input gate i, a forgetting gate f and an output gate o can automatically control the flow of information, and a memory unit c_tCan remember the long-distance information h_tIs a vector representation at time t.

S2, the coded question and answer are regenerated by the question vector of the self-attention mechanism to obtain a new question vector; the sentence T of question obtained in step S1 is processed in step S2_q＝{t_q1,...,t_qmAnd the sentence T of the answer_a＝{t_a1,...,t_anTherein of

v＝T_qW₁(ii) a Wherein

α_qSigmoid (v); wherein

U_q＝α_q⊙T_q(ii) a Wherein

S3, establishing a word-level similarity matrix for the questions and the answers and aligning the word-level similarity matrix; the calculation mode of the word level matrix is as follows:

M(i,j)＝U_q(i)T_a(j)^T

wherein the content of the first and second substances,

S4, capturing semantic information of multiple granularities, and fusing and comparing vectors of different granularities; the problem original vector is denoted as Q and the vectors passing through the attention-alignment layer are denoted as Q

wherein the content of the first and second substances,

s5, extracting fusion characteristics through multiple layers of CNNs to obtain the best option, wherein the calculation formula is as follows:

deriving final prediction vector by multi-layer perceptron (MLP)

Obtaining a score vector by using the following formula;

G＝softmax(Score)；

although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims

1. An answer selection algorithm based on enhanced question importance representation, characterized by: the method comprises the following steps:

s1, coding the question and the answer through a BilSTM coding layer;

s2, obtaining a new problem vector by the coded problem by using a self-attention mechanism;

s3, establishing a word-level similarity matrix for the questions and the answers and aligning the word-level similarity matrix;

s4, capturing semantic information of multiple granularities, and fusing and comparing vectors of different granularities;

and S5, extracting fusion characteristics through the multi-window CNN to obtain the best option.

2. An answer selection algorithm based on an enhanced question importance representation according to claim 1, characterized in that: in step S1, Q is the question, the answer is A, and H is used_q＝{h_q1,...,h_qmH and_a＝{h_a1,...,h_anrepresents a question sentence vector and an answer sentence vector,

is sentence H_qThe ith word of (1) is embedded, and m and n respectively represent the lengths of the question and the answer;

g_t＝φ(W_gx_t+V_gh_t-1+b_g)，

i_t＝σ(W_ix_t+W_ih_t-1+b_i)，

f_t＝σ(W_fx_t+W_fh_t-1+b_f)，

o_t＝σ(W_ox_t+W_oh_t-1+b_o),

c_t＝g_t⊙i_t+c_t-1⊙f_t，

h_t＝c_t⊙o_t

wherein the content of the first and second substances,

in the above-mentioned manner,

sigma and phi are sigmoid function and tanh function respectively, ⊙ represents that two vectors are subjected to element multiplication, an input gate i, a forgetting gate f and an output gate o can automatically control the flow of information, and a memory unit c_tCan remember the long-distance information h_tIs a vector representation at time t.

3. An answer selection algorithm based on an enhanced question importance representation according to claim 1, characterized in that: the sentence T of question obtained in step S1 is processed in step S2_q＝{t_q1,...,t_qmAnd the sentence T of the answer_a＝{t_a1,...,t_anTherein of

v＝T_qW₁(ii) a Wherein

α_qSigmoid (v); wherein

U_q＝α_q⊙T_q(ii) a Wherein

4. An answer selection algorithm based on an enhanced question importance representation according to claim 1, characterized in that: the calculation mode of the word level matrix is as follows:

M(i,j)＝U_q(i)T_a(j)^T

wherein the content of the first and second substances,each row of the word-level matrix is the influence of the words in the question on each word in the answer, and the rows and the columns of the word-level matrix are normalized by a softmax function to obtain a mutual information influence factor lambda_q(i, j) and λ_a(i, j) wherein λ_q(i, j) and λ_aThe value ranges of (i, j) are all [0,1 ]](ii) a Multiplying the question vector and the answer vector with the corresponding influence factors to obtain two new vectors E_qAnd E_a。

5. An answer selection algorithm based on an enhanced question importance representation according to claim 1, characterized in that: in step S4, the original problem vector is denoted as Q, and the vectors passing through the attention alignment layer are denoted as Q

Vector subtraction represents the Euclidean distance between two vectorsThe vector multiplication is approximate to the cosine distance between two vectors, and the specific calculation formula is as follows:

wherein the content of the first and second substances,

6. an answer selection algorithm based on an enhanced question importance representation according to claim 4, characterized in that: the calculation formula in step S5 is:

deriving final prediction vector by multi-layer perceptron (MLP)

Obtaining a score vector by using the following formula;

G＝softmax(Score)；