CN108304587B - Community question-answering platform answer sorting method - Google Patents

Community question-answering platform answer sorting method Download PDF

Info

Publication number
CN108304587B
CN108304587B CN201810186972.2A CN201810186972A CN108304587B CN 108304587 B CN108304587 B CN 108304587B CN 201810186972 A CN201810186972 A CN 201810186972A CN 108304587 B CN108304587 B CN 108304587B
Authority
CN
China
Prior art keywords
vector
question
answer
answers
vector sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810186972.2A
Other languages
Chinese (zh)
Other versions
CN108304587A (en
Inventor
陈恩红
刘淇
金斌斌
赵洪科
童世炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201810186972.2A priority Critical patent/CN108304587B/en
Publication of CN108304587A publication Critical patent/CN108304587A/en
Application granted granted Critical
Publication of CN108304587B publication Critical patent/CN108304587B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a community question-answering platform answer sorting method, which can solve the answer sorting problem by fully utilizing rich metadata (such as a theme, a timestamp and text content of question answers) of the question and the answer; meanwhile, the answer ordering question is answered by using an enhanced attention-based recurrent neural network (EARNN) model, and more information is used compared with the traditional model. The results of the prediction are improved to some extent by a plurality of evaluation indexes.

Description

Community question-answering platform answer sorting method
Technical Field
The invention relates to the field of machine learning and question-answering systems, in particular to a community question-answering platform answer sequencing method.
Background
The community question-answering platform, such as hundredth knowledge, dog search question and the like, provides an online question-answering platform for Internet users to help people quickly obtain high-quality answers to daily or professional questions. As community questions and answers become more popular, questions present in the platform gradually emerge, one of which is that answers are of varying quality, and the experience of users on the platform is greatly affected by the low-quality answers. Although these platforms have proposed the mechanism of "like" at present, for those new questions and answers, the number of like is still unstable due to the short time of occurrence, and the quality of answer is not reflected. Therefore, how to effectively measure the quality of the answer is a research problem which needs to be solved urgently by the community question-answering platform.
Around this research question, researchers have proposed a number of ways, of which "answer ranking" is an effective way to help users quickly pick out high quality answers from a range of answers of varying quality. Relevant research mainly focuses on lexical, syntactic or semantic matching between questions and answers, and ignores the positive influence of metadata information such as topics and timestamps on answer sorting questions.
Disclosure of Invention
The invention aims to provide a community question-answering platform answer sorting method, which can solve answer sorting problems by fully utilizing rich metadata (such as topics, timestamps and text contents of question answers) of questions and answers.
The purpose of the invention is realized by the following technical scheme:
a community question-answering platform answer sorting method comprises the following steps:
crawling a certain amount of data from a community question and answer platform website, wherein the crawled data for a question comprises the following data: the text content of the question, the subject to which the question belongs, the text content of a series of answers corresponding to the question, the timestamp of each answer and the number of praise for each answer;
constructing an enhanced attention mechanism recurrent neural network model based on the text content of each crawled question, the topic to which the question belongs and the text content of a series of answers corresponding to the question, and then combining the time stamp of each answer to perform ranking score of the quality of the relevant answer; training an enhanced attention mechanism recurrent neural network model by combining the ranking scores with a preset time-sensitive objective function and using a problem-dependent pair-wise training strategy;
for a new question and a series of answers thereof, a series of examples are constructed by utilizing the text content of the new question, the subject to which the new question belongs, the text content of a series of answers corresponding to the new question and the timestamp of each answer, and are sequentially input into the trained enhanced attention mechanism recurrent neural network model, so that a series of ranking scores are obtained, and corresponding answers are ranked from front to back according to the ranking scores.
According to the technical scheme provided by the invention, the answer sorting problem is carried out by using an enhanced attention-machine recurrent neural network (EARNN), and compared with the traditional model, more information is used. The results of the prediction are improved to some extent by a plurality of evaluation indexes.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flowchart of a community question-answering platform answer sorting method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a community question-answering platform answer sorting method, which mainly comprises the following steps as shown in figure 1:
step 1, crawling a certain amount of data from a community question-and-answer platform website, wherein the crawled data for a question comprises the following steps: the text content of the question, the subject to which the question belongs, the text content of a series of answers corresponding to the question, the timestamp of each answer, and the number of praise for each answer.
Step 2, constructing an enhanced attention mechanism recurrent neural network model based on the text content of each crawled question, the topic to which the question belongs and the text content of a series of answers corresponding to the question, and then combining the time stamp of each answer to perform ranking score of the quality of the relevant answer; the enhanced attention-mechanism recurrent neural network model is trained in conjunction with the ranking scores and a preset time-sensitive objective function and using a problem-dependent pair-wise training strategy.
And 3, for a new question and a series of answers thereof, constructing a series of examples by using the text content of the new question, the subject to which the new question belongs, the text content of the series of answers corresponding to the new question and the time stamp of each answer, sequentially inputting the examples into the trained enhanced attention mechanism recurrent neural network model to obtain a series of ranking scores, and ranking the corresponding answers from front to back according to the ranking scores.
For ease of understanding, the above-described process is described in detail below.
1. And (4) crawling of data.
In the embodiment of the invention, a certain amount of data is crawled from a community question-and-answer platform website, and the data crawled for a question comprises the following steps: the text content of the question, the topic to which the question belongs (possibly multiple topics), the text content of a series of answers to the question, the timestamp of each answer, and the number of praise for each answer.
2. And (4) preprocessing data.
Preprocessing the crawled data before constructing an enhanced attention mechanism cyclic neural network model to ensure the effect of the model; the pretreatment mainly comprises the following steps:
1) and removing the questions and answers with the number of words less than the set number in the text content.
In the embodiment of the invention, some questions and answers with lower quality need to be removed, and the questions and answers with the number of words smaller than the set number in the text content are generally considered to be of lower quality; illustratively, the set number here may be 10.
2) The questions and answers with the number of praise exceeding the preset range in a period of time are removed.
In the embodiment of the invention, if the wave band of the praise number exceeds the preset range in a period of time, the praise number is considered not to be stable yet, and the data are biased to the model evaluation, so that the question and the answer of which the praise number is not stable yet are removed.
3) The text content of the remaining questions and answers after the two steps is subjected to word segmentation processing, and the data of each question is changed into: the word segmentation result of the text content of the question, the subject to which the question belongs, the word segmentation result of the content of each answer, the timestamp of each answer, and the number of praise of each answer; the praise number of each answer is used for verifying the quality of the model, and the rest information is used as the input of the model for later evaluation of the quality of each answer.
3. And constructing an enhanced attention mechanism recurrent neural network model.
The method for constructing the enhanced attention mechanism recurrent neural network model comprises four parts: an input layer, a long-short term memory network layer, an attention layer and an evaluation layer.
1) An input layer: regarding an answer, the answer is considered to be composed of a plurality of sentences, each sentence being composed of a plurality of words; regarding a corresponding question, the question is considered to be composed of a sentence, and the sentence is composed of a plurality of words; regarding the topic to which the question belongs, the topic is considered to be composed of a plurality of words; using Word Embedding technology to represent words appearing in the text by a vector with a fixed length, so that each Word appearing in the text content of the question, the text content of the answer and the theme is replaced by a K-dimensional vector; the vector sequence TQ of the text content for the hypothesis problem consists of N vectors, denoted TQ ═ x1,x2,…,xN},xp∈RKP ═ 1,2,. cndot, N; assuming that a vector sequence TA of text content of an answer consists of M sentences each consisting of D vectors, TA ═ s1,s2,…,sM},sm={ym1,ym2,...,ymD},ymd∈RKM1, 2, M, D1, 2, D; let the topic TC consist of C vectors, denoted TC ═ z1,z2,...,zC},zq∈RKQ is 1,2. N, M, D, C is not a fixed value and will vary from input instance to input instance.
2) Long-short term memory network layer: modeling the vector sequences in question and answer using two long-short term memory networks LSTM _ Q and LSTM _ A for the vector sequence TQ of the question and one answer vector sequence TA, respectively, and using the last cell vector in LSTM _ Q for initialization of the cell vector in LSTM _ A; then, a vector sequence of questions and answers passing through the long-term and short-term memory network is obtained, wherein the vector sequence is MQ and MA, and each vector in the vector sequence MQ and MA contains context semantic information.
In the embodiment of the invention, the vector sequence TQ for the problem is { x ═ x1,x2,...,xNN updates the cell vector sequence c ═ c { c } in LSTM _ Q over time t ═ 1,21,c2,...,cNGet hidden vector sequence h ═ h1,h2,...,hNThe calculation method is as follows:
it=σ(Wxixt+Whiht-1+bi);
ft=σ(Wxfxt+Whfht-1+bf);
ct=ft·ct-1+it·τ(Wxcxt+Whcht-1+bc);
ot=σ(Wxoxt+Whoht-1+bo);
ht=ot·τ(ct);
the numerical value of the time corresponds to the sequence number of the vector in the vector series TQ of the problem one by one, that is, the first time t is 1, and the vector x with the sequence number of 1 in the TQ corresponds to1(ii) a In particular, cell vector c0And hidden vector h0Initializing to a zero vector; i.e. it,ft,otAn input gate, a forgetting gate and an output gate respectively; sigma (·), tau (·) is sigmoid (·), tanh (·) nonlinear activation function, respectively; is an element multiply operation; { Wxi,Whi,Wxf,Whf,Wxc,Whc,Wxo,WhoIs a modelParameter matrix to be optimized, { bi,bf,bc,boIs the parameter vector to be optimized in the model.
Similarly, for each sentence in the answer, the vector sequence sm={ym1,ym2,...,ymDD updates the cell vector sequence c ' ═ { c ' in LSTM _ a over time t ' ═ 1,2.1,c'2,...,c'DAnd obtaining an implicit vector sequence h'm={h'm1,h'm2,...,h'mD}. In particular, to model the relationship between questions and answers, the resulting hidden vector h'mThe cell vector c 'may be varied depending on the question'0=cNIs implicit to vector h'm0Initialized to a zero vector. Similarly, the value at this time also corresponds to the sequence number of the vector in each sentence vector sequence.
From this layer, the corresponding number of hidden vectors h and h 'can be derived from the vector sequences TQ and TA of questions and answers'mAlthough the number of hidden vectors and the dimension and input of the vectors remain unchanged, the hidden vectors contain context semantic information, wherein the hidden vector sequence h is also the vector sequence MQ and the hidden vector sequence h'mI.e. the vector sequence MA.
3) Attention layer: the vector sequence MQ of the question and the vector sequence MA of an answer can be effectively interacted by using a sentence-level attention mechanism to obtain a vector of the question and a vector of the answer; or, based on sentence-level attention mechanism, a deeper word-level attention mechanism can be designed to change the topic from the vector sequence TC to a vector FC; and then fusing the vector FC of the subject, the vector sequence MQ of the question and the vector sequence MA of an answer to finally obtain a vector of the question and a vector of the answer.
In the embodiment of the invention, during specific implementation, a sentence-level attention mechanism or a word-level attention mechanism can be used for processing MQ and MA to finally obtain corresponding vectors. For the sake of distinction, in the following description, a sentence-level attention mechanism is used to obtain a vector of questions and a vector of answersCorresponding notation is FQ1、FA1(ii) a Obtaining the vector of the question and the corresponding vector of one answer by using the word level attention mechanism and recording the result as FQ2、FA2
The following is a detailed description of the sentence-level attention mechanism and the word-level attention mechanism.
a. The process of interacting the vector sequence MQ of the question with the vector sequence MA of one answer with the sentence-level attention mechanism to obtain the vector of the question and the vector of one answer is as follows:
for the vector sequence MQ of the problem, it is expressed as a K-dimensional vector FQ using an averaging pooling (averaging) operation1
Figure BDA0001590557050000051
Wherein MQpA vector representing the p-th word in the sequence of vectors MQ.
For the answer vector sequence MA, firstly, each sentence in the answer vector sequence MA is represented by a K-dimensional vector by using an average pooling operation, and several semantic representations r'mM1, 2., M; then, a distance function is used to calculate their respective attention scores α'mM1, 2, M, and then using a weighted average to obtain a vector representation FA of the answer1
Figure BDA0001590557050000061
Wherein, MAmdThe vector representing the d word in the m sentence in the vector sequence MA, f (-) represents the cosine similarity function.
b. Changing the theme from a vector sequence TC into a vector FC by using a word-level attention mechanism; then fusing the vector FC of the subject, the vector sequence MQ of the question and the vector sequence MA of an answer to finally obtain the vector of the question and the vector of the answer as follows:
subject TC ═ z for a given problem1,z2,...,zCUsing an averaging poolThe quantization operation transforms it into a fixed-length vector FC:
Figure BDA0001590557050000062
after the vector FC is obtained, the word-level attention mechanism uses it to calculate a score for the attention of each word in the question and answer;
for the vector sequence MQ of the question, which consists of a series of semantically represented vectors, we use the transformation matrix W to compute the distance between the vector of each word in the question and the vector FC of the topic, and then to compute the attention score β for the p-th word in the question using the softmax operationp(ii) a Resulting vector FQ of the problem2Expressed as:
Figure BDA0001590557050000063
wherein MQp、MQiCorrespondingly, the p-th and i-th vectors in the vector sequence MQ are represented, and h (a, b) ═ aTWb is used to calculate the distance of different space vectors, where (a, b) corresponds to the above equation (FC, MQ)p)、(FC,MQi)。
For the vector sequence MA of answers, an attention score β is calculated for the vector of the d word in the m sentence of the answer using a similar method as described abovemdAnd obtaining a vector representation r of the mth sentencem(ii) a Then using something like a'm、FA1To calculate the attention score alpha of the mth sentence in the word-level attention mechanismmVector representation of sum answer FA2
Figure BDA0001590557050000064
Figure BDA0001590557050000065
Wherein, MAmd、MAmlCorresponding sequence of representation vectors MAnd (B) vectors of the d and l words in the m sentence in the A.
In the embodiment of the invention, the sentence-level attention mechanism mainly utilizes the vector sequence MQ of the question and the vector sequence MA of one answer to distinguish the importance degree of each sentence in the answer, and the word-level attention mechanism captures the deep semantic relationship among the theme, the question and the answer by further utilizing the additional information of the theme TC on the basis of the sentence-level attention mechanism, and further distinguishes the importance of words and sentences in the question and the answer.
By the above method, a vector representation of the question and answer, in which FQ1、FA1、FQ2、FA2∈RK
4) Evaluation layer: and calculating answer deep language matching scores by combining the vectors of the questions and the vectors of one answer, and taking time effects into consideration by combining the time stamps of the answers so as to obtain the ranking scores.
In the embodiment of the invention, firstly, the vector of the question and the vector of an answer are combined to calculate the matching score of the answer deep layer language
Figure BDA0001590557050000071
The formula is as follows:
Figure BDA0001590557050000072
wherein, σ (·), τ (·) are sigmoid (·), tanh (·) nonlinear activation functions respectively;
Figure BDA0001590557050000073
representing a splicing operation; { W1,W2Is the parameter matrix to be optimized in the model, { b1,b2The parameter vector to be optimized in the model is used as the parameter vector; in the above formula, FQy、FAyThe correlation vector in which y is 1 or 2 and represents the above calculation formula may be a calculation result of the sentence-level attention mechanism or a calculation result of the word-level attention mechanism.
Combining the time stamp T of the answerThe effect is also taken into consideration to obtain a ranking score
Figure BDA0001590557050000074
The formula is as follows:
Figure BDA0001590557050000075
wherein, T0The timestamp representing the first answer, H is a hyper-parameter.
4. And training model parameters.
The step is mainly to train all parameter matrixes or vectors in the enhanced attention mechanism recurrent neural network model established in the previous step, including { W }xi,Whi,Wxf,Whf,Wxc,Whc,Wxo,Who}、{bi,bf,bc,bo}, conversion matrices W, { W1,W2And { b }and1,b2}. Specifically, the evaluation result is combined with a preset time-sensitive objective function, and an enhanced attention mechanism recurrent neural network model is trained by using a problem-dependent pair-wise training strategy.
For each question Q, two answers A are extracted from the corresponding series of answers+And A-Wherein A is+Has a praise number greater than A-Thus constituting a triplet (Q, A)+,A-);
A random gradient descent (SGD) algorithm is used to minimize the time-sensitive objective function:
L=max(0,m+S(Q,A-)-S(Q,A+));
wherein, S (Q, A)+) And S (Q, A)-) The corresponding ranking scores represent the two answers A + and A-in question Q.
In addition, in the training process, the whole data set can be divided into a training set and a test set according to the proportion of 4:1, the training set is used for optimizing the parameters of the model, and the test set is used for measuring the quality of the final model.
5. Predicting value of a series of answers to a new question
This step essentially predicts the value of a series of answers to a new question and ranks the answers according to the magnitude of the predicted value (i.e., the ranking score).
In the embodiment of the invention, a new question X is utilized, a theme B corresponding to the new question X, and a series of text contents A ═ A of answers1,A2,...,AGAnd a timestamp T ═ T for each answer1,T2,...,TGConstruction of a series of examples (X, B, A)g,Tg) G is more than or equal to 1 and less than or equal to G; inputting the examples into a trained enhanced attention mechanism recurrent neural network model in sequence to obtain a series of ranking scores
Figure BDA0001590557050000081
Sorting the corresponding answers in a front-to-back manner according to the sorting scores; that is, the higher the ranking score, the higher the quality of the corresponding answer is considered, and the rank is relatively top.
According to the scheme of the embodiment of the invention, the deep semantic relation between the questions and the answers is captured by utilizing the fusion of various metadata, so that the key points of the questions and the answers can be effectively distinguished, the ordering of the answers is realized, and readers are helped to quickly find valuable and attractive answers.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (7)

1. A community question-answering platform answer sorting method is characterized by comprising the following steps:
crawling a certain amount of data from a community question and answer platform website, wherein the crawled data for a question comprises the following data: the text content of the question, the subject to which the question belongs, the text content of a series of answers corresponding to the question, the timestamp of each answer and the number of praise for each answer;
constructing an enhanced attention mechanism recurrent neural network model based on the text content of each crawled question, the topic to which the question belongs and the text content of a series of answers corresponding to the question, and then combining the time stamp of each answer to perform ranking score of the quality of the relevant answer; training an enhanced attention mechanism recurrent neural network model by combining the ranking scores with a preset time-sensitive objective function and using a problem-dependent pair-wise training strategy;
for a new question and a series of answers thereof, constructing a series of examples by using the text content of the new question, the subject to which the new question belongs, the text content of a series of answers corresponding to the new question and the timestamp of each answer, sequentially inputting the examples into a trained enhanced attention mechanism recurrent neural network model so as to obtain a series of ranking scores, and ranking corresponding answers from front to back according to the ranking scores;
the method for constructing the enhanced attention mechanism recurrent neural network model comprises the following four parts: an input layer, a long-short term memory network layer, an attention layer and an evaluation layer;
an input layer: regarding an answer, the answer is considered to be composed of a plurality of sentences, each sentence being composed of a plurality of words; regarding a corresponding question, the question is considered to be composed of a sentence, and the sentence is composed of a plurality of words; for the subject to which the question belongs, the subject is considered to be composed ofA plurality of words; using Word Embedding technology to represent words appearing in the text by a vector with a fixed length, so that each Word appearing in the text content of the question, the text content of the answer and the theme is replaced by a K-dimensional vector; the vector sequence TQ of the text content of the question consists of N vectors, denoted TQ ═ x1,x2,...,xN},xp∈RKP ═ 1,2,. cndot, N; a vector sequence TA of text content of an answer consists of M sentences each consisting of D vectors, then TA ═ s1,s2,...,sM},sm={ym1,ym2,...,ymD},ymd∈RKM1, 2, M, D1, 2, D; the topic TC consists of C vectors, denoted TC ═ z1,z2,...,zC},zq∈RK,q=1,2,...,C;
Long-short term memory network layer: modeling the vector sequences in question and answer using two long-short term memory networks LSTM _ Q and LSTM _ A for the vector sequence TQ of the question and one answer vector sequence TA, respectively, and using the last cell vector in LSTM _ Q for initialization of the cell vector in LSTM _ A; then obtaining vector sequences of the questions and answers after passing through the long-term and short-term memory network, wherein the vector sequences are MQ and MA respectively, and each vector in the vector sequences MQ and MA contains context semantic information;
attention layer: interacting the vector sequence MQ of the question with the vector sequence MA of an answer by using a sentence-level attention mechanism to obtain the vector FQ of the question1And a vector of answers FA1(ii) a Or changing the theme from the vector sequence TC into a vector FC by using a word-level attention mechanism; then fusing the vector FC of the subject, the vector sequence MQ of the question and the vector sequence MA of an answer to finally obtain the vector FQ of the question2And a vector of answers FA2
Evaluation layer: and calculating answer deep language matching scores by combining the vectors of the questions and the vectors of one answer, and taking time effects into consideration by combining the time stamps of the answers so as to obtain the ranking scores.
2. The method for sorting answers of a community question-answering platform according to claim 1, wherein a step of preprocessing crawled data is further included before constructing the enhanced attention mechanism recurrent neural network model, and the step includes:
removing questions and answers with the number of words smaller than the set number in the text content;
removing questions and answers with praise number exceeding a preset range in a wave band within a period of time;
the remaining questions and the text content of the answers are participled, and the data for each question becomes: the word segmentation result of the text content of the question, the subject to which the question belongs, the word segmentation result of the content of each answer, the timestamp of each answer, and the number of praise of each answer; the praise number of each answer is used for verifying the quality of the model, and the rest information is used as the input of the model for later evaluation of the quality of each answer.
3. The method as claimed in claim 1, wherein the method comprises the steps of,
vector sequence for problem TQ ═ x1,x2,...,xNN updates the cell vector sequence c ═ c { c } in LSTM _ Q over time t ═ 1,21,c2,...,cNH and hidden vector sequence h ═ h1,h2,...,hNThe calculation method is as follows:
it=σ(Wxixt+Whiht-1+bi);
ft=σ(Wxfxt+Whfht-1+bf);
ct=ft·ct-1+it·τ(Wxcxt+Whcht-1+bc);
ot=σ(Wxoxt+Whoht-1+bo);
ht=ot·τ(ct);
where the value of the time corresponds one-to-one to the sequence number of the vector in the vector sequence TQ of the problem, it,ft,otAn input gate, a forgetting gate and an output gate respectively; cell vector c0And hidden vector h0Initializing to a zero vector; sigma (·), tau (·) is sigmoid (·), tanh (·) nonlinear activation function, respectively; is an element multiply operation; { Wxi,Whi,Wxf,Whf,Wxc,Whc,Wxo,WhoIs the parameter matrix to be optimized in the model, { bi,bf,bc,boIs the parameter vector to be optimized in the model;
vector sequence s for each sentence in the answerm={ym1,ym2,...,ymDD updates the cell vector sequence c ' ═ { c ' in LSTM _ a over time t ' ═ 1,2.1,c'2,...,c'DAnd obtaining an implicit vector sequence h'm={h'm1,h'm2,...,h'mD}; let cell vector c'0=cNIs implicit to vector h'm0Initialized to a zero vector.
4. The method as claimed in claim 1, wherein the method comprises the steps of,
the process of interacting the vector sequence MQ of the question with the vector sequence MA of one answer with the sentence-level attention mechanism to obtain the vector of the question and the vector of one answer is as follows:
for the vector sequence MQ of the problem, it is expressed as a K-dimensional vector FQ using an average pooling operation1
Figure FDA0002562774540000031
Wherein MQpA vector representing the p-th word in the vector sequence MQ;
for the vector sequence of answers MA, each of the vector sequence of answers MA is first processed using an averaging pooling operationThe sentences are represented by a vector of K dimension to obtain several semantic representations r'mM1, 2., M; then, a distance function is used to calculate their respective attention scores α'mM1, 2, M, and then using a weighted average to obtain a vector representation FA of the answer1
Figure FDA0002562774540000032
Wherein, MAmdA vector representing the d word in the m sentence in the vector sequence MA, wherein f (-) represents a cosine similarity function;
changing the theme from a vector sequence TC into a vector FC by using a word-level attention mechanism; then fusing the vector FC of the subject, the vector sequence MQ of the question and the vector sequence MA of an answer to finally obtain the vector of the question and the vector of the answer as follows:
subject TC ═ z for a given problem1,z2,...,zCIt is transformed into a fixed-length vector FC using an average pooling operation:
Figure FDA0002562774540000033
after the vector FC is obtained, the word-level attention mechanism uses it to calculate a score for the attention of each word in the question and answer;
for a vector sequence MQ of the question, which is composed of a series of semantically represented vectors, the distance between the vector of each word in the question and the vector FC of the topic is calculated using the transformation matrix W, after which the attention score β for the p-th word in the question is calculated using the softmax operationp(ii) a Resulting vector FQ of the problem2Expressed as:
Figure FDA0002562774540000041
wherein MQp、MQiCorresponding representation of the p, i-th vector, h in the vector sequence MQ(a,b)=aTWb is used to calculate the distance of different space vectors, where (a, b) corresponds to the above equation (FC, MQ)p)、(FC,MQi);
For the vector sequence MA of answers, an attention score β is calculated for the vector of the d word in the m sentence of the answermdAnd obtaining a vector representation r of the mth sentencem(ii) a Then the attention score alpha of the mth sentence in the attention mechanism of the word level is calculatedmVector of sum answers FA2
Figure FDA0002562774540000042
Figure FDA0002562774540000043
Wherein, MAmd、MAmlThe corresponding vector represents the d, l word in the m sentence in the vector sequence MA.
5. The method as claimed in claim 1, wherein the method comprises the steps of,
computing answer deep language matching scores combining a question vector and an answer vector
Figure FDA0002562774540000044
The formula is as follows:
Figure FDA0002562774540000045
y is 1 or 2;
wherein, σ (·), τ (·) are sigmoid (·), tanh (·) nonlinear activation functions respectively;
Figure FDA0002562774540000046
representing a splicing operation; { W1,W2Is the parameter matrix to be optimized in the model, { b1,b2The parameter vector to be optimized in the model is used as the parameter vector;
combining with the time stamp T of the answer, the time effect is also taken into consideration to obtain the ranking score
Figure FDA0002562774540000047
The formula is as follows:
Figure FDA0002562774540000048
wherein, T0The timestamp representing the first answer, H is a hyper-parameter.
6. The method for sorting answers of a community question-answering platform according to claims 1,2, 3, 4 or 5, wherein a training process of the enhanced attention mechanism recurrent neural network model by combining the evaluation result with a preset time-sensitive objective function and using a question-dependent pair-wise training strategy is as follows:
for each question Q, two answers A are extracted from the corresponding series of answers+And A-Wherein A is+Has a praise number greater than A-Thus constituting a triplet (Q, A)+,A-);
A random gradient descent algorithm is used to minimize the time-sensitive objective function:
L=max(0,m+S(Q,A-)-S(Q,A+));
wherein, S (Q, A)+) And S (Q, A)-) Corresponding two answers A in the presentation question Q+And A-The ranking score of (1).
7. The method as claimed in claim 1,2, 3, 4 or 5, wherein a new question X is used, the topic B of the new question X, and the text content a ═ a of a series of answers1,A2,...,AGAnd a timestamp T ═ T for each answer1,T2,...,TGConstruction of a series of examples (X, B, A)g,Tg) G is more than or equal to 1 and less than or equal to G; these examples are to be construed as beingSequentially inputting the data into a trained enhanced attention mechanism cyclic neural network model so as to obtain a series of sequencing scores
Figure FDA0002562774540000051
And sorting the corresponding answers in a front-to-back mode according to the size of the sorting score.
CN201810186972.2A 2018-03-07 2018-03-07 Community question-answering platform answer sorting method Active CN108304587B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810186972.2A CN108304587B (en) 2018-03-07 2018-03-07 Community question-answering platform answer sorting method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810186972.2A CN108304587B (en) 2018-03-07 2018-03-07 Community question-answering platform answer sorting method

Publications (2)

Publication Number Publication Date
CN108304587A CN108304587A (en) 2018-07-20
CN108304587B true CN108304587B (en) 2020-10-27

Family

ID=62849405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810186972.2A Active CN108304587B (en) 2018-03-07 2018-03-07 Community question-answering platform answer sorting method

Country Status (1)

Country Link
CN (1) CN108304587B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460823A (en) * 2018-09-14 2019-03-12 广州神马移动信息科技有限公司 Construction method and its device, electronic equipment, the computer-readable medium of knowledge base
CN109213847A (en) * 2018-09-14 2019-01-15 广州神马移动信息科技有限公司 Layered approach and its device, electronic equipment, the computer-readable medium of answer
CN109739958A (en) * 2018-11-22 2019-05-10 普强信息技术(北京)有限公司 A kind of specification handbook answering method and system
CN110085249B (en) * 2019-05-09 2021-03-16 南京工程学院 Single-channel speech enhancement method of recurrent neural network based on attention gating
CN110597971B (en) * 2019-08-22 2022-04-29 卓尔智联(武汉)研究院有限公司 Automatic question answering device and method based on neural network and readable storage medium
CN112131354B (en) * 2020-11-26 2021-04-16 广州华多网络科技有限公司 Answer screening method and device, terminal equipment and computer readable storage medium
CN113255843B (en) * 2021-07-06 2021-09-21 北京优幕科技有限责任公司 Speech manuscript evaluation method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101441660A (en) * 2008-12-16 2009-05-27 腾讯科技(深圳)有限公司 Knowledge evaluating system and method in inquiry and answer community
CN107590138A (en) * 2017-08-18 2018-01-16 浙江大学 A kind of neural machine translation method based on part of speech notice mechanism

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140019447A1 (en) * 2011-12-18 2014-01-16 Yuly Goryavskiy Multi-attribute search system and method for ranking objects according to their attractiveness

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101441660A (en) * 2008-12-16 2009-05-27 腾讯科技(深圳)有限公司 Knowledge evaluating system and method in inquiry and answer community
CN107590138A (en) * 2017-08-18 2018-01-16 浙江大学 A kind of neural machine translation method based on part of speech notice mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
即时交互式问答系统的设计与实现;呼大为等;《小型微型计算机系统》;20090930;第30卷(第9期);第1761-1766页 *

Also Published As

Publication number Publication date
CN108304587A (en) 2018-07-20

Similar Documents

Publication Publication Date Title
CN108304587B (en) Community question-answering platform answer sorting method
CN109753566B (en) Model training method for cross-domain emotion analysis based on convolutional neural network
CN111415740B (en) Method and device for processing inquiry information, storage medium and computer equipment
Dhillon et al. Eigenwords: spectral word embeddings.
Li et al. Pachinko allocation: DAG-structured mixture models of topic correlations
Shen et al. Question/answer matching for CQA system via combining lexical and sequential information
CN111460132B (en) Generation type conference abstract method based on graph convolution neural network
Wen et al. Dynamic interactive multiview memory network for emotion recognition in conversation
Cai et al. Intelligent question answering in restricted domains using deep learning and question pair matching
CN107832326A (en) A kind of natural language question-answering method based on deep layer convolutional neural networks
CN115659954A (en) Composition automatic scoring method based on multi-stage learning
Chen et al. Deep neural networks for multi-class sentiment classification
Uto et al. Diverse reports recommendation system based on latent Dirichlet allocation
CN116595151A (en) Priori knowledge-based image reasoning question-answering method for inspiring large language model
Vekariya et al. A novel approach for semantic similarity measurement for high quality answer selection in question answering using deep learning methods
Mrhar et al. Toward a deep recommender system for moocs platforms
CN109948163B (en) Natural language semantic matching method for dynamic sequence reading
Lin et al. BERT-SMAP: Paying attention to Essential Terms in passage ranking beyond BERT
Xiao et al. Knowledge tracing based on multi-feature fusion
Hamarashid et al. A comprehensive review and evaluation on text predictive and entertainment systems
Chan et al. Optimization of language models by word computing
YIN A compression-based BiLSTM for treating teenagers’ depression chatbot
Jain et al. SentiGames-A Game Theoretic Approach To Sentiment Analysis
Klenin et al. Comparison of Vector Space Representations of Documents for the Task of Matching Contents of Educational Course Programmes.
Cheng et al. Automatic Scoring of Spoken Language Based on Basic Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant