CN110969010A - Problem generation method based on relationship guidance and dual-channel interaction mechanism - Google Patents

Problem generation method based on relationship guidance and dual-channel interaction mechanism Download PDF

Info

Publication number
CN110969010A
CN110969010A CN201911238302.1A CN201911238302A CN110969010A CN 110969010 A CN110969010 A CN 110969010A CN 201911238302 A CN201911238302 A CN 201911238302A CN 110969010 A CN110969010 A CN 110969010A
Authority
CN
China
Prior art keywords
answer
article
representation
articles
context
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911238302.1A
Other languages
Chinese (zh)
Inventor
赵洲
潘启璠
王禹潼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201911238302.1A priority Critical patent/CN110969010A/en
Publication of CN110969010A publication Critical patent/CN110969010A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a problem generation method based on relationship guidance and a two-channel interaction mechanism, wherein the problem generation method comprises the following steps: 1) for a set of answers, supporting articles, and unrelated articles, a relationship director is constructed that contains the relationships between the answers and the corresponding articles. 2) And (4) pre-training the relation director obtained in the step (1) to obtain an answer correlation encoder. 3) For a set of answers and articles, word embedding is used for the articles and answers to obtain corresponding representations. And (3) respectively inputting the article and the answer representation into a context encoder and an answer related encoder obtained in the step (2) to obtain the context representations of the original text, the answer and the answer related article. And inputting the context representation into the dual-channel interaction module to obtain the joint representation of the article. 4) And (4) decoding the input question generator which obtains the joint expression of the article in the step (4) to generate a question. 5) And training to obtain a question generation network, wherein the model can generate corresponding questions on the basis of given answers and articles.

Description

Problem generation method based on relationship guidance and dual-channel interaction mechanism
Technical Field
The invention relates to the field of natural language processing problem generation, in particular to a problem generation method based on relationship guidance and a two-channel interaction mechanism.
Background
Article-based question generation is a challenging task that requires the generation of correct and smooth questions on the basis of a given answer and a piece of text. Question generation is currently gaining more and more attention in the field of natural language processing, and there are many successful application areas, such as providing valuable questions to a question answering system, automatically generating the questions of a job, and automatically presenting questions in a dialog system for feedback.
The problem generating method based on the neural network mainly comprises two steps, wherein the first step is to extract a plurality of important sentences from an article by using a supervised neural network or artificial rules, and the second step is to generate a problem according to the extracted sentences by using an encoder-decoder framework. The prior art mainly has the following defects: a large amount of manpower is needed for marking a data set required by training and monitoring a neural network, and meanwhile, the existing question-answer data sets have strong differences in the aspects of question types, language styles and the like; artificially designed rules are not efficient and are difficult to migrate to new domains; the existing method utilizes a coder-decoder neural network to achieve a good effect, but only extracts a few sentences relevant to answers to generate questions, and neglects the connection between the answers and the full text. To overcome these deficiencies, the present invention utilizes full-text information to generate answers to related questions.
Disclosure of Invention
The invention aims to solve the problems in the prior art, and aims to overcome the problem that the prior art only generates a problem according to a plurality of sentences relevant to answers and neglects the relation between the answers and the full text. In particular, the present invention contemplates a discriminator, termed a relationship director, to determine whether an article provides sufficient information to obtain the answer. From another perspective, the relationship director aims to find content from the articles that is helpful in understanding the answers and to determine fraudulent articles that are not relevant to obtaining the answers. Then, the invention designs a dual-channel interactive module, which comprises a multi-step modeling of the original channel and the representation of the output of the relationship director channel, and a control gate for determining the information flux of the two channels. Finally, the present invention will incorporate a relationship director and a two-channel interaction module into the encoder-decoder framework to predict the problem.
The invention adopts the specific technical scheme that:
the problem generation method based on the relation guidance and the dual-channel interaction mechanism comprises the following steps:
1. aiming at a group of answers, supporting articles and irrelevant articles, a relation director containing the relation between the answers and the corresponding articles is constructed, and the relation director contains an answer relevant encoder and a question relevant score function;
2. pre-training the relation director obtained in the step (1) to obtain a trained relation director, and fixing the weight parameters of the answer related encoders in the relation director to obtain the answer related encoders with fixed parameters;
3. constructing a question generation network, wherein the question generation network comprises a word embedding device, a context encoder, a two-channel interaction module, a question generator and an answer correlation encoder with fixed parameters obtained in the step 2;
4. aiming at a group of answers and articles, obtaining context representations of the articles, the answers and the articles related to answers through word embedding and a context encoder in a question generation network and an answer related encoder with fixed parameters obtained in the step 2;
5. the context expression of the articles, answers and answers related articles obtained in the step 4 is input into a double-channel interaction module of a question generation network to obtain a joint expression G of the articles;
6. generating a question generator of a network by the input question which is obtained by the joint representation of the articles in the step 5, and decoding the generated question; problem generation is a decoding process. The problem generator utilizes a long short term memory decoder based on attention mechanism to solve the out-of-vocabulary problem. The encoder attention memory is the joint representation G of the article obtained in step 5. Thereafter, in each decoding step, the attention result and the previous word are inserted into the decoding unit, wherein the attention result output by the encoder is used to initialize the hidden state of the long-term and short-term memory in the decoder.
7. And training to obtain a final question generation network model, and generating a corresponding question by the model on the basis of given answers and a section of article.
The invention has the following beneficial effects:
unlike existing encoder-decoder models, the present invention utilizes weakly supervised labeling to model relationships between related articles and the scope of answers, and transfers the learned relationships to the question generation system. In particular, the present invention contemplates a discriminator, known as a relationship director, to determine whether an article provides sufficient information to obtain the answer. From another perspective, the relationship director aims to find content from the articles that is helpful in understanding the answers and to determine fraudulent articles that are not relevant to obtaining the answers. Then, the invention designs a dual-channel interactive module, which comprises a multi-step modeling of the original channel and the representation of the output of the relationship director channel, and a control gate for determining the information flux of the two channels. Finally, the present invention incorporates a relationship director and a two-channel interaction module into the encoder-decoder framework to predict the problem.
The existing question generation method only extracts a few sentences related to answers to generate a question, and ignores the connection between the answers and the full text. The invention utilizes full-text information to generate and answer related questions, and overcomes the related defects of the existing questions.
Drawings
FIG. 1 is a schematic diagram of a relationship director;
fig. 2 is a schematic diagram of the overall structure of the problem generation network.
Detailed Description
The invention will be further elucidated and described with reference to the drawings and the detailed description.
As shown in fig. 1 and fig. 2, the problem generation method based on relationship guidance and a dual-channel interaction mechanism of the present invention includes the following steps:
step one, aiming at a group of answers, supporting articles and irrelevant articles, a relation director containing the relation between the answers and the corresponding articles is constructed, the structural schematic diagram of the relation director is shown in figure 1 and comprises word embedding, position embedding, an answer-related question encoder and a question-related score function, wherein the answer-related question encoder consists of a self-attention unit, a gated convolutional neural network unit, a bidirectional attention module, a fully-connected feedforward layer and a ReLU activation function; the specific workflow of the relationship director is as follows:
the method comprises the steps of forming a triple group by an answer, a support article and an irrelevant article from a search engine as an input sequence, embedding and representing each word in the triple group by a pre-trained word, coding the position of the word by position embedding, and adding the results of the word embedding and the position embedding to obtain a final representation (p) of the triple group+,a,p-) Where a is the final representation of the answer, p+To support the final representation of the article, p-Is the final representation of the irrelevant article;
the final representation (p) of the triplet+,a,p-) An input answer-dependent encoder comprising a self-attention unit, a gated convolutional neural network unit, and a bi-directional attention module; final representation of the triplet (p)+,a,p-) Encoded by a gated convolutional neural network and a self-attention mechanism according to the following formula:
Figure BDA0002305465850000031
Figure BDA0002305465850000032
Figure BDA0002305465850000033
wherein [ ·]MAnd [ ·]NThe repetition times of the self-attention unit or the gated convolutional neural network unit are respectively M times and N times, each self-attention unit SelfAtt (·) and the gated convolutional neural network unit GatedCNN (·) utilize a residual error mechanism and a layer normalization function, f (·) is a fully-connected feedforward layer, and ReLU is taken as an activation function;
Figure BDA0002305465850000034
for the final representation of the encoded support article,
Figure BDA0002305465850000035
for the final representation of the encoded unrelated article,
Figure BDA0002305465850000036
is the final representation of the encoded answer;
will be provided with
Figure BDA0002305465850000041
And
Figure BDA0002305465850000042
inputting a bidirectional attention module, and obtaining a similarity matrix S epsilon R between the supporting article and the answer according to the following formulan×m
Figure BDA0002305465850000043
Wherein, WsIs a matrix that can be trained in a single way,
Figure BDA0002305465850000044
for the ith vector in the final representation of the encoded support article,
Figure BDA0002305465850000045
is the jth vector in the final representation of the encoded answer, [;]connections representing vectors⊙ denotes the bitwise multiplication of vectors, SijRepresenting similarity between the ith vector in the final representation of the encoded support article and the jth vector in the final representation of the encoded answer;
computing
Figure BDA0002305465850000046
About
Figure BDA0002305465850000047
Attention weighting vector of
Figure BDA0002305465850000048
Figure BDA0002305465850000049
Computing
Figure BDA00023054658500000410
About
Figure BDA00023054658500000411
Attention weighting vector g of (1):
βi=Softmax(maxrow(Sij))
Figure BDA00023054658500000412
will be provided with
Figure BDA00023054658500000413
g combining the following formula to obtain a support article enhancement representation with bidirectional attention enhancement
Figure BDA00023054658500000414
Figure BDA00023054658500000415
In the same way, will
Figure BDA00023054658500000416
And
Figure BDA00023054658500000417
inputting the bidirectional attention module to obtain the irrelevant article enhanced representation of bidirectional attention enhancement
Figure BDA00023054658500000418
1.3) adopting a feedforward layer to obtain scores of enhanced representations of supporting articles and enhanced representations of irrelevant articles, and calculating by using the following problem-related score functions:
Figure BDA00023054658500000419
Figure BDA00023054658500000420
wherein, W(p+,a)Is a trainable weight matrix that is used to determine,
Figure BDA00023054658500000421
is a relevance score that supports article-answer combinations,
Figure BDA00023054658500000422
is the relevance score of an irrelevant article-answer combination, MeanPooling is the average pooling operation, Sigmoid is the Sigmoid activation function;
step two, pre-training the relation director obtained in the step one to obtain a trained relation director, and fixing the weight parameters of the answer related encoders in the relation director to obtain the answer related encoders with fixed parameters;
step three, constructing a question generation network, as shown in fig. 2, wherein the question generation network comprises a word embedding device, a context encoder, a two-channel interaction module, a question generator and an answer related encoder with fixed parameters obtained in the step two; the specific workflow of the problem generation network is as follows:
(1) embedding words into a group of answers and articles to obtain the representation of the articles and the representation of the answers;
(2) respectively inputting the representation of the article and the representation of the answer into a context encoder, carrying out context encoding by two bidirectional long-short term memory modules sharing weight, and outputting the context representation of the article and the context representation of the answer;
(3) and inputting the article representation and the answer representation into an answer-related encoder with fixed parameters obtained in the second step, and outputting context representations of the articles related to the answers.
(4) Inputting the context representation of the article, the context representation of the answer and the context representation of the article related to the answer, which are obtained in the step (3), into a dual-channel interaction module; the dual-channel interaction module comprises two channels: an original interaction channel and a transfer interaction channel; the original interactive channel is formed by sequentially connecting an original interactive unit and a context encoder, the transfer interactive channel is formed by sequentially connecting a transfer interactive unit and a context encoder, and the output sides of the original interactive unit, the transfer interactive unit and the context encoder are also respectively connected with a linear layer and a ReLU activation function;
inputting the context representation of an article and the context representation of an answer into an original interaction channel of a dual-channel interaction module, inputting the relevance score of an article-answer combination and the context representation of the answer into a transfer interaction channel of the dual-channel interaction module, and repeating the K steps by adopting a residual error mechanism for an original interaction unit, a transfer interaction unit and two context encoders in the dual-channel interaction module to obtain an output x of the original interaction channel and an output y of the transfer interaction channel; the original interaction unit and the transfer interaction unit both adopt bidirectional attention modules.
(5) And combining the output x of the original interaction channel and the output y of the transfer interaction channel by adopting a control gate according to the following formula to obtain a joint expression G of an article:
g=σ(Wg[x;y]+bg)
G=g·x+(1-g)·y
wherein, WgAnd bgIs a trainable referenceNumber, σ is the activation function, G is the control gate, G is the joint representation of the article,. represents the bitwise multiplication of the matrix, [;]representing the concatenation of vectors.
(6) Inputting the joint representation G of the article obtained in the step (5) into a question generator, and decoding to generate a question; problem generation is a decoding process that uses a long-short term memory decoder based on attention mechanism to solve out-of-vocabulary problems. The encoder attention memory is the joint expression G of the article obtained in step (5), after which in each decoding step the attention result and the preceding words are inserted into the decoding unit, where the attention result output by the encoder is used to initialize the hidden state of the long-short term memory in the decoder.
The method is applied to the following embodiments to achieve the technical effects of the present invention, and detailed steps in the embodiments are not described again.
Examples
Experiments were performed on the MS MARCO dataset and the SQuAD dataset. The MS MARCO dataset is a large-scale dataset collected by Bing, containing 1010916 questions, 8841823 related articles extracted from 3563535 web documents. The present invention utilizes the MS MARCO to train the relationship director. In the MSMARCO dataset, there were 10 articles obtained from the search engine after each question-answer combination, and some articles were not enough to derive an answer that could answer the question. The invention marks the irrelevant articles as negative examples and marks the supporting articles as positive examples. The SQuAD dataset is one of the most influential reading comprehension datasets, containing over 10 million questions to 536 Wikipedia articles, and the answer range contained in the articles. The invention divides the whole SQuAD data set into a training set (80%), a development set (10%) and a test set (10%) at the article level.
In order to objectively evaluate the performance of the algorithm of the invention, six indexes, namely BLEU1, BLEU2, BLEU3, BLEU4 and ROUGE-L, METEOR, are used for automatically evaluating the effect of the invention in a selected test set, and three manual evaluation indexes, namely fluency, accuracy and Turing @1, are introduced to verify the effect of the model. According to the steps described in the specific embodiment, the results of the automatic evaluation on six indexes are shown in table 1, the results of the experiments on the fluency standard obtained by comparing with other 4 existing methods are shown in table 2, the results of the experiments on the Accuracy standard obtained by comparing with other 4 existing methods are shown in table 3, the results of the experiments on the Turing @1 standard obtained by comparing with other 4 existing methods are shown in table 4, and the method is represented as follows:
table 1 results of automatic evaluation for six indices
BLEU1 BLEU2 BLEU3 BLEU4 ROUGE-L METEOR
The invention 32.65 22.14 15.86 12.03 32.36 20.25
Table 2 experimental results obtained for fluency criteria
Model (model) Seq2Seq Seq2Seq+Attention Seq2Seq+Attention+Copy The model
Fluency degree 1.09 2.83 3.84 4.14
TABLE 3 Experimental results for accuracy standards
Model (model) Seq2Seq Seq2Seq+Attention Seq2Seq+Attention+Copy The model
Rate of accuracy 0.01 0.21 0.33 0.53
TABLE 4 Experimental results for Turing @1 Standard
Model (model) Seq2Seq Seq2Seq+Attention Seq2Seq+Attention+Copy The model
Turing@1 0.8% 7.8% 32.6% 58.9%

Claims (7)

1. A problem generation method based on relationship guidance and a dual-channel interaction mechanism is characterized by comprising the following steps:
1) aiming at a group of answers, supporting articles and irrelevant articles, a relation director containing the relation between the answers and the corresponding articles is constructed, and the relation director contains an answer relevant encoder and a question relevant score function;
2) pre-training the relation director obtained in the step 1) to obtain a trained relation director, and fixing the weight parameters of the answer-related encoders in the relation director to obtain the answer-related encoders with fixed parameters;
3) constructing a question generation network, wherein the question generation network comprises word embedding, a context encoder, a two-channel interaction module, a question generator and an answer related encoder with fixed parameters obtained in the step 2);
4) aiming at a group of answers and articles, obtaining context representations of the articles, the answers and the articles related to answers through word embedding and a context encoder in a question generation network and an answer related encoder with fixed parameters obtained in the step 2);
5) inputting the context representation of the articles, answers and answers related articles obtained in the step 4) into a double-channel interaction module of a question generation network to obtain the joint representation of the articles;
6) generating the input question of the joint expression of the article obtained in the step 5) into a question generator of a network, and decoding the input question to generate a question;
7) and training to obtain a final question generation network model, and generating a corresponding question by the model on the basis of given answers and a section of article.
2. The problem generation method based on the relationship guidance and the two-channel interaction mechanism as claimed in claim 1, wherein the step 1) is specifically as follows:
1.1) forming a triple of answers, supporting articles and irrelevant articles from a search engine as an input sequence, wherein each word in the triple is represented by a pre-trained word embedding, the positions of the words are encoded by adopting position embedding, and the results of the word embedding and the position embedding are added to obtain a final representation (p) of the triple+,a,p-) Where a is the final representation of the answer, p+To support the final representation of the article, p-Is the final representation of the irrelevant article;
1.2) Final representation (p) of the triplet+,a,p-) Input answer dependent encoder, said answer dependent encoder packetThe system comprises a self-attention unit, a gated convolutional neural network unit and a bidirectional attention module; final representation of the triplet (p)+,a,p-) Encoded by a gated convolutional neural network and a self-attention mechanism according to the following formula:
Figure FDA0002305465840000011
Figure FDA0002305465840000012
Figure FDA0002305465840000013
wherein [ ·]MAnd [ ·]NThe repetition times of the self-attention unit or the gated convolutional neural network unit are respectively M times and N times, each self-attention unit SelfAtt (·) and the gated convolutional neural network unit GatedCNN (·) utilize a residual error mechanism and a layer normalization function, f (·) is a fully-connected feedforward layer, and ReLU is taken as an activation function;
Figure FDA0002305465840000021
for the final representation of the encoded support article,
Figure FDA0002305465840000022
for the final representation of the encoded unrelated article,
Figure FDA0002305465840000023
is the final representation of the encoded answer;
will be provided with
Figure FDA0002305465840000024
And
Figure FDA0002305465840000025
inputting the bidirectional attention module to obtain the branch according to the following formulaSimilarity matrix S epsilon R between prop article and answern×m
Figure FDA0002305465840000026
Wherein, WsIs a matrix that can be trained in a single way,
Figure FDA0002305465840000027
for the ith vector in the final representation of the encoded support article,
Figure FDA0002305465840000028
is the jth vector in the final representation of the encoded answer, [;]representing concatenation of vectors, ⊙ representing bitwise multiplication of vectors, SijRepresenting similarity between the ith vector in the final representation of the encoded support article and the jth vector in the final representation of the encoded answer;
computing
Figure FDA0002305465840000029
About
Figure FDA00023054658400000210
Attention weighting vector of
Figure FDA00023054658400000211
Figure FDA00023054658400000212
Computing
Figure FDA00023054658400000213
About
Figure FDA00023054658400000214
Attention weighting vector g of (1):
βi=Softmax(maxrow(Sij))
Figure FDA00023054658400000215
will be provided with
Figure FDA00023054658400000216
g combining the following formula to obtain a support article enhancement representation with bidirectional attention enhancement
Figure FDA00023054658400000217
Figure FDA00023054658400000218
In the same way, will
Figure FDA00023054658400000219
And
Figure FDA00023054658400000220
inputting the bidirectional attention module to obtain the irrelevant article enhanced representation of bidirectional attention enhancement
Figure FDA00023054658400000221
1.3) adopting a feedforward layer to obtain scores of enhanced representations of supporting articles and enhanced representations of irrelevant articles, and calculating by using the following problem-related score functions:
Figure FDA00023054658400000222
Figure FDA00023054658400000223
wherein the content of the first and second substances,
Figure FDA00023054658400000224
is a trainable weight matrix that is used to determine,
Figure FDA00023054658400000225
is a relevance score that supports article-answer combinations,
Figure FDA00023054658400000226
is the relevance score of an irrelevant article-answer combination, MeanPooling is the average pooling operation, and Sigmoid is the Sigmoid activation function.
3. The problem generation method based on the relationship guidance and the two-channel interaction mechanism as claimed in claim 1, wherein the step 2) is specifically as follows:
design loss function Lr
Figure FDA0002305465840000031
Wherein the content of the first and second substances,
Figure FDA0002305465840000032
and
Figure FDA0002305465840000033
representing relevance scores for supporting article-answer combinations and relevance scores for irrelevant article-answer combinations, c being a defined hyperparameter, R being a set of supporting article-answer-irrelevant article combinations;
pre-training the relation director obtained in the step 1) to obtain a trained relation director, and fixing parameters of the answer related encoder in the relation director to obtain the answer related encoder with fixed parameters.
4. The problem generation method based on the relationship guidance and the two-channel interaction mechanism as claimed in claim 1, wherein the step 4) is specifically as follows:
4.1) embedding words in a group of answers and articles to obtain the representation of the articles and the representation of the answers;
4.2) respectively inputting the representation of the article and the representation of the answer into a context coder, carrying out context coding by two bidirectional long-short term memory modules sharing weight, and outputting the context representation of the article and the context representation of the answer;
4.3) inputting the article representation and the answer representation into the answer-related encoder with fixed parameters obtained in the step 2), and outputting the context representation of the article related to the answer.
5. The problem generation method based on the relationship guidance and the two-channel interaction mechanism as claimed in claim 1, wherein the step 5) is specifically as follows:
5.1) inputting the context representation of the article, the context representation of the answer and the context representation of the article related to the answer obtained in the step 4) into a dual-channel interaction module; the dual-channel interaction module comprises two channels: an original interaction channel and a transfer interaction channel; the original interactive channel is formed by sequentially connecting an original interactive unit and a context encoder, the transfer interactive channel is formed by sequentially connecting a transfer interactive unit and a context encoder, and the output sides of the original interactive unit, the transfer interactive unit and the context encoder are also respectively connected with a linear layer and a ReLU activation function;
inputting the context representation of an article and the context representation of an answer into an original interaction channel of a dual-channel interaction module, inputting the relevance score of an article-answer combination and the context representation of the answer into a transfer interaction channel of the dual-channel interaction module, and repeating the K steps by adopting a residual error mechanism for an original interaction unit, a transfer interaction unit and two context encoders in the dual-channel interaction module to obtain an output x of the original interaction channel and an output y of the transfer interaction channel;
5.2) combining the output x of the original interaction channel and the output y of the transfer interaction channel by adopting a control gate according to the following formula to obtain a joint expression G of an article:
g=σ(Wg[x;y]+bg)
G=g·x+(1-g)·y
wherein, WgAnd bgIs a trainable parameter, σ is an activation function, G is a control gate, G is a joint representation of an article, [;]representing the concatenation of vectors.
6. The problem generation method based on the relationship guidance and the dual-channel interaction mechanism as claimed in claim 5, wherein the original interaction unit and the transfer interaction unit both use a bidirectional attention module.
7. The method of claim 1, wherein the problem generator in step 6) utilizes a long-short term memory decoder based on attention mechanism to solve the out-of-vocabulary problem.
CN201911238302.1A 2019-12-06 2019-12-06 Problem generation method based on relationship guidance and dual-channel interaction mechanism Withdrawn CN110969010A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911238302.1A CN110969010A (en) 2019-12-06 2019-12-06 Problem generation method based on relationship guidance and dual-channel interaction mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911238302.1A CN110969010A (en) 2019-12-06 2019-12-06 Problem generation method based on relationship guidance and dual-channel interaction mechanism

Publications (1)

Publication Number Publication Date
CN110969010A true CN110969010A (en) 2020-04-07

Family

ID=70033105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911238302.1A Withdrawn CN110969010A (en) 2019-12-06 2019-12-06 Problem generation method based on relationship guidance and dual-channel interaction mechanism

Country Status (1)

Country Link
CN (1) CN110969010A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287978A (en) * 2020-10-07 2021-01-29 武汉大学 Hyperspectral remote sensing image classification method based on self-attention context network
CN115080715A (en) * 2022-05-30 2022-09-20 重庆理工大学 Span extraction reading understanding method based on residual error structure and bidirectional fusion attention

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763444A (en) * 2018-05-25 2018-11-06 杭州知智能科技有限公司 The method for solving video question and answer using hierarchical coding decoder network mechanism
CN108846130A (en) * 2018-06-29 2018-11-20 北京百度网讯科技有限公司 A kind of question text generation method, device, equipment and medium
CN109684452A (en) * 2018-12-25 2019-04-26 中科国力(镇江)智能技术有限公司 A kind of neural network problem generation method based on answer Yu answer location information
US20190251168A1 (en) * 2018-02-09 2019-08-15 Salesforce.Com, Inc. Multitask Learning As Question Answering
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190251168A1 (en) * 2018-02-09 2019-08-15 Salesforce.Com, Inc. Multitask Learning As Question Answering
CN108763444A (en) * 2018-05-25 2018-11-06 杭州知智能科技有限公司 The method for solving video question and answer using hierarchical coding decoder network mechanism
CN108846130A (en) * 2018-06-29 2018-11-20 北京百度网讯科技有限公司 A kind of question text generation method, device, equipment and medium
CN109684452A (en) * 2018-12-25 2019-04-26 中科国力(镇江)智能技术有限公司 A kind of neural network problem generation method based on answer Yu answer location information
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUTONG WANG ETC.: "Weak Supervision Enhanced Generative Network for Question Generation", 《28TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE IJCAI-19》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287978A (en) * 2020-10-07 2021-01-29 武汉大学 Hyperspectral remote sensing image classification method based on self-attention context network
CN112287978B (en) * 2020-10-07 2022-04-15 武汉大学 Hyperspectral remote sensing image classification method based on self-attention context network
US11783579B2 (en) 2020-10-07 2023-10-10 Wuhan University Hyperspectral remote sensing image classification method based on self-attention context network
CN115080715A (en) * 2022-05-30 2022-09-20 重庆理工大学 Span extraction reading understanding method based on residual error structure and bidirectional fusion attention

Similar Documents

Publication Publication Date Title
CN109657041B (en) Deep learning-based automatic problem generation method
CN107239446B (en) A kind of intelligence relationship extracting method based on neural network Yu attention mechanism
CN109766427B (en) Intelligent question-answering method based on collaborative attention for virtual learning environment
CN110390397B (en) Text inclusion recognition method and device
CN109992657B (en) Dialogue type problem generation method based on enhanced dynamic reasoning
CN107562792A (en) A kind of question and answer matching process based on deep learning
CN112559702B (en) Method for generating natural language problem in civil construction information field based on Transformer
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN105843801A (en) Multi-translation parallel corpus construction system
CN111125333B (en) Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN111563146A (en) Inference-based difficulty controllable problem generation method
CN116596347B (en) Multi-disciplinary interaction teaching system and teaching method based on cloud platform
CN110969010A (en) Problem generation method based on relationship guidance and dual-channel interaction mechanism
CN114548053A (en) Text comparison learning error correction system, method and device based on editing method
CN114328853B (en) Chinese problem generation method based on Unilm optimized language model
CN110321568A (en) The Chinese-based on fusion part of speech and location information gets over convolutional Neural machine translation method
CN112668344B (en) Complexity-controllable diversified problem generation method based on mixed expert model
CN114692615A (en) Small sample semantic graph recognition method for small languages
Chen A deep learning-based intelligent quality detection model for machine translation
Liu et al. Semantic Repeatability Screening Mechanism of Intelligent Learning Platform Based on Bi-LSTM.
CN111428499A (en) Idiom compression representation method for automatic question-answering system by fusing similar meaning word information
Mu Gated Recurrent Unit Framework for Ideological and Political Teaching System in Colleges
CN110929265B (en) Multi-angle answer verification method for reading, understanding, asking and answering
Alissa et al. Text simplification using transformer and BERT
Nie et al. Predicting Reading Comprehension Scores of Elementary School Students.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200407

WW01 Invention patent application withdrawn after publication