CN110162613A

CN110162613A - A kind of problem generation method, device, equipment and storage medium

Info

Publication number: CN110162613A
Application number: CN201910447602.4A
Authority: CN
Inventors: 高一帆; 李丕绩
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-05-27
Filing date: 2019-05-27
Publication date: 2019-08-23
Anticipated expiration: 2039-05-27
Also published as: CN111414464B; CN110162613B; CN111414464A

Abstract

This application provides a kind of problem generation method, device, equipment and storage mediums；The method includes carrying out coded treatment by the first encoding model the first term vector corresponding to referenced text, answer information vector text position vector corresponding with current question and answer wheel number, obtain the first semantic vector sequence；Coded treatment is carried out to corresponding second term vector of history question and answer text by the second encoding model, obtains the second semantic vector sequence；Processing is decoded to the first semantic vector sequence and the second semantic vector sequence by decoded model, is obtained described when the corresponding question text of front-wheel number；Export described problem text.By the application, coherent and good dialogism problem can be generated in conjunction with dialog history content.

Description

Question generation method, device, equipment and storage medium

Technical Field

The present application relates to the field of communications technologies, and in particular, to a problem generation method and apparatus, and a storage medium.

Background

With the rapid development of artificial intelligence technology, the content of natural language processing, which is one of the most important research fields of artificial intelligence, is becoming more and more abundant, including machine translation, automatic abstractions, problem generation, and the like. A problem Generation (QG) technique is a technique for automatically generating a corresponding problem from a segment of text, as an advanced retrieval form in information retrieval. The problem generation technology can be used for performing knowledge testing in an education scene, such as an intelligent Tutor System (intelligent director System), and can actively put forward some problems aiming at reading understanding and test the understanding of students on articles; meanwhile, the technology can also be applied to a chat robot and a voice assistant, so that the chat system can actively raise problems to enhance the interactivity and the continuity of the conversation. In addition, the problem generation technology can also be applied to the medical field, for example, can be used for an automatic inquiry system to diagnose through a dialogue with a patient.

The existing problem generation method mainly focuses on the problem generation of reading and understanding a single sentence on a text, and often the conversation is not consistent due to the fact that the previous conversation content cannot be considered, so that poor user experience is brought.

Disclosure of Invention

The embodiment of the application provides a question generation method, a question generation device and a storage medium, which can generate consistent and good-dialogue questions by combining historical dialogue contents.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a problem generation method, which comprises the following steps:

coding a first word vector corresponding to a reference text, an answer information vector and a text position vector corresponding to the current question and answer cycle number through a first coding model to obtain a first semantic vector sequence;

coding a second word vector corresponding to the historical question-answer text through a second coding model to obtain a second semantic vector sequence;

decoding the first semantic vector sequence and the second semantic vector sequence through a decoding model to obtain a question text corresponding to the current number of the question-answering rounds;

and outputting the question text.

An embodiment of the present application provides a problem generation apparatus, including:

the first coding module is used for coding a first word vector corresponding to the reference text, an answer information vector and a text position vector corresponding to the current question-answer wheel number through a first coding model to obtain a first semantic vector sequence;

the second coding module is used for coding a second word vector corresponding to the historical question-answer text through a second coding model to obtain a second semantic vector sequence;

the decoding module is used for decoding the first semantic vector sequence and the second semantic vector sequence through a decoding model to obtain a question text corresponding to the current number of the question and answer rounds;

and the output module is used for outputting the question text.

In the above solution, the apparatus further comprises:

a training module, configured to perform joint training on the decoding model according to at least a first optimization objective function and a second optimization objective function, so as to adjust parameters of the decoding model;

the first optimization objective function is used for focusing attention distribution of the decoding model on entity nouns when pronouns need to be generated, and the second optimization objective function is used for optimizing chapter position attention distribution corresponding to each round of question-answer dialog, so that chapter position attention distribution of the decoding model is focused on texts corresponding to current question-answer rounds.

An embodiment of the present application provides a problem generation device, including:

a memory for storing executable instructions;

and the processor is used for realizing the method provided by the embodiment of the application when executing the executable instructions stored in the memory.

The embodiment of the application provides a storage medium, which stores executable instructions and is used for causing a processor to execute the executable instructions so as to realize the method provided by the embodiment of the application.

The embodiment of the application has the following beneficial effects:

when the method provided by the embodiment of the application is used for generating the questions, the relevant vectors in the reference text and the relevant vectors of the historical question and answer text are respectively encoded and then decoded to obtain the generated questions, and the generated questions can be better connected with the historical dialogue contents due to the combination of the historical dialogue contents; and the reference text related vector also comprises a text position vector corresponding to the current number of the question and answer rounds, so that the focus of the question can be concentrated on the text position vector corresponding to the current number of the question and answer rounds, and the generated question is more targeted.

Drawings

FIG. 1A is a schematic diagram of a related art gated representation of a knowledge neural problem generation model;

fig. 1B is a schematic diagram of a network architecture of a problem generation method according to an embodiment of the present application;

fig. 1C is a schematic diagram of another network architecture of a problem generation method according to an embodiment of the present application;

fig. 1D is a schematic diagram of another network architecture of a problem generation method according to an embodiment of the present application;

fig. 2 is an alternative structural diagram of a terminal 400 according to an embodiment of the present application;

fig. 3A is a schematic flow chart of an implementation of a problem generation method according to an embodiment of the present application;

fig. 3B is a schematic flow chart of an implementation of the problem generation method according to the embodiment of the present application;

fig. 4 is a schematic flow chart of an implementation of obtaining word vectors of reference texts according to an embodiment of the present application;

fig. 5 is a system architecture diagram of a neural network model of a problem generation method according to an embodiment of the present application.

Detailed Description

In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) Generating a problem, namely automatically generating a corresponding problem by a system according to a section of characters;

2) conversational (conversation) refers to how well a sentence appears in a conversation.

3) Conversation Flow refers to a plurality of conversations occurring successively according to a time axis, wherein the topics of the conversations may transition one or more times according to the time axis.

4) The intelligent tutor system is a system which can automatically issue questions, evaluate the student level and give feedback in an education scene;

5) the Loss Function (Loss Function), also called cost Function (cost Function), is a Function that maps the value of a random event or its related random variables to non-negative real numbers to represent the "risk" or "Loss" of the random event. In application, the loss function is usually associated with the optimization problem as a learning criterion, i.e. the model is solved and evaluated by minimizing the loss function. Parameter estimation, which is used for models in statistics and machine learning, for example, is an optimization goal of machine learning models.

6) Bilingual Evaluation Basis (BLEU), which is an Evaluation criterion for measuring the quality of machine translation. BLEU is a weighted geometric average of the precision of an N-gram (a set of N words), the end result being the ratio of the number of correct N-gram matches to the number of occurrences of all N-grams in the machine translated translation.

7) Rouge (called unified-organized Understudy for Gisting Evaluation) is a set of indicators that evaluate automatic digests and machine translations. It measures the "similarity" between an automatically generated abstract or translation and a reference abstract by comparing the automatically generated abstract or translation with a set of reference abstract (usually manually generated) to compute a corresponding score.

L in Rouge-L is the initial letter of LCS (Longest Common Subsequence) because Rouge-L uses the Longest Common Subsequence.

8) The Attention (Attention) mechanism determines the Attention distribution of decoder output decoding according to the matching degree of the current input sequence of the decoding model and the output vector, wherein the matching degree is high, namely the relative score of the Attention concentration point is higher.

9) The word vector is also called word embedding (word embedding) or word space embedding representation, and the word vector is a representation of a natural language word segmentation in a word space, and refers to a vector obtained by mapping a word to a semantic space.

10) The coding model, which may also be referred to as an encoder or an encoder model, may be a Recurrent Neural Network (RNN) model, and is capable of reading an entire source sequence as a fixed-length code.

11) The decoding model may be called a decoder or a decoder model, or may be an RNN model, and the decoding model may be various RNNs with control/Memory, such as an RNN based on a Long Short-Term Memory network (LSTM), a Transformer (Transformer) model, or an RNN based on a gated round robin Unit (GRU). And the decoding model decodes the input sequence obtained after coding so as to output the target sequence.

12) Attention distribution, when the decoder model decodes each word vector of the sequence of word vectors to output a question, the distribution of the probability that each word in the history question-and-answer text (each word in the sequence of words currently input to the decoder model) is output as a decoding result is determined by the attention mechanism. For example, when decoding the current word vector, the probability distributions of the outputs "he" and "Clinton" are (0.6, 0.4)), the attention distribution of the decoding is focused on "he".

13) And when the decoder model decodes the output question, taking a plurality of different texts as the probability distribution of the text where the correct answer of the current question-answer cycle number is located. For example, when texts 1, 2, and 3 exist, assuming that the distribution of the probability is (0.6, 0.2, 0.2), the discourse position attention distribution is concentrated on the text 1.

In order to better understand the method provided in the embodiments of the present application, a gated-referred to as a knowledge neural problem generation model for problem generation in the related art is first described.

Fig. 1A is a schematic diagram of a gated neural problem generation model, as shown in fig. 1A, in which at least an Encoder (Encoder)101 and a Decoder (Decoder)102 are included, that is, problem generation is based on an Encoder-Decoder neural network framework. In implementation, first, the encoder 101 converts a segment of english short text from a discrete source word representation to a continuous word space embedding representation (word embedding), converts answer position information to corresponding answer features (answer features), converts reference position information to reference position features (reference position features), and concatenates the three features to input the encoder 101. The encoder 101 encodes it into a semantic vector sequence (h)₁，h₂…) and then inputs the encoded semantic vector sequence into the decoder 102. The decoder 102 reads this semantic vector sequence and generates a problem word by word through an attention mechanism (attention) and a recurrent neural network.

Wherein, the acquisition of the reference position feature can be realized by the following steps:

step 11, obtaining the reference pairs (maintenance pairs) and the corresponding confidence scores (confidence score) in the text by means of the existing reference resolution tool.

Taking FIG. 1 as an example, the reference resolution tool associates the pronoun "the" in the sentence with the most likely reference noun "the pantoers" above, while obtaining the score of this reference pair (ment-pair score).

Step 12, inserting the noun (the pantoers) into the pronouns (the), and converting the nouns into the position-referring characteristic f^c＝(c₁，c₂，…，c_n) (co-reference position feature) to indicate whether a reference phenomenon occurs at the current position.

Step 13, generating an improved referred position characteristic f by using the confidence score of the referred pair in the step 11 through a gating mechanism (gating mechanism)^d＝(d₁，d₂，…，d_n) (refined co-reference position feature) wherein:

d_i＝c_i⊙g_i，g_i＝MLP(c_i，score_i) Wherein ⊙ is the product of the corresponding positions of two vectors, and MLP is the neural network of Multi-Layer perceptron (Multi-Layer Perception).

The above technique has the following problems: 1) the problem generation only can consider English short texts, and in the conversation, the previous conversation content is not modeled, so that the problem is not consistent with the conversation history; 2) without modeling for conversational flow, focus shifts between multiple questions cannot be planned in testing students' understanding of an english essay.

Based on this, the embodiment of the application provides a question generation method, which can generate questions by combining historical conversation contents, so as to ensure consistency between question and answer conversations; in addition, when generating the problems, the text position information corresponding to the current round number is also considered, so that focus shift among a plurality of problems can be realized.

An exemplary application of the apparatus implementing the embodiment of the present application is described below, and the apparatus provided in the embodiment of the present application may be implemented as a terminal, and may also be implemented as a server. In the following, exemplary applications covered when the device is implemented as a terminal and a server will be described, respectively.

Referring to fig. 1B, fig. 1B is a schematic diagram of a network architecture of a problem generation method according to an embodiment of the present application, in order to support an exemplary application, a first terminal 400 and a second terminal 100 establish a communication connection through a network 300, where the network 300 may be a wide area network or a local area network, or a combination of the two, and data transmission is implemented using a wireless link. The problem generation method provided by the embodiment of the application can be applied to an online education scene, and assuming that the second terminal 100 is used as a teacher terminal and the first terminal 400 is a student terminal, two first terminals 400-1 and 400-2 are exemplarily shown in fig. 1B.

Under the network architecture, the second terminal 100 may first send an article to the first terminal 400-1 and the first terminal 400-2, and after the user corresponding to the first terminal 400-1 and the first terminal 400-2 finishes learning the article, the article may be tested. At this time, the second terminal 100 may generate a question based on the article, send the question to the 400-1 and the first terminal 400-2, and continue to generate questions according to the answer result of the user and the article after the user corresponding to the 400-1 and the first terminal 400-2 answers.

Fig. 1C is a schematic diagram of another network architecture of the problem generation method according to the embodiment of the present Application, in order to support an exemplary Application, a third terminal 500 is connected to the server 200 through the network 300, where the third terminal 500 may be an intelligent terminal, and an Application program (App) capable of performing conversation and chat is installed on the intelligent terminal. The third terminal 500 may also be an intelligent chat robot. The network 300 may be a wide area network or a local area network, or a combination of both, using wireless links for data transmission.

The third terminal 500 may collect voice conversation information between the user and the third terminal 500, send the collected voice conversation information to the server 200, the server 200 generates a question based on the voice conversation information of the user, and sends the question to the third terminal 500, and the third terminal 500 outputs the question, for example, the question may be output in a voice manner. During the subsequent conversation, the server 200 will continue to generate questions based on the previous conversation content, so that the conversation communication between the third terminal 500 and the user is coherent and smooth, thereby giving the user a good communication experience.

Fig. 1D is a schematic diagram of another network architecture of the problem generation method according to the embodiment of the present application, as shown in fig. 1D, the network architecture only includes a third terminal 500, and the third terminal 500 may be a smart phone, a tablet computer, a notebook computer, or the like, or may be a chat robot. The third terminal 500 is exemplarily illustrated in fig. 1D in the form of a chat robot. After collecting the voice dialog information of the user, the third terminal 500 generates a question according to the collected voice dialog information and outputs the question. In the subsequent dialog process, the third terminal 500 may continue to generate questions based on the previous dialog content, so that the dialog communication between the third terminal 500 and the user is coherent and smooth, thereby giving the user a good communication experience.

The apparatus provided in the embodiments of the present application may be implemented as hardware or a combination of hardware and software, and various exemplary implementations of the apparatus provided in the embodiments of the present application are described below.

Referring to fig. 2, fig. 2 is a schematic diagram of an alternative structure of a terminal 400 according to an embodiment of the present application, where the terminal 400 may be a mobile phone, a computer, a digital broadcast terminal, an information transceiver device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like, and an exemplary structure of the device implemented as a server may be foreseen according to the structure of the terminal 400, so that the structure described herein should not be considered as a limitation, for example, some components described below may be omitted, or components not described below may be added to adapt to specific requirements of some applications.

The terminal 400 shown in fig. 2 includes: at least one processor 410, memory 440, at least one network interface 420, and a user interface 430. Each of the components in the terminal 400 are coupled together by a bus system 450. It is understood that the bus system 450 is used to enable connected communication between these components. The bus system 450 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 450 in fig. 2.

The user interface 430 may include a display, keyboard, mouse, trackball, click wheel, keys, buttons, touch pad or touch screen, etc.

Memory 440 may be either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM). The volatile Memory may be a Random Access Memory (RAM). The memory 440 described in embodiments herein is intended to comprise any suitable type of memory.

The memory 440 in the embodiment of the present application is capable of storing data to support the operation of the terminal 400. Examples of such data include: any computer program for operating on the terminal 400, such as an operating system and application programs. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application program may include various application programs.

As an example of the method provided by the embodiment of the present application implemented by a combination of hardware and software, the method provided by the embodiment of the present application can be directly embodied as a combination of software modules executed by the processor 410, the software modules can be located in a storage medium located in the memory 440, the processor 410 reads executable instructions included in the software modules in the memory 440, and the method provided by the embodiment of the present application is completed in combination with necessary hardware (for example, including the processor 410 and other components connected to the bus 450).

By way of example, the Processor 410 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor or the like.

Methods of implementing embodiments of the present application will be described in connection with the foregoing exemplary application and implementations of apparatus implementing embodiments of the present application.

Referring to fig. 3A, fig. 3A is a schematic flow chart of an implementation of the problem generation method according to the embodiment of the present application, and the problem generation method according to the embodiment of the present application can be applied to the second terminal 100 shown in fig. 1B, the server 200 shown in fig. 1C, and the third terminal 500 shown in fig. 1D, and the problem generation method is implemented by running a neural network model for problem generation.

In some embodiments, the problem-generated neural network model comprises: the system comprises a first coding model, a second coding model and a decoding model, wherein the first coding model is used for coding a reference text to obtain a first semantic word vector, the second coding model is used for coding a historical dialogue text to obtain a second semantic word vector, and the decoding model is used for decoding the first semantic word vector and the second semantic word vector to obtain a generated problem.

The first coding model and the second coding model may be the same type of coding model, e.g. both RNN models, but the parameters of the first coding model and the second coding model may be different. The first coding model may correspond to chapter coder 501 shown in fig. 5, and the second coding model may correspond to paragraph coder 502 shown in fig. 5.

The problem generation method provided by the embodiment of the present application will be described below with reference to the steps shown in fig. 3A.

In step S201, a first word vector corresponding to the reference text, an answer information vector, and a text position vector corresponding to the current number of question and answer rounds are encoded by a first encoding model, so as to obtain a first semantic vector sequence.

Here, the first word vector may be a vector representation of all words in the reference text in a continuous word space, or may be a vector representation of words in the reference text that are not answers in the continuous word space.

The answer information vector is a vector representation of the words marked as answers in the reference text in a continuous word space. In the embodiment of the present application, the reference text may be divided into several parts in advance, for example, the reference text may be divided into several parts according to sentences, each sentence or every two sentences may be one part, or the reference text may be divided into several parts according to the number of words, and every ten words or fifteen words may be one part. Each portion has location information for the portion in the reference text, which may include the location of the first word in the reference text and the location of the last word in the reference text in the portion. And the number of the question and answer rounds has a corresponding relation with each part divided by the reference text, and can be the text content of one part corresponding to one number of the question and answer rounds or the text content of one part corresponding to a plurality of numbers of the question and answer rounds. And generating a text position vector corresponding to the current question and answer wheel number according to the text position information corresponding to the current question and answer wheel number.

When the step S201 is implemented, first a first word vector corresponding to a reference text, an answer information vector, and a text position vector corresponding to a current number of turns of question and answer are obtained, the first word vector corresponding to the reference text, the answer information vector, and the text position vector corresponding to the current number of turns are used as inputs of a coding model, when a coder codes the first word vector, the answer information vector, and the text position vector, the three are spliced first, then the spliced vector is coded, the spliced input vector with an indefinite length is converted into an intermediate variable with a fixed length, and sequence information of the spliced vector is coded in the intermediate variable, so as to obtain a first semantic vector sequence.

In step S202, a second word vector corresponding to the historical question and answer text is encoded by a second encoding model, so as to obtain a second semantic vector sequence.

In some embodiments, step S202 described above may be implemented in such a way: the second coding model transforms the second word vector into a context variable having a fixed length and codes the second word vector in the context variable, thereby obtaining a second semantic vector sequence.

In some embodiments, before step S202, a second word vector corresponding to the historical question and answer text is obtained. When the method is implemented, the original word vectors of all words in the historical question-answering text are firstly obtained, and then all the original word vectors are mapped into second word vectors. The original word vectors may be one-hot (one-hot) vectors, and when mapping each original word vector to a second word vector, each original word vector may be used as an input of a word to vector (word 2vec) model, and the second word vector with a low dimension is obtained through the word2 vec.

In step S203, the first semantic vector sequence and the second semantic vector sequence are decoded by a decoding model to obtain a question text corresponding to the current number of question-answering rounds.

When the first semantic vector sequence is generated, the text position corresponding to the current number of turns of question and answer is used as one of the input items, and the second semantic vector sequence is obtained by encoding the second word vector corresponding to the historical question and answer content, so that the decoding model performs word-by-word decoding on the first semantic vector sequence and the second semantic vector sequence to obtain a question, the question is linked with the historical question and answer content, the fluency of question and answer can be ensured, the question is specific to the text content corresponding to the current number of turns, when the value of the current number of turns of question and answer is changed, the question related to the text content of the new number of turns of question and answer can be generated, the theme can be gradually transferred in one multi-turn conversation, and the question in the multi-turn question and answer can be better generated.

In step S204, the question text is output.

Here, if step S204 is implemented by the second terminal in fig. 1B or the server in fig. 1C, outputting the question text may be sending the generated question text to the first terminal shown in fig. 1B or to the third terminal shown in fig. 1C, so that the first terminal or the third terminal displays the question text in the form of characters through a display device, or of course, the first terminal or the third terminal plays the voice information corresponding to the question text in the form of voice.

If step S204 is implemented by the third terminal in fig. 1D, outputting the question text may be that the third terminal displays the question text in the form of characters through a display device, or the third terminal plays the voice message corresponding to the question text in the form of voice.

In some embodiments, before step S201, a first word vector and an answer information vector corresponding to the reference text need to be obtained first. Referring to fig. 4, fig. 4 is a schematic view of an implementation flow for obtaining a word vector of a reference text according to an embodiment of the present application, and as shown in fig. 4, the implementation flow includes the following steps:

step S111, obtaining original word vectors and attribute information corresponding to each participle in the reference text.

When a first word vector and an answer information vector corresponding to a reference text are obtained, an original word vector automatically generated by each participle in the reference text is obtained, wherein the original word vector can be a one-hot vector, and the dimensionality is generally high, so that after the original word vector of each participle is obtained, each original word vector is mapped to a low-dimensional continuous-vector word space

The attribute information represents whether the participle is an answer or not.

Step S112, determining whether the attribute information of the participle indicates that the participle is an answer.

Here, if the attribute information of the participle indicates that the participle is an answer, the process proceeds to step S204, and if the attribute information of the participle indicates that the participle is not an answer, the process proceeds to step S203.

Step S113, when the attribute information of the participle indicates that the participle is not an answer, mapping the original word vector corresponding to the participle into a first word vector.

For example, for the sentence "inside predicted bit inside gap summary times to term limits in the 22nd attribute of the constraint", wherein the attributes of "deleted", "term limits in the 22nd attribute of the constraint" are answers and the attributes of the rest words are non-answers. Then the first word vector at least includes the low-dimensional word vector obtained by mapping the original word vectors of words including word, president, Bill Clinton, was, ineligible, to, serve, a, third, term, due, to.

Step S114, when the attribute information of the participle indicates that the participle is an answer, mapping an original word vector corresponding to the participle into an answer information vector.

For example, the answer information vector at least includes democratic, term, limits, in, the, 22^ndThe low-dimensional word vector mapped by the original word vector of the words, assessment, of, the, and constraint may also include, in some embodiments, a position information component of the answer.

In some embodiments, the step S201 may be implemented in such a manner that a first coding model is used to perform coding processing on a first word vector, an answer information vector, and a text position vector corresponding to a current number of question and answer rounds corresponding to the acquired reference text, so as to obtain a first semantic vector sequence, where:

and step S2011, the first word vector, the answer information vector and the text position vector are spliced to obtain a spliced vector.

Here, when the first word vector, the answer information vector, and the text position vector are spliced, the first word vector, the answer information vector, and the text position vector may be spliced according to a preset sequence, for example, the preset sequence may be that the first word vector is a text position vector, the second word vector is an answer information vector, and the third word vector is a first word vector, and then the text position vector, the answer information vector, and the first word vector are sequentially spliced to obtain a spliced vector.

Step S2022, converting the stitching vector into an intermediate vector having a fixed length by the first coding model.

Here, the intermediate Vector may be a Context Vector (Context Vector), which is a weighted Context Vector of each word, and a Context relationship between one word and another word can be made by the Context Vector.

Step S2023, encoding the sequence information of the splicing vector in the intermediate vector to obtain a first semantic vector sequence.

Since the lengths of the first word vector, the answer information vector and the text position vector are generally different for different reference texts, the lengths of the obtained splicing vectors are different, and the first coding model is used for coding the splicing vectors into a first semantic vector sequence with a fixed length when the splicing vectors are coded.

In steps S2021 to S2022, the first word vector, the answer information vector, and the text position vector are first spliced, and then the spliced vector is encoded, so that the first semantic vector sequence is ensured to include information of the three aspects, and the encoding efficiency can be improved.

In some embodiments, as shown in fig. 3B, after step S204, the following process may also be performed:

step S205, acquiring an answer text corresponding to the question text.

Here, the answer text may be input by the user through an input device or may be uttered by the user through voice. Therefore, when step S205 is implemented, the answer text input by the user through the input device may be obtained, or the answer voice information of the user may be obtained, and then the answer voice information is subjected to voice recognition to obtain the answer text.

Step S206, determining whether the answer text and a preset answer text satisfy a matching relationship.

If the answer text and the preset answer text satisfy the matching relationship, the explanatory answer text is matched with the answer text, that is, the explanatory answer is correct, at this time, a question can be continuously generated according to the subsequent article part, and the step S207 is entered; if the answer text and the preset answer text do not satisfy the matching relationship, the answer text is not matched with the answer text, that is, the answer is wrong, and then the process goes to step S208.

And step S207, updating the value of the current question-answer turn number to the value of the next question-answer turn number.

For example, if the current number of rounds of question and answer is assumed to be 2, if the answer text and the answer text satisfy the matching relationship, the current number of rounds of question and answer is updated to 3, and then step S201 is performed until the current number of rounds of question and answer reaches the preset number of rounds of end.

And step S208, keeping the value of the current question-answering round number.

For example, if the current number of the question and answer turns is assumed to be 2, if the answer text and the answer text do not satisfy the matching relationship, the current number of the question and answer turns remains 2 at this time, and then step S201 is performed.

In some embodiments, when the answer text and the answer text do not satisfy the matching relationship, adding 1 to the number of wrong answers corresponding to the question, and when the number of answers exceeds a specified number threshold, updating the value of the current number of rounds of answers to the value of the next number of rounds of answers. For example, the threshold number of times is 3, and when the user answers a question more than 3 times, then the question continues to be generated from the subsequent article portion.

Note that the initial value of the number of times of error response for each question is 0.

Through steps S205 to S208, whether to perform subsequent question generation may be determined according to whether the answer text of the user is correct, so that the knowledge points with poor grasp may be further consolidated in the question and answer process while the conversation continuity is ensured.

In some embodiments, step S203 described above may be implemented in such a way:

and performing word-by-word decoding on the first semantic vector sequence and the second semantic vector sequence through a decoding model, in the decoding process, concentrating the attention distribution of chapter positions of the decoding model in the text corresponding to the current round number, and when pronouncing is decoded and output, concentrating the attention distribution of the decoding model in the entity nouns in the second semantic vector sequence, so that the problem text which is corresponding to the current round number and refers to alignment is generated.

Here, when implemented, step S103 is to perform word-by-word decoding on the first semantic vector sequence and the second semantic vector sequence by using the trained decoding model. When the decoding model is trained, parameters of the decoding model are adjusted according to a loss function capable of realizing reference alignment and a loss function capable of realizing that attention distribution is concentrated at a text position corresponding to the current number of the question and answer rounds, so that the problem generated by decoding the first semantic vector sequence and the second semantic vector sequence word by word through the decoding model not only can realize reference alignment, but also is related to text content corresponding to the current number of the question and answer rounds, and the dialogue of the question and answer is improved.

In some embodiments, the method further comprises:

and step 31, acquiring a third semantic vector sequence corresponding to the initial decoding model and the training text.

Here, the training text may be a question-answering dialog text, and the third semantic vector sequence corresponding to the training text further includes part-of-speech information representing each word to determine whether pronouns need to be generated in the decoding process.

And 32, decoding the third semantic vector sequence through the decoding model, and adjusting parameters of the decoding model according to a first optimization objective function when pronouns are determined to be required to be generated, so that the attention distribution of the decoding model is concentrated on entity nouns.

Here, when the decoding model decodes the third semantic vector sequence and determines that pronouns need to be generated, the parameters of the decoding model may be adjusted through the first optimization objective function, so that the attention distribution of the decoding model is concentrated on the entity nouns corresponding to the pronouns, thereby ensuring that the pronouns and the entity nouns in the generated question correspond to each other, and further improving the dialogue of question answering.

In some embodiments, the method further comprises:

step 41, performing joint training on the decoding model according to at least a first optimization objective function and a second optimization objective function to adjust parameters of the decoding model;

In an actual implementation process, in addition to jointly training the decoding model according to the first optimization objective function and the second optimization objective function, the decoding model may also be jointly trained according to the formula (1-1) to adjust parameters of the decoding model:

L＝L_nll+L_coref+L_flow(1-1)；

wherein L is_nllIs an optimization objective of the classical codec model, L_corefFor the first optimization objective function, L_flowAnd a second optimization objective function.

Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.

Fig. 5 is a schematic diagram of a system architecture of a problem generation method according to an embodiment of the present application, and as shown in fig. 5, the system architecture at least includes a chapter encoder (parsing encoder)501, a paragraph encoder (conversion encoder)502, and a decoder 503. The chapter encoder 501 encodes english short texts, and the paragraph encoder 502 encodes historical dialogue contents, so as to convert discrete texts into continuous vectorized semantic representations respectively.

The chapter encoder 501 encodes the english short text, and actually the chapter encoder 501 encodes a series of related vectors of the english short text. Before encoding, English short texts need to be converted from discrete source word representations into continuous word space embedding representations (word embedding), and answer position information needs to be converted into corresponding answer position embedding representations (answer position embedding). Besides, in the embodiment of the present application, a dialog flow representation (turn number embedding & chunk embedding) is also separately designed. The word space embedded representation, answer position representation, and dialog flow representation are then encoded by the chapter encoder 501, resulting in a vectorized semantic representation. The vectorized semantic representation encoded by the chapter encoder 501 corresponds to the first semantic vector sequence in the other embodiments.

The paragraph encoder 502 encodes the contents of the historical dialog, i.e., encodes the previous questions and answers. In implementation, the paragraph encoder 502 actually encodes the word space embedded representation corresponding to the historical dialog content, thereby obtaining a vectorized semantic representation. The vectorized semantic representation encoded by the paragraph encoder 502 corresponds to the second semantic vector sequence in other embodiments.

After obtaining the vectorized semantic representation, the decoder 503 converts the vectorized semantic representation into a question, and adopts the techniques of co-reference alignment (co-reference alignment) and conversation flow modeling (conversation flow modeling) in the process of word-by-word decoding, so that the model can generate a question with dialogue.

The following description refers to alignment techniques and dialog flow modules.

In this embodiment, the reference alignment module adopting the reference alignment technology can convert an entity noun (entity) in the historical dialog content into a pronoun in the dialog to be generated, for example:

english short text fragment: an incorporated, reconstructed, predicted, bit, product to product limits in the 22-dimensional of the constraint.

Conversation history: what is the white polar waters Clinton a member of?

Results of the related art solutions: what was a wave Clinton ineligile to serve?

Results referring to the alignment module: what was we he ineligible to serve?

As can be seen from the comparison of the result of the problem generated by the related art scheme and the result of the problem generated by the reference alignment, the reference alignment module can generate a problem rich in dialogue, so that the problem can be coherent with the historical dialogue content of the dialogue. However, the related technical solutions can only simply copy the entity nouns in the dialog history, and cannot achieve the purpose of reference alignment.

When the reference alignment module is implemented, when the model needs to generate pronouns (such as he in the example), the attention distribution (attention distribution) of the model is encouraged to be focused on entity nouns (such as Clinton in the example) in the conversation history, so that the aim of aligning the generated reference words to the entity nouns in the conversation content in the history is fulfilled, and the optimization goal can be achieved by using a loss function L as described in formula (2-1)_corefRepresents:

wherein,attention distribution corresponding to the entity noun to be referred toProbability of (β)_i-k,jTo be the attention distribution of the sentence, p_corefTo generate probabilities, s, of referring to pronouns in a probability distribution_cTo be the confidence probability of the pair of references, λ₁And λ₂To adjust the hyper-parameter (hyper-parameter) of the optimization objective, the empirical value is 1.

The dialogue flow module can realize the focus transfer in continuous questions, and achieve the purpose that the questions in the first rounds of the dialogue ask the contents of the first sentences of the short text, and the questions pay attention to the contents behind the short text gradually until the dialogue is finished as the dialogue goes deep.

An embedded representation method (flow embedding) of the dialog flow, that is, a dialog flow representation input into the chapter encoder 501, is first designed in the dialog flow module, and the dialog flow representation includes two parts: a current wheel number representation (turn number embedding) and a chapter relative position representation (chunk embedding), wherein:

and (3) mapping the number of turns of the current conversation to be a continuous vectorization representation, adding the continuous vectorization representation to a chapter coder, so that the model can learn the number of turns of the current conversation, and thus, the model can pay attention to the proper position in the text.

The relative position of the chapters is expressed (chunk embedding), and in this embodiment, it is assumed that chapters (past) are evenly divided into 10 parts according to sentences, and the model learns the relative relationship between the wheel number and the chapter position. Intuitively, in the initial dialogue round, the model should pay attention to the sentences at the front part of the chapters and generate corresponding problems for some front sentences, and when the dialogue goes deep, the model should pay attention to the sentences at the back of the chapters and generate corresponding problems for the back sentences. Therefore, the model can generate questions along the sequence of discourse description, and the questions are more consistent with the rhythm of question questions in the dialogue.

Secondly, the dialogue flow module optimizes the distribution of chapter attention (past attention) of each turn of dialogue, and limits the attention distribution a of the model through the labels of the sentences in the middle of the marked chapters_jSo that the model only concerns short textsOf (1) related content. The optimization objective may be defined by a loss function L as described in equation (2-2)_flowRepresents:

wherein λ is₃And λ₄To adjust the hyper-parameters of the optimization objective, the empirical value is 1; α_jThe probability of attention distribution for the chapters; CES refers to the sentence that the sentence belongs to the current question that needs to be asked, and HES refers to the sentence that the sentence belongs to the sentence that has been asked before.

In the embodiment of the present application, the reference alignment module and the dialogue flow modeling module are jointly trained, and the optimization target of the joint training can be represented by formula (2-3):

L＝L_nll+L_coref+L_flow(2-3)；

wherein L is_nllIs the optimization objective of the classical codec model.

The method provided by the embodiment of the application can be applied to an intelligent tutor system (intuition tutorystem). The application scenario is for a piece of English short, and the system asks students a series of questions in the form of conversation to test the students' understanding of the short. Compared with the previous model which mainly focuses on respectively proposing the problems according to a sentence, the system proposed by the application is more interactive through the interactive mode of conversation.

The results of the evaluation and comparison with the prior art are shown in table 1:

TABLE 1

	BLEU1	BLEU2	BLEU3	ROUGE_L
					Existing solutions	28.84	13.74	8.16	39.18
This scheme	37.38	22.81	16.25	46.90

As can be seen from table 1, based on several evaluation manners, such as BLEU1, BLEU2, BLEU3, and ROUGE _ L, the method provided in the embodiment of the present application is superior to the existing scheme.

The method and the device can be applied to continuous question asking of an intelligent tutor system in a conversation to test the understanding scene of a student on a section of English short text, and the conversation history content is modeled by the multi-input encoder and the reference alignment model, so that the generated problem is coherent with the conversation history, the entity name (entity) in the conversation history can be converted into a corresponding pronoun in the generated problem, and the conversation of the problem is improved; in addition, through the conversation flow module provided by the embodiment of the application, topics can be gradually transferred in one multi-turn conversation, so that questions in multiple rounds of questions and answers can be better generated.

It should be noted that the problem generation method provided by the embodiment of the present application is not limited to a specific network model, and may be used in an encoder-decoder model, or in a transformer (transformer) model; meanwhile, the problem generation method provided by the embodiment of the application is not limited to be applied to an intelligent teacher system, the reference alignment method can be used for all models which need to be capable of converting entities in input into pronouns, the conversation flow module can be used in general conversation scenes such as a chat robot and a customer service robot, and the chat conversation history is used as a basis for generating chapter encoder input.

An exemplary structure of software modules is described below, and in some embodiments, as shown in FIG. 2, the software modules 80 in the apparatus 440 may include:

the first encoding module 81 is configured to encode a first word vector corresponding to the reference text, an answer information vector, and a text position vector corresponding to the current number of question and answer rounds through a first encoding model to obtain a first semantic vector sequence;

the second coding module 82 is configured to perform coding processing on a second word vector corresponding to the historical question-answer text through a second coding model to obtain a second semantic vector sequence;

the decoding module 83 is configured to perform decoding processing on the first semantic vector sequence and the second semantic vector sequence through a decoding model to obtain a question text corresponding to the current number of question-answering rounds;

and an output module 84 for outputting the question text.

In some embodiments, the apparatus further comprises:

the device comprises a first obtaining module, a second obtaining module and a third obtaining module, wherein the first obtaining module is used for obtaining original word vectors and attribute information corresponding to all participles in a reference text, and the attribute information represents whether the participles are answers or not;

the first mapping module is used for mapping the original word vector corresponding to the participle into a first word vector when the attribute information of the participle indicates that the participle is not an answer;

and the second mapping module is used for mapping the original word vector corresponding to the participle into an answer information vector when the attribute information of the participle indicates that the participle is an answer.

In other embodiments, the apparatus further comprises:

the splicing module is used for splicing the first word vector, the answer information vector and the text position vector to obtain a spliced vector;

the first encoding module 81 is further configured to convert the spliced vector into an intermediate vector with a fixed length through the first encoding model, and encode sequence information of the spliced vector in the intermediate vector to obtain a first semantic vector sequence.

In other embodiments, the apparatus further comprises:

the second acquisition module is used for acquiring an answer text corresponding to the question text;

the round number updating module is used for updating the value of the current question and answer round number to the value of the next question and answer round number when the answer text and the preset answer text meet the matching relationship;

and the round number keeping module is used for keeping the value of the current question and answer round number when the answer text and the answer text do not meet the matching relationship.

In other embodiments, the decoding module 83 is further configured to perform word-by-word decoding on the first semantic vector sequence and the second semantic vector sequence through a decoding model;

and when pronouncing is required, concentrating the attention distribution of the decoding model to entity nouns in a second semantic vector sequence, so that the problem text indicating alignment corresponding to the current round number is generated.

In other embodiments, the apparatus further comprises:

the third acquisition module is used for acquiring an initial decoding model and a third semantic vector sequence corresponding to the training text;

the decoding module 83 is further configured to, when the third semantic vector sequence is decoded by the decoding model and it is determined that a pronoun needs to be generated, adjust parameters of the decoding model according to a first optimization objective function, so that the attention distribution of the decoding model is focused on a physical noun.

In other embodiments, the apparatus further comprises:

As an example of the method provided by the embodiment of the present Application being implemented by hardware, the method provided by the embodiment of the present Application may be directly implemented by the processor 410 in the form of a hardware decoding processor, for example, implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.

Embodiments of the present application provide a storage medium having stored therein executable instructions, which when executed by a processor, will cause the processor to perform the methods provided by embodiments of the present application, for example, the methods as illustrated in fig. 3 and 4.

In some embodiments, the storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. A method of question generation, the method comprising:

and outputting the question text.

2. The method of claim 1, further comprising:

acquiring original word vectors and attribute information corresponding to each participle in a reference text, wherein the attribute information represents whether the participle is an answer or not;

when the attribute information of the participle indicates that the participle is not an answer, mapping an original word vector corresponding to the participle into a first word vector;

and when the attribute information of the participle indicates that the participle is an answer, mapping an original word vector corresponding to the participle into an answer information vector.

3. The method according to claim 1, wherein the obtaining a first semantic vector sequence by encoding a first word vector, an answer information vector, and a text position vector corresponding to a current number of rounds of question and answer obtained by the first encoding model, comprises:

splicing the first word vector, the answer information vector and the text position vector to obtain a spliced vector;

converting the stitched vector into an intermediate vector having a fixed length by the first coding model;

and coding the sequence information of the spliced vector in the intermediate vector to obtain a first semantic vector sequence.

4. The method of claim 1, wherein after outputting the question text, the method further comprises:

acquiring an answer text corresponding to the question text;

when the answer text and the preset answer text meet the matching relationship, updating the value of the current question-answer turn number to the value of the next question-answer turn number;

and when the answer text and the answer text do not meet the matching relationship, keeping the value of the current number of the question and answer rounds.

5. The method according to claim 1, wherein the decoding the first semantic vector sequence and the second semantic vector sequence by using a decoding model to obtain the problem text corresponding to the current round number comprises:

performing word-by-word decoding on the first semantic vector sequence and the second semantic vector sequence through a decoding model;

in the process of word-by-word decoding, the position attention distribution of the chapters of the decoding model is concentrated in the text corresponding to the current round number, an

When the pronouns are decoded and output, the attention distribution of the decoding model is concentrated on the entity nouns in the second semantic vector sequence, so that the problem texts corresponding to the current round number and indicating alignment are generated.

6. The method according to any one of claims 1 to 5, further comprising:

acquiring an initial decoding model and a third semantic vector sequence corresponding to a training text;

and when the third semantic vector sequence is decoded by the decoding model and pronouns are determined to be required to be generated, adjusting parameters of the decoding model according to a first optimization objective function so as to enable the attention distribution of the decoding model to be concentrated on entity nouns.

7. The method according to any one of claims 1 to 5, further comprising:

jointly training the decoding model according to at least a first optimization objective function and a second optimization objective function to adjust parameters of the decoding model;

8. A question generation apparatus, comprising:

and the output module is used for outputting the question text.

9. The apparatus of claim 8, further comprising:

10. The apparatus of claim 8, further comprising:

the first coding module is further configured to convert the spliced vector into an intermediate vector with a fixed length through the first coding model, and encode sequence information of the spliced vector in the intermediate vector to obtain a first semantic vector sequence.

11. The apparatus of claim 8, further comprising:

12. The apparatus of claim 8,

the decoding module is further configured to perform word-by-word decoding on the first semantic vector sequence and the second semantic vector sequence through a decoding model;

13. The apparatus of any one of claims 8 to 12, further comprising:

and the decoding module is further used for adjusting parameters of the decoding model according to a first optimization objective function when the third semantic vector sequence is decoded by the decoding model and pronouns are determined to be required to be generated, so that the attention distribution of the decoding model is focused on entity nouns.

14. A question generation apparatus, comprising:

a memory for storing executable instructions;

a processor for implementing the method of any one of claims 1 to 7 when executing executable instructions stored in the memory.

15. A storage medium having stored thereon executable instructions for causing a processor to perform the method of any one of claims 1 to 7 when executed.