US11294942B2 - Question generation - Google Patents

Question generation Download PDF

Info

Publication number
US11294942B2
US11294942B2 US16/334,135 US201716334135A US11294942B2 US 11294942 B2 US11294942 B2 US 11294942B2 US 201716334135 A US201716334135 A US 201716334135A US 11294942 B2 US11294942 B2 US 11294942B2
Authority
US
United States
Prior art keywords
question
textual content
fact
decoder
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/334,135
Other versions
US20200183963A1 (en
Inventor
Reza Ghaeini
Sheikh Sadid Al Hasan
Oladimeji Feyisetan Farri
Kathy Mi Young Lee
Vivek Varma Datla
Ashequl Qadir
Junyi Liu
Adi Prakash
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips NV filed Critical Koninklijke Philips NV
Priority to US16/334,135 priority Critical patent/US11294942B2/en
Assigned to KONINKLIJKE PHILIPS N.V. reassignment KONINKLIJKE PHILIPS N.V. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FARRI, OLADIMEJI FEYISETAN, PRAKASH, Adi, GHAEINI, Reza, AL HASAN, Sheikh Sadid, DATLA, Vivek Varma, LEE, Kathy Mi Young, LIU, JUNYI, QADIR, ASHEQUL
Publication of US20200183963A1 publication Critical patent/US20200183963A1/en
Application granted granted Critical
Publication of US11294942B2 publication Critical patent/US11294942B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0454
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • Embodiments described herein generally relate to systems and methods for generating questions and, more particularly but not exclusively, to systems and methods for generating questions from free text using deep learning networks.
  • a QA system must receive a concise and well-described question as an input to generate the best possible answer as an output.
  • Prior studies have revealed that humans do not always ask succinct questions on a specific topic of interest. For example, variations in expressive abilities among users may impact the ability of QA systems to “understand” the input queries and therefore the QA system may not truly understand the information that a user is seeking. Accordingly, the performance of the QA system may be adversely affected.
  • embodiments relate to a method of generating a question from text.
  • the method includes receiving textual content using an interface, receiving a factual statement associated with the textual content using the interface, and generating, using a processor executing instructions stored on a memory to provide a question generator module, a question from the textual content relating to the factual statement.
  • the method further includes receiving a question type related to the textual content using the interface.
  • the factual statement consists of a plurality of words
  • the method further comprises mapping, using a processor executing instructions stored on a memory to provide a fact embedder module, a sequence of the plurality of words to word embeddings.
  • the method further includes processing the word embeddings of the fact embedder module using a first gated recurrent unit to result in a set of computed weights.
  • the method further includes providing the received textual content to a second bidirectional fact-based gated recurrent unit whose weighting is determined by the set of computed weights.
  • the method further includes providing output of the second bidirectional fact-based gated recurrent unit to at least one attention generator, each attention generator computing normalized weights for all sequences of the second gated recurrent unit. In some embodiments, the method further includes using the computed normalized weights and a third unidirectional fact-based gated recurrent unit to generate a plurality of words forming the question. In some embodiments, the at least one attention generator utilizes the set of computed weights in determining the normalized weights. In some embodiments, the second bidirectional fact-based gated recurrent unit comprises a convolutional neural network feeding forward into a plurality of recurrent neural networks.
  • inventions relate to a system for generating a question from text.
  • the system includes an interface for receiving textual content and a factual statement associated with the textual content, and a processor executing instructions stored on a memory to provide a question generator module configured to generate a question from the textual content relating to the factual statement.
  • the interface is further configured to receive a question type related to the textual content.
  • the factual statement consists of a plurality of words
  • the system further includes a processor executing instructions stored on a memory to provide a fact embedder module configured to map a sequence of the plurality of words to word embeddings.
  • the system further includes a first gated recurrent unit configured to process the word embeddings into a set of computed weights.
  • the system further includes a second bidirectional fact-based gated recurrent unit configured to receive the textual content whose weighting is determined by the set of computed weights.
  • the system further includes at least one attention generator configured to compute normalized weights for sequences outputted by the second bidirectional fact-based gated recurrent unit.
  • the system further includes a third unidirectional fact-based gated recurrent unit configured to generate a plurality of words forming the question using the normalized weights.
  • the at least one attention generator utilizes the set of computed weights in computing the normalized weights.
  • the second bidirectional fact-based gated recurrent unit comprises a convolutional neural network feeding forward into a plurality of recurrent neural networks.
  • the question generator module receives input that is a concatenation of the factual statement and a paragraph.
  • the system further includes an attention generator that considers the factual statement and previous hidden states of the question generator module.
  • the question generator module includes an encoder and a decoder, and the question generator module uses a single representation of the factual statement for the encoder and the decoder.
  • the question generator module includes an encoder and a decoder, and an input to the decoder is the element-wise product of an encoder output and an extracted fact representation from the encoder.
  • embodiments relate to a computer readable medium containing computer-executable instructions for performing a method of generating a question from text.
  • the medium includes computer-executable instructions for receiving textual content using an interface; computer-executable instructions for receiving a factual statement associated with the textual content using the interface, and computer-executable instructions for generating, using a processor executing instructions stored on a memory to provide a question generator module, a question from the textual content relating to the factual statement.
  • FIG. 1 illustrates a system for generating a question from text in accordance with one embodiment
  • FIG. 2 illustrates the workflow of operation of the system of FIG. 1 in accordance with one embodiment
  • FIG. 3 illustrates a more detailed view of operation of the system of FIG. 1 in accordance with one embodiment
  • FIG. 4 illustrates recurrent neural network architectures used in one embodiment of the invention
  • FIG. 5 illustrates the attention mechanism of FIG. 4 in accordance with one embodiment
  • FIG. 6 depicts a flowchart of a method of generating a question from text in accordance with one embodiment
  • FIG. 7 depicts a flowchart of a method of generating a question from text in accordance with another embodiment.
  • the present disclosure also relates to an apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer.
  • a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each may be coupled to a computer system bus.
  • the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
  • the proposed models may use a novel attention-based recurrent neural network (RNN) encoder-decoder architecture with control gates for focused facts and question types.
  • RNN recurrent neural network
  • the deep RNN in accordance with various embodiments can look back on more than just previously generated words and/or characters to decide which word and/or characters to generate next as part of the generated question.
  • Various embodiments described herein may also use a convolutional neural network (CNN) with multiple window sizes to generate phrase embeddings based on, for example, the context span of the underlying sentence in the source paragraph.
  • CNN convolutional neural network
  • Another novel component of various embodiments described herein is the use of a language model on top of a softmax layer. This enables the model to find an appropriate combination of words rather than just using the most probable word at each position. Accordingly, the language model can generate natural language questions.
  • features of the systems and methods described herein can create question suggestions in search engines or conversational dialogue systems. They can also be used to generate clarification questions/decompositions from a complex question or scenario (e.g., to better understand patient complaints, etc.). They can further be used to generate question answering corpus and perform educational assessments (e.g., as part of a tutoring system).
  • Methods and systems of various embodiments described herein may receive, as an input, a 3-tuple comprising a source text (e.g., a paragraph of information), a focused factual statement describing the topic of the question to be generated, and an indication of the question type.
  • the method and system may then create embeddings of both the source text and the focused fact.
  • These embeddings, along with the question type, are fed as inputs into a question generator module that includes a trained RNN that generates a sequence of word embeddings that represent the output question.
  • a language module may also translate the generated embeddings into natural language.
  • FIG. 1 illustrates a system 100 for generating a question from text in accordance with one embodiment.
  • the system 100 includes a processor 120 , memory 130 , a user interface 140 , a network interface 150 , and storage 160 interconnected via one or more system buses 110 .
  • FIG. 1 constitutes, in some respects, an abstraction and that the actual organization of the system 100 and the components thereof may differ from what is illustrated.
  • the processor 120 may be any hardware device capable of executing instructions stored on memory 130 or storage 160 or otherwise capable of processing data.
  • the processor 120 may include a microprocessor, field programmable gate array (FPGA), application-specific integrated circuit (ASIC), or other similar devices.
  • the memory 130 may include various memories such as, for example L1, L2, or L3 cache or system memory. As such, the memory 130 may include static random access memory (SRAM), dynamic RAM (DRAM), flash memory, read only memory (ROM), or other similar memory devices.
  • SRAM static random access memory
  • DRAM dynamic RAM
  • ROM read only memory
  • the user interface 140 may include one or more devices for enabling communication with a user.
  • the user interface 140 may include a display, a mouse, and a keyboard for receiving user commands.
  • the user interface 140 may include a command line interface or graphical user interface that may be presented to a remote terminal via the network interface 150 .
  • the user interface 140 may present an agent in the form of an avatar to communicate with a user. If the user is a child, for example, the avatar may be presented as a cartoon character to make the user feel more comfortable.
  • the displayed agent may of course vary and depend on the application.
  • the network interface 150 may include one or more devices for enabling communication with other hardware devices.
  • the network interface 150 may include a network interface card (NIC) configured to communicate according to the Ethernet protocol.
  • NIC network interface card
  • the network interface 150 may implement a TCP/IP stack for communication according to the TCP/IP protocols.
  • TCP/IP protocols Various alternative or additional hardware or configurations for the network interface 150 will be apparent.
  • the storage 160 may include one or more machine-readable storage media such as read-only memory (ROM), random-access memory (RAM), magnetic disk storage media, optical storage media, flash-memory devices, or similar storage media.
  • ROM read-only memory
  • RAM random-access memory
  • magnetic disk storage media magnetic disk storage media
  • optical storage media flash-memory devices
  • flash-memory devices or similar storage media.
  • the storage 160 may store instructions for execution by the processor 120 or data upon with the processor 120 may operate.
  • the storage 160 may include a question generator module 161 .
  • the question generator module 161 may include a fact embedder module 162 , a paragraph embedder 163 , a question generator 164 , and a first gated recurrent unit 165 .
  • the question generator module 161 may further include an encoder 166 with a second bidirectional fact-based gated recurrent unit 167 and a decoder 168 with attention generator(s) 169 , and a third gated recurrent unit 170 .
  • This illustration of the question generator module 161 is merely exemplary and it is contemplated that the system 100 , as well as the question generator module 161 , may include components in addition to or in lieu of those shown in FIG. 1 .
  • the system 100 and, namely, the question generator module 161 may be trained on any appropriate corpus of language data.
  • the system 100 may use the Stanford Question Answering Dataset (SQuAD).
  • SQuAD was originally used as a reading comprehension dataset consisting of over 100,000 question-answer pairs.
  • the questions were derived from over 23,000 short paragraphs curated from a collection of over 5,000 Wikipedia articles, and the answer to each question is a segment of text from the corresponding paragraph.
  • the paragraphs may be used as the source, the answers may be used as the focused factual statements, and the corresponding question may be used as the target to train the question generator module 161 .
  • the SQUAD dataset also contains different question types (e.g., “what” questions, “when” questions, “how” questions, etc.) about domains such as math, sports, biographies, events, etc.
  • tuples of (paragraphs, questions, and answers) from SQuAD may be used. From the entire SQUAD dataset, randomly selected portions of the data may be selected to be used as the training set, development set, and testing set, respectively. It follows that the same paragraph may have different corresponding facts and target questions in the training and testing sets.
  • FIG. 2 illustrates the workflow 200 of the question generator module 161 of FIG. 1 in accordance with one embodiment.
  • a dataset 202 or other input mechanism may provide a tuple 204 comprising a source paragraph, a focused fact, and a question type.
  • the focused fact may be communicated to the fact embedder module 162 for embedding since the focused fact may be represented by more than word.
  • the question type may indicate the type of question to be generated (e.g., a “what” question, a “where” question, a “when” question, etc.).
  • the paragraph (or other free text) may be communicated to the paragraph embedder 163 for embedding.
  • the paragraph embedder 163 may generate a sequence of embeddings for words from the paragraph.
  • the output of the paragraph embedder 163 may be a concatenation of each window's output.
  • the question generator 164 may include or otherwise be configured with the encoder 166 and decoder 168 that implement a bi-directional recurrent neural network (RNN).
  • the output of the question generator 164 is the generated question 206 about the focused fact.
  • FIG. 3 depicts a more detailed workflow 300 of the question generator module 161 in accordance with one embodiment.
  • the input into the question generator module 161 is a tuple including a paragraph 302 (i.e., free text), a focused-fact 304 , and a question type 306 .
  • the paragraph 302 may be fed into a convolutional neural network (CNN) 308 .
  • the CNN 308 may use windows of various sizes W s to capture sequences of words from the paragraph 302 .
  • a CNN window W s may capture all words of the source text by capturing smaller sequences of words individually.
  • the embeddings may therefore be generated based on the context span of the underlying sentence. Accordingly, the CNN 308 provides a rich embedding solution.
  • the output of the CNN may be the concatenation of each CNN's window's output. This output may then be fed into a plurality of recurrent neural networks 310 implemented by the question generator 164 .
  • the focused fact 304 may also be embedded in various embodiments (e.g., if the focused factual statement consists of more than one word).
  • the embedded focused fact may be communicated to a squeezer module 312 that generates a vector representation from the sequence of word embeddings.
  • the squeezer module 312 may implement any one of various tools known in the art such as those listed in FIG. 3 .
  • the embedded focused fact 304 and the question type 306 may be fed into a concatenator module 314 .
  • the concatenator module 314 may be configured to concenate the focused-fact 304 and the question type 306 and feed them into one or more layers of a bi-directional recurrent neural network (RNN) 310 implemented by the question generator 164 .
  • RNN bi-directional recurrent neural network
  • FIG. 4 illustrates an RNN 310 implemented by the question generator 164 in accordance with one embodiment.
  • In-W t , F-W J , and Q-W K stand for the I th , J th and K th word of the input paragraph (received from the paragraph embedder 163 ), the focused fact, and the generated question, respectively.
  • n, m, and q are the length of the paragraph, the focused fact, and the generated question, respectively.
  • RNN 310 shown in FIG. 4 is only one embodiment, which is referred to as a “deep curious” RNN model. There may be several other types of RNN models that consider different types of input. These different models are discussed below.
  • the gated recurrent units 165 a and 165 b receive the same input ( ⁇ F-W 1 , F-W 2 , . . . , F-W m ⁇ ).
  • the output of the second bidirectional fact-based GRU 167 may be the concatenation of forward and backward outputs of the GRU 165 a and the output of the paragraph embedder 163 .
  • the bidirectional fact-based gated recurrent unit 167 receives the words of the input paragraph In-W 1 , In-W 2 , In-W k , In-W n-1 . . . , In-W n and outputs the concatenation of the forward and backward representation of the input paragraph.
  • the forward representation of an input word In-W k may refer to analyzing the word(s) in the supplied text by reading the words in order from left to right.
  • the backward representation reverses the order and reads the word(s) backwards in order (right to left). Analyzing both representations recognizes the dependency between the words and allows for a better understanding of how words relate to one another.
  • the second bidirectional fact-based GRU 167 then outputs sequences 402 for each input word In-W k .
  • Equations 1-4 may be referred to as the operational stages of the GRU 165 a .
  • W, U, F are weighting parameters
  • x t is the input at time t
  • h t-1 is the state and the output of the second fact-based gated recurrent unit 167 at time t ⁇ 1.
  • d enc is the time-independent embedding of the fact that is extracted from the GRU 165 a and is the same for all time sequences.
  • is the element-wise product operation
  • r t and z t are the reset gate and the update gate at time t, respectively.
  • is the new candidate state at time t
  • h t is the final state and output of the GRU 165 a at time t.
  • Equation 1 calculates the reset gate and determines the importance of h t-1 (the state at t ⁇ 1) in calculating the summarization ⁇ t .
  • Equation 2 calculates the update signal and determines how much of h t-t should be considered in the calculating the next state h t at time t. For example, if z t is approximately equal to 1, then h t-1 is almost entirely copied to h t . On the other hand, if z t is approximately equal to 0, then mostly the new memory ⁇ t is forwarded to calculate the next hidden state.
  • Equation 3 calculates the new memory ⁇ t which is the consolidation of a new input x t with the past hidden state h t-1 . This equation essentially combines a newly observed word with a previous state h t-t to summarize the new word in the context of the previous state. Finally, equation 4 calculates the final state h t which is the output of the gated recurrent unit 165 a.
  • the decoder 168 utilizes the focused fact (received from GRU 165 b ) in a similar manner as the encoder 166 .
  • the equations 5-8 are similar to equations 1-4 except d dec stands for the independent embedding of the focused fact that is extracted from the GRU 165 b.
  • the decoder 168 is trained to predict the next word y t (referred to as Q-W t in FIG. 4 ) based on the context vector c from the encoder 166 and the previous predicted words (y t , . . . , y t-1 ).
  • each conditional probability is modeled as follows: p ( y t
  • ⁇ y 1 , . . . ,y t-1 ⁇ ,c t d dec ) f ( y t-1 ,s t ,c t ,d dec ) (Eq. 10)
  • FIG. 5 illustrates the outputted concatenated sequences 402 of FIG. 4 being communicated to the attention generator 169 of FIG. 4 .
  • the attention generator(s) 169 may be configured to compute normalized weights for each sequence (i.e., hidden representation) outputted by the encoder 166 .
  • the attention generator 169 may compute the weight and importance of each encoder output according to the previous decoder hidden state using equation 14. Referring back to FIG. 3 , this step may be performed by the softmax layer 314 .
  • the softmax function mitigates the effect of extreme values or outliers in a dataset without entirely removing them from the dataset.
  • the attention generator 169 may normalize the computed weights for all sequences and communicate the normalized weights to the third fact-based GRU 170 .
  • the third fact-based GRU 170 may use the weighted sum of the encoder's hidden representations to generate a possible word for the question.
  • the output of the decoder 168 may be communicated to a language model 316 .
  • the language model 316 may be configured to at least assess proposed questions.
  • the top k probable words will be communicated to the language model 316 .
  • the language model 316 may execute a beam search and/or an n-gram based language model (as well as any other suitable type of language model).
  • the score of each word would be the multiplication of the softmax value (as determined by equation 14) and the n gram probabilities.
  • the generated question may be supplied to a user via a user interface 140 .
  • the encoder-decoder model may be built on a simple attention-based RNN encoder-decoder framework. The model does not use a factual statement and the RNN generates questions directly from the source text or paragraph.
  • the deep curious model may be a fact-based RNN that is built on an attention-based encoder-decoder framework.
  • the factual statement affects the gating of the encoder and the decoder RNN.
  • augmented deep curious model which is an augmented version of the deep curious model discussed above.
  • the focused fact i.e., the fact representation that is extracted for the decoder part, d dec
  • the attention generator considers the factual statement in addition to the previous hidden state of the decoder and the hidden states of the encoder in accordance with equation 16 below.
  • Another implementation may be referred to as a “simplified deep curious” model. This model is similar to the deep curious model discussed above, but uses one single fact representation for both the encoder and decoder.
  • the elementary deep curious model is a fact-based RNN that is built on an attention-based encoder-decoder framework.
  • the final output from the encoder that passes through the decoder is the element-wise product of the encoder output and the extracted fact representation from the encoder GRU (d enc ).
  • Another implementation may be referred to as a “separate deep curious” model.
  • This model is similar to the elementary deep curious model.
  • the separate deep curious model uses a separate GRU to extract a fact representation called d elem .
  • the model uses d elem instead of d enc in the element-wise product.
  • Another implementation may be referred to as a “fact+ encoder-decoder” model.
  • This model is similar to the encoder-decoder model discussed above, but its input is the concatenation of the factual statement and input paragraph.
  • Another implementation may be referred to as a “fact+ deep curious” model. This model is similar to the deep curious model, but its input is the concatenation of the factual statement and input paragraph.
  • augmented+ deep curious Another implementation may be referred to as an “augmented+ deep curious” model.
  • This model is similar to the augmented deep curious model, but its input is the concatenation of the factual statement and paragraph.
  • FIG. 6 depicts a flowchart of a method 600 of generating a question from text in accordance with one embodiment.
  • Step 602 involves receiving textual content using an interface.
  • the textual content may be in the form of a paragraph of free text, for example, and may be received through any suitable interface such as those discussed previously.
  • Step 604 involves receiving a factual statement associated with the textual content using the interface.
  • the factual statement or “focused fact” indicates to what the generated question should be directed. Accordingly, the question generator module 161 may suggest more relevant questions based on the desires of the user(s).
  • Step 606 involves generating, using a processor executing instructions stored on a memory to provide a question generator module, a question from the textual content relating to the factual statement.
  • the processor may rely on convolutional neural networks that feed into one or more recurrent neural networks to find the most optimal combination of words to generate the most relevant question. Then, a generated question may be outputted to a user.
  • FIG. 7 depicts a flowchart of a method 700 of generating a question from text in accordance with another embodiment. Steps 702 and 704 are similar to steps 602 and 604 of FIG. 6 , respectively, and are not repeated here. Step 706 involves receiving a question type related to the textual content.
  • the question type may refer to whether the generated question should be a “what” question, a “why” question, a “when” question, etc.
  • Step 708 involves mapping a sequence of the factual statement to word embeddings.
  • the factual statement consists of a plurality of words.
  • the processor may execute instructions stored on a memory to provide a fact embedder module.
  • the fact embedder module may be similarly configured to the fact embedder 162 of FIG. 2 , for example, and may map the sequence of words of the factual statement to word embeddings.
  • Step 710 involves processing the word embeddings of the fact embedder module using a first gated recurrent unit to result in a set of computed weights.
  • the first gated recurrent unit may be similar to the GRU 165 a of FIG. 4 , for example.
  • Step 712 involves providing the received textual content to a second bidirectional fact-based gated recurrent unit whose weighting is determined by the set of computed weights.
  • the second bidirectional fact-based gated recurrent unit may be similar to the bidirectional fact-based gated recurrent unit 167 of FIG. 4 , for example.
  • Step 714 involves providing output of the second bidirectional fact-based gated recurrent unit to at least one attention generator, each attention generator computing normalized weights for all sequences of the second gated recurrent unit.
  • the attention generators may be similar to the attention generators 169 of FIG. 4 .
  • Step 716 involves using the computed normalized weights and a third unidirectional fact-based gated recurrent unit to generate a plurality of words forming the question.
  • the third unidirectional fact-based gated recurrent unit may be similar to the third fact-based GRU 170 of FIG. 4 .
  • step 718 involves generating a question relating to the factual statement using the words generated in step 716 .
  • the question may then be outputted to a user.
  • Embodiments of the present disclosure are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to embodiments of the present disclosure.
  • the functions/acts noted in the blocks may occur out of the order as shown in any flowchart.
  • two blocks shown in succession may in fact be executed substantially concurrent or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
  • not all of the blocks shown in any flowchart need to be performed and/or executed. For example, if a given flowchart has five blocks containing functions/acts, it may be the case that only three of the five blocks are performed and/or executed. In this example, any of the three of the five blocks may be performed and/or executed.
  • a statement that a value exceeds (or is more than) a first threshold value is equivalent to a statement that the value meets or exceeds a second threshold value that is slightly greater than the first threshold value, e.g., the second threshold value being one value higher than the first threshold value in the resolution of a relevant system.
  • a statement that a value is less than (or is within) a first threshold value is equivalent to a statement that the value is less than or equal to a second threshold value that is slightly lower than the first threshold value, e.g., the second threshold value being one value lower than the first threshold value in the resolution of the relevant system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

Methods and systems for generating a question from free text. The system is trained on a corpus of data and receives a tuple consisting of a paragraph (free text), a focused fact, and a question type. The system implements a language model to find the most optimal combination of words to return a question for the paragraph about the focused fact.

Description

CROSS-REFERENCE TO PRIOR APPLICATIONS
This application is the U.S. National Phase application under 35 U.S.C. § 371 of International Application No. PCT/EP2017/074818, filed on Sep. 12, 2017, which claims the benefit of Provisional Application Ser. No. 62/401,293, filed Sep. 29, 2016. These applications are hereby incorporated by reference herein, for all purposes.
TECHNICAL FIELD
Embodiments described herein generally relate to systems and methods for generating questions and, more particularly but not exclusively, to systems and methods for generating questions from free text using deep learning networks.
BACKGROUND
Search engines cannot always satisfy an end user's desire to have more direct access to relevant documents or information. Question answering (QA) systems have attempted to improve this experience by directly retrieving relevant answers to natural language questions.
However, one of the main requirements of a QA system is that it must receive a concise and well-described question as an input to generate the best possible answer as an output. Prior studies have revealed that humans do not always ask succinct questions on a specific topic of interest. For example, variations in expressive abilities among users may impact the ability of QA systems to “understand” the input queries and therefore the QA system may not truly understand the information that a user is seeking. Accordingly, the performance of the QA system may be adversely affected.
Using existing search engines is therefore time consuming and often unsatisfying as they are unable to handle complex queries well. Accordingly, accurate answers may not be returned. While existing QA systems have addressed some of these limitations, they generally do not perform well with questions that are related to many topics of interest and that may be unrelated.
A need exists, therefore, for question generation systems and methods that overcome the disadvantages of existing techniques.
SUMMARY
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description section. This summary is not intended to identify or exclude key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
In one aspect, embodiments relate to a method of generating a question from text. The method includes receiving textual content using an interface, receiving a factual statement associated with the textual content using the interface, and generating, using a processor executing instructions stored on a memory to provide a question generator module, a question from the textual content relating to the factual statement.
In some embodiments, the method further includes receiving a question type related to the textual content using the interface.
In some embodiments, the factual statement consists of a plurality of words, and the method further comprises mapping, using a processor executing instructions stored on a memory to provide a fact embedder module, a sequence of the plurality of words to word embeddings. In some embodiments, the method further includes processing the word embeddings of the fact embedder module using a first gated recurrent unit to result in a set of computed weights. In some embodiments, the method further includes providing the received textual content to a second bidirectional fact-based gated recurrent unit whose weighting is determined by the set of computed weights. In some embodiments, the method further includes providing output of the second bidirectional fact-based gated recurrent unit to at least one attention generator, each attention generator computing normalized weights for all sequences of the second gated recurrent unit. In some embodiments, the method further includes using the computed normalized weights and a third unidirectional fact-based gated recurrent unit to generate a plurality of words forming the question. In some embodiments, the at least one attention generator utilizes the set of computed weights in determining the normalized weights. In some embodiments, the second bidirectional fact-based gated recurrent unit comprises a convolutional neural network feeding forward into a plurality of recurrent neural networks.
According to yet another aspect, embodiments relate to a system for generating a question from text. The system includes an interface for receiving textual content and a factual statement associated with the textual content, and a processor executing instructions stored on a memory to provide a question generator module configured to generate a question from the textual content relating to the factual statement.
In some embodiments, the interface is further configured to receive a question type related to the textual content.
In some embodiments, the factual statement consists of a plurality of words, and the system further includes a processor executing instructions stored on a memory to provide a fact embedder module configured to map a sequence of the plurality of words to word embeddings. In some embodiments, the system further includes a first gated recurrent unit configured to process the word embeddings into a set of computed weights. In some embodiments, the system further includes a second bidirectional fact-based gated recurrent unit configured to receive the textual content whose weighting is determined by the set of computed weights. In some embodiments, the system further includes at least one attention generator configured to compute normalized weights for sequences outputted by the second bidirectional fact-based gated recurrent unit. In some embodiments, the system further includes a third unidirectional fact-based gated recurrent unit configured to generate a plurality of words forming the question using the normalized weights. In some embodiments, the at least one attention generator utilizes the set of computed weights in computing the normalized weights. In some embodiments, the second bidirectional fact-based gated recurrent unit comprises a convolutional neural network feeding forward into a plurality of recurrent neural networks.
In some embodiments, the question generator module receives input that is a concatenation of the factual statement and a paragraph. In some embodiments, the system further includes an attention generator that considers the factual statement and previous hidden states of the question generator module. In some embodiments, the question generator module includes an encoder and a decoder, and the question generator module uses a single representation of the factual statement for the encoder and the decoder.
In some embodiments, the question generator module includes an encoder and a decoder, and an input to the decoder is the element-wise product of an encoder output and an extracted fact representation from the encoder.
According to yet another aspect, embodiments relate to a computer readable medium containing computer-executable instructions for performing a method of generating a question from text. The medium includes computer-executable instructions for receiving textual content using an interface; computer-executable instructions for receiving a factual statement associated with the textual content using the interface, and computer-executable instructions for generating, using a processor executing instructions stored on a memory to provide a question generator module, a question from the textual content relating to the factual statement.
BRIEF DESCRIPTION OF DRAWINGS
Non-limiting and non-exhaustive embodiments of the invention are described with reference to the following figures, wherein like reference numerals refer to like parts throughout the various views unless otherwise specified.
FIG. 1 illustrates a system for generating a question from text in accordance with one embodiment;
FIG. 2 illustrates the workflow of operation of the system of FIG. 1 in accordance with one embodiment;
FIG. 3 illustrates a more detailed view of operation of the system of FIG. 1 in accordance with one embodiment;
FIG. 4 illustrates recurrent neural network architectures used in one embodiment of the invention;
FIG. 5 illustrates the attention mechanism of FIG. 4 in accordance with one embodiment;
FIG. 6 depicts a flowchart of a method of generating a question from text in accordance with one embodiment; and
FIG. 7 depicts a flowchart of a method of generating a question from text in accordance with another embodiment.
DETAILED DESCRIPTION
Various embodiments are described more fully below with reference to the accompanying drawings, which form a part hereof, and which show specific exemplary embodiments. However, the concepts of the present disclosure may be implemented in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided as part of a thorough and complete disclosure, to fully convey the scope of the concepts, techniques and implementations of the present disclosure to those skilled in the art. Embodiments may be practiced as methods, systems or devices. Accordingly, embodiments may take the form of a hardware implementation, an entirely software implementation or an implementation combining software and hardware aspects. The following detailed description is, therefore, not to be taken in a limiting sense.
Reference in the specification to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one example implementation or technique in accordance with the present disclosure. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
Some portions of the description that follow are presented in terms of symbolic representations of operations on non-transient signals stored within a computer memory. These descriptions and representations are used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. Such operations typically require physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical, magnetic or optical signals capable of being stored, transferred, combined, compared and otherwise manipulated. It is convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. Furthermore, it is also convenient at times, to refer to certain arrangements of steps requiring physical manipulations of physical quantities as modules or code devices, without loss of generality.
However, all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices. Portions of the present disclosure include processes and instructions that may be embodied in software, firmware or hardware, and when embodied in software, may be downloaded to reside on and be operated from different platforms used by a variety of operating systems.
The present disclosure also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each may be coupled to a computer system bus. Furthermore, the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
The processes and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform one or more method steps. The structure for a variety of these systems is discussed in the description below. In addition, any particular programming language that is sufficient for achieving the techniques and implementations of the present disclosure may be used. A variety of programming languages may be used to implement the present disclosure as discussed herein.
In addition, the language used in the specification has been principally selected for readability and instructional purposes and may not have been selected to delineate or circumscribe the disclosed subject matter. Accordingly, the present disclosure is intended to be illustrative, and not limiting, of the scope of the concepts discussed herein.
Features of various embodiments of systems and methods described herein utilize a novel deep learning-based question generation solution to generate relevant questions from textual content (e.g., a paragraph about specific topic(s) of interest). Various embodiments use associated focused factual statements and question types to generate the relevant question(s).
Specifically, the proposed models may use a novel attention-based recurrent neural network (RNN) encoder-decoder architecture with control gates for focused facts and question types. The deep RNN in accordance with various embodiments can look back on more than just previously generated words and/or characters to decide which word and/or characters to generate next as part of the generated question. Various embodiments described herein may also use a convolutional neural network (CNN) with multiple window sizes to generate phrase embeddings based on, for example, the context span of the underlying sentence in the source paragraph.
Another novel component of various embodiments described herein is the use of a language model on top of a softmax layer. This enables the model to find an appropriate combination of words rather than just using the most probable word at each position. Accordingly, the language model can generate natural language questions.
Features of various embodiments described herein can be implemented in a variety of applications. For example, features of the systems and methods described herein can create question suggestions in search engines or conversational dialogue systems. They can also be used to generate clarification questions/decompositions from a complex question or scenario (e.g., to better understand patient complaints, etc.). They can further be used to generate question answering corpus and perform educational assessments (e.g., as part of a tutoring system).
Methods and systems of various embodiments described herein may receive, as an input, a 3-tuple comprising a source text (e.g., a paragraph of information), a focused factual statement describing the topic of the question to be generated, and an indication of the question type. In some embodiments, the method and system may then create embeddings of both the source text and the focused fact. These embeddings, along with the question type, are fed as inputs into a question generator module that includes a trained RNN that generates a sequence of word embeddings that represent the output question. A language module may also translate the generated embeddings into natural language.
FIG. 1 illustrates a system 100 for generating a question from text in accordance with one embodiment. As shown, the system 100 includes a processor 120, memory 130, a user interface 140, a network interface 150, and storage 160 interconnected via one or more system buses 110. It will be understood that FIG. 1 constitutes, in some respects, an abstraction and that the actual organization of the system 100 and the components thereof may differ from what is illustrated.
The processor 120 may be any hardware device capable of executing instructions stored on memory 130 or storage 160 or otherwise capable of processing data. As such, the processor 120 may include a microprocessor, field programmable gate array (FPGA), application-specific integrated circuit (ASIC), or other similar devices.
The memory 130 may include various memories such as, for example L1, L2, or L3 cache or system memory. As such, the memory 130 may include static random access memory (SRAM), dynamic RAM (DRAM), flash memory, read only memory (ROM), or other similar memory devices.
The user interface 140 may include one or more devices for enabling communication with a user. For example, the user interface 140 may include a display, a mouse, and a keyboard for receiving user commands. In some embodiments, the user interface 140 may include a command line interface or graphical user interface that may be presented to a remote terminal via the network interface 150.
The user interface 140 may present an agent in the form of an avatar to communicate with a user. If the user is a child, for example, the avatar may be presented as a cartoon character to make the user feel more comfortable. The displayed agent may of course vary and depend on the application.
The network interface 150 may include one or more devices for enabling communication with other hardware devices. For example, the network interface 150 may include a network interface card (NIC) configured to communicate according to the Ethernet protocol. Additionally, the network interface 150 may implement a TCP/IP stack for communication according to the TCP/IP protocols. Various alternative or additional hardware or configurations for the network interface 150 will be apparent.
The storage 160 may include one or more machine-readable storage media such as read-only memory (ROM), random-access memory (RAM), magnetic disk storage media, optical storage media, flash-memory devices, or similar storage media. In various embodiments, the storage 160 may store instructions for execution by the processor 120 or data upon with the processor 120 may operate.
For example, the storage 160 may include a question generator module 161. The question generator module 161 may include a fact embedder module 162, a paragraph embedder 163, a question generator 164, and a first gated recurrent unit 165. The question generator module 161 may further include an encoder 166 with a second bidirectional fact-based gated recurrent unit 167 and a decoder 168 with attention generator(s) 169, and a third gated recurrent unit 170. This illustration of the question generator module 161 is merely exemplary and it is contemplated that the system 100, as well as the question generator module 161, may include components in addition to or in lieu of those shown in FIG. 1.
The system 100 and, namely, the question generator module 161 may be trained on any appropriate corpus of language data. For example, one embodiment of the system 100 may use the Stanford Question Answering Dataset (SQuAD). SQuAD was originally used as a reading comprehension dataset consisting of over 100,000 question-answer pairs. The questions were derived from over 23,000 short paragraphs curated from a collection of over 5,000 Wikipedia articles, and the answer to each question is a segment of text from the corresponding paragraph.
The paragraphs may be used as the source, the answers may be used as the focused factual statements, and the corresponding question may be used as the target to train the question generator module 161. The SQUAD dataset also contains different question types (e.g., “what” questions, “when” questions, “how” questions, etc.) about domains such as math, sports, biographies, events, etc.
To train the question generator module 161, tuples of (paragraphs, questions, and answers) from SQuAD may be used. From the entire SQUAD dataset, randomly selected portions of the data may be selected to be used as the training set, development set, and testing set, respectively. It follows that the same paragraph may have different corresponding facts and target questions in the training and testing sets.
FIG. 2 illustrates the workflow 200 of the question generator module 161 of FIG. 1 in accordance with one embodiment. As mentioned previously, a dataset 202 or other input mechanism may provide a tuple 204 comprising a source paragraph, a focused fact, and a question type.
The focused fact may be communicated to the fact embedder module 162 for embedding since the focused fact may be represented by more than word. The question type may indicate the type of question to be generated (e.g., a “what” question, a “where” question, a “when” question, etc.).
The paragraph (or other free text) may be communicated to the paragraph embedder 163 for embedding. The paragraph embedder 163 may generate a sequence of embeddings for words from the paragraph. The output of the paragraph embedder 163 may be a concatenation of each window's output.
The question generator 164 may include or otherwise be configured with the encoder 166 and decoder 168 that implement a bi-directional recurrent neural network (RNN). The output of the question generator 164 is the generated question 206 about the focused fact.
FIG. 3 depicts a more detailed workflow 300 of the question generator module 161 in accordance with one embodiment. As stated previously, the input into the question generator module 161 is a tuple including a paragraph 302 (i.e., free text), a focused-fact 304, and a question type 306.
The paragraph 302 may be fed into a convolutional neural network (CNN) 308. The CNN 308 may use windows of various sizes Ws to capture sequences of words from the paragraph 302. In other words, a CNN window Ws may capture all words of the source text by capturing smaller sequences of words individually. The embeddings may therefore be generated based on the context span of the underlying sentence. Accordingly, the CNN 308 provides a rich embedding solution.
The output of the CNN may be the concatenation of each CNN's window's output. This output may then be fed into a plurality of recurrent neural networks 310 implemented by the question generator 164.
The focused fact 304 may also be embedded in various embodiments (e.g., if the focused factual statement consists of more than one word). The embedded focused fact may be communicated to a squeezer module 312 that generates a vector representation from the sequence of word embeddings. The squeezer module 312 may implement any one of various tools known in the art such as those listed in FIG. 3.
The embedded focused fact 304 and the question type 306 may be fed into a concatenator module 314. The concatenator module 314 may be configured to concenate the focused-fact 304 and the question type 306 and feed them into one or more layers of a bi-directional recurrent neural network (RNN) 310 implemented by the question generator 164.
FIG. 4 illustrates an RNN 310 implemented by the question generator 164 in accordance with one embodiment. In the context of FIG. 4, In-Wt, F-WJ, and Q-WK stand for the Ith, Jth and Kth word of the input paragraph (received from the paragraph embedder 163), the focused fact, and the generated question, respectively. Also, n, m, and q are the length of the paragraph, the focused fact, and the generated question, respectively.
It is noted that the RNN 310 shown in FIG. 4 is only one embodiment, which is referred to as a “deep curious” RNN model. There may be several other types of RNN models that consider different types of input. These different models are discussed below.
The gated recurrent units 165 a and 165 b receive the same input ({F-W1, F-W2, . . . , F-Wm}). The output of the second bidirectional fact-based GRU 167 may be the concatenation of forward and backward outputs of the GRU 165 a and the output of the paragraph embedder 163.
That is, the bidirectional fact-based gated recurrent unit 167 receives the words of the input paragraph In-W1, In-W2, In-Wk, In-Wn-1 . . . , In-Wn and outputs the concatenation of the forward and backward representation of the input paragraph. The forward representation of an input word In-Wk may refer to analyzing the word(s) in the supplied text by reading the words in order from left to right. The backward representation reverses the order and reads the word(s) backwards in order (right to left). Analyzing both representations recognizes the dependency between the words and allows for a better understanding of how words relate to one another. The second bidirectional fact-based GRU 167 then outputs sequences 402 for each input word In-Wk.
To compute the hidden state ht, the encoder 166 performs the following calculations:
r t=σ(W r x t +U r h t-1 +F r d enc)  (Eq. 1)
z t=σ(W z x t +U z h t-1 +F z d enc)  (Eq. 2)
ĥ t=tan h(Wx t +U(r t ⊙h t-1))  (Eq. 3)
h t=(1−z t)h t-1 +z t ĥ t  (Eq. 4)
In the encoder-decoder model shown in FIG. 4, the encoder 166 reads the input sequence of word embeddings X=(x1, x2, . . . , xT) obtained from the paragraph embedder 163 into to the sequence of embeddings H=(h1, h2, . . . , hT), while the received focused fact received from the GRU 165 a is used to compute the gating for all sequences.
Equations 1-4 may be referred to as the operational stages of the GRU 165 a. W, U, F are weighting parameters, xt is the input at time t, ht-1 is the state and the output of the second fact-based gated recurrent unit 167 at time t−1. denc is the time-independent embedding of the fact that is extracted from the GRU 165 a and is the same for all time sequences. ⊙ is the element-wise product operation, rt and zt are the reset gate and the update gate at time t, respectively. ĥ is the new candidate state at time t, and ht is the final state and output of the GRU 165 a at time t.
Equation 1 calculates the reset gate and determines the importance of ht-1 (the state at t−1) in calculating the summarization ĥt. Equation 2 calculates the update signal and determines how much of ht-t should be considered in the calculating the next state ht at time t. For example, if zt is approximately equal to 1, then ht-1 is almost entirely copied to ht. On the other hand, if zt is approximately equal to 0, then mostly the new memory ĥt is forwarded to calculate the next hidden state.
Equation 3 calculates the new memory ĥt which is the consolidation of a new input xt with the past hidden state ht-1. This equation essentially combines a newly observed word with a previous state ht-t to summarize the new word in the context of the previous state. Finally, equation 4 calculates the final state ht which is the output of the gated recurrent unit 165 a.
The functions executed by the decoder 168 are similar to equations 1-4 above and are:
r t=σ(W r x t +U r h t-1 +F r d dec)  (Eq. 5)
z t=σ(W z x t+(U z h t-1 +F z d dec)  (Eq. 6)
ĥ t=tan h(Wx t +U(r t ⊙h t-1))  (Eq. 7)
h t=(1−z t)h t-1 +z t ĥ t  (Eq. 8)
The decoder 168 utilizes the focused fact (received from GRU 165 b) in a similar manner as the encoder 166. The equations 5-8 are similar to equations 1-4 except ddec stands for the independent embedding of the focused fact that is extracted from the GRU 165 b.
The decoder 168 is trained to predict the next word yt (referred to as Q-Wt in FIG. 4) based on the context vector c from the encoder 166 and the previous predicted words (yt, . . . , yt-1). The decoder 168 may define a probability over the prediction sequence Y=(y1, . . . , yT) by decomposing the joint probability that a particular sequence of words appears in the dataset into the order conditionals:
p(Y)=Πt=1 T p(y t |{y 1 , . . . ,y t-1 },c t ,d dec)  (Eq. 9)
In RNNs, each conditional probability is modeled as follows:
p(y t |{y 1 , . . . ,y t-1 },c t d dec)=f(y t-1 ,s t ,c t ,d dec)  (Eq. 10)
    • where:
      s t =g(y t-1 ,s t-1 ,c t ,d dec),  (Eq. 11)
      c t =q({h 1 , . . . ,h T },s t-1)  (Eq. 12)
      and f, g, and q, are nonlinear and potentially multilayered functions, st is the hidden state of the decoder 168 at time t, and ct is the context vector from the encoder 166 at time t which is generated based on the function q. The context vector ct is computed as a weighted sum of the outputs H=(h1, . . . , hT) from the encoder 166 using the equation:
      c tt=1 T a ti h i  (Eq. 13)
    • where:
e ti = ϕ ( s t - 1 , h i ) , ( Eq . 14 ) a ti = exp ( e ti ) j = 1 T exp ( e tj ) ( Eq . 15 )
and ϕ is a feedforward neural network that is jointly trained with other components of the model.
FIG. 5 illustrates the outputted concatenated sequences 402 of FIG. 4 being communicated to the attention generator 169 of FIG. 4. The attention generator(s) 169 may be configured to compute normalized weights for each sequence (i.e., hidden representation) outputted by the encoder 166.
Specifically, in order to generate the ith word of the question (Q-Wt) the attention generator 169 may compute the weight and importance of each encoder output according to the previous decoder hidden state using equation 14. Referring back to FIG. 3, this step may be performed by the softmax layer 314. The softmax function mitigates the effect of extreme values or outliers in a dataset without entirely removing them from the dataset.
Then, the attention generator 169 may normalize the computed weights for all sequences and communicate the normalized weights to the third fact-based GRU 170. The third fact-based GRU 170 may use the weighted sum of the encoder's hidden representations to generate a possible word for the question.
Referring back to FIG. 3, the output of the decoder 168, as well as the softmax layer 314, may be communicated to a language model 316. The language model 316 may be configured to at least assess proposed questions.
For example, for each word position in a sequence, the top k probable words will be communicated to the language model 316. The language model 316 may execute a beam search and/or an n-gram based language model (as well as any other suitable type of language model). The score of each word would be the multiplication of the softmax value (as determined by equation 14) and the n gram probabilities. Finally, the generated question may be supplied to a user via a user interface 140.
As mentioned previously, there may be several configurations of the recurrent neural networks. One implementation may be referred to as an “encoder-decoder” model. In this configuration, the encoder-decoder model may be built on a simple attention-based RNN encoder-decoder framework. The model does not use a factual statement and the RNN generates questions directly from the source text or paragraph.
Another implementation may be referred to as simply a “deep curious” model. The deep curious model may be a fact-based RNN that is built on an attention-based encoder-decoder framework. In this embodiment, the factual statement affects the gating of the encoder and the decoder RNN.
Another implementation may be referred to as an “augmented deep curious” model which is an augmented version of the deep curious model discussed above. In this type of model, the focused fact (i.e., the fact representation that is extracted for the decoder part, ddec) also affects the attention generator of the network. In other words, the attention generator considers the factual statement in addition to the previous hidden state of the decoder and the hidden states of the encoder in accordance with equation 16 below.
e ti=ϕ(s t-1 ,h i)  (Eq. 16)
Another implementation may be referred to as a “simplified deep curious” model. This model is similar to the deep curious model discussed above, but uses one single fact representation for both the encoder and decoder.
Another implementation may be referred to as an “elementary deep curious” model. The elementary deep curious model is a fact-based RNN that is built on an attention-based encoder-decoder framework. However, the final output from the encoder that passes through the decoder is the element-wise product of the encoder output and the extracted fact representation from the encoder GRU (denc).
Another implementation may be referred to as a “separate deep curious” model. This model is similar to the elementary deep curious model. However, the separate deep curious model uses a separate GRU to extract a fact representation called delem. Then, the model uses delem instead of denc in the element-wise product.
Another implementation may be referred to as a “fact+ encoder-decoder” model. This model is similar to the encoder-decoder model discussed above, but its input is the concatenation of the factual statement and input paragraph.
Another implementation may be referred to as a “fact+ deep curious” model. This model is similar to the deep curious model, but its input is the concatenation of the factual statement and input paragraph.
Another implementation may be referred to as an “augmented+ deep curious” model. This model is similar to the augmented deep curious model, but its input is the concatenation of the factual statement and paragraph.
FIG. 6 depicts a flowchart of a method 600 of generating a question from text in accordance with one embodiment. Step 602 involves receiving textual content using an interface. The textual content may be in the form of a paragraph of free text, for example, and may be received through any suitable interface such as those discussed previously.
Step 604 involves receiving a factual statement associated with the textual content using the interface. The factual statement or “focused fact” indicates to what the generated question should be directed. Accordingly, the question generator module 161 may suggest more relevant questions based on the desires of the user(s).
Step 606 involves generating, using a processor executing instructions stored on a memory to provide a question generator module, a question from the textual content relating to the factual statement. The processor may rely on convolutional neural networks that feed into one or more recurrent neural networks to find the most optimal combination of words to generate the most relevant question. Then, a generated question may be outputted to a user.
FIG. 7 depicts a flowchart of a method 700 of generating a question from text in accordance with another embodiment. Steps 702 and 704 are similar to steps 602 and 604 of FIG. 6, respectively, and are not repeated here. Step 706 involves receiving a question type related to the textual content. The question type may refer to whether the generated question should be a “what” question, a “why” question, a “when” question, etc.
Step 708 involves mapping a sequence of the factual statement to word embeddings. In some embodiments the factual statement consists of a plurality of words. In these embodiments, the processor may execute instructions stored on a memory to provide a fact embedder module. The fact embedder module may be similarly configured to the fact embedder 162 of FIG. 2, for example, and may map the sequence of words of the factual statement to word embeddings.
Step 710 involves processing the word embeddings of the fact embedder module using a first gated recurrent unit to result in a set of computed weights. The first gated recurrent unit may be similar to the GRU 165 a of FIG. 4, for example.
Step 712 involves providing the received textual content to a second bidirectional fact-based gated recurrent unit whose weighting is determined by the set of computed weights. The second bidirectional fact-based gated recurrent unit may be similar to the bidirectional fact-based gated recurrent unit 167 of FIG. 4, for example.
Step 714 involves providing output of the second bidirectional fact-based gated recurrent unit to at least one attention generator, each attention generator computing normalized weights for all sequences of the second gated recurrent unit. The attention generators may be similar to the attention generators 169 of FIG. 4.
Step 716 involves using the computed normalized weights and a third unidirectional fact-based gated recurrent unit to generate a plurality of words forming the question. The third unidirectional fact-based gated recurrent unit may be similar to the third fact-based GRU 170 of FIG. 4.
Finally, the method 700 ends with step 718, which involves generating a question relating to the factual statement using the words generated in step 716. The question may then be outputted to a user.
The methods, systems, and devices discussed above are examples. Various configurations may omit, substitute, or add various procedures or components as appropriate. For instance, in alternative configurations, the methods may be performed in an order different from that described, and that various steps may be added, omitted, or combined. Also, features described with respect to certain configurations may be combined in various other configurations. Different aspects and elements of the configurations may be combined in a similar manner. Also, technology evolves and, thus, many of the elements are examples and do not limit the scope of the disclosure or claims.
Embodiments of the present disclosure, for example, are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to embodiments of the present disclosure. The functions/acts noted in the blocks may occur out of the order as shown in any flowchart. For example, two blocks shown in succession may in fact be executed substantially concurrent or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Additionally, or alternatively, not all of the blocks shown in any flowchart need to be performed and/or executed. For example, if a given flowchart has five blocks containing functions/acts, it may be the case that only three of the five blocks are performed and/or executed. In this example, any of the three of the five blocks may be performed and/or executed.
A statement that a value exceeds (or is more than) a first threshold value is equivalent to a statement that the value meets or exceeds a second threshold value that is slightly greater than the first threshold value, e.g., the second threshold value being one value higher than the first threshold value in the resolution of a relevant system. A statement that a value is less than (or is within) a first threshold value is equivalent to a statement that the value is less than or equal to a second threshold value that is slightly lower than the first threshold value, e.g., the second threshold value being one value lower than the first threshold value in the resolution of the relevant system.
Specific details are given in the description to provide a thorough understanding of example configurations (including implementations). However, configurations may be practiced without these specific details. For example, well-known circuits, processes, algorithms, structures, and techniques have been shown without unnecessary detail in order to avoid obscuring the configurations. This description provides example configurations only, and does not limit the scope, applicability, or configurations of the claims. Rather, the preceding description of the configurations will provide those skilled in the art with an enabling description for implementing described techniques. Various changes may be made in the function and arrangement of elements without departing from the spirit or scope of the disclosure.
Having described several example configurations, various modifications, alternative constructions, and equivalents may be used without departing from the spirit of the disclosure. For example, the above elements may be components of a larger system, wherein other rules may take precedence over or otherwise modify the application of various implementations or techniques of the present disclosure. Also, a number of steps may be undertaken before, during, or after the above elements are considered.
Having been provided with the description and illustration of the present application, one skilled in the art may envision variations, modifications, and alternate embodiments falling within the general inventive concept discussed in this application that do not depart from the scope of the following claims.

Claims (15)

What is claimed is:
1. A method of generating a question from text, the method comprising:
receiving textual content using an interface;
receiving a factual statement associated with the textual content using the interface; and
generating, using a processor executing instructions stored on a memory to provide a question generator module including an encoder and a decoder, the question from the textual content relating to the factual statement, wherein the factual statement indicates to what the generated question should be directed, a language model is configured to assess proposed questions based upon output of the decoder and a softmax layer, and analysis of the textual content combines both a forward representation of the textual content and a backward representation of the textual content;
wherein the softmax layer computes weight and importance of each output of the encoder according to a preceding hidden state of the decoder.
2. The method of claim 1, further comprising:
receiving a question type related to the textual content using the interface.
3. The method of claim 1, wherein the factual statement consists of a plurality of words, and the method further comprises:
mapping, using a processor executing instructions stored on a memory to provide a fact embedder module, a sequence of the plurality of words to word embeddings.
4. The method of claim 3, further comprising:
processing the word embeddings of the fact embedder module using a first gated recurrent unit to result in a set of computed weights.
5. The method of claim 4, further comprising:
providing the received textual content to a second bidirectional fact-based gated recurrent unit whose weighting is determined by the set of computed weights.
6. The method of claim 5, further comprising:
providing output of the second bidirectional fact-based gated recurrent unit to at least one attention generator, each attention generator computing normalized weights for all sequences of the second gated recurrent unit.
7. The method of claim 6, further comprising:
using the computed normalized weights and a third unidirectional fact-based gated recurrent unit to generate a plurality of words forming the question.
8. The method of claim 6, wherein the at least one attention generator utilizes the set of computed weights in determining the normalized weights.
9. The method of claim 5, wherein the second bidirectional fact-based gated recurrent unit comprises a convolutional neural network feeding forward into a plurality of recurrent neural networks.
10. A system for generating a question from text, the system comprising:
an interface configured to receive textual content and a factual statement associated with the textual content; and
a processor configured to execute instructions stored on a memory to provide a question generator module including an encoder and a decoder that is configured to generate the question from the textual content relating to the factual statement, wherein the factual statement indicates to what the generated question should be directed, a language model is configured to assess proposed questions based upon output of the decoder and a softmax layer, and analysis of the textual content combines both a forward representation of the textual content and a backward representation of the textual content;
wherein the softmax layer computes weight and importance of each output of the encoder according to a preceding hidden state of the decoder.
11. The system of claim 10, wherein the question generator module is configured to receive input that is a concatenation of the factual statement and a paragraph.
12. The system of claim 11, further comprising:
an attention generator configured to consider the factual statement and previous hidden states of the question generator module.
13. The system of claim 11, wherein the question generator module is configured to use a single representation of the factual statement for the encoder and the decoder.
14. The system of claim 10, wherein an input to the decoder is the element-wise product of an encoder output and an extracted fact representation from the encoder.
15. A non-transitory computer readable medium containing computer-executable instructions for performing a method of generating a question from text, the non-transitory computer readable medium comprising:
computer-executable instructions for receiving textual content using an interface;
computer-executable instructions for receiving a factual statement associated with the textual content using the interface; and
computer-executable instructions for generating, using a processor executing instructions stored on a memory to provide a question generator module including an encoder and a decoder, the question from the textual content relating to the factual statement, a language model is configured to assess proposed questions based upon output of the decoder and a softmax layer, and analysis of the textual content combines both a forward representation of the textual content and a backward representation of the textual content;
wherein the softmax layer computes weight and importance of each output of the encoder according to a preceding hidden state of the decoder.
US16/334,135 2016-09-29 2017-09-29 Question generation Active 2038-02-28 US11294942B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/334,135 US11294942B2 (en) 2016-09-29 2017-09-29 Question generation

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201662401293P 2016-09-29 2016-09-29
PCT/EP2017/074818 WO2018060450A1 (en) 2016-09-29 2017-09-29 Question generation
US16/334,135 US11294942B2 (en) 2016-09-29 2017-09-29 Question generation

Publications (2)

Publication Number Publication Date
US20200183963A1 US20200183963A1 (en) 2020-06-11
US11294942B2 true US11294942B2 (en) 2022-04-05

Family

ID=59997363

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/334,135 Active 2038-02-28 US11294942B2 (en) 2016-09-29 2017-09-29 Question generation

Country Status (2)

Country Link
US (1) US11294942B2 (en)
WO (1) WO2018060450A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210375404A1 (en) * 2019-06-05 2021-12-02 Boe Technology Group Co., Ltd. Medical question-answering method, medical question-answering system, electronic device, and computer readable storage medium

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110383299B (en) * 2017-02-06 2023-11-17 渊慧科技有限公司 Memory enhanced generation time model
US10902205B2 (en) * 2017-10-25 2021-01-26 International Business Machines Corporation Facilitating automatic detection of relationships between sentences in conversations
US11422996B1 (en) * 2018-04-26 2022-08-23 Snap Inc. Joint embedding content neural networks
CN109033073B (en) * 2018-06-28 2020-07-28 中国科学院自动化研究所 Text inclusion recognition method and device based on vocabulary dependency triple
CN110147532B (en) * 2019-01-24 2023-08-25 腾讯科技(深圳)有限公司 Encoding method, apparatus, device and storage medium
US11526557B2 (en) 2019-11-27 2022-12-13 Amazon Technologies, Inc. Systems, apparatuses, and methods for providing emphasis in query results
US11475067B2 (en) * 2019-11-27 2022-10-18 Amazon Technologies, Inc. Systems, apparatuses, and methods to generate synthetic queries from customer data for training of document querying machine learning models
US11366855B2 (en) 2019-11-27 2022-06-21 Amazon Technologies, Inc. Systems, apparatuses, and methods for document querying
CN112115687B (en) * 2020-08-26 2024-04-26 华南理工大学 Method for generating problem by combining triplet and entity type in knowledge base
US11687539B2 (en) 2021-03-17 2023-06-27 International Business Machines Corporation Automatic neutral point of view content generation
US11704499B2 (en) 2021-04-13 2023-07-18 Microsoft Technology Licensing, Llc Generating questions using a resource-efficient neural network
TW202314579A (en) * 2021-09-17 2023-04-01 財團法人資訊工業策進會 Machine reading comprehension apparatus and method
CN115600587B (en) * 2022-12-16 2023-04-07 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Mathematics application question generation system and method, intelligent terminal and readable storage medium

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8175873B2 (en) 2008-12-12 2012-05-08 At&T Intellectual Property I, L.P. System and method for referring to entities in a discourse domain
US8612204B1 (en) 2011-03-30 2013-12-17 Google Inc. Techniques for reordering words of sentences for improved translation between languages
US8738375B2 (en) 2011-05-09 2014-05-27 At&T Intellectual Property I, L.P. System and method for optimizing speech recognition and natural language parameters with user feedback
US20140156260A1 (en) * 2012-11-30 2014-06-05 Microsoft Corporation Generating sentence completion questions
US20140229158A1 (en) 2013-02-10 2014-08-14 Microsoft Corporation Feature-Augmented Neural Networks and Applications of Same
US20140279763A1 (en) 2013-03-18 2014-09-18 Educational Testing Service System and Method for Automated Scoring of a Summary-Writing Task
US20140278341A1 (en) 2013-03-13 2014-09-18 Red Hat, Inc. Translation assessment
US20140358537A1 (en) 2010-09-30 2014-12-04 At&T Intellectual Property I, L.P. System and Method for Combining Speech Recognition Outputs From a Plurality of Domain-Specific Speech Recognizers Via Machine Learning
US9037464B1 (en) 2013-01-15 2015-05-19 Google Inc. Computing numeric representations of words in a high-dimensional space
US20160063879A1 (en) * 2014-08-26 2016-03-03 Microsoft Corporation Generating high-level questions from sentences
US20160117314A1 (en) 2014-10-27 2016-04-28 International Business Machines Corporation Automatic Question Generation from Natural Text
US9330084B1 (en) * 2014-12-10 2016-05-03 International Business Machines Corporation Automatically generating question-answer pairs during content ingestion by a question answering computing system
US20160125872A1 (en) 2014-11-05 2016-05-05 At&T Intellectual Property I, L.P. System and method for text normalization using atomic tokens
US20160350653A1 (en) * 2015-06-01 2016-12-01 Salesforce.Com, Inc. Dynamic Memory Network
US20160358072A1 (en) * 2015-06-05 2016-12-08 Google Inc. Reading comprehension neural networks
US20170103324A1 (en) * 2015-10-13 2017-04-13 Facebook, Inc. Generating responses using memory networks
US20200042597A1 (en) * 2017-04-27 2020-02-06 Microsoft Technology Licensing, Llc Generating question-answer pairs for automated chatting
US10606846B2 (en) * 2015-10-16 2020-03-31 Baidu Usa Llc Systems and methods for human inspired simple question answering (HISQA)

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8566090B2 (en) 2008-12-12 2013-10-22 At&T Intellectual Property I, L.P. System and method for referring to entities in a discourse domain
US8175873B2 (en) 2008-12-12 2012-05-08 At&T Intellectual Property I, L.P. System and method for referring to entities in a discourse domain
US20140358537A1 (en) 2010-09-30 2014-12-04 At&T Intellectual Property I, L.P. System and Method for Combining Speech Recognition Outputs From a Plurality of Domain-Specific Speech Recognizers Via Machine Learning
US8612204B1 (en) 2011-03-30 2013-12-17 Google Inc. Techniques for reordering words of sentences for improved translation between languages
US8738375B2 (en) 2011-05-09 2014-05-27 At&T Intellectual Property I, L.P. System and method for optimizing speech recognition and natural language parameters with user feedback
US20140156260A1 (en) * 2012-11-30 2014-06-05 Microsoft Corporation Generating sentence completion questions
US9020806B2 (en) * 2012-11-30 2015-04-28 Microsoft Technology Licensing, Llc Generating sentence completion questions
US9037464B1 (en) 2013-01-15 2015-05-19 Google Inc. Computing numeric representations of words in a high-dimensional space
US20140229158A1 (en) 2013-02-10 2014-08-14 Microsoft Corporation Feature-Augmented Neural Networks and Applications of Same
US20140278341A1 (en) 2013-03-13 2014-09-18 Red Hat, Inc. Translation assessment
US20140279763A1 (en) 2013-03-18 2014-09-18 Educational Testing Service System and Method for Automated Scoring of a Summary-Writing Task
US20160063879A1 (en) * 2014-08-26 2016-03-03 Microsoft Corporation Generating high-level questions from sentences
WO2016032864A1 (en) 2014-08-26 2016-03-03 Microsoft Technology Licensing, Llc Generating high-level questions from sentences
US20190355267A1 (en) * 2014-08-26 2019-11-21 Microsoft Technology Licensing, Llc Generating high-level questions from sentences
US20160117314A1 (en) 2014-10-27 2016-04-28 International Business Machines Corporation Automatic Question Generation from Natural Text
US20160125872A1 (en) 2014-11-05 2016-05-05 At&T Intellectual Property I, L.P. System and method for text normalization using atomic tokens
US9330084B1 (en) * 2014-12-10 2016-05-03 International Business Machines Corporation Automatically generating question-answer pairs during content ingestion by a question answering computing system
US20160350653A1 (en) * 2015-06-01 2016-12-01 Salesforce.Com, Inc. Dynamic Memory Network
US20160358072A1 (en) * 2015-06-05 2016-12-08 Google Inc. Reading comprehension neural networks
US20170103324A1 (en) * 2015-10-13 2017-04-13 Facebook, Inc. Generating responses using memory networks
US10606846B2 (en) * 2015-10-16 2020-03-31 Baidu Usa Llc Systems and methods for human inspired simple question answering (HISQA)
US20200042597A1 (en) * 2017-04-27 2020-02-06 Microsoft Technology Licensing, Llc Generating question-answer pairs for automated chatting

Non-Patent Citations (28)

* Cited by examiner, † Cited by third party
Title
Agarwal, et al. "Automatic Question Generation Using Discourse Cues", In Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 1-9.
Andrenucci, et al. "Automated Question Answering: Review of the Main Approaches", In Proceedings of the 3rd International Conference on Information Technology and Applications (ICITA'05), Sydney, Australia, pp. 514-519.
Bahdanau, et al., "Neural Machine Translation by Jointly Learning to Align and Translate", Published as a conference paper at ICLR 2015, pp. 1-15.
Boyer, et al., "Proceedings of QG2010: The Third Workshop on Question Generation", Jun. 18, 2010, Pittsburgh: questiongeneration.org., 95 pages.
Brown, et al., "Automatic Question Generation for Vocabulary Assessment", In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, 2005, 8 pages.
Chen, et al., "Generating Questions Automatically from Informational Text", In Proceedings of the 2nd Workshop on Question Generation (AIED 2009), pp. 17-24.
Clark, et al., "Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability", In Proceedings of ACL-HLT, pp. 176-181.
Echihabi, et al., "A Noisy-channel Approach to Question Answering", In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics—vol. 1, pp. 16-23. ACL.
Graesser, et al., "Intelligent Tutoring Systems with Conversational Dialogue", 2001 AI Magazine, 22(4):39-51. (Abstract).
Heilman, et al., "Extracting Simplified Statements for Factual Question Generation", In Proceedings of the Third Workshop on Question Generation, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, 10 pages.
Heilman, et al., "Good Question! Statistical Ranking for Question Generation", In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 609-617.
Hickl, et al., "Experiments with Interactive Question-Answering", 2005, In In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pp. 60-69.
Lavie, et al., "METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments", 2007, In Proceedings of the Second Workshop on Statistical Machine Translation, pp. 228-231.
Madnani, et al., "Reexamining Machine Translation Metrics for Paraphrase Identification", 2012, In Proceedings of NAACLHLT, pp. 182-190.
Mannem, et al., "Question Generation from Paragraphs at Upenn", 2010, In Proceedings of the Third Workshop on Question Generation, 8 pages.
McGough, et al., "A Web-based Testing System with Dynamic Question Generation", 2001, In ASEE/IEEE Frontiers in Education Conference, 2 pages (Abstract).
Mitkov, et al., "Computer-aided Generation of Multiple-Choice Tests" 2003, In Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing—vol. 2, pp. 17-22.
Papineni, et al., "BLEU: A Method for Automatic Evaluation of Machine Translation", IBM Research Report, Computer Science, Sep. 17, 2001, 10 pages.
Rajpurkar, et al., "Squad: 100, 000+ questions for machine comprehension of text", Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, Nov. 1-5, 2016, pp. 2383-2392.
Rus, et al., "Experiments on Generating Questions About Facts", 2007, In Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing, pp. 444-455. Springer-Verlag. (Abstract).
Rus, et al., "The Question Generation Shared Task and Evaluation Challenge", Proceedings of the 13th European Workshop on Natural Language Generation (ENLG), Sep. 2011, pp. 318-320.
Serban, et al., "Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus", Cornell University, Computer Science, Submitted on Mar. 22, 2016 (last revised May 29, 2016, 13 pages.
Subramanian, et al, "Neural Models for Key Phrase Detection and Question Generation", Cornell University, Computer Science, Submitted on Jun. 14, 2017 (last revised May 30, 2018), 7 pages.
Wang, et al., "Automatic Question Generation for Learning Evaluation in Medicine", ICWL 2007, LNCS 4823, 2008, pp. 242-251.
Wubben, et al., "Paraphrase Generation As Monolingual Translation: Data and Evaluation", 2010, In Proceedings of INLG, pp. 203-207.
Yang, et al., "Semi-Supervised QA with Generative Domain-Adaptive Nets", Conference Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Apr. 22, 2017, vol. 1, 11 pages.
Yuan, et al., "Machine Comprehension by Text-to-Text Neural Question Generation", Proceedings of the 2nd Worshop on Representation Learning for NLP, Vancouver, Canada, Aug. 3, 2017, pp. 15-25.
Zhou, et al., "Neural Question Generation from Text: A Preliminary Study", Cornell University, Computer Science, submitted on Apr. 6, 2017 (last revised Apr. 18, 2017) 6 pages.

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210375404A1 (en) * 2019-06-05 2021-12-02 Boe Technology Group Co., Ltd. Medical question-answering method, medical question-answering system, electronic device, and computer readable storage medium

Also Published As

Publication number Publication date
WO2018060450A1 (en) 2018-04-05
US20200183963A1 (en) 2020-06-11

Similar Documents

Publication Publication Date Title
US11294942B2 (en) Question generation
Uc-Cetina et al. Survey on reinforcement learning for language processing
US11087092B2 (en) Agent persona grounded chit-chat generation framework
CN109844743B (en) Generating responses in automated chat
EP3516591B1 (en) Neural machine translation systems
CN106997370B (en) Author-based text classification and conversion
US20200042597A1 (en) Generating question-answer pairs for automated chatting
US20180329884A1 (en) Neural contextual conversation learning
CN108595629B (en) Data processing method and application for answer selection system
US20210248450A1 (en) Sorting attention neural networks
US11741190B2 (en) Multi-dimensional language style transfer
CN109313650A (en) Response is generated in automatic chatting
Gale et al. Experiments in Character-Level Neural Network Models for Punctuation.
Xiong et al. DGI: recognition of textual entailment via dynamic gate matching
CN110245349A (en) A kind of syntax dependency parsing method, apparatus and a kind of electronic equipment
JP2022503812A (en) Sentence processing method, sentence decoding method, device, program and equipment
CN113807512B (en) Training method and device for machine reading understanding model and readable storage medium
US20200364543A1 (en) Computationally efficient expressive output layers for neural networks
US12125271B2 (en) Image paragraph description generating method and apparatus, medium and electronic device
US20240232572A1 (en) Neural networks with adaptive standardization and rescaling
CN113177393B (en) Method and apparatus for pre-training language model for improved understanding of web page structure
CN111767720B (en) Title generation method, computer and readable storage medium
WO2024119831A1 (en) Question generation method, and generation apparatus, computer device and storage medium
US20200134023A1 (en) Learning method and generating apparatus
CN112905754A (en) Visual conversation method and device based on artificial intelligence and electronic equipment

Legal Events

Date Code Title Description
AS Assignment

Owner name: KONINKLIJKE PHILIPS N.V., NETHERLANDS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GHAEINI, REZA;AL HASAN, SHEIKH SADID;FARRI, OLADIMEJI FEYISETAN;AND OTHERS;SIGNING DATES FROM 20181204 TO 20190112;REEL/FRAME:048623/0534

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE