WO2019244803A1 - Dispositif d'apprentissage de réponse, procédé d'apprentissage de réponse, dispositif de génération de réponse, procédé de génération de réponse et programme - Google Patents

Dispositif d'apprentissage de réponse, procédé d'apprentissage de réponse, dispositif de génération de réponse, procédé de génération de réponse et programme Download PDF

Info

Publication number
WO2019244803A1
WO2019244803A1 PCT/JP2019/023755 JP2019023755W WO2019244803A1 WO 2019244803 A1 WO2019244803 A1 WO 2019244803A1 JP 2019023755 W JP2019023755 W JP 2019023755W WO 2019244803 A1 WO2019244803 A1 WO 2019244803A1
Authority
WO
WIPO (PCT)
Prior art keywords
answer
unit
sentence
question
polarity
Prior art date
Application number
PCT/JP2019/023755
Other languages
English (en)
Japanese (ja)
Inventor
光甫 西田
京介 西田
淳史 大塚
いつみ 斉藤
久子 浅野
準二 富田
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP2019032127A external-priority patent/JP2019220142A/ja
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to US17/254,187 priority Critical patent/US20210125516A1/en
Publication of WO2019244803A1 publication Critical patent/WO2019244803A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates to an answer learning device, an answer learning method, an answer generating device, an answer generating method, and a program, and in particular, an answer generating device for answering a question sentence with a polarity, an answer learning device, an answer generating method, The answer learning method and program.
  • Non-Patent Document 1 BiDAF (Non-Patent Document 1)
  • SQuAD Non-Patent Document 2
  • SQuAD is a data set for an extraction type task in which a sentence of one paragraph is linked to one question, and the answer written in the sentence is extracted as it is as an answer.
  • the present invention has been made in view of the above points, and provides an answer generation device, an answer generation method, and a program that can accurately and polarly answer a question that can be answered with a polarity.
  • the purpose is to:
  • the present invention has been made in view of the above points, and, for a question that can be answered with a polarity, an answer learning device that can accurately learn a model for answering with a polarity,
  • the purpose is to provide learning methods and programs.
  • the answer generation device based on the input sentence and question sentence, using a pre-learned reading comprehension model for estimating the range that is the basis of the answer to the question sentence in the sentence, the said range
  • a machine reading comprehension unit for estimating the start and end of a, based on the information obtained by the processing of the machine reading comprehension unit, a judgment model learned in advance to judge whether the polarity of the answer to the question sentence is positive or not.
  • a determining unit for determining the polarity of the answer to the question sentence.
  • the machine reading unit based on the input sentence and the question sentence, the pre-learned reading comprehension for estimating the range of the answer to the question sentence in the sentence Using a model, the start and end of the range are estimated, and the determining unit determines whether the polarity of the answer to the question is positive based on information obtained by the processing of the machine reading unit.
  • the polarity of the answer to the question sentence is determined using a determination model learned in advance.
  • the machine reading section reads the reading comprehension for estimating the range that is the basis of the answer to the question sentence in the sentence based on the input sentence and the question sentence.
  • the model uses the model to estimate the start and end of the range, and the determination unit learns in advance to determine whether the polarity of the answer to the question is positive based on information obtained by the processing of the machine reading unit.
  • the polarity of the answer to the question sentence is determined using the determined determination model.
  • the start and end of the range are estimated using the reading model for estimating the range on which the answer to the question in the sentence is based.
  • the information obtained by the estimating process by using a judgment model learned in advance to judge whether the polarity of the answer to the question sentence is positive, by determining the polarity of the answer to the question sentence, A question that can be answered with a polarity can be accurately answered with a polarity.
  • the reading model and the judgment model of the answer generation device are neural networks
  • the machine reading unit receives the sentence and the question sentence as input, and encodes the sentence, Based on the result of encoding the question sentence, a reading matrix is generated using the reading model for estimating the range, and the start and end of the range are estimated using the reading matrix, and the determination is performed.
  • the unit determines the polarity of the answer to the question using the determination model that determines whether the polarity of the answer to the question is positive based on the reading matrix generated by the machine reading unit. be able to.
  • the answer generation device may further include a question determination unit that determines whether the question message is a question that can be answered in a polarity, and the determination unit includes the question message by the question determination unit. Is determined to be a question that can be answered with polarity, the polarity of the answer to the question sentence can be determined using the determination model.
  • the polarity of the answer of the answer generating apparatus according to the present invention may be Yes or No, or OK or NG.
  • the answer generation device further includes an output unit, wherein the machine reading unit extracts, based on the information obtained by the processing, the basis information that is the basis of the answer to the question sentence.
  • a rationale extraction unit that extracts rationale information of the answer to the question sentence using a model, wherein the output unit is configured to determine the polarity of the answer determined by the determination unit and the rationale extracted by the rationale extraction unit. The information and the answer can be output.
  • the answer generation device wherein the determination model is for determining whether the answer to the question sentence is a positive polarity, a non-positive polarity, or a non-polar answer
  • the determination unit determines whether the answer to the question sentence is a positive polarity, a non-positive polarity, or a non-polar answer using the determination model, and the output unit determines When it is determined that the answer is not the polarity, the basis information extracted by the basis extraction unit can be output as a response.
  • the answer learning device is a learning data including a sentence, a question sentence, a correct answer indicating a polarity of an answer to the question sentence in the sentence, and a start end and an end of a range serving as a basis for the answer in the sentence.
  • An input unit that receives an input of, based on the sentence and the question sentence, using a reading model for estimating the range, a machine reading unit that estimates the start and end of the range, and a machine reading unit.
  • a determination unit that determines the polarity of the answer to the question using a determination model that determines whether the polarity of the answer to the question is positive based on the information obtained by the processing, and a determination unit included in the learning data.
  • the correct answer matches the result determined by the determination unit, the start end and the end included in the learning data, the start end estimated by the machine reading unit and As the serial and the end coincide, configured and a parameter learning unit for learning the parameters of the reading model and the determination model.
  • the input unit includes a sentence, a question sentence, a correct answer indicating the polarity of an answer to the question sentence in the sentence, and a starting point of a range serving as a basis of the answer in the sentence.
  • the learning unit, the correct answer included in the learning data, the result determined by the determination unit matches, the start end and the end included in the learning data, the machine reading comprehension unit Ri so that the estimated the beginning and the end coincide, learns the parameters of the reading model and the determination model.
  • the input unit includes a sentence, a question sentence, a correct answer indicating the polarity of an answer to the question sentence in the sentence, and a range serving as a basis for the answer in the sentence.
  • the machine reading comprehension unit receives the input of the learning data including the start and end of the range, and estimates the start and end of the range using a reading model for estimating the range based on the sentence and the question sentence. I do.
  • the determination unit determines the polarity of the answer to the question using a determination model that determines whether the polarity of the answer to the question is positive
  • the parameter learning unit determines that the correct answer included in the learning data matches the result determined by the determination unit, and that the start and end included in the learning data match the start and end estimated by the machine reading unit.
  • the parameters of the reading model and the judgment model are learned.
  • the input of the learning data including the sentence, the question sentence, the correct answer indicating the polarity of the answer to the question sentence in the sentence, and the start and end of the range based on which the answer in the sentence is accepted,
  • a reading model for estimating the range based on information obtained by estimating the start and end of the range, the polarity of the answer to the question is positive or not.
  • the polarity of the answer to the question sentence is determined using a determination model that determines whether the correct answer included in the learning data matches the determined result, and the start and end included in the learning data are estimated.
  • the machine reading comprehension unit of the answer learning device based on the information obtained by the processing, using the extraction model to extract the basis information that is the basis of the answer to the question sentence, the question
  • the learning data further includes a basis extraction unit that extracts basis information of an answer to a sentence
  • the learning data further includes basis information of the answer in the sentence
  • the parameter learning unit further includes a base unit in the sentence included in the learning data.
  • the parameters of the extracted model can be learned so that the basis information of the answer matches the basis information extracted by the basis extraction unit.
  • a program according to the present invention is a program for functioning as each unit of the above-described answer learning device or answer generating device.
  • a question that can be answered with a polarity can be accurately answered with a polarity.
  • the answer learning device the answer learning method, and the program of the present invention, it is possible to accurately learn a model for answering with a polarity with respect to a question that can be answered with a polarity.
  • 4 is a flowchart illustrating an answer learning processing routine of the answer learning device according to the first embodiment of the present invention. It is a functional block diagram showing the composition of the answer generation device concerning a 1st embodiment of the present invention.
  • 5 is a flowchart illustrating an answer generation processing routine of the answer generation device according to the first embodiment of the present invention. It is a functional block diagram showing the composition of the answer learning device concerning a 2nd embodiment of the present invention. It is a flow chart which shows an answer learning processing routine of an answer learning device concerning a 2nd embodiment of the present invention.
  • the first embodiment of the present invention provides a new task setting for outputting an answer to an input question in a format that is not described in the text, for a question that can be answered with a polarity such as “Yes or No”. Answer with polarity, such as Yes or No. " In the present embodiment, a case where the polarity of the answer is Yes or No will be described as an example.
  • the task answered with “Yes” or “No” is a completely new task in which no existing research exists.
  • MS-MARCO (reference document 1) exists in addition to SQuAD (non-patent document 2) in a typical data set for machine reading.
  • MS-MARCO is a data set in which one question is linked to nearly 10 paragraphs, and a human generates an answer from the paragraph group.
  • Such a task that outputs a response to a question in a format not described in the text is called a generation type task.
  • the generation type task is a more difficult task than the extraction type task due to the characteristic of “outputting the answer in a format not written in the text”.
  • S-Net (reference document 2) exists as a method of the generation type task. [Reference 2] Chuanqi Tan, Furu Weiz, Nan Yang, Bowen Du, Weifeng Lv, Ming Zhouz, "S-NET: FROM ANSWER EXTRACTION TO ANSWER GENERATION FOR MACHINE READING COMPREHENSION", 2017.
  • the answer learning device converts a sentence P and a question sentence Q, which are word sequences, into a vector sequence, and a machine reading unit converts the sentence P into an answer range score ( sd : se ) using a reading technology.
  • the vector sequence and the score of the answer range are converted into a decision score by using a decision unit which is a new technique, and learning is performed using the answer range score and the decision score.
  • the neural network of the machine reading section and the determination section share a layer, it is possible to learn from both sides of the machine reading based on the Yes / No determination and the Yes / No determination based on the reading. .
  • FIG. 1 is a block diagram showing a configuration of an answer learning device 10 according to the first embodiment of the present invention.
  • the answer learning device 10 is configured by a computer including a CPU, a RAM, and a ROM storing a program for executing an answer learning processing routine described later, and is functionally configured as follows. . As shown in FIG. 1, the answer learning device 10 according to the present embodiment includes an input unit 100, an analysis unit 200, and a parameter learning unit 300.
  • the input unit 100 includes a plurality of sentences including a sentence P, a question sentence Q, a correct answer Y indicating a polarity of an answer to the question sentence in the sentence P, and a start end D and an end E of a range serving as a basis of the answer in the sentence P. Accept input of learning data.
  • the learning data includes a sentence P and a question sentence Q composed of text data, a correct answer Y indicating whether the answer is Yes / No, and a range (D: E).
  • D and E are represented by the position numbers of the words in the sentence P, D is the position number of the word at the start position of the range that is the basis of the answer, and E is the word of the end position of the range that is the basis of the answer. Position number.
  • the sentence P and the question sentence Q which are text data, are represented as a token sequence by an existing tokenizer. Note that any unit can be used as the token, but in the present embodiment, the token is described as a word.
  • the length of the sentence P and the question sentence Q expressed in a word sequence is defined by the number of words, and the number of words of the sentence P is L P and the number of words of the question sentence Q is L Q.
  • a plurality of learning data may be processed as a mini-batch collectively for each mini-batch, or may be processed for each learning data.
  • the input unit 100 passes the sentence P and the question sentence Q of the received learning data to the machine reading unit 210 and the learning data to the parameter learning unit 300.
  • the analysis unit 200 includes a machine reading unit 210 and a determination unit 220.
  • Mechanical reading unit 210 based on the sentence P and question Q, ranging underlies the answer in text P D: using a reading model for estimating E, the starting end s d and end s e of the range presume.
  • the machine reading unit 210 includes a word encoding unit 211, a word database (DB) 212, a first context encoding unit 213, an attention unit 214, a second context encoding unit 215, and a basis.
  • a search unit 216 is provided.
  • the word encoding unit 211 generates word vector sequences P 1 and Q 1 based on the sentence P and the question sentence Q.
  • the word coding unit 211 extracts the vector corresponding the word DB212 each word of the sentence P and question Q, generating a sequence P 1 and Q 1 of word vectors.
  • the word vector sequence P 1 is a matrix of L P ⁇ d
  • the word vector sequence Q 1 is a matrix of L Q ⁇ d.
  • the word encoding unit 211 passes the generated word vector sequences P 1 and Q 1 to the first context encoding unit 213.
  • the word DB 212 stores a plurality of word vectors.
  • a word vector is a set of real-valued vectors of a predetermined dimension representing a word.
  • the word DB 212 uses a plurality of word vectors (word embedding) learned in advance by a neural network.
  • word vectors word embedding
  • an existing one such as word2vec or GloVe may be used.
  • a word vector a newly learned word vector can be connected to a word vector extracted from a plurality of existing word vectors.
  • any word embedding technique such as a technique of encoding character information of a word (Reference Document 3) can be used.
  • Word vectors can also be learned from gradients that can be calculated by the backpropagation method.
  • the first context encoding unit 213 converts the word vector sequences P 1 and Q 1 into vector sequences P 2 and Q 2 by using an RNN.
  • Existing technology such as LSTM can be used for the structure of the RNN.
  • the first context encoding unit 213 uses a bidirectional RNN that combines two types of RNNs, an RNN that processes a vector sequence in the forward direction and an RNN that processes the vector sequence in the reverse direction.
  • series P 2 vectors first context coding unit 213 converts the L P ⁇ d 1, is series Q 2 vectors L Q ⁇ d 1 size Matrix.
  • the first context encoding unit 213 passes the converted vector sequences P 2 and Q 2 to the attention unit 214 and the vector sequence Q 2 to the input conversion unit 221.
  • Attention unit 214 using a neural network, based on the sequence P 2 and Q 2 of the vector to produce a reading matrix B is a sequence of vectors representing the attention sentence P and question Q.
  • attention unit 214 first, from the series P 2 and Q 2 vectors, attention matrix Is calculated.
  • attention matrix A for example, the following equation (1) can be used.
  • the attention unit 214 calculates an attention vector from the sentence P to the question sentence Q based on the attention matrix A. , Attention vector from question Q to sentence P Is calculated.
  • softmax is a softmax function, It is.
  • attention vector Can be represented by the following equation (3).
  • the i-th element (1 ⁇ i ⁇ L P) is the maximum value of the i-th vector of the attention matrix A (max value of j direction).
  • softmax i means that softmax is used in the i direction.
  • becomes a vector of length L P by using the max function for the attention matrix A, and in Equation (3), by taking each component of ⁇ as a weight and taking the sum of the weights of the rows of P 2 , Is the vector of length d 1.
  • the attention unit 214 includes a vector sequence P 2 , an attention vector , And attention vector Based on, determine the reading matrix B of length L P representing the result of attention. For example, reading matrix It is.
  • “,” is an operator that connects vectors and matrices horizontally.
  • the attention unit 214 passes the reading matrix B to the input conversion unit 221 and the second context encoding unit 215.
  • the second context encoding unit 215 converts the reading matrix B generated by the attention unit 214 into a reading matrix M that is a sequence of vectors using a neural network.
  • the second context encoding unit 215 converts the reading matrix B into a reading matrix M by RNN.
  • An existing technology such as LSTM can be used for the structure of the RNN as in the case of the first context encoding unit 213.
  • the second context encoding unit 215 passes the converted reading matrix M to the input conversion unit 221 and the ground search unit 216.
  • Rationale search unit 216 based on the reading matrix M, range underlies the answer in text P D: using a reading model for estimating E, estimates the start s d and end s e of the range.
  • evidence search unit 216 is composed of two neural network terminating RNN for estimating the start for RNN and end s e for estimating the start s d range is the basis of the answer .
  • Rationale search unit 216 first, a reading matrix M, to obtain a sequence M 1 vector to input the starting end for RNN.
  • the ground search unit 216 obtains the starting end sd of the range serving as the ground for the answer using the following equation (4).
  • the starting end sd is a score related to the starting end of the range serving as the basis of the answer, and is represented by a vector. In other words, it represents the probability (score) that the word corresponding to each dimension of the vector is the beginning of the answer range.
  • the reading matrix M to obtain a word vector M 2 is input to the terminating RNN.
  • Rationale search unit 216 a termination s e range on which to base the answer is determined using the following equation (5).
  • the termination s e is a score on termination of range of the basis of the answer, is represented by a vector. That is, it represents the probability (score) that the word corresponding to each dimension of the vector is the end of the answer range.
  • the ground search unit 216 passes the estimated answer range score to the input conversion unit 221 and the parameter learning unit 300.
  • the determination unit 220 determines the polarity of the answer to the question message Q using a determination model that determines whether the polarity of the answer to the question message Q is positive based on the information obtained by the processing of the machine reading unit 210. .
  • the determination unit 220 includes an input conversion unit 221 and a score calculation unit 222.
  • Input conversion unit 221 a result of the sentence P encoded by mechanical reading unit 210, based on the result of the question Q encoded by mechanical reading unit 210, generating a sequence P 3 and Q 3 of the vector.
  • the input conversion unit 221 first receives an input of information obtained by the processing of the machine reading unit 210.
  • Information that accepts input can be classified into four types. That is, (1) a sequence of vectors (for example, a reading matrix B or M) of length L P that is the encoding result of the sentence P and that takes the question sentence Q into consideration, and (2) the encoding result of the question sentence Q vector sequence of length L Q is (e.g., sequence Q 2 vector), (3) a vector of length L P is information about answers range (e.g., start s d and end s e estimated), (4 4.) Four types of matrices of size L P ⁇ L Q (for example, attention matrix A), which are the semantic matching results between the text P and the question text Q, are accepted.
  • the object of the present embodiment can be achieved if at least one type (1) (reading matrix B or M) is used.
  • type (1) (reading matrix B or M)
  • (2), (3) and (4) only one of them or a plurality of them may be additionally received.
  • a simple form (1) will be described as an example a case of accepting reading matrix B, and (2) a sequence of vectors Q 2.
  • Input conversion unit 221 based on the sequence Q 2 of reading the matrix B and the vector is accepted, a sequence of vectors of length L P , A sequence of vectors of length L Q Is calculated.
  • the sequence P 3 and Q 3 of the vector can be any of the neural network.
  • the following equations (6) and (7) can be used.
  • the number of dimensions of the d 3 can be set arbitrarily.
  • d 3 d 2 to match the dimension with Q 2
  • the input conversion unit 221 passes the generated series of vectors P 3 and Q 3 to the score calculation unit 222.
  • the score calculation unit 222 determines the polarity of the answer to the question sentence Q using a determination model that determines whether the polarity of the answer to the question sentence Q is positive.
  • the score calculation unit 222 based on the sequence P 3 and Q 3 of the vector, using a framework of any sentence pair classification tasks, answer to question Q is Yes or No for crab classification Is determined (a real number from 0 to 1).
  • a framework after decoder LSTM of ESIM (reference document 4), which is a representative model of implication recognition, which is one of the sentence pair classification tasks, can be used for the classification problem.
  • Reference 4 Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, Hui Jiang, Diana Inkpen, "Enhanced LSTM for Natural Language Inference", arXiv: 1609.06038, 2017.
  • the vector series P 3 and Q 3 are subjected to average pooling (operation for taking an average in the column direction) or max pooling (operation for taking the maximum value in the column direction) to obtain a vector. Get.
  • the vector J is converted into a real number (one-dimensional vector) by a multilayer perceptron, and sigmoid conversion is performed to obtain a judgment score k.
  • the problem may not be classified into Yes / No, but may be classified into Yes, No, or unknown.
  • a result obtained by soft-max conversion may be used as the judgment score k.
  • the score calculation unit 222 passes the judgment score k to the parameter learning unit 300.
  • the parameter learning unit 300 determines that the correct answer Y included in the learning data coincides with the result determined by the determining unit 220, and the starting end D and the ending E included in the learning data and the starting end s estimated by the machine reading comprehension unit 210. as the d and terminal s e matches learns the parameters of the reading model and decision model.
  • the parameter learning unit 300 calculates the linear sum of the objective function L C of the reading model used by the machine reading unit 210 and the objective function L J of the judgment model used by the judging unit 220 by the objective of the optimization problem.
  • a function (Equation (8) below).
  • is a parameter of the model, and can be learned by a learning device.
  • an appropriate value such as 1 or 1/2 is determined so that the learning proceeds.
  • Non-Patent Document 1 proposes a cross entropy function represented by the following equation (9).
  • D and E represent the positions of the true start end D and end E, respectively, s d and D are the values of the D-th element in the vector s d , and se and E are the vectors It represents the value of the E-th element in the s e.
  • the objective function L J may be any objective function.
  • the following equation (10) is obtained.
  • Y is the correct answer Y indicating the polarity of the true answer.
  • the score k Yes k
  • the parameter learning unit 300 calculates the gradient of the objective function represented by the above equation (8) using the error back propagation gradient method, and updates the parameter using an arbitrary optimization method.
  • FIG. 2 is a flowchart illustrating an answer learning processing routine according to the first embodiment of the present invention.
  • the answer learning device according to the present embodiment learns using a mini-batch, but a general neural network learning method may be used. Note that the size of the mini-batch is set to 1 for simplicity.
  • the answer learning device 10 executes an answer learning processing routine shown in FIG.
  • step S100 the input unit 100 determines the sentence P, the question sentence Q, the correct answer Y indicating the polarity of the answer to the question sentence in the sentence P, the start end D and the end end of the range serving as the basis of the answer in the sentence P.
  • the input of a plurality of learning data including E is received.
  • step S110 the input unit 100 divides the learning data received in step S100 into mini-batches.
  • the mini-batch is a set of ⁇ pieces of learning data obtained by randomly dividing a plurality of pieces of learning data.
  • is a natural number of 1 or more.
  • step S120 the word encoding unit 211 selects the first mini-batch.
  • step S130 the word encoding unit 211, based on the sentence P and question Q included in the mini-batch that is selected, generating a sequence P 1 and Q 1 of word vectors.
  • step S140 the first context coding unit 213, the sequence P 1 and Q 1 of word vectors generated by the step S130, respectively into a sequence P 2 and Q 2 vectors using a neural network.
  • step S150 the attention unit 214, by using a neural network, based on the sequence P 2 and Q 2 of the vector to produce a reading matrix B representing the attention sentence P and question Q.
  • step S160 the second context encoding unit 215 converts the reading matrix B generated in step S150 into a reading matrix M using a neural network.
  • step S170 rationale search unit 216, based on the reading matrix M, range underlies the answer in text P D: using a reading model for estimating E, starting of the range s d and termination s e Is estimated.
  • step S180 the input conversion unit 221, a result of the sentence P encoded by mechanical reading unit 210, based on the result of the question Q encoded by mechanical reading unit 210, a sequence of vectors P 3 and Q 3 Generate
  • step S190 the score calculation unit 222, based on the sequence P 3 and Q 3 of the vector, using a decision model polarity answer to question Q determines whether positive or not, the polarity of the answer to the question Q Judge.
  • step S200 the parameter learning unit 300 estimates that the correct answer Y included in the learning data matches the result determined by the determination unit 220, and the start end D and the end E included in the learning data are estimated by the machine reading unit 210. as it has been the starting end s d and end s e match, updates the parameters of the reading model and decision model.
  • parameter learning section 300 determines whether or not processing has been performed for all mini-batches.
  • step S220 the next mini-batch is selected, and the process returns to step S130.
  • step S230 the parameter learning unit 300 determines whether or not the learning has converged.
  • step S230 If the learning has not converged (NO in step S230), the process returns to step S110, and the processes from step S110 to step S230 are performed again.
  • step S240 the parameter learning unit 300 stores the learned parameters in a memory (not shown).
  • the step of selecting the first sentence P and the question Q after the step S120, and the step of selecting all the sentences P and the questions Q in the mini-batch before the step S210 It is determined whether or not the processing has been performed. If the determination result is negative, the next sentence P and question Q are selected and the process returns to step S130. If the determination is positive, the process proceeds to step S210. Steps may be added.
  • the sentence, the question sentence, the correct answer indicating the polarity of the answer to the question sentence in the sentence, and the range that is the basis of the answer in the sentence Information obtained by accepting input of learning data including a start end and an end, and estimating a start end and an end of the range using a reading model for estimating the range based on the sentence and the question sentence.
  • the polarity of the answer to the question is determined, and the correct answer included in the learning data and the determined result are Then, by learning the parameters of the reading model and the judgment model so that the beginning and end included in the learning data coincide with the estimated beginning and end, the To questions that can be answered, accurately, it is possible to learn a model for answer polarity.
  • FIG. 3 is a block diagram illustrating a configuration of the answer generation device 20 according to the first embodiment of the present invention.
  • symbol is attached
  • the answer generation device 20 is configured by a computer including a CPU, a RAM, and a ROM storing a program for executing an answer generation processing routine described below, and is functionally configured as follows. . As illustrated in FIG. 3, the answer generation device 20 according to the present embodiment includes an input unit 400, an analysis unit 200, and an output unit 500. The analysis unit 200 uses the parameters learned by the answer learning device 10.
  • the input unit 400 receives the input of the sentence P and the question sentence Q.
  • the input unit 400 passes the received sentence P and the question sentence Q to the machine reading unit 210.
  • the output unit 500 outputs the judgment score k obtained by the score calculation unit 222 of the judgment unit 220 as an answer, using the answer range score obtained by the ground search unit 216 of the machine reading unit 210 as the basis of the answer.
  • the output unit 500 outputs an arbitrary output format, such as outputting a determination result having a large score as an answer among scores of Yes and No of the determination score k, and outputting only a determination result of a score exceeding a threshold value. Can be selected.
  • the output unit 500 can select an arbitrary output format for the answer range score. Since the answer range score includes the starting end s d and end s e, it is considered to use various methods as a calculation method for the output. For example, as in Non-Patent Document 1, a technique under the constraint that start s d is before the end s e, the product of the starting end s d and end s e outputs a word string of a range of maximum, such as Can be used.
  • FIG. 4 is a flowchart illustrating an answer generation processing routine according to the first embodiment of the present invention. Note that the same processes as those of the answer learning process routine according to the first embodiment are denoted by the same reference numerals, and detailed description is omitted.
  • step S300 the input unit 400 accepts the input of the sentence P and the question Q.
  • step S400 the output unit 500 uses the answer range score obtained in step S170 as the basis for the answer using a predetermined method, and generates the judgment score k obtained in step S190 as an answer using a predetermined method.
  • step S430 the output unit 500 outputs the basis and answer of all the answers obtained in step S400.
  • the reading comprehension model for estimating the range that is the basis of the answer to the question sentence in the sentence is provided.
  • the knowledge required to answer the question is not always described in one place.
  • the required knowledge may be written in multiple places in a sentence, or may need to be supplemented from world knowledge.
  • a text with the necessary knowledge is obtained by searching on the Web, etc., and a new sentence linked to the question target sentence is obtained. It is realized by performing question answering. Normally, it is difficult to simply connect sentences simply because the part necessary for the answer in the original sentence and the newly joined text are located at separate places. However, in the present embodiment, by extracting them as the ground sentence, matching can be performed even when the ground sentence is at a remote place.
  • FIG. 5 is a block diagram showing a configuration of the answer learning device 30 according to the second embodiment of the present invention. Note that the same components as those of the answer learning device 10 according to the above-described first embodiment are denoted by the same reference numerals, and detailed description is omitted.
  • the answer learning device 30 is configured by a computer including a CPU, a RAM, and a ROM storing a program for executing an answer learning processing routine described later, and is functionally configured as follows. . As shown in FIG. 5, the answer learning device 30 according to the present embodiment includes an input unit 100, an analyzing unit 600, and a parameter learning unit 700.
  • the analysis unit 600 includes a machine reading unit 610 and a determination unit 220.
  • Mechanical reading unit 610 based on the sentence P and question Q, ranging underlies the answer in text P D: using a reading model for estimating E, the starting end s d and end s e of the range presume.
  • the machine reading unit 210 includes a word encoding unit 211, a word database (DB) 212, a first context encoding unit 213, an attention unit 214, a second context encoding unit 215, and a basis.
  • An extraction unit 617 and a ground search unit 216 are provided.
  • the basis extraction unit 617 uses the extraction model that extracts basis information that is the basis of the answer to the question sentence based on the information obtained by the processing of the machine reading comprehension unit 610, and uses the extraction model to extract the basis information of the answer to the question sentence Q. Is extracted.
  • the grounds extraction unit 617 first receives the reading matrix M converted by the second context encoding unit 215 (or the reading matrix B before conversion), and uses a neural network to input each of the sentences P. A series H of vectors representing the meaning of a sentence is extracted.
  • the basis extraction unit 617 can use, for example, Undirectional-RNN as a neural network.
  • grounds extractor 617 defines the operation one time to extract a single basis statement is generated by the RNN extraction model state z t. That is, the ground extraction unit 617 determines the element of the vector series H corresponding to the ground sentence extracted at time t-1. By inputting the RNN extraction model, it generates a state z t.
  • st-1 is a subscript of the ground sentence extracted at time t-1.
  • the grounds extraction unit 617 uses the extraction model to extract a glimpse vector that is a question sentence vector considering the importance at time t. e t (formula (13) below) is generated by performing a glimpse operation (reference document 5) on the question sentence Q.
  • a glimpse operation reference document 5
  • the extraction result of the ground sentence can include the content corresponding to the entire question.
  • the initial value of the RNN of the extraction model is a vector obtained by maxpooling a vector sequence obtained by performing an affine transformation on the vector sequence H.
  • Rationale extraction unit 617, a state z t, and glimpse vector e t, based on the sequence H of vectors, by extraction model, the first ⁇ statement select according to the probability distribution represented by the following formula (14) at time t , Sentence st ⁇ is the basis sentence extracted at time t.
  • the rationale extraction unit 617 passes the grounds search unit 216 and the parameter learning unit 700.
  • the parameter learning unit 700 determines whether the correct answer Y included in the learning data matches the result determined by the determining unit 220, and the start end D and the end E included in the learning data, and the start end s estimated by the machine reading comprehension unit 610. and d and terminal s e match, the correct answer of the basis information in text P included in the training data, as the basis information extracted by the rationale extraction unit 617 match, reading model, the decision model and extraction model Learn the parameters.
  • the parameter learning unit 700 includes an objective function L C for the reading model used by the machine reading unit 610, an objective function L J for the judgment model used by the judgment unit 220, and an extraction model used by the ground extraction unit 617.
  • L C for the reading model used by the machine reading unit 610
  • L J for the judgment model used by the judgment unit 220
  • extraction model used by the ground extraction unit 617 Let the linear sum with the objective function L s for is the objective function of the optimization problem (Equation (15) below).
  • the objective function L C and L J is the same as the first embodiment.
  • the objective function Ls is an objective function subjected to coverage regularization (reference document 6).
  • an objective function such as the following equation (16) can be used. [Reference 6] A. See, P. J. Liu and C. D. Manning, “Get to the point: ummarization with pointer-generator networks”, ACL, 2017, pp. 1073-1083.
  • the parameter learning unit 700 calculates the gradient of the objective function represented by the above equation (16) using the error backpropagation gradient method, and updates each parameter using an arbitrary optimization method.
  • FIG. 6 is a flowchart showing an answer learning processing routine according to the second embodiment of the present invention.
  • the answer learning device according to the present embodiment learns using a mini-batch, but a general neural network learning method may be used.
  • the size of the mini-batch is set to 1 for simplicity.
  • the same components as those in the answer learning processing routine according to the above-described first embodiment are denoted by the same reference numerals, and detailed description is omitted.
  • grounds extraction part 617 performs grounds information extraction processing.
  • step S600 the parameter learning unit 700 estimates that the correct answer Y included in the learning data matches the result determined by the determination unit 220, and that the start end D and the end E included in the learning data are determined by the machine reading unit 210. is a starting end s d and end s e were match, and basis information for the answer in text P included in the training data, as the basis information extracted by the rationale extraction unit 617 match, reading the model, determine the model And the parameters of the extracted model.
  • FIG. 7 is a flowchart showing a basis information extraction processing routine in the answer learning device according to the second embodiment of the present invention.
  • the basis extraction unit 617 uses the extraction model to extract the basis information that is the basis of the answer to the question based on the information obtained by the processing of the machine reading unit 610 by the basis information extraction processing.
  • the basis information of the answer to Q is extracted.
  • step S510 the rationale extraction unit 617, an operation to extract one basis statement is defined as 1 time, the state z t at time t generated by the RNN extraction model.
  • step S520 evidence extraction unit 617, a glimpse vector e t a question message vector in consideration of the importance at time t, generated by performing the glimpse operation on question Q.
  • step S540 the grounds extraction unit 617 determines whether the termination condition is satisfied.
  • step S540 If the end condition is not satisfied (NO in step S540), the grounds extraction unit 617 adds 1 to t in step S550, and returns to step S510. On the other hand, when the termination condition is satisfied (YES in step S540), the grounds extraction unit 617 returns.
  • the extraction model for extracting the basis information that is the basis of the answer to the question sentence is provided. Extract the basis information of the answer to the question sentence, and learn the parameters of the extraction model so that the basis information of the answer in the sentence included in the learning data matches the basis information extracted by the basis extraction unit. This makes it possible to learn a model for answering with a polarity more accurately for a question that can be answered with a polarity.
  • FIG. 8 is a block diagram showing a configuration of the answer generation device 40 according to the second embodiment of the present invention.
  • the answer generation device 40 is configured by a computer including a CPU, a RAM, and a ROM storing a program for executing an answer generation processing routine described later, and is functionally configured as follows. .
  • the answer generation device 40 according to the second embodiment includes an input unit 400, an analysis unit 600, and an output unit 800.
  • the output unit 800 outputs, as a response, the polarity of the answer determined by the determination unit 220 and the basis information extracted by the basis extraction unit 617.
  • FIG. 9 is a flowchart illustrating an answer generation processing routine according to the second embodiment of the present invention. Note that the same processes as those in the answer generation processing routine according to the first embodiment and the answer learning processing routine according to the second embodiment are denoted by the same reference numerals, and detailed description is omitted.
  • step S700 the output unit 800 outputs the basis and answer of all the answers obtained in step S400, and the basis information obtained in step S555.
  • the configuration shown in FIG. 10 is used as the configuration of each unit of the answer generation device.
  • the determination unit 220 is configured using an RNN and a linear transformation, determines whether to answer with one of Yes / No / extraction type answers, and determines a ternary value of Yes / No / extraction type answer. Is output.
  • the ground search unit 216 is configured using two sets of RNN and linear transformation, one set of which outputs the end point of the answer, and the other set outputs the start point of the answer.
  • the basis extraction unit 617 is configured using the RNN and the extraction model 617A.
  • the second context encoding unit 215 is configured using an RNN and self-attention, and the attention unit 214 is configured with a bidirectional attention.
  • the first context encoding unit 213 is configured using two RNNs, and the word encoding unit 211 is configured using two sets of word embedding and character embedding.
  • the configuration shown in FIG. 11 is used as the configuration of the extraction model 617A.
  • This configuration is based on the extractive sentence summarization model proposed in Reference 7. [Reference 7] YC Chen and M. Bansal, “Fast abstractive summarization with reinforce-selected sentence rewriting”, ACL, 2018, pp.675-686.
  • the method of Reference 7 is a method of extracting a sentence in the abstract original sentence while paying attention to the abstract original sentence.
  • the sentence in the sentence P is extracted while paying attention to the question sentence Q.
  • the extraction model 617A by performing a glimpse operation on the question sentence Q, the extraction result is intended to include the content corresponding to the entire question.
  • the prediction accuracy of the answer type T, answer A, and grounds S was evaluated.
  • the answer type T is composed of three labels “Yes / No / Extraction” in the task setting of Hotpot QA. Both answers and grounds sentence extraction were evaluated for perfect match (EM) and partial match. The index of the partial match is the harmonic mean (F1) of the precision and the recall. The answer is evaluated based on the match of the answer type T. In the case of extraction, the answer is also evaluated based on the match of the answer A. The partial match of the ground sentence extraction was measured by matching the extracted sentence id with the true ground sentence id. Therefore, partial matching at the word level is not considered.
  • the answer accuracy when limited to “Yes / No” questions is described as YN.
  • joint EM and joint F1 are used as indices taking into account both the accuracy of the answer and the basis.
  • Reference 8 Z. Yang, P. Qi, S. Zhang, Y. Bengio, WW Cohen, R. Salakhutdinov and CD Manning, “HotpotQA: A dataset for diverse, explainable multi-hop question answering”, EMNLP, 2018 , pp. 2369-2380.
  • the distractor setting assumes that it is possible with existing technology to reduce a large amount of text to a small amount related to the question.
  • the fullwiki setting is a setting in which a small amount of text is narrowed down by TF-IDF similarity search.
  • the present embodiment greatly exceeded the baseline model, and achieved state-of-the-art accuracy.
  • the perfect match of the ground sentence is greatly improved to 37.5 points (+ 185%) in the distractor setting and 10.3 points (+ 268%) in the fullwiki setting. Therefore, it can be said that the present embodiment is an excellent technique for extracting the base sentence without excess or deficiency.
  • Table 3 shows the experimental results in the distractor setting with the development data.
  • the improvement in the accuracy of the determination of “Yes / No” can be interpreted as interpreting that the multitask learning with the extraction model 617A can train the lower RNN to acquire a feature that also contributes to the answer.
  • the accuracy of the Joint index is improved.
  • a technique of performing only sentence extraction by the RNN without using the glimpse operation was tested.
  • the EM based on the baseline model is 6.5 points higher, but the F1 is lower than the baseline model.
  • the answer was 0.9 points for EM and 0.8 points for F1.
  • the determination accuracy of “Yes / No” is improved by 3.0 points. Therefore, it can be interpreted that the learning of the lower RNN is progressing by the extraction model 617A. As a result, the accuracy of the Joint index is improved.
  • this example exceeded all the indices as compared with the method using no glimpse operation.
  • the extraction model for extracting the basis information that is the basis of the answer to the question sentence is provided.
  • the basis information of the answer to the question sentence is extracted, and the polarity of the determined answer and the extracted basis information are output as the answer, so that the question that can be answered with the polarity is further Answers can be made with accuracy and polarity.
  • the results of sentence P encoded by mechanical reading unit 210, the question Q based on the result of encoding by the machine reading unit 210 has been generating a sequence P 3 and Q 3 of the vector , at least one of the starting end s d and end s e range underlies answer estimated by the machine reading unit 210, or as a further input an attention matrix a representing the relationship between text P and question Q, question
  • the polarity of the answer to the question Q may be determined using a determination model that determines whether the polarity of the answer to Q is positive.
  • the second context encoding unit 215 passes the converted reading matrix M, and the ground search unit 216 passes the estimated answer range score to the input conversion unit 221.
  • the input conversion unit 221 can be used as a method of calculating the sequence P 3 vector, the following equation (17) or the equation (18).
  • Linear () indicates a linear transformation
  • an input conversion unit 221 can be used as a method of calculating the sequence Q 3 vector, the following equation (19).
  • the score calculation unit 222 can use an existing framework of the sentence pair classification task that is modified.
  • sequence Q 3 vectors, and methods using the final state of the output of LSTM when the input to the LSTM, an operation that takes weighted average of attentive pooling can be replaced by using a termination s e etc.).
  • the information received by the input conversion unit 221 can be (1) only the reading matrix B.
  • the definition of J is It is.
  • the answer learning device 10 may further include a question determination unit that determines whether the input question sentence Q is a “question that can be answered with Yes or No”.
  • the determination method of the question determination unit a conventional method such as a rule base or a determination by machine learning may be used.
  • the output (Yes / No) from the determination unit 220 is not performed. It is also possible to configure so that only the output from 210 is performed.
  • the question determination unit 220 when the output of the determination unit 220 is binary of Yes / No, if it is inappropriate to answer Yes or No, the answer may be Yes or No. Can be prevented. In addition, it is possible to exclude a question that is inappropriate to answer with Yes or No from the learning data, so that more appropriate learning can be performed.
  • the meaning of “unknown” becomes clearer.
  • the meaning of “unknown” means “it is inappropriate to answer“ Yes ”or“ No ”” or “(for reasons such as that there is no description that is ) I do not understand ", but the meaning of" unknown "can be narrowed down to the latter if the determination is made by the question determination unit.
  • the question determination unit can be provided in the answer generation device 20.
  • the answer generation device 20 includes a question determination unit, and when the output of the determination unit 220 is a binary value of Yes / No, when it is inappropriate to answer with Yes or No, the answer generation device 20 answers with Yes or No. Can be prevented.
  • the present invention is not limited to this. / No / extract type answer, and if the answer is an extract type answer, the output unit outputs, as the extract type answer, the basis sentence output by the basis extractor 617 or the basis search.
  • the range of the basis of the answer output by the unit 216 may be output.
  • the answer polarity is Yes or No has been described as an example.
  • the present invention is not limited to this, and the answer polarity may be, for example, OK or NG.
  • the program has been described as being installed in advance, but the program may be stored in a computer-readable recording medium and provided.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

La présente invention permet de donner une réponse précise avec une polarité pour une question qui peut répondre à une polarité. Une unité de lecture de machine estime, sur la base d'une phrase d'entrée et d'une phrase de question, une extrémité de début et une extrémité de fin d'une plage en utilisant un modèle de lecture qui a été entraîné à l'avance de façon à estimer la plage qui est dans la phrase et est la base d'une réponse à la phrase de question. Une unité de détermination 220 détermine, sur la base d'informations obtenues du traitement par l'unité de lecture de machine 210, la polarité de la réponse à la phrase de question en utilisant un modèle de détermination pré-appris pour déterminer si la polarité de la réponse à la phrase de question est correcte.
PCT/JP2019/023755 2018-06-18 2019-06-14 Dispositif d'apprentissage de réponse, procédé d'apprentissage de réponse, dispositif de génération de réponse, procédé de génération de réponse et programme WO2019244803A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/254,187 US20210125516A1 (en) 2018-06-18 2019-06-14 Answer training device, answer training method, answer generation device, answer generation method, and program

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2018-115166 2018-06-18
JP2018115166 2018-06-18
JP2019-032127 2019-02-25
JP2019032127A JP2019220142A (ja) 2018-06-18 2019-02-25 回答学習装置、回答学習方法、回答生成装置、回答生成方法、及びプログラム

Publications (1)

Publication Number Publication Date
WO2019244803A1 true WO2019244803A1 (fr) 2019-12-26

Family

ID=68984027

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/023755 WO2019244803A1 (fr) 2018-06-18 2019-06-14 Dispositif d'apprentissage de réponse, procédé d'apprentissage de réponse, dispositif de génération de réponse, procédé de génération de réponse et programme

Country Status (1)

Country Link
WO (1) WO2019244803A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051496A (zh) * 2019-12-27 2021-06-29 中国电信股份有限公司 训练用于分类统一资源定位符的分类器的方法及系统
WO2022079826A1 (fr) * 2020-10-14 2022-04-21 日本電信電話株式会社 Dispositif d'apprentissage, dispositif de traitement d'informations, procédé d'apprentissage, procédé de traitement d'informations et programme
US11481445B2 (en) 2020-03-05 2022-10-25 Fujifilm Business Innovation Corp. Answer generating device and non-transitory computer readable medium storing program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006163623A (ja) * 2004-12-03 2006-06-22 Nippon Hoso Kyokai <Nhk> 質問応答装置及び質問応答プログラム、並びに、テレビ受像機
JP2014120053A (ja) * 2012-12-18 2014-06-30 Nippon Telegr & Teleph Corp <Ntt> 質問応答装置、方法、及びプログラム
JP2017049681A (ja) * 2015-08-31 2017-03-09 国立研究開発法人情報通信研究機構 質問応答システムの訓練装置及びそのためのコンピュータプログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006163623A (ja) * 2004-12-03 2006-06-22 Nippon Hoso Kyokai <Nhk> 質問応答装置及び質問応答プログラム、並びに、テレビ受像機
JP2014120053A (ja) * 2012-12-18 2014-06-30 Nippon Telegr & Teleph Corp <Ntt> 質問応答装置、方法、及びプログラム
JP2017049681A (ja) * 2015-08-31 2017-03-09 国立研究開発法人情報通信研究機構 質問応答システムの訓練装置及びそのためのコンピュータプログラム

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051496A (zh) * 2019-12-27 2021-06-29 中国电信股份有限公司 训练用于分类统一资源定位符的分类器的方法及系统
CN113051496B (zh) * 2019-12-27 2024-01-26 中国电信股份有限公司 训练用于分类统一资源定位符的分类器的方法及系统
US11481445B2 (en) 2020-03-05 2022-10-25 Fujifilm Business Innovation Corp. Answer generating device and non-transitory computer readable medium storing program
WO2022079826A1 (fr) * 2020-10-14 2022-04-21 日本電信電話株式会社 Dispositif d'apprentissage, dispositif de traitement d'informations, procédé d'apprentissage, procédé de traitement d'informations et programme

Similar Documents

Publication Publication Date Title
WO2020174826A1 (fr) Dispositif de génération de réponse, dispositif d&#39;apprentissage de réponse, procédé de génération de réponse et programme de génération de réponse
JP7247878B2 (ja) 回答学習装置、回答学習方法、回答生成装置、回答生成方法、及びプログラム
CN108959396B (zh) 机器阅读模型训练方法及装置、问答方法及装置
CN108875807B (zh) 一种基于多注意力多尺度的图像描述方法
CN110781680B (zh) 基于孪生网络和多头注意力机制的语义相似度匹配方法
CN108875074B (zh) 基于交叉注意力神经网络的答案选择方法、装置和电子设备
US20190197109A1 (en) System and methods for performing nlp related tasks using contextualized word representations
CN110110062B (zh) 机器智能问答方法、装置与电子设备
KR20180125905A (ko) 딥 뉴럴 네트워크(Deep Neural Network)를 이용하여 문장이 속하는 클래스(class)를 분류하는 방법 및 장치
WO2019244803A1 (fr) Dispositif d&#39;apprentissage de réponse, procédé d&#39;apprentissage de réponse, dispositif de génération de réponse, procédé de génération de réponse et programme
KR101939209B1 (ko) 신경망 기반의 텍스트의 카테고리를 분류하기 위한 장치, 이를 위한 방법 및 이 방법을 수행하기 위한 프로그램이 기록된 컴퓨터 판독 가능한 기록매체
JP7139626B2 (ja) フレーズ生成関係性推定モデル学習装置、フレーズ生成装置、方法、及びプログラム
CN110457718B (zh) 一种文本生成方法、装置、计算机设备及存储介质
CN113435211B (zh) 一种结合外部知识的文本隐式情感分析方法
CN108536735B (zh) 基于多通道自编码器的多模态词汇表示方法与系统
CN111027292B (zh) 一种限定采样文本序列生成方法及其系统
CN112905772B (zh) 语义相关性分析方法、装置及相关产品
CN112926655B (zh) 一种图像内容理解与视觉问答vqa方法、存储介质和终端
CN113628059A (zh) 一种基于多层图注意力网络的关联用户识别方法及装置
CN114492451B (zh) 文本匹配方法、装置、电子设备及计算机可读存储介质
CN114332565A (zh) 一种基于分布估计的条件生成对抗网络文本生成图像方法
CN112732879B (zh) 一种问答任务的下游任务处理方法及模型
CN112560440A (zh) 一种基于深度学习的面向方面级情感分析的句法依赖方法
US20240037335A1 (en) Methods, systems, and media for bi-modal generation of natural languages and neural architectures
CN114168769B (zh) 基于gat关系推理的视觉问答方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19821772

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19821772

Country of ref document: EP

Kind code of ref document: A1