CN112380326B - Question answer extraction method based on multilayer perception and electronic device - Google Patents

Question answer extraction method based on multilayer perception and electronic device Download PDF

Info

Publication number
CN112380326B
CN112380326B CN202011079727.5A CN202011079727A CN112380326B CN 112380326 B CN112380326 B CN 112380326B CN 202011079727 A CN202011079727 A CN 202011079727A CN 112380326 B CN112380326 B CN 112380326B
Authority
CN
China
Prior art keywords
representation
answer
question
document
inference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011079727.5A
Other languages
Chinese (zh)
Other versions
CN112380326A (en
Inventor
林政�
付鹏
刘欢
王伟平
孟丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202011079727.5A priority Critical patent/CN112380326B/en
Publication of CN112380326A publication Critical patent/CN112380326A/en
Application granted granted Critical
Publication of CN112380326B publication Critical patent/CN112380326B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Abstract

The invention provides a question answer extraction method based on multilayer perception, which comprises the following steps: splicing a question and a plurality of target documents, inputting the spliced question and the target documents into a pre-training language model to obtain a representation Q of the question and a context representation P of the target documents, and interacting the representation Q and the context representation P to obtain a question representation u related to the documents and a document representation h fusing question information; carrying out multi-layer perception classification on the problem representation u, obtaining the inference type of the problem, and generating a subproblem c through the representation Q according to the inference type, the problem representation u, the document representation h and the representation QtObtaining the answer attention distribution of the question in the target document, wherein t is the number of times of generating sub-questions; and obtaining an answer prediction result of the question according to the answer attention distribution. The invention answers the questions in a subproblem splitting mode, introduces the inference category classifier to control the splitting, shares answers of the questions and improves the inference reading understanding effect.

Description

Question answer extraction method based on multilayer perception and electronic device
Technical Field
The invention belongs to the field of natural language processing, and particularly relates to a question answer extraction method based on multilayer perception and an electronic device.
Background
Inferential reading understanding is a plurality of related documents of a given user's question from which answers to the question and related evidence sentences are found. Reasoning, reading and understanding the question requires a model to combine with the question to reason about the semantic meaning of the text and find the relevant evidence sentences and the final answers to the question. Inferential reading models can be divided into three broad categories of methods as a whole. One is a method of memory network, which simulates the reasoning process by continuously iteratively updating the reasoning state; the other is a mode based on a graph neural network, and reasoning is carried out through updating of the graph neural network; there are also other methods based on deep learning. The frame of the reasoning reading understanding model based on the graph neural network can be integrally divided into three parts: 1) a semantic coding stage; 2) a reasoning modeling stage; 3) evidence and answer prediction phase. The semantic coding stage codes the questions and the documents into word vectors with context semantic information, the reasoning modeling stage models a reasoning process by using a graph neural network technology, and the answer predicting stage predicts relevant evidence sentences and answer segments after obtaining word representations. For some data with more candidate paragraphs, paragraph selection is also needed, and the paragraph selection stage selects relevant paragraphs from the candidate paragraphs for input of subsequent semantic coding.
A typical memory Network-based method is a Dynamic Co-attribute Network (simulation Xiong, Victor Zhong, Richard Socher: Dynamic Coattribute Networks For Question answering. ICLR,2017), the method divides a model into two parts of encoding and decoding, on one hand, a Co-attribute mechanism is used in an encoding stage to encode a Question and a document to obtain a stable representation related to the Question; on the other hand, in the decoding stage, iteration is carried out by using the result of answer prediction, the answer is predicted in each round according to the current state value, the current wheel state value is updated according to the answer prediction result, the iteration is continuously updated, and the result of the last round is used as the final answer.
A typical Graph-based neural Network method comparison is a DFGN model (Lin Qiu, Yunxuuan Xiao, Yanru Qu, Hao Zhou, Lei Li, Weinan Zhang, Yong Yu: dynamic Fused Graph Network for Multi-hop learning. ACL 2019: 6140-. The DFGN model firstly uses Bert to independently classify documents and select paragraphs, a semantic coding stage uses the Bert to obtain context word representation of the documents and questions, an inference modeling stage is realized by adopting a GAT graph neural network, a BilSTM modeling graph and word representation bidirectional fusion process is used, node information obtained after graph inference is fused into word representation, and bidirectional fusion of graph information and text information is completed by continuously iterating the graph inference process, so that extraction answers are predicted; in addition, the DFGN also models the effect of the problem in the graph construction process, the BiAttentention is adopted to update the problem representation, a dynamic graph is constructed according to the matching degree of the problem representation and the node representation, and meanwhile, the problem representation is continuously updated in the iteration process.
In other non-graphical neural network methods, Jianxing Yu, Zhengjun Zha, Jian Yin and the like design an inference neuron (inference neuron comparison: answer Questions by recovery from the evaluation Chain from text. ACL 2019: 2241-. The inference neuron comprises a memory vector, a read operation unit, a write operation unit and a controller operation unit, wherein the controller unit generates a series of attention-based operations based on problems, the read operation unit reads related contents according to operation instructions of the controller, the write unit generates a new result according to the controller operations and the read unit results and updates the memory vector, the inference neuron is recursively linked together, and the output of the previous step is the result of the next step; in addition, due to the uncertainty of the inference depth of different samples, the termination action of the inference process is dynamically decided, and the whole network is trained through reinforcement learning.
The models proposed by Sewon, Victor Zhong, Luke Zettlemoyer, Hannaneh Hajishirzi et al decompose the problem into a number of simple sub-problems (Multi-hop Reading composition through Question and recoring. ACL 2019: 6097-. In order to easily acquire labeled data for decomposing sub-problems, the sub-problems are formed into fragments of the original problem, the sub-problem generation problem is changed into a fragment prediction problem, and the training of the part can be as effective as human labeling by only using 400 pieces of manually labeled data. In addition, a method of re-scoring the sub-questions and answers to select the best answer is also proposed. Self-integrated modeling Networks (Self-Assembling Modular Networks for inter-predictive Multi-Hop learning. EMNLP/IJCNLP 2019:4473-4483) proposed by Yiche Jiang, Mohit Bansal et al, adopt a neural network mode simulation stack to build a Self-integrated Modular neural network, and can completely and automatically integrate sub-problems together for disassembly and combination.
However, most of the current models do not process inference types of different categories respectively, and most of the inference processes of the models for modeling are complex.
The method is mainly characterized in that different subproblems are generated and split into different subtasks according to the characteristics of different reasoning problems in a data set, and the subtasks are completed in a hierarchical progressive mode to predict answers.
Disclosure of Invention
In order to solve the above problems, the invention provides a problem answer extraction method based on multilayer perception and an electronic device, which control different sub-modules to perform hierarchical structure combination through a simple problem classification mechanism, are simpler to realize, and are convenient to combine with other parts in a module mode.
In order to achieve the purpose, the invention adopts the following technical scheme:
a question answer extraction method based on multilayer perception comprises the following steps:
1) splicing a question and a plurality of target documents, inputting the spliced question and the target documents into a pre-training language model to obtain a representation Q of the question and a context representation P of the target documents, and interacting the representation Q and the context representation P to obtain a question representation u related to the documents and a document representation h fusing question information;
2) carrying out multi-layer perception classification on the problem representation u, obtaining the inference type of the problem, and generating a subproblem c through the representation Q according to the inference type, the problem representation u, the document representation h and the representation QtObtaining the answer attention distribution of the question in the target document, wherein t is the number of times of generating sub-questions;
3) and obtaining an answer prediction result of the question according to the answer attention distribution.
Further, the target document is obtained by the following steps:
1) inputting a plurality of original documents into a paragraph selection model consisting of a BERT model and a layer of linear classifiers;
2) and selecting paragraphs related to the problems in each original document according to a threshold value to obtain a plurality of target documents.
Further, the pre-trained language model includes a BERT model.
Further, the method of interacting the representation Q with the context representation P comprises: using a bidirectional attention mechanism; the step of generating the sub-questions comprises:
1) inputting the representation Q through a BilSTM network to obtain a problem vector qv;
2) passing problem vector qv, sub-problem ct-1And problem representation u, get sub-problem ct
Further, the inference types include: bridging entity class or comparison class problems.
Further, if the inference type is the bridging entity class, obtaining the answer attention distribution by the following steps:
1) calling a Find function according to the problem representation u, the document representation h and the subproblems to generate an intermediate bridging entity att1
2) Att according to an intermediate bridging entity1Question representation u, document representation h and subproblem ctAnd calling a Transfer function to obtain the attention distribution of the answer.
Further, if the inference type is a comparison type question, obtaining an answer attention distribution by the following steps:
1) according to the question representation u, the document representation h and the subproblem ctCalling two Find functions respectively to generate an intermediate bridging entity att1Att with intermediate bridging entity2
2) Comparing intermediate bridging entities att by calling Compare function1Att with intermediate bridging entity2The answer attention distribution is obtained.
Further, the method for obtaining the predicted result of the answer to the question comprises the following steps: denote context by C'(t)Inputting a plurality of LSTM layers which are stacked layer by layer and do not share parameters; the answer prediction result comprises the following steps: one or more of a related evidence sentence, an answer start position, an answer end position, and an answer type.
A storage medium having stored therein a computer program, wherein the computer program is arranged to perform the above-mentioned method when executed.
An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer to perform the method as described above.
Compared with the prior art, the invention has the following positive effects:
1) the simple form of splitting the subproblems is provided to answer the questions in a hierarchical progressive mode, the problem splitting does not need to be supervised, and the effect of reasoning, reading and understanding is improved.
2) The inference category classifier is introduced to control the splitting, and the subtask module is used for sharing and answering the questions, so that the effect of reasoning, reading and understanding is improved.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
FIG. 2 is a block diagram of a bridge entity class problem resolution framework according to the present invention.
FIG. 3 is a schematic diagram of a comparative problem disassembly frame of the present invention.
Detailed Description
In order to make the aforementioned and other features and advantages of the invention more comprehensible, embodiments accompanied with figures are described in detail below.
The invention classifies inference categories in the hotspot qa dataset. There are two main classes, the bridging entity class and the comparison class. If the question is of a bridge entity type, the model processes the answer prediction process into two hierarchically stacked subtasks, wherein the first layer is to search for an intermediate bridge entity through a search module, and the second layer is to lock a final answer through a conversion module by using a bridge entity representation output by the first layer and question and context content; if the question is of a comparative type, the model processes the answer prediction process into two hierarchically stacked subtasks, wherein the first layer is to find two related entity parts in the model through two searching modules, and the second layer is to compare entity representations output in the first layer through a comparison module to predict a final answer. The related subtasks are realized through three functions, the Find function realizes the finding of the subtasks, the Transfer function realizes the locking of answers through a bridging entity, and the Compare function realizes the comparison of two entity representations to obtain the answers. Problem disassembly frame as illustrated in the figure:
as shown in fig. 1, the frame adopted by the present invention is integrally divided into three parts: 1) a paragraph selection module; 2) a semantic coding module; 3) and a hierarchical answer prediction module. The paragraph selection module screens a plurality of documents to filter out irrelevant documents and avoid overlarge input document length. The semantic coding module codes the question and the document into vector representation with context semantic information. And the hierarchical answer prediction module is used for respectively processing the questions with different reasoning types and predicting the final evidence sentence and the answer. The invention is mainly characterized in that the hierarchical answer prediction module can be divided into a classification controller, a subproblem generator and three subtask executors.
The first process is as follows: a paragraph selection module.
The paragraph selection module uses a BERT model (acob Derlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova: BERT: Pre-training of Deep Bidirectional transformations for navigation unrestanding. NAACL-HLT 2019:4171-4186) and a layer of linear classifier to fine-tune to obtain a paragraph selection model, independently judges whether a problem is related to a paragraph, and sets a threshold value to be 0.3 to select the more related paragraph. This is a selection under the condition of ensuring the recall rate, and the total length of the recalled relevant documents substantially satisfies the maximum input length 512 of the next stage.
And a second process: and a semantic coding module.
The semantic coding layer codes the question and context documents into a vector representation with context semantic information. The question and all relevant documents of the question are spliced together to form the input to the coding module, which uses a pre-trained BERT model. After encoding, a representation is obtained
Figure BDA0002717800380000051
And
Figure BDA0002717800380000052
where R represents a set of real numbers, L and N are the lengths of the question and context, respectively, d1Is the dimension size of the BERT hidden layer.
Then, utilizeA two-way Attention mechanism (Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajisi: Bidirectional Attention Flow for Machine compatibility. ICLR 2017) interactively models problems and contexts. The model uses a two-way attention mechanism to interactively model the problem and the context and learn the problem representation related to the document
Figure BDA0002717800380000053
And problem-related document representation
Figure BDA0002717800380000054
Wherein d is2It is the word of the output that represents the dimension size.
The third process: and a hierarchical answer prediction module.
The input of the question reasoning type discriminator is a question representation u, and the question word representation obtained in the coding stage is subjected to two classifications by a multilayer perceptron to obtain the reasoning type of the question.
Further, if the inference type is the bridging entity class, as shown in fig. 2, the model first calls the Find function to generate an intermediate bridging entity att1Then, a Transfer function is called to obtain the attention distribution of the relevant answer according to the bridging entity. If the inference type is a comparison type problem, as shown in FIG. 3, the model will invoke the Find function twice to obtain two related entities att1And att2The Compare function is then invoked to get the attention distribution of the final answer by comparing the two related entities.
Find function first solves sub-problem ctInjected into the problem-related document representation h to obtain h' ═ h |, ctNext, a problem-related contextual representation is obtained through a two-way attention mechanism
Figure BDA0002717800380000055
The specific process is as follows:
Mj,s=W1uj+W2h’s+W3(uj⊙h′s)
Figure BDA0002717800380000061
Figure BDA0002717800380000062
Figure BDA0002717800380000063
Figure BDA0002717800380000064
Figure BDA0002717800380000065
Figure BDA0002717800380000066
further, M is a similarity matrix in a bidirectional attention mechanism, W is a trainable parameter, cqAnd q iscRespectively, the context attention related to the problem and the problem obtained by calculation in the bidirectional attention mechanism, p is the calculated attention weight, s is the number of contexts, J is the dimension implicitly represented in the context sequence, M is the maximum one-dimensional value of M, u is the maximum one-dimensional value of MjIs the jth word representation in the question representation.
Figure BDA0002717800380000067
Finally, the original dimension of the input problem is compressed by linear transformation to be used as the final output, and the final output is used as the attention distribution of the related entity and is marked as att1
The Transfer function firstly obtains a bridge entity representation b through an attention mechanism calculation, and then injects the bridge entity representation into the context representation to obtain hbThen, the Find in Transfer is reusedtransFunction finding definiteAnd (4) locating the position of the final answer, wherein the function is designed to be identical to the Find function, so that the related textual context expression is obtained. C of the same sectiontA subproblem generator is also required to generate. The specific process is as follows:
Figure BDA0002717800380000068
hb=h⊙b
Figure BDA0002717800380000069
further, att1Is the attention distribution of the bridging entity, hsIs the s-th word in the context representation h.
Inputs to the Compare function are attentions att related to two entities1And att2Respectively, the attention distributions of the related entities generated by the Find function from the two subproblems. It is therefore necessary here to obtain representation information hs relating to two entities1And hs2Finally, the two pieces of representation information and the subproblems are spliced and combined to obtain information o required by comparisoninThese information are then input into the multi-level perceptron for comparison. The overall idea is to aggregate the two attention distributions and compare them according to the sub-question representations to obtain the answer. From the above, we can obtain a final attention distribution, which is used to predict the start and end positions of the answer segment. The specific formula is as follows:
Figure BDA0002717800380000071
Figure BDA0002717800380000072
oin=[ct;hs1;hs2;ct·(hs1-hs2)]
hc=W1·(ReLU(W2·oin))
further, W is a trainable model parameter.
Meanwhile, each time the model calls the Find, Transfer and Compare functions, the subproblem solved by the current function is calculated by the subproblem generator, and the calculation process represented by the subproblem is as follows:
qt=W1,t·qv+b1,t
cqt=W2·[ct-1;qt]+b2
et,j=W4·(cqt·uj)+b4
cvt=Softmax(et)
Figure BDA0002717800380000073
where qv represents the problem vector, produced by BilSTM. As is known from the second process,
Figure BDA0002717800380000074
qv ═ bilstm (q), and the concatenation of the implied variables end-to-end is the value of qv. c. CtRepresenting the sub-problem representation of the current computation. Where both W and b are trainable parameters. In the calculation process, the problem representation and the previous subproblem representation are fused to obtain cqtAnd thus a representation of the current sub-problem is computed by means of an attention mechanism.
The resulting probability distribution representing the computed evidence and answers is then used as input to the prediction layer. When the problem is a bridging entity class, the input of the prediction layer is the output of the Transfer function; when the problem is a comparison type problem, the input of the prediction layer is the output of the Compare function. The output of the prediction layer has four dimensions, including the relevant evidence sentences, the start position of the answer, the end position of the answer, and the type of the answer. The prediction layer uses a vertical structure design to solve the problem between outputsDependency, four LSTM layers that do not share parameters are stacked together by layer. The context representation of the last round of reasoning module is the input of the first layer of LSTM, each layer of LSTM will output a probability distribution
Figure BDA0002717800380000075
Cross entropy is then calculated using these probability distributions. The specific LSTM stacking is as follows:
Osup=F0(C(t))
Ostart=F1([C(t),Osup])
Oend=F2([C(t),Osup,Ostart])
Otype=F3([C(t),Osup,Ostart])
further, F0,F1,F2,F3Respectively four multi-layer sensors, OsupIs used to predict the evidence-representing probability distribution, OstartAnd OendProbability distributions, O, for predicting the start and end positions of the answer, respectivelytypeIs the probability distribution used to predict the answer type.
The four cross entropy loss functions are finally jointly optimized.
L=Lstart+LendsLsuptLtype
Further, Lstart,Lend,Lsup,LtypeAre each Osup,Ostart,Oend,OtypeA loss function, lambda, obtained by calculating a cross entropy loss function with the real labelsAnd λtAre the hyper-parameters for calculating evidence prediction loss and answer type loss, respectively.
In experimental effect, the present invention performed experiments on the HotpotQA inferential reading comprehension data set (Zhilin Yang, Pen Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, Christopher D. manning: HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question answering. EMNLP 2018: 2369-. There were 90247 samples for training data and 7405 samples for validation data. Statistical results of the bridge class and comparative class theory problems in the data set are shown in the table:
data set Bridging class Comparison class All are provided with
Training set 17456 72991 90247
Verification set 1487 5918 7405
Table 1: statistical results of bridging-class and comparison-class problems in HotpotQA
The evaluation indexes of the present invention are the EM value and the F1 value. The EM value is the proportion condition of completely consistent comparison between the predicted answer and the real answer, and the F1 value comprehensively measures the accuracy and the recall rate of the predicted result and the real result.
The problem classifier performance of the present invention is as follows:
correct sample Error samples Rate of accuracy
Question classifier 7375 30 99.59%
Table 2: performance evaluation of problem classifiers
The invention was compared to the mainstream method, where the last line is the model proposed by the invention, and the specific results are shown in table 1. It can be seen that the model proposed by the invention exceeds the performance of many mainstream models, and the effectiveness of the method proposed by the invention is proved.
Figure BDA0002717800380000091
The method of the present invention has been described in detail by way of the form expression and examples, but the specific form of implementation of the present invention is not limited thereto. Various obvious changes and modifications can be made by one skilled in the art without departing from the spirit and principles of the process of the invention. The protection scope of the present invention shall be subject to the claims.

Claims (7)

1. A question answer extraction method based on multilayer perception comprises the following steps:
1) a problem is solved withSplicing the dry target documents, inputting the spliced dry target documents into a pre-training language model to obtain the representation of the problem
Figure DEST_PATH_IMAGE002
Contextual representation with target document
Figure DEST_PATH_IMAGE004
Will represent
Figure 848980DEST_PATH_IMAGE002
And context representation
Figure 703804DEST_PATH_IMAGE004
Interacting to obtain document-related problem representation
Figure DEST_PATH_IMAGE006
Document representation with fused problem information
Figure DEST_PATH_IMAGE008
2) To problem representation
Figure 296590DEST_PATH_IMAGE006
Carrying out multi-layer perception classification, and obtaining inference types of the problems, wherein the inference types comprise: bridging entity class or comparison class problems;
3) if the inference type is the bridging entity class, expressing according to the problem
Figure 567166DEST_PATH_IMAGE006
Document representation
Figure 316291DEST_PATH_IMAGE008
And subproblems, calling Find function, generating intermediate bridging entity
Figure DEST_PATH_IMAGE010
And according to intermediate bridging entities
Figure 14120DEST_PATH_IMAGE010
Problem representation
Figure 218836DEST_PATH_IMAGE006
Document representation
Figure 293103DEST_PATH_IMAGE008
And sub-problems
Figure DEST_PATH_IMAGE012
Calling a Transfer function to obtain answer attention distribution;
if the inference type is a comparative problem, expressing the problem according to the problem
Figure 837348DEST_PATH_IMAGE006
Document representation
Figure 765465DEST_PATH_IMAGE008
And sub-problems
Figure 457477DEST_PATH_IMAGE012
Respectively calling the Find function twice to generate an intermediate bridging entity
Figure 69855DEST_PATH_IMAGE010
With intermediate bridging entities
Figure DEST_PATH_IMAGE014
And comparing intermediate bridging entities by calling Compare function
Figure 468607DEST_PATH_IMAGE010
With intermediate bridging entities
Figure 836134DEST_PATH_IMAGE014
Obtaining the attention distribution of the answer;
4) and obtaining an answer prediction result of the question according to the answer attention distribution.
2. The method of claim 1, wherein the target document is obtained by:
1) inputting a plurality of original documents into a paragraph selection model consisting of a BERT model and a layer of linear classifiers;
2) and selecting paragraphs related to the problems in each original document according to a threshold value to obtain a plurality of target documents.
3. A method as recited in claim 1, the pre-training language model comprising a BERT model.
4. The method of claim 1, wherein the representation is to be represented
Figure 950196DEST_PATH_IMAGE002
And context representation
Figure 163003DEST_PATH_IMAGE004
The interaction method comprises the following steps: a bidirectional attention mechanism is used; the step of generating the sub-questions comprises:
1) will represent
Figure 744157DEST_PATH_IMAGE002
Inputting through a BilSTM network to obtain a problem vector
Figure DEST_PATH_IMAGE016
2) Passing problem vectors
Figure 954690DEST_PATH_IMAGE016
Sub-problems of
Figure DEST_PATH_IMAGE018
And problem representation
Figure 496661DEST_PATH_IMAGE006
Get a sub-problem
Figure 510228DEST_PATH_IMAGE012
5. The method of claim 1, wherein obtaining the predicted result of the answer to the question comprises: representing context
Figure DEST_PATH_IMAGE020
Inputting a plurality of LSTM layers which are stacked layer by layer and do not share parameters; the answer prediction results include: one or more of a related evidence sentence, an answer start position, an answer end position, and an answer type.
6. A storage medium having a computer program stored thereon, wherein the computer program is arranged to, when run, perform the method of any of claims 1-5.
7. An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the method according to any of claims 1-5.
CN202011079727.5A 2020-10-10 2020-10-10 Question answer extraction method based on multilayer perception and electronic device Active CN112380326B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011079727.5A CN112380326B (en) 2020-10-10 2020-10-10 Question answer extraction method based on multilayer perception and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011079727.5A CN112380326B (en) 2020-10-10 2020-10-10 Question answer extraction method based on multilayer perception and electronic device

Publications (2)

Publication Number Publication Date
CN112380326A CN112380326A (en) 2021-02-19
CN112380326B true CN112380326B (en) 2022-07-08

Family

ID=74581232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011079727.5A Active CN112380326B (en) 2020-10-10 2020-10-10 Question answer extraction method based on multilayer perception and electronic device

Country Status (1)

Country Link
CN (1) CN112380326B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420111B (en) * 2021-06-17 2023-08-11 中国科学院声学研究所 Intelligent question answering method and device for multi-hop reasoning problem

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674279A (en) * 2019-10-15 2020-01-10 腾讯科技(深圳)有限公司 Question-answer processing method, device, equipment and storage medium based on artificial intelligence
CN111027327A (en) * 2019-10-29 2020-04-17 平安科技(深圳)有限公司 Machine reading understanding method, device, storage medium and device
CN111339281A (en) * 2020-03-24 2020-06-26 苏州大学 Answer selection method for reading comprehension choice questions with multi-view fusion
CN111460092A (en) * 2020-03-11 2020-07-28 中国电子科技集团公司第二十八研究所 Multi-document-based automatic complex problem solving method
CN111598118A (en) * 2019-12-10 2020-08-28 中山大学 Visual question-answering task implementation method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082184A1 (en) * 2016-09-19 2018-03-22 TCL Research America Inc. Context-aware chatbot system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674279A (en) * 2019-10-15 2020-01-10 腾讯科技(深圳)有限公司 Question-answer processing method, device, equipment and storage medium based on artificial intelligence
CN111027327A (en) * 2019-10-29 2020-04-17 平安科技(深圳)有限公司 Machine reading understanding method, device, storage medium and device
CN111598118A (en) * 2019-12-10 2020-08-28 中山大学 Visual question-answering task implementation method and system
CN111460092A (en) * 2020-03-11 2020-07-28 中国电子科技集团公司第二十八研究所 Multi-document-based automatic complex problem solving method
CN111339281A (en) * 2020-03-24 2020-06-26 苏州大学 Answer selection method for reading comprehension choice questions with multi-view fusion

Also Published As

Publication number Publication date
CN112380326A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
US11593631B2 (en) Explainable transducer transformers
Lin et al. Deep learning for missing value imputation of continuous data and the effect of data discretization
Craven et al. Using neural networks for data mining
US11334791B2 (en) Learning to search deep network architectures
CN112380835B (en) Question answer extraction method integrating entity and sentence reasoning information and electronic device
Wang et al. Tensor networks meet neural networks: A survey and future perspectives
CN113412492A (en) Quantum algorithm for supervised training of quantum Boltzmann machine
CN115687609A (en) Zero sample relation extraction method based on Prompt multi-template fusion
CN112380326B (en) Question answer extraction method based on multilayer perception and electronic device
Eyraud et al. TAYSIR Competition: Transformer+\textscrnn: Algorithms to Yield Simple and Interpretable Representations
Lu Learning Guarantees for Graph Convolutional Networks in the Stochastic Block Model
Chien et al. Bayesian multi-temporal-difference learning
Li et al. A hint from arithmetic: On systematic generalization of perception, syntax, and semantics
Anireh et al. HTM-MAT: An online prediction software toolbox based on cortical machine learning algorithm
Abuelenin et al. Optimizing deep learning based on deep auto encoder and genetic algorithm
Tran Unsupervised neural-symbolic integration
CN114065769B (en) Method, device, equipment and medium for training emotion reason pair extraction model
Cameron Information compression of molecular representations using neural network auto-encoders
Gangal et al. Neural Computing
Matovič et al. Establishing Pattern Sequences Using Artificial Neural Networks with an Application to Organizational Patterns
Karthika Renuka et al. Visual Question Answering System Using Co-attention Model
Ha et al. Evolving multi-view autoencoders for text classification
Busireddy A Framework for Question Answering System Using Dynamic Co-attention Networks
Jiang Transfer Learning of Image Classification with Deep Learning Architectures
Panayiotou Molecular Graph Learning in the Optimal Transport Geometry

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant