CN112434152B

CN112434152B - Education choice question answering method and device based on multi-channel convolutional neural network

Info

Publication number: CN112434152B
Application number: CN202011384874.3A
Authority: CN
Inventors: 来雨轩; 张晨; 冯岩松; 贾爱霞; 赵东岩
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2020-12-01
Filing date: 2020-12-01
Publication date: 2022-10-14
Anticipated expiration: 2040-12-01
Also published as: CN112434152A

Abstract

The invention discloses a method and a device for solving elementary education choice questions based on a multi-channel convolutional neural network. The method comprises the following steps: 1) Giving a selection question presented in a text form, supplementing each option into an assertion, retrieving each assertion by using a subject knowledge base, and screening through a bridging rule to obtain a high-confidence evidence; 2) Processing the problem information and the high-confidence evidence by using a multi-channel convolutional neural network to obtain a confidence competition result between the options; 3) And judging the best option according to the confidence competition result among the options. The invention can retrieve high-confidence evidence from the discipline knowledge base by using a bridging attention mechanism, then simultaneously process the questions and the evidence by gating the multi-channel convolution neural network to obtain the comparison scores among the options, and further determine the best option based on the accumulated scores compared among all the option pairs, so that the machine can solve the specific discipline selection questions in the elementary education stage and obtain better performance.

Description

Method and device for answering multiple-choice questions in education based on multi-channel convolutional neural network

技术领域technical field

本发明属于自然语言问答领域，涉及一种基于多通道卷积神经网络的初等教育类选择题解答器。该解答器能够利用桥接注意力机制从学科知识库中检索出高置信度证据，随后通过门控多通道卷积神经网络同时处理题目和证据，得到选项间的比较分数，进而基于所有选项对间比较的累积分数确定最佳选项，从而使机器可以解答初等教育阶段的特定学科选择题，并取得较好的表现。The invention belongs to the field of natural language question answering, and relates to a primary education multiple-choice question solver based on a multi-channel convolutional neural network. The solver can retrieve high-confidence evidence from the subject knowledge base using a bridged attention mechanism, and then process the question and evidence simultaneously through a gated multi-channel convolutional neural network to obtain a comparison score between options, which is then based on all options. The cumulative scores compared to determine the best option, allowing the machine to answer subject-specific multiple-choice questions in primary education and achieve better performance.

背景技术Background technique

随着机器学习和人工智能技术的发展，机器在众多自然语言处理任务上取得了优秀的表现，甚至在一些任务上接近人类表现，机器问答是其中发展迅猛的领域之一。机器问答任务要求模型自动回答以人类自然语言形式呈现的问题，是衡量机器人类语言理解能力的标准之一。With the development of machine learning and artificial intelligence technology, machines have achieved excellent performance in many natural language processing tasks, and even approach human performance in some tasks. Machine question answering is one of the rapidly developing fields. The machine question answering task requires the model to automatically answer the questions presented in the form of human natural language, which is one of the standards for measuring the language understanding ability of robots.

选择题是初等教育阶段一种用于综合考察学生对于各科知识掌握程度的重要题型，其一般形式为：给定问题文字描述(有时也配有图表)和多个候选项，要求考生理解问题指示，从候选项中选出最合适的一项作为该题的答案。初等教育阶段的各科考卷中的选择题涉及知识范围广、解答难度大，且评判标准较为公正，非常适合用来检测机器的自然语言理解能力。如何使机器在初等教育学科选择题上获得较好的表现，便成为自然语言处理领域中重要的课题。Multiple-choice question is an important question type used to comprehensively examine students' knowledge of various subjects in the primary education stage. Question instructions, select the most appropriate one from the candidates as the answer to the question. The multiple-choice questions in each subject examination paper in the primary education stage involve a wide range of knowledge, are difficult to answer, and have relatively fair evaluation criteria, which are very suitable for testing the natural language comprehension ability of machines. How to make machines perform better on multiple-choice questions in primary education has become an important topic in the field of natural language processing.

为了让机器解决该类选择题，经常使用神经网络技术对其进行处理。神经网络是自然语言处理中被广泛使用的一种技术，能够通过巨大的网络结构抽取文本中的高层次特征。卷积神经网络是一种使用卷积计算的深度神经网络，能够较好地建模文本信息，并具有稳定而优秀的表现。在机器问答领域，卷积神经网络经常被用于文本特征提取。然而，神经网络中蕴含的知识较为有限，为了使模型在知识密集的具体学科问答上获得较好的表现，经常引入术语库、知识库等外部知识和证据对神经网络进行辅助。In order for machines to solve such multiple-choice questions, they are often processed using neural network techniques. Neural network is a widely used technology in natural language processing, which can extract high-level features in text through huge network structure. Convolutional neural network is a deep neural network that uses convolutional computing, which can model text information well and has stable and excellent performance. In the field of machine question answering, convolutional neural networks are often used for text feature extraction. However, the knowledge contained in the neural network is relatively limited. In order to make the model perform better in the knowledge-intensive question answering of specific disciplines, external knowledge and evidence such as termbases and knowledge bases are often introduced to assist the neural network.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种使用门控多通道卷积神经网络，结合检索和桥接证据筛选机制，更好地将学科知识证据引入到神经网络中，从而使机器在初等教育学科选择题上获得较好表现的方法和装置。即对于一道初等教育阶段的特定学科选择题，能够从学科知识库检索得到高置信度证据，然后使用卷积神经网络处理问题信息与证据，通过选项间竞争得到最佳答案。The purpose of the present invention is to provide a gated multi-channel convolutional neural network, combined with retrieval and bridging evidence screening mechanism, to better introduce the subject knowledge evidence into the neural network, so that the machine can obtain the multiple choice questions of primary education subjects. Methods and apparatus for better performance. That is, for a specific subject multiple-choice question in the primary education stage, high-confidence evidence can be retrieved from the subject knowledge base, and then the convolutional neural network is used to process the problem information and evidence, and the best answer can be obtained through the competition among the options.

为了达到上述目的，本发明的技术方案为：In order to achieve the above object, the technical scheme of the present invention is:

一种基于多通道卷积神经网络的初等教育学科选择题解答方法，包括以下步骤：A method for answering multiple-choice questions in primary education subjects based on a multi-channel convolutional neural network, comprising the following steps:

给定一道以文本形式呈现的选择题(包含问题和多个选项)，将每个选项补充成为断言；Given a multiple-choice question in text form (containing a question and multiple options), supplement each option as an assertion;

利用学科知识库对每条断言进行检索，通过桥接规则对检索得到的证据进行筛选，得到高置信度证据；Use the subject knowledge base to retrieve each assertion, and filter the retrieved evidence through bridging rules to obtain high-confidence evidence;

使用多通道卷积神经网络处理问题信息与高置信度证据，得到选项间的置信度竞争结果；Use multi-channel convolutional neural network to process problem information and high-confidence evidence, and obtain the results of confidence competition between options;

根据选项间的置信度竞争结果判断出最佳选项。The best option is determined according to the confidence competition results among the options.

进一步地，所述给定一道以文本形式呈现的选择题，将每个选项补充成为断言，包括：针对一道由问题q和n个选项{op₁，op₂，...，op_n}构成的选择题，使用规则清洗问题，删除包括“如图”、“正确选项是”在内的赘余表述。将问题q和各个选项{op₁，op₂，...，op_n}进行连接，生成断言{a₁，a₂，...，a_n}。根据题目参考答案和问题中的否定词等信息，标注断言的正确性。Further, given a multiple-choice question presented in the form of text, each option is supplemented as an assertion, including: for a question q and n options {op ₁ , op ₂ , ..., op _n } For multiple-choice questions, use rules to clean up the questions, and delete redundant expressions including "as shown in the figure" and "the correct option is". Concatenate the question q with the options {op ₁ , op ₂ , ..., op _n } to generate assertions {a ₁ , a ₂ , ..., a _n }. Mark the correctness of the assertion based on information such as the question reference answer and the negative words in the question.

进一步地，所述利用学科知识库对每条断言进行检索，包括：根据题目涉及的学科，从教科书和网络百科等资源中搜集与该学科相关的文本信息构建用于检索的学科知识库K，为机器回答选择题提供证据支撑。对于{a₁，a₂，...，a_n}中的每条断言，从学科知识库K中检索得到m条文本相似度较高的证据{k₁，k₂，...，k_m}。Further, the retrieval of each assertion by using the subject knowledge base includes: according to the subject involved in the subject, collecting text information related to the subject from resources such as textbooks and online encyclopedia to construct a subject knowledge base K for retrieval, Provide evidence support for machines to answer multiple-choice questions. For each _assertion in _{ a ₁ , _a ₂ , . _m }.

进一步地，所述通过桥接规则对检索得到的证据进行筛选，得到高置信度证据，包括：对于每条断言检索得到的m条证据{k₁，k₂，...，k_m}，使用桥接机制为每条证据进行打分，选出得分最高的l条作为之后使用的高置信度证据{k′₁，k′₂，...，k′_l}。对于由问题q和选项op构成的断言a，证据k中的一个词w_i桥接注意力得分为：Further, the bridging rule is used to screen the retrieved evidence to obtain high-confidence evidence, including: for each assertion retrieved _m pieces of evidence {k ₁ , k ₂ , . . . , km }, using The bridging mechanism scores each piece of evidence, and selects the l piece with the highest score as the high-confidence evidence {k′ ₁ , k′ ₂ , . . . , k′ _l } used later. For assertion a consisting of question q and option op, a word _wi in evidence k bridges the attention score as:

其中，q_w和op_w分别为问题q和选项op构成的词集，cos函数用于计算两个词的词向量的余弦相似度。如果问题和选项中的词语重复出现，则每次多出现一次，计算的分数以一定倍率(如0.9)递减，以免给出现多次的词语赋予过高的分数。整条证据k的得分由得分最高的t个单词的得分的加权平均得到，加权计算公式为：Among them, q _w and op _w are the word sets formed by question q and option op, respectively, and the cos function is used to calculate the cosine similarity of the word vectors of the two words. If the words in the questions and options appear repeatedly, each time they appear more than once, the calculated score is decreased by a certain rate (such as 0.9), so as to avoid giving too high scores to the words that appear many times. The score of the whole piece of evidence k is obtained by the weighted average of the scores of the t words with the highest score. The weighted calculation formula is:

其中，

为判断w_i的得分是否排名在前t内(比如t取值为5)的指示函数，pow(x，y)为求x的y次幂的函数。in,

In order to judge whether the score of _wi is ranked in the top t (for example, t is 5) the indicator function, pow(x, y) is the function of calculating x to the power of y.

对于每条断言，将得到的l条高置信度证据进行组合，得到每条断言用于输入神经网络的证据{e₁，e₂，...，e_n}，其中e_i(1≤i≤n)为断言a_i(1≤i≤n)的一组支持证据。For each assertion, the obtained l high-confidence evidences are combined to obtain the _evidence _{ e ₁ , e ₂ , . ≤n) is a set of supporting evidence for assertion a _i (1≤i≤n).

进一步地，在所述通过桥接规则对检索得到的证据进行筛选之后，对数据使用包含相应学科术语的词表通过基于文本匹配的分词方法(如正向最大匹配法)进行分词。为了降低数据集复杂性，增加数据一致性，对不同词性的未登录词进行相应的后处理。Further, after the bridging rule is used to screen the retrieved evidence, the data is segmented by a word segmentation method based on text matching (such as a forward maximum matching method) using a vocabulary containing corresponding subject terms. In order to reduce the complexity of the dataset and increase the data consistency, corresponding post-processing is performed on the unregistered words of different parts of speech.

进一步地，所述使用多通道卷积神经网络处理问题信息与高置信度证据，得到选项间的置信度竞争结果，是使用门控多通道卷积神经网络模型处理题目和证据，对于每组选项对进行比较打分。具体而言，网络架构如下：Further, the multi-channel convolutional neural network is used to process question information and high-confidence evidence, and the confidence competition result between options is obtained. The gated multi-channel convolutional neural network model is used to process the question and evidence. For each group of options Score for comparison. Specifically, the network architecture is as follows:

1)在模型的嵌入层，将问题q、需要比较的两个选项op₁，op₂及对应的断言a₁，a₂和证据e₁，e₂中的各个词编码为词向量。该步骤中，向量表示可以是使用神经语言模型训练的词向量(如Word2vec，GloVe)、使用奇异值分解(SVD，Singular Value Decomposition)等方法对高维矩阵进行降维得到的词向量(如潜在语义分析(LSA，Latent Semantic analysis)得到的结果)等预训练好的低维稠密的语义表示向量，也可以是如独热向量(One-hotVector)的原始的高维稀疏向量。1) In the embedding layer of the model, the question q, the two options op ₁ , op ₂ to be compared, and the corresponding assertions a ₁ , a ₂ and evidence e ₁ , e ₂ are encoded as word vectors. In this step, the vector representation can be a word vector trained by a neural language model (such as Word2vec, GloVe), a word vector obtained by reducing the dimension of a high-dimensional matrix by methods such as Singular Value Decomposition (SVD, Singular Value Decomposition) (such as potential Semantic analysis (LSA, Latent Semantic analysis) and other pre-trained low-dimensional dense semantic representation vectors can also be original high-dimensional sparse vectors such as One-hot Vectors.

2)在模型的卷积层，对于问题q、选项op₁，op₂、断言a₁，a₂和证据e₁，e₂的词向量表示使用多核多步长的卷积神经网络同时在不同的通道中进行处理。可以采用多层卷积神经网络，并采取残差链接。残差链接(Residual Connection)指将前一卷积层输出结果直接加到当前卷积层输出中，替代当前层的输出用于后续处理，如果相加的两个张量特征维度有差异，通常经宽为1的卷积层处理以作维度调整。这种链接可以让深层卷积网络在学习过程中在一定程度上自适应调整网络深度，降低深层网络对梯度传播的影响。在层与层之间，可以加入额外的池化层(例如最大池化、平均池化)以减小特征矩阵的规模，并对最终输出向量作池化处理。问题q、选项op₁，op₂、断言a₁，a₂的处理共享一套卷积神经网络参数，证据e₁，e₂的处理使用另一套卷积神经网络参数。在最后一层卷积层的池化以后，对输出进行按位乘的门控机制。具体而言，使用问题q的输出对选项op₁，op₂的输出分别进行门控，使用断言a₁的输出对证据e₁的卷积输出进行门控，使用断言a₂的输出对证据e₂的输出进行门控。2) In the convolutional layer of the model, the word vectors for question q, options op ₁ , op ₂ , assertions a ₁ , a ₂ and evidence e ₁ , e ₂ represent the convolutional neural network using multi-kernel and multi-step size at the same time in different processed in the channel. A multi-layer convolutional neural network can be used, and residual links can be taken. Residual connection refers to adding the output of the previous convolutional layer directly to the output of the current convolutional layer, replacing the output of the current layer for subsequent processing. If the two added tensor feature dimensions are different, usually Processed by a convolutional layer of width 1 for dimension adjustment. This link allows the deep convolutional network to adaptively adjust the network depth to a certain extent during the learning process, reducing the impact of the deep network on gradient propagation. Between layers, additional pooling layers (eg max pooling, average pooling) can be added to reduce the size of the feature matrix and pool the final output vector. Problem q, options op ₁ , op ₂ , processing of assertions a ₁ , a ₂ share one set of CNN parameters, and processing of evidence e ₁ , e ₂ uses another set of CNN parameters. After the pooling of the last convolutional layer, a gating mechanism is applied to the output by bitwise multiplication. Specifically, the outputs of options op ₁ and op ₂ are gated separately using the output of question q, the convolution output of evidence e ₁ is gated using the output of assertion a ₁ , and the evidence e is gated using the output of assertion a ₂ The output of ₂ is gated.

3)在模型的输出层，对于上一步骤得到的经过门控机制后的四个向量表示，每个选项对应的两个向量进行连接后通过全连接层，继而将全连接层的输出向量进行连接后再次通过全连接层(其中最后一层的输出维度为2)，最终得到两个选项的竞争分数。3) In the output layer of the model, for the four vector representations obtained in the previous step after the gating mechanism, the two vectors corresponding to each option are connected and then passed through the fully connected layer, and then the output vector of the fully connected layer is processed. After connecting, it passes through the fully connected layer again (where the output dimension of the last layer is 2), and finally gets the competition score of the two options.

进一步地，根据上一步骤得到的选项间的两两竞争结果判断出最佳选项。上一步骤中得到选项op_i(1≤i≤n)相对选项op_j(1≤j≤n，j≠i)的竞争分数记为P(i＞j)，则选项op_i的最终累计比较得分final_i为：Further, the best option is determined according to the result of the pairwise competition between the options obtained in the previous step. In the previous step, the competition score of option op _i (1≤i≤n) relative to option op _j (1≤j≤n, j≠i) is recorded as P(i>j), then the final cumulative comparison of option op _i The final _i score is:

对于单选题(即题目的选择支直接为答案表述的题目)，从中选出累计比较得分最高的选项作为最佳答案。对于多选题(即题目先给出若干答案表述，选择支为这些表述中的某些编号组合的题目)，对每个选择支中的各个编号的表述的累计比较得分进行累加，将总得分最高的选择支当做最佳答案。For single-choice questions (that is, questions whose selection branches are directly expressed as answers), the option with the highest cumulative comparison score is selected as the best answer. For multiple-choice questions (that is, the question is given several answer statements first, and the selection branch is a question with some combination of numbers in these expressions), the cumulative comparison score of each numbered expression in each choice branch is accumulated, and the total score is calculated. The highest choice is considered the best answer.

本发明还提供一种采用上述方法的基于多通道卷积神经网络的初等教育类选择题解答装置，其包括：The present invention also provides an apparatus for answering multiple-choice questions of elementary education based on a multi-channel convolutional neural network using the above method, which includes:

断言生成模块，用于将给定以文本形式呈现的选择题中的每个选项补充成为断言；Assertion generation module for supplementing each option in a given multiple choice question presented in text form into an assertion;

证据检索和筛选模块，用于利用学科知识库对每条断言进行检索，通过桥接规则对检索得到的证据进行筛选，得到高置信度证据；Evidence retrieval and screening module is used to retrieve each assertion using the subject knowledge base, and filter the retrieved evidence through bridging rules to obtain high-confidence evidence;

比较打分模块，用于使用多通道卷积神经网络处理问题信息与高置信度证据，得到选项间的置信度竞争结果；A comparative scoring module is used to process question information and high-confidence evidence using multi-channel convolutional neural networks to obtain confidence competition results between options;

最佳选项判断模块，用于根据选项间的置信度竞争结果判断出最佳选项。The best option judgment module is used for judging the best option according to the confidence competition result among the options.

上述装置中的部分模块并非必须，删除或修改部分组件后依然能够正常工作。例如删除证据信息引入的相关组件(证据检索和筛选模块以及比较打分模块中的部分组件)后，该装置依然能够进行教育类选择题的解答。Some modules in the above devices are not necessary, and they can still work normally after deleting or modifying some components. For example, after deleting the relevant components introduced by the evidence information (evidence retrieval and screening module and some components in the comparison scoring module), the device can still answer educational multiple-choice questions.

本发明的有益效果如下：The beneficial effects of the present invention are as follows:

本发明对于一道初等教育阶段的特定学科的选择题，能够从学科知识库检索得到高置信度证据，然后使用卷积神经网络处理间题信息与证据，通过选项间竞争得到最佳答案，在初等教育阶段的各科选择题上的取得较好的表现。本发明中使用桥接注意力机制能够利用问题、选项和证据间的语义关系完成高置信度证据的筛选，从而提高检索得到的证据与题目的相关性。本发明中使用的多层多通道卷积神经网络，具有强大的表示能力，能够从不同的角度提取文本的深度特征。本发明中使用的门控机制，能够将检索得到的证据信息引入到神经网络中，完成问题与选项、断言与证据之间的语义交互，最终使模型在初等教育学科选择题上获得较好的表现。For a multiple-choice question of a specific subject in the primary education stage, the invention can retrieve high-confidence evidence from the subject knowledge base, and then use the convolutional neural network to process the question information and evidence, and obtain the best answer through the competition among the options. Achieve better performance on the multiple-choice questions in each subject at the education stage. The bridge attention mechanism used in the present invention can utilize the semantic relationship among questions, options and evidence to complete the screening of high-confidence evidence, thereby improving the correlation between the retrieved evidence and the topic. The multi-layer multi-channel convolutional neural network used in the present invention has strong representation ability and can extract the depth features of text from different angles. The gate control mechanism used in the present invention can introduce the retrieved evidence information into the neural network, complete the semantic interaction between questions and options, assertions and evidence, and finally make the model obtain better results in the multiple-choice questions of primary education subjects. Performance.

附图说明Description of drawings

图1为本发明实施例中的初等教育类选择题解答方法的框架图。FIG. 1 is a frame diagram of a method for answering multiple-choice questions of primary education in an embodiment of the present invention.

图2为本发明实施例中的神经网络框架图。FIG. 2 is a frame diagram of a neural network in an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，可以理解的是，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It can be understood that the described embodiments are only a part of the embodiments of the present invention, rather than all the implementations. example. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative efforts shall fall within the protection scope of the present invention.

本发明实例基于地理高考试卷或地理模拟试卷中的选择题构成的数据集。本领域技术人员应该清楚地明白，在具体实施过程中也可以采用其他候选信息集和问题集。An example of the present invention is based on a data set composed of multiple-choice questions in a geography college entrance examination paper or a geography simulation test paper. It should be clearly understood by those skilled in the art that other candidate information sets and question sets may also be used in the specific implementation process.

具体地，该实例来自于239套地理试卷，共计4226道地理选择题，每道题由问题和四个选项构成。部分问题包含图表，但此方法不对图表进行处理。Specifically, this example comes from 239 sets of geography examination papers, with a total of 4226 geography multiple-choice questions, each question is composed of questions and four options. Some questions contain graphs, but this method does not handle graphs.

如图1所示，为本发明实施例中的初等教育类选择题解答方法的框架图；如图2所示，为本发明实施例中的神经网络框架图。具体步骤如下：As shown in FIG. 1 , it is a frame diagram of a method for answering multiple-choice questions of elementary education in an embodiment of the present invention; as shown in FIG. 2 , it is a framework diagram of a neural network in an embodiment of the present invention. Specific steps are as follows:

步骤1：对选择题数据集进行断言生成、证据检索、注意力机制筛选、未登录词处理等预处理步骤；Step 1: Perform preprocessing steps such as assertion generation, evidence retrieval, attention mechanism screening, and unregistered word processing for the multiple-choice data set;

具体而言，使用规则清洗问题，删除“如图”、“正确选项是”等赘余表述。将问题和各个选项进行连接生成断言。根据题目参考答案和问题中的否定词等信息，标注断言的正确性。Specifically, use rules to clean up problems, and delete redundant expressions such as "as shown in the figure" and "the correct option is". Connect the question to each option to generate an assertion. Mark the correctness of the assertion based on information such as the question reference answer and the negative words in the question.

从地理教科书、百度百科和维基百科的相关页面中搜集所需知识文本，使用Lucene构建学科知识库。对于每个断言，检索产生50条检索结果后，使用桥接注意力机制进行打分后重排序，取前5名的结果作为高置信度证据。Collect required knowledge texts from geography textbooks, Baidu Encyclopedia and Wikipedia related pages, and use Lucene to build a subject knowledge base. For each assertion, after the retrieval produces 50 retrieval results, the bridge attention mechanism is used to score and reorder, and the top 5 results are taken as high-confidence evidence.

结合地理学科词表对数据进行正向最大匹配法分词后，进行后处理：将NN、NR词性(普通名词、专有名词)的未登录词回退为特殊编号，对标点和不包含汉字的未登录词赋予特殊编号，将NT、CD词性(时间名词、数词)的未登录词回退到词性编号，词性编号的词向量设置为随机初始化的向量。Combined with the geography vocabulary, the data is segmented by the forward maximum matching method, and then post-processing is performed: the unregistered words of NN and NR parts of speech (common nouns, proper nouns) are returned to special numbers, and punctuation and words that do not contain Chinese characters are returned. Unregistered words are assigned special numbers, and the unregistered words of NT and CD parts of speech (time nouns, numerals) are returned to the part-of-speech number, and the word vector of the part-of-speech number is set to a randomly initialized vector.

步骤2：使用门控多通道卷积神经网络模型处理题目和证据，对于每两个选项得到一个竞争得分。Step 2: Process the question and evidence using a gated multi-channel convolutional neural network model to get a competition score for each two options.

在模型的嵌入层，使用预训练词向量对问题、选项、断言和证据进行词嵌入。In the embedding layer of the model, word embeddings for questions, options, assertions, and evidences are performed using pretrained word vectors.

在模型的卷积层，使用两层卷积神经网络对问题、选项、断言和证据的词嵌入表示进行多通道处理，共有1280个卷积核，其中有512个大小为100*1卷积核、512个大小为100*2卷积核、256个大小为100*3的卷积核。问题、选项、断言通道的卷积神经网络共享参数，证据通道的卷积神经网络使用另一套参数。两层卷积神经网络间使用残差链接，池化层均使用最大池化。卷积层中问题和断言的通道在池化前通过Sigmoid函数进行激活。门控机制使用按位乘的方法，使用问题的输出表示分别对两个选项的输出表示进行门控，使用断言的输出表示分对相应证据的输出表示进行门控。In the convolutional layer of the model, a two-layer convolutional neural network is used to perform multi-channel processing on the word embedding representation of questions, options, assertions and evidence. There are a total of 1280 convolution kernels, of which 512 are 100*1 convolution kernels. , 512 convolution kernels of size 100*2, and 256 convolution kernels of size 100*3. The CNNs for the question, options, and assertion channels share parameters, and the CNNs for the evidence channel use a different set of parameters. Residual links are used between the two layers of convolutional neural networks, and max pooling is used in both pooling layers. The question and assertion channels in the convolutional layer are activated by the sigmoid function before pooling. The gating mechanism uses a bitwise multiplication method, using the output representation of the question to gate the output representation of the two options separately, and using the output representation of the assertion to gate the output representation of the corresponding evidence.

在输出层，每个选项对应的两个向量进行连接后通过全连接层，继而将全连接层的输出向量进行连接后再次通过全连接层，最终得到两个选项间的比较分数。具体而言，首先将每个选项对应的两个门控机制的输出向量进行连接，得到的2560维向量通过输出维度为512的全连接网络，然后将输出的两个512维向量进行连接后，依次通过输出维度为1024、2的全连接网络。上述全连接层中均使用了50％的Dropout机制以防过拟合，激活函数均采用ReLU。In the output layer, the two vectors corresponding to each option are connected and passed through the fully connected layer, and then the output vector of the fully connected layer is connected and then passed through the fully connected layer again, and finally the comparison score between the two options is obtained. Specifically, the output vectors of the two gating mechanisms corresponding to each option are first connected, and the obtained 2560-dimensional vector passes through a fully connected network with an output dimension of 512, and then the two output 512-dimensional vectors are connected. Pass through the fully connected network with output dimensions of 1024 and 2 in turn. The 50% Dropout mechanism is used in the above fully connected layers to prevent overfitting, and the activation function uses ReLU.

步骤3：根据上一步骤得到的选项间比较得分，利用以下公式计算出每个选择支的最终得分，选出得分最高的选择支作为最终答案。上一步骤中得到选项op_i(1≤i≤n)相对选项op_j(1≤j≤n，j≠i)的竞争分数记为P(i＞j)，则选项op_i的最终累计比较得分final_i为：Step 3: According to the comparison scores between the options obtained in the previous step, use the following formula to calculate the final score of each option, and select the option with the highest score as the final answer. In the previous step, the competition score of option op _i (1≤i≤n) relative to option op _j (1≤j≤n, j≠i) is recorded as P(i>j), then the final cumulative comparison of option op _i The final _i score is:

对于使用的高考地理选择题数据集，报告准确率(Accuracy)，即模型预测得分最高的选项为标准答案的频率。模型效果如表1所示。其中无图题为原题就没有配有图片的题目，有图题为原题配有图片的题目。模型基于文本信息进行解答，暂无专门模块处理图片中的信息，对于这些原本有图的问题，我们假设其无图，也将其用于模型训练。For the college entrance examination geographic multiple-choice question dataset used, the accuracy rate (Accuracy) is reported, that is, the frequency with which the model predicts that the option with the highest score is the standard answer. The model effect is shown in Table 1. Among them, if there is no picture title for the original title, there is no title with a picture, and there is a picture title for the original title with a picture. The model answers based on textual information, and there is no special module to deal with the information in the picture. For these originally pictured problems, we assume that there is no picture, and also use it for model training.

表1：解答器在高考地理选择题上效果Table 1: The effect of the solver on the geography multiple-choice questions of the college entrance examination

总体来说，可以看到，两种设定下的模型表现相对于随机选取答案的基准方法(准确率为25％)有着较大的提升。Overall, it can be seen that the performance of the models under both settings has a large improvement over the baseline method of randomly selecting answers (25% accuracy).

本发明的另一实施例提供一种采用本发明方法的基于多通道卷积神经网络的初等教育类选择题解答装置，其包括：Another embodiment of the present invention provides an apparatus for answering multiple-choice questions in elementary education based on a multi-channel convolutional neural network using the method of the present invention, which includes:

上述各模块的具体实现方式参见前文对本发明方法的说明。For the specific implementation of the above modules, refer to the foregoing description of the method of the present invention.

本发明的另一实施例提供一种电子装置(计算机、服务器等)，其包括存储器和处理器，所述存储器存储计算机程序，所述计算机程序被配置为由所述处理器执行，所述计算机程序包括用于执行本发明方法中各步骤的指令。Another embodiment of the present invention provides an electronic device (computer, server, etc.) comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer The program includes instructions for carrying out the steps in the method of the present invention.

本发明的另一实施例提供一种计算机可读存储介质(如ROM/RAM、磁盘、光盘)，所述计算机可读存储介质存储计算机程序，所述计算机程序被计算机执行时，实现本发明方法的各个步骤。Another embodiment of the present invention provides a computer-readable storage medium (eg, ROM/RAM, magnetic disk, optical disk), where the computer-readable storage medium stores a computer program, and when the computer program is executed by a computer, the method of the present invention is implemented of the various steps.

本发明未详细阐述的部分属于本领域技术人员的公知技术。The parts of the present invention that are not described in detail belong to the well-known technology of those skilled in the art.

显然，本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样，倘若对本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内，则本发明也意图包含这些改动和变型在内。It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention. Thus, provided that these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these modifications and variations.

Claims

1. a method for answering multiple-choice questions of education based on multi-channel convolutional neural network, is characterized in that, comprises the following steps:

Given a multiple choice question presented in text form, supplement each option as an assertion;

Use the subject knowledge base to retrieve each assertion, and filter the retrieved evidence through bridging rules to obtain high-confidence evidence;

Use multi-channel convolutional neural network to process problem information and high-confidence evidence, and obtain the results of confidence competition between options;

Determine the best option according to the confidence competition results between options;

The subject knowledge base is used to retrieve each assertion, and the retrieved evidence is screened through bridging rules to obtain high-confidence evidence, including:

For each assertion, retrieve _m pieces of evidence {k ₁ ,k ₂ ,...,km } whose similarity of text meets the set threshold from the subject knowledge base; use bridging mechanism for each piece of evidence in the m pieces of evidence. Score, and select l with the highest score as high-confidence evidence {k′ ₁ ,k′ ₂ ,...,k _l ′}; for assertion a composed of question q and option op, one of the evidence k The bridge attention score for word _wi is:

Among them, q _w and op _w are the word sets formed by question q and option op, respectively, and the cos function is used to calculate the cosine similarity of the word vectors of the two words; if the words in the question and the option appear repeatedly, each time they appear more Once, the calculated score is decreased at a rate of 0.9; the score of the entire piece of evidence k is obtained by the weighted average of the scores of the highest scoring t words. The weighted calculation formula is:

in,

In order to judge whether the score of _wi is ranked in the top t, pow(x, y) is the function of finding the power of y of x;

For each assertion, the obtained l high-confidence evidences are combined to obtain the evidence {e ₁ ,e ₂ ,..., _en } for each assertion to be input to the neural network, where e _i is one of the assertions a _i Group supporting evidence, 1≤i≤n;

The multi-channel convolutional neural network is used to process the problem information and high-confidence evidence, and the confidence competition results between the options are obtained, including:

In the embedding layer of the model, each word in the question q, the two options to be compared op ₁ , op ₂ and the corresponding assertions a ₁ , a ₂ and evidence e ₁ , e ₂ are encoded as word vectors;

In the convolutional layer of the model, the word vector representation for question q, options op ₁ , op ₂ , assertions a ₁ , a ₂ and evidence e ₁ , e ₂ uses a multi-kernel and multi-step convolutional neural network simultaneously in different channels process in the process; multi-layer convolutional neural network is used for processing, and residual link is adopted, additional pooling layer is added between layers to reduce the scale of feature matrix, and the final output vector is pooled; Question q, options op ₁ , op ₂ , the processing of assertions a ₁ , a ₂ share a set of CNN parameters, and the processing of evidence e ₁ , e ₂ uses another set of CNN parameters; After the pooling of the product layer, the gating mechanism of bitwise multiplication is performed on the output;

In the output layer of the model, for the four vector representations after the gating mechanism, the two vectors corresponding to each option are connected and then passed through the fully connected layer, and then the output vectors of the fully connected layer are connected and then passed through the fully connected layer again , and finally get a competitive score for both options.

2. The method for answering educational multiple-choice questions based on a multi-channel convolutional neural network as claimed in claim 1, wherein the given multiple-choice questions presented in text form supplement each option as an assertion, including : Use rules to clean up questions and delete redundant expressions including "as shown in the figure" and "the correct option is"; connect questions with various options to generate assertions; refer to answers and negative word information in questions to mark assertions correctness.

3. The method for answering multiple-choice questions in education based on multi-channel convolutional neural networks as claimed in claim 1, wherein the method for constructing the subject knowledge base is: according to the subject involved in the subject, from textbooks and network resources Collect text information related to the subject and build a subject knowledge base for retrieval.

4. The method for answering educational multiple-choice questions based on a multi-channel convolutional neural network according to claim 1, characterized in that, after the bridging rule is used to screen the retrieved evidence, words containing corresponding subject terms are used. The table uses the word segmentation method based on text matching for word segmentation, and performs corresponding post-processing on unregistered words with different parts of speech and words with special parts of speech.

5. The method for answering educational multiple-choice questions based on a multi-channel convolutional neural network as claimed in claim 1, wherein the best option is determined according to the confidence competition result between the options, comprising:

The competition score of option op _i relative to option op _j is recorded as P(i>j), where 1≤i≤n, 1≤j≤n, j≠i, where n represents the number of options; then the final result of option op _i The cumulative comparison score final _i is:

For single-choice questions, the option with the highest cumulative comparison score is selected as the best answer; for multiple-choice questions, the cumulative comparison scores of the expressions of each number in each choice branch are accumulated, and the choice branch with the highest total score is regarded as the best answer. Best answer.

6. A device for answering educational multiple-choice questions based on a multi-channel convolutional neural network using the method according to any one of claims 1 to 5, characterized in that, comprising:

Assertion generation module for supplementing each option in a given multiple choice question presented in text form into an assertion;

Evidence retrieval and screening module is used to retrieve each assertion using the subject knowledge base, and filter the retrieved evidence through bridging rules to obtain high-confidence evidence;

A comparative scoring module is used to process question information and high-confidence evidence using multi-channel convolutional neural networks to obtain confidence competition results between options;

The best option judgment module is used for judging the best option according to the confidence competition result among the options.

7. An electronic device, comprising a memory and a processor, wherein the memory stores a computer program, the computer program is configured to be executed by the processor, and the computer program includes a program for executing claims 1- 5. Instructions for each step in the method for solving educational multiple-choice questions based on a multi-channel convolutional neural network according to any one of claims.

8. A computer-readable storage medium, characterized in that, the computer-readable storage medium stores a computer program, and when the computer program is executed by a computer, the multi-based multi-based system according to any one of claims 1 to 5 is implemented. An educational multiple-choice problem-solving method for channel convolutional neural networks.