CN112434152B - Education choice question answering method and device based on multi-channel convolutional neural network - Google Patents
Education choice question answering method and device based on multi-channel convolutional neural network Download PDFInfo
- Publication number
- CN112434152B CN112434152B CN202011384874.3A CN202011384874A CN112434152B CN 112434152 B CN112434152 B CN 112434152B CN 202011384874 A CN202011384874 A CN 202011384874A CN 112434152 B CN112434152 B CN 112434152B
- Authority
- CN
- China
- Prior art keywords
- evidence
- option
- neural network
- assertion
- confidence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 40
- 238000012216 screening Methods 0.000 claims abstract description 17
- 230000007246 mechanism Effects 0.000 claims abstract description 15
- 238000013528 artificial neural network Methods 0.000 claims abstract description 14
- 238000012545 processing Methods 0.000 claims abstract description 14
- 230000001502 supplementing effect Effects 0.000 claims abstract description 6
- 239000013598 vector Substances 0.000 claims description 28
- 230000014509 gene expression Effects 0.000 claims description 14
- 238000011176 pooling Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 230000001186 cumulative effect Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 241000695274 Processa Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Educational Administration (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Educational Technology (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a method and a device for solving elementary education choice questions based on a multi-channel convolutional neural network. The method comprises the following steps: 1) Giving a selection question presented in a text form, supplementing each option into an assertion, retrieving each assertion by using a subject knowledge base, and screening through a bridging rule to obtain a high-confidence evidence; 2) Processing the problem information and the high-confidence evidence by using a multi-channel convolutional neural network to obtain a confidence competition result between the options; 3) And judging the best option according to the confidence competition result among the options. The invention can retrieve high-confidence evidence from the discipline knowledge base by using a bridging attention mechanism, then simultaneously process the questions and the evidence by gating the multi-channel convolution neural network to obtain the comparison scores among the options, and further determine the best option based on the accumulated scores compared among all the option pairs, so that the machine can solve the specific discipline selection questions in the elementary education stage and obtain better performance.
Description
Technical Field
The invention belongs to the field of natural language question answering, and relates to a elementary education choice question answering machine based on a multi-channel convolutional neural network. The solver can retrieve high-confidence evidence from the subject knowledge base by using a bridge attention mechanism, then simultaneously process the questions and the evidence by gating the multichannel convolutional neural network to obtain comparison scores among the options, and further determine the best option based on the accumulated scores compared among all the option pairs, so that the solver can solve specific subject selection questions in the primary education stage and obtain better performance.
Background
With the development of machine learning and artificial intelligence techniques, machines have achieved excellent performance on many natural language processing tasks, even approaching human performance on some tasks, and machine question-and-answer is one of the rapidly developing fields. The machine question-answering task requires a model to automatically answer questions presented in a human natural language form, and is one of the standards for measuring the understanding ability of the machine human language.
The choice question is an important question type for comprehensively investigating the mastery degree of students on knowledge of various departments in the elementary education stage, and the general form of the choice question is as follows: given a textual description of the question (sometimes provided with a chart) and a plurality of candidate items, the test taker is asked to understand the question instructions and to select the most appropriate one of the candidate items as the answer to the question. The choice questions in the examination papers of various departments in the elementary education stage are wide in knowledge range, high in answer difficulty and fair in judgment standard, and are very suitable for detecting the natural language understanding capacity of the machine. How to make a machine obtain a better expression on the choice of elementary education subjects becomes an important subject in the field of natural language processing.
In order for a machine to solve such selection problems, neural network techniques are often used to process them. Neural networks are a widely used technique in natural language processing, and can extract high-level features in texts through a huge network structure. The convolutional neural network is a deep neural network using convolutional calculation, can better model text information, and has stable and excellent performance. In the field of machine question answering, convolutional neural networks are often used for text feature extraction. However, the knowledge contained in the neural network is relatively limited, and in order to make the model obtain better performance on the knowledge-intensive specific subject questions and answers, external knowledge and evidences such as a term base and a knowledge base are often introduced to assist the neural network.
Disclosure of Invention
The invention aims to provide a method and a device for better introducing discipline knowledge evidence into a neural network by using a gated multi-channel convolutional neural network and combining a retrieval and bridging evidence screening mechanism, so that a machine can obtain better performance on elementary education discipline selection questions. The method comprises the steps of searching a subject knowledge base to obtain high-confidence evidence for a specific subject choice question in an elementary education stage, processing question information and the evidence by using a convolutional neural network, and obtaining the best answer through competition among options.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a preliminary education subject selection question answering method based on a multi-channel convolutional neural network comprises the following steps:
given a selection question (containing a question and a plurality of options) presented in text form, each option is supplemented into an assertion;
searching each assertion by using a discipline knowledge base, and screening the searched evidence through a bridging rule to obtain a high-confidence evidence;
processing the problem information and the high-confidence evidence by using a multi-channel convolutional neural network to obtain a confidence competition result between the options;
and judging the best option according to the confidence competition result among the options.
Further, given a choice question presented in text form, each choice is supplemented as an assertion, including: for one pass problem q and n options { op 1 ,op 2 ,...,op n And (4) constructing selection questions, cleaning the questions by using rules, and deleting excrescence expressions comprising 'as a graph' and 'correct option is'. Question q and each option { op 1 ,op 2 ,...,op n Connect, generate assertion { a } 1 ,a 2 ,...,a n }. And marking the correctness of the assertion according to information such as the question reference answer and the negative word in the question.
Further, the retrieving each assertion with the discipline knowledge base includes: according to the subject related to the subject, text information related to the subject is collected from resources such as textbooks and network encyclopedias, a subject knowledge base K for retrieval is built, and evidence support is provided for machine answering and selecting the subject. For { a 1 ,a 2 ,...,a n Retrieving m evidences (K) with higher similarity of texts from a discipline knowledge base K 1 ,k 2 ,...,k m }。
Further, the screening the retrieved evidence through the bridging rule to obtain the high-confidence evidence includes: retrieve m pieces of evidence k for each assertion 1 ,k 2 ,...,k m Using a bridging mechanismScoring each evidence, selecting the l with the highest score as the high confidence evidence { k' 1 ,k′ 2 ,...,k′ l }. For an assertion a, consisting of a question q and an option op, a word w in evidence k i Bridging attention was scored as:
wherein q is w And op w And respectively, the word sets formed by the question q and the option op, and the cos function is used for calculating the cosine similarity of the word vectors of the two words. If words in questions and options occur repeatedly, each time more than once, the calculated score is decremented by a certain factor (e.g., 0.9) so as not to give too high a score to words that occur more than once. The score of the whole evidence k is obtained by weighted average of the scores of the t words with the highest score, and the weighted calculation formula is as follows:
wherein,to judge w i Is ranked within the top t (e.g., t takes the value of 5), and pow (x, y) is a function raised to the y-th power of x.
For each assertion, combining the obtained l pieces of high-confidence evidence to obtain evidence { e } of each assertion used for inputting into the neural network 1 ,e 2 ,...,e n In which e i (1. Ltoreq. I. Ltoreq. N) is an assertion a i (1. Ltoreq. I.ltoreq.n) of supporting evidence.
Further, after the screening of the retrieved evidence by the bridging rule, the data is segmented by a segmentation method based on text matching (such as a forward maximum matching method) using a vocabulary containing corresponding discipline terms. In order to reduce the complexity of the data set and increase the data consistency, the corresponding post-processing is carried out on the unknown words with different parts of speech.
Further, the problem information and the high-confidence evidence are processed by using the multi-channel convolutional neural network to obtain the confidence competition result between the options, the problem and the evidence are processed by using a gated multi-channel convolutional neural network model, and each group of option pairs are compared and scored. Specifically, the network architecture is as follows:
1) At the embedding layer of the model, a question q and two options op needing to be compared are combined 1 ,op 2 And corresponding assertion a 1 ,a 2 And evidence e 1 ,e 2 Each word in (a) is encoded as a word vector. In this step, the vector representation may be a low-dimensional dense Semantic representation vector pre-trained by using a Word vector (e.g., word2vec, gloVe) trained by using a neural language model, a Word vector (e.g., a result obtained by Latent Semantic Analysis (LSA)) obtained by using a Singular Value Decomposition (SVD) or other method to perform dimension reduction on a high-dimensional matrix, or may be an original high-dimensional sparse vector such as a One-hot vector.
2) On the convolution layer of the model, for the question q and the option op 1 ,op 2 Assertion a 1 ,a 2 And evidence e 1 ,e 2 The word vector of (a) indicates that the convolutional neural network using multiple cores and multiple steps is simultaneously processed in different channels. A multi-layer convolutional neural network may be employed and residual chaining may be employed. Residual Connection (Residual Connection) refers to adding the previous convolutional layer output result directly to the current convolutional layer output, replacing the output of the current layer for subsequent processing, and if the two tensor characteristic dimensions added have a difference, processing the convolutional layer with the width of 1 is usually performed for dimension adjustment. The linkage can enable the deep convolutional network to adaptively adjust the network depth to a certain extent in the learning process, and reduce the influence of the deep convolutional network on gradient propagation. Between layers, additional pooling layers (e.g., maximum pooling, average pooling) may be added to reduce the size of the feature matrix and to pool the final output vector. Question q, option op 1 ,op 2 Assertion a 1 ,a 2 Is shared in the processA set of convolutional neural network parameters, evidence e 1 ,e 2 Using another set of convolutional neural network parameters. After pooling of the last convolutional layer, a bitwise-multiplication gating mechanism is performed on the output. Specifically, the output pair option op of the question q is used 1 ,op 2 Are gated separately using the assertion a 1 Is output to the evidence e 1 Is gated using an assertion a 2 Is output to the evidence e 2 Is gated.
3) In the output layer of the model, for the four vector representations obtained in the last step after the gating mechanism, two vectors corresponding to each option are connected and then pass through the full connection layer, and then the output vectors of the full connection layer are connected and then pass through the full connection layer again (wherein the output dimensionality of the last layer is 2), and finally the competition scores of the two options are obtained.
And further, judging the best option according to the result of pairwise competition between the options obtained in the last step. The option op is obtained in the last step i (i is more than or equal to 1 and less than or equal to n) relative option op j (j is more than or equal to 1 and less than or equal to n, j is not equal to i) is marked as P (i is more than j), and then the option op is marked i Final cumulative comparison score final of i Comprises the following steps:
for the single-choice question (namely, the choice of the question is directly the question expressed by the answer), the choice with the highest cumulative comparison score is selected as the best answer. For multiple-choice questions (namely, the questions give a plurality of answer expressions first, and the choice is the questions with certain number combinations in the expressions), the accumulated comparison scores of the expressions with the numbers in each choice are accumulated, and the choice with the highest total score is taken as the best answer.
The invention also provides a elementary education choice question answer device based on the multichannel convolutional neural network, which adopts the method and comprises the following steps:
an assertion generating module, which is used for supplementing each option in the given selection questions presented in the text form into an assertion;
the evidence retrieval and screening module is used for retrieving each assertion by utilizing the subject knowledge base and screening the retrieved evidence through the bridging rule to obtain the high-confidence evidence;
the comparison and scoring module is used for processing the problem information and the high-confidence evidence by using the multi-channel convolutional neural network to obtain a confidence competition result between the options;
and the best option judgment module is used for judging the best option according to the confidence competition result among the options.
Some modules in the device are not necessary, and can still work normally after some components are deleted or modified. For example, after deleting relevant components (part of components in the evidence retrieval and screening module and the comparison and scoring module) introduced by the evidence information, the device can still solve the education type choice questions.
The invention has the following beneficial effects:
the invention can search the subject knowledge base to obtain the high-confidence evidence for the selected subjects of the specific subject in the primary education stage, then uses the convolutional neural network to process the subject information and the evidence, obtains the best answer through competition among the options, and obtains better expression on the selected subjects of the primary education stage. The invention uses the bridging attention mechanism to complete the screening of the evidence with high confidence by using the semantic relation among the problems, options and the evidence, thereby improving the correlation between the retrieved evidence and the titles. The multilayer multi-channel convolution neural network used in the invention has strong expression capability and can extract the depth characteristics of the text from different angles. The gating mechanism used in the invention can introduce the retrieved evidence information into the neural network, complete semantic interaction between questions and options, assertions and evidence, and finally make the model obtain better expression on the choice questions of the elementary education disciplines.
Drawings
Fig. 1 is a block diagram of a solution method for elementary education choice questions according to an embodiment of the present invention.
FIG. 2 is a diagram of a neural network framework in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention is based on a data set formed by selection questions in a geographical college entrance examination paper or a geographical simulation examination paper. It should be clear to those skilled in the art that other candidate information sets and problem sets may be used in the implementation.
Specifically, the example is from 239 sets of geographical test paper, totaling 4226 geographical choice questions, each of which consists of a question and four options. Some questions contain charts, but this approach does not deal with charts.
FIG. 1 is a block diagram of a solution method for elementary education choice questions according to an embodiment of the present invention; fig. 2 is a diagram of a neural network framework in an embodiment of the present invention. The method comprises the following specific steps:
step 1: carrying out preprocessing steps such as assertion generation, evidence retrieval, attention mechanism screening, unknown word processing and the like on the option data set;
specifically, the redundant expressions such as "figure", "correct option is", and the like are deleted by using the rule cleaning problem. And connecting the question and each option to generate an assertion. And marking the correctness of the assertion according to information such as the question reference answer and the negative word in the question.
Required knowledge texts are collected from relevant pages of geographic textbooks, encyclopedias and Wikipedia, and Lucene is used for building a subject knowledge base. For each assertion, after 50 search results are generated by searching, a bridge attention mechanism is used for carrying out reordering after scoring, and the first 5 results are taken as high-confidence evidence.
After the forward maximum matching method word segmentation is carried out on the data by combining the geographical subject word list, post-processing is carried out: the method comprises the steps of returning unknown words of NN and NR parts of speech (common nouns and proper nouns) to special numbers, giving special numbers to punctuations and the unknown words not containing Chinese characters, returning the unknown words of NT and CD parts of speech (time nouns and number words) to the part of speech numbers, and setting word vectors of the part of speech numbers as vectors of random initialization.
And 2, step: topics and evidence were processed using a gated multi-channel convolutional neural network model, resulting in a competition score for every two options.
At the embedding layer of the model, pre-training word vectors are used for word embedding of questions, options, assertions and evidence.
On the convolutional layer of the model, the word-embedded representation of the problems, options, assertions and evidences is processed in multiple channels by using two layers of convolutional neural networks, and the number of the convolutional kernels is 1280, wherein 512 convolutional kernels with the size of 100 × 1, 512 convolutional kernels with the size of 100 × 2 and 256 convolutional kernels with the size of 100 × 3 are included. The convolutional neural networks of the problem, option, assertion channel share parameters, and the convolutional neural networks of the evidence channel use another set of parameters. Residual error linkage is used between the two layers of convolutional neural networks, and the pooling layers use maximum pooling. The problematic and asserted channels in the convolutional layer are activated by Sigmoid function before pooling. The gating mechanism uses a bitwise multiplication method, gates the output representations of the two options separately using the output representation of the problem, and gates the output representations of the corresponding evidence using the output representation of the assertion.
In the output layer, the two vectors corresponding to each option pass through the full connection layer after being connected, and then the output vectors of the full connection layer pass through the full connection layer again after being connected, so that the comparison score between the two options is finally obtained. Specifically, output vectors of two gating mechanisms corresponding to each option are connected, the obtained 2560-dimensional vector passes through a full-connection network with an output dimension of 512, and then the two output 512-dimensional vectors are connected and sequentially pass through the full-connection network with output dimensions of 1024 and 2. In the above fully connected layers, a 50% Dropout mechanism is used to prevent over-fitting, and the ReLU is used for the activation function.
And step 3: and according to the comparison scores among the options obtained in the last step, calculating the final score of each option by using the following formula, and selecting the option with the highest score as the final answer. The option op is obtained in the last step i (i is more than or equal to 1 and less than or equal to n) relative option op j (j is more than or equal to 1 and less than or equal to n, j is not equal to i) is marked as P (i is more than j), and then the option op is selected i Final cumulative comparison score final of i Comprises the following steps:
for the single-choice question (namely, the choice of the question is directly the question expressed by the answer), the choice with the highest cumulative comparison score is selected as the best answer. For multiple-choice questions (namely, the questions give a plurality of answer expressions first, and the choices are questions with certain number combinations in the expressions), the accumulated comparison scores of the expressions with the numbers in each choice are accumulated, and the choice with the highest total score is taken as the best answer.
For the college entrance examination geographical choice question data set used, the Accuracy (accuacy), i.e. the frequency with which the option with the highest model prediction score is the standard answer, is reported. The model effect is shown in table 1. Wherein, no picture is the subject and no picture is provided, and the picture is the subject provided with the picture. The model is solved based on text information, no special module is used for processing information in pictures temporarily, and for the problems with pictures originally, the problems are assumed to have no pictures and are also used for model training.
Table 1: answer machine effect on geographical choice questions of college entrance examination
In general, it can be seen that the models under the two settings show a great improvement over the baseline method (with an accuracy of 25%) of randomly selecting answers.
Another embodiment of the present invention provides a device for solving elementary education choice questions based on a multichannel convolutional neural network using the method of the present invention, including:
the assertion generating module is used for supplementing each option in the given selection questions presented in the text form into an assertion;
the evidence retrieval and screening module is used for retrieving each assertion by utilizing the subject knowledge base and screening the retrieved evidence through the bridging rule to obtain the high-confidence evidence;
the comparison and scoring module is used for processing the problem information and the high-confidence evidence by using the multi-channel convolutional neural network to obtain a confidence competition result between the options;
and the best option judgment module is used for judging the best option according to the confidence competition result among the options.
The specific implementation of the modules is described in the foregoing description of the method of the present invention.
Another embodiment of the invention provides an electronic device (computer, server, etc.) comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the steps of the method of the invention.
Another embodiment of the invention provides a computer readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program which, when executed by a computer, performs the steps of the method of the invention.
Parts of the invention not described in detail are well known to the person skilled in the art.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is intended to include such modifications and variations.
Claims (8)
1. A multi-channel convolutional neural network-based education choice question answering method is characterized by comprising the following steps:
giving a choice question presented in a text form, and supplementing each choice into an assertion;
searching each assertion by using a discipline knowledge base, and screening the searched evidence through a bridging rule to obtain a high-confidence evidence;
processing the problem information and the high-confidence evidence by using a multi-channel convolutional neural network to obtain a confidence competition result between the options;
judging the best option according to the confidence competition result among the options;
the method comprises the following steps of retrieving each assertion by using a discipline knowledge base, and screening the retrieved evidence through a bridging rule to obtain a high-confidence evidence, wherein the method comprises the following steps:
for each assertion, retrieving m pieces of evidence that the similarity of texts meets a set threshold from a discipline knowledge base { k } 1 ,k 2 ,…,k m }; a bridging mechanism is used to score each of the m pieces of evidence, and the l pieces with the highest scores are selected as high-confidence evidence { k' 1 ,k′ 2 ,…,k l ' }; for an assertion a, consisting of a question q and an option op, a word w in evidence k i The bridging attention score of (a) is:
wherein q is w And op w The words are respectively a word set formed by the question q and the option op, and the cos function is used for calculating the cosine similarity of word vectors of the two words; if the words in the questions and options appear repeatedly, the words appear more than once each time, and the calculated score is decreased by 0.9 multiplying power; the score of the whole evidence k is obtained by weighted average of the scores of the t words with the highest score, and the weighted calculation formula is as follows:
for each assertion, combining the obtained l pieces of high-confidence evidence to obtain evidence { e } of each assertion used for inputting into the neural network 1 ,e 2 ,…,e n In which e i To assert a i I is more than or equal to 1 and less than or equal to n;
the method for processing the problem information and the high confidence evidence by using the multi-channel convolution neural network to obtain the confidence competition result among the options comprises the following steps:
at the embedding layer of the model, a question q and two options op needing to be compared are combined 1 ,op 2 And corresponding assertion a 1 ,a 2 And evidence e 1 ,e 2 Each word in (1) is encoded as a word vector;
on the convolution layer of the model, for the question q and the option op 1 ,op 2 Assertion a 1 ,a 2 And evidence e 1 ,e 2 The word vector of (a) indicates that the multi-core multi-step convolutional neural network is used for processing in different channels simultaneously; processing by adopting a multilayer convolutional neural network, adopting residual error linkage, adding an additional pooling layer between layers to reduce the scale of the characteristic matrix, and pooling the final output vector; question q, option op 1 ,op 2 Assertion a 1 ,a 2 Share a set of convolutional neural network parameters, evidence e 1 ,e 2 Using another set of convolutional neural network parameters; after pooling of the last convolutional layer, performing a bitwise-multiplication gating mechanism on the output;
in the output layer of the model, for the four vector representations after the gating mechanism, two vectors corresponding to each option are connected and then pass through the full connection layer, and then the output vectors of the full connection layer are connected and then pass through the full connection layer again, so that the competition scores of the two options are finally obtained.
2. The method of claim 1, wherein supplementing each option to an assertion given a choice presented textually as a choice comprises: cleaning the problem by using rules, and deleting redundant expressions comprising 'such as a figure' and 'the correct option is'; connecting the question with each option to generate assertion; and marking the correctness of the assertion according to the question reference answer and the negative word information in the question.
3. The multi-channel convolutional neural network-based educational choice question answering method of claim 1, wherein the method of constructing the discipline knowledge base is: according to the subject related to the subject, text information related to the subject is collected from textbooks and network resources, and a subject knowledge base for retrieval is constructed.
4. The multi-channel convolutional neural network-based educational choice question answering method of claim 1, wherein after the screening of the retrieved evidence by the bridging rule, the vocabulary containing the corresponding discipline terms is segmented using a segmentation method based on text matching, and the corresponding post-processing is performed on the unknown words of different parts of speech and the words of special parts of speech.
5. The method as claimed in claim 1, wherein the step of determining the best choice according to the confidence competition result between choices comprises:
option op i Relative option op j Is marked as P (i)>j) Wherein i is more than or equal to 1 and less than or equal to n, j is more than or equal to 1 and less than or equal to n, and j is not equal to i, wherein n represents the number of options; then option op i Final cumulative comparison score final of i Comprises the following steps:
for the single-choice questions, selecting the option with the highest accumulated comparison score as the best answer; for the multiple choice question, the accumulated comparison scores of the expressions of the numbers in each choice branch are accumulated, and the choice branch with the highest total score is taken as the best answer.
6. An apparatus for solving a selection question of education based on a multi-channel convolutional neural network using the method of any one of claims 1 to 5, comprising:
an assertion generating module, which is used for supplementing each option in the given selection questions presented in the text form into an assertion;
the evidence retrieval and screening module is used for retrieving each assertion by utilizing the subject knowledge base and screening the retrieved evidence through the bridging rule to obtain the high-confidence evidence;
the comparison and scoring module is used for processing the problem information and the high-confidence evidence by using the multi-channel convolutional neural network to obtain a confidence competition result between the options;
and the best option judgment module is used for judging the best option according to the confidence competition result among the options.
7. An electronic device comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the steps of the method for multi-channel convolutional neural network-based solution of educational selection questions of any of claims 1 to 5.
8. A computer-readable storage medium storing a computer program which, when executed by a computer, implements the multichannel convolutional neural network-based educational choice question answering method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011384874.3A CN112434152B (en) | 2020-12-01 | 2020-12-01 | Education choice question answering method and device based on multi-channel convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011384874.3A CN112434152B (en) | 2020-12-01 | 2020-12-01 | Education choice question answering method and device based on multi-channel convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112434152A CN112434152A (en) | 2021-03-02 |
CN112434152B true CN112434152B (en) | 2022-10-14 |
Family
ID=74699095
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011384874.3A Active CN112434152B (en) | 2020-12-01 | 2020-12-01 | Education choice question answering method and device based on multi-channel convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112434152B (en) |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832326B (en) * | 2017-09-18 | 2021-06-08 | 北京大学 | Natural language question-answering method based on deep convolutional neural network |
US11481416B2 (en) * | 2018-07-12 | 2022-10-25 | International Business Machines Corporation | Question Answering using trained generative adversarial network based modeling of text |
CN109271505B (en) * | 2018-11-12 | 2021-04-30 | 深圳智能思创科技有限公司 | Question-answering system implementation method based on question-answer pairs |
CN111639187B (en) * | 2019-03-01 | 2023-05-16 | 上海数眼科技发展有限公司 | Knowledge graph-based knowledge question and answer verification code generation system and method |
CN111382255B (en) * | 2020-03-17 | 2023-08-01 | 北京百度网讯科技有限公司 | Method, apparatus, device and medium for question-answering processing |
-
2020
- 2020-12-01 CN CN202011384874.3A patent/CN112434152B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112434152A (en) | 2021-03-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112990296B (en) | Image-text matching model compression and acceleration method and system based on orthogonal similarity distillation | |
Willis et al. | Key phrase extraction for generating educational question-answer pairs | |
Van Nguyen et al. | Enhancing lexical-based approach with external knowledge for Vietnamese multiple-choice machine reading comprehension | |
Bai et al. | A survey of current machine learning approaches to student free-text evaluation for intelligent tutoring | |
CN110852071B (en) | Knowledge point detection method, device, equipment and readable storage medium | |
Agarwal et al. | Autoeval: A nlp approach for automatic test evaluation system | |
Boateng et al. | Real-world deployment and evaluation of kwame for science, an ai teaching assistant for science education in west africa | |
Tashu et al. | Deep learning architecture for automatic essay scoring | |
CN116860947A (en) | Text reading and understanding oriented selection question generation method, system and storage medium | |
Alwaneen et al. | Stacked dynamic memory-coattention network for answering why-questions in Arabic | |
CN108959467B (en) | Method for calculating correlation degree of question sentences and answer sentences based on reinforcement learning | |
CN116012866A (en) | Method and device for detecting heavy questions, electronic equipment and storage medium | |
CN112434152B (en) | Education choice question answering method and device based on multi-channel convolutional neural network | |
CN116186199A (en) | Automatic short answer scoring method based on multi-feature fusion | |
Zhao et al. | Investigating the Validity and Reliability of a Comprehensive Essay Evaluation Model of Integrating Manual Feedback and Intelligent Assistance. | |
Xu et al. | A survey of machine reading comprehension methods | |
CN113656548A (en) | Text classification model interpretation method and system based on data envelope analysis | |
CN112785039A (en) | Test question answering score prediction method and related device | |
Aishwarya et al. | Stacked Attention based Textbook Visual Question Answering with BERT | |
Wang et al. | Intelligent evaluation algorithm of English writing based on semantic analysis | |
Khandait et al. | Automatic question generation through word vector synchronization using lamma | |
Meng et al. | Nonlinear network speech recognition structure in a deep learning algorithm | |
Rosso-Mateus et al. | A two-step neural network approach to passage retrieval for open domain question answering | |
Gala et al. | Real-time cognitive evaluation of online learners through automatically generated questions | |
Firoozi | Using automated procedures to score written essays in Persian: An application of the multilingual BERT system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |