CN112508334A

CN112508334A - Personalized paper combining method and system integrating cognitive characteristics and test question text information

Info

Publication number: CN112508334A
Application number: CN202011233044.0A
Authority: CN
Inventors: 王志锋; 余新国; 左明章; 叶俊民; 张思; 闵秋莎; 罗恒; 夏丹; 姚璜; 杨洋
Original assignee: Central China Normal University
Current assignee: Central China Normal University
Priority date: 2020-11-06
Filing date: 2020-11-06
Publication date: 2021-03-16
Anticipated expiration: 2040-11-06
Also published as: CN112508334B

Abstract

The invention belongs to the technical field of intelligent education, and discloses a method and system for individualized test composition integrating cognitive characteristics and test question text information. First, a cognitive diagnosis model is used to predict a learner's score on a specific test question based on the cognitive level; then Use the recurrent neural network model to predict the learner's score based on text information on a specific test item; then construct a probability matrix factorization objective function based on the obtained learner's cognitive level-based and text-based information-based prediction score to predict the learner's score on a specific test item. Finally, using the estimated learner's knowledge mastery vector and learner's incremental knowledge mastery vector, calculate the KL divergence, and combine the learner's potential score on the test questions to select the one that increases the learner's knowledge mastery trend and has appropriate difficulty. The questions make up the paper for the individualized test. The invention can customize the test results according to the test objectives and the difficulty of the test questions, which greatly increases the efficiency of the learner's self-learning.

Description

Personalized paper combining method and system integrating cognitive characteristics and test question text information

Technical Field

The invention belongs to the technical field of intelligent education, and particularly relates to a personalized paper combining method and system integrating cognitive characteristics and test question text information.

Background

At present, with the rapid development of the internet and the arrival of the big data era, the traditional education industry gradually starts to transform to digital education, and massive education resources are shared as information on an online education platform for learners to download and learn. The test questions of each subject are used by learners as important resources in education in a large amount to consolidate the knowledge of learners in classroom, however, learners are difficult to directly screen out test questions really suitable for the learners from the large amount of test questions, and more, the learners are trained by adopting the tophai tactics. The personalized test paper organizing system can quickly master the text information of the matched test questions according to the knowledge of the learner, and provide the test questions which are suitable for the learning difficulty of the learner and aim to enhance the knowledge growth of the learner, so that the learner can be better trained, the learning efficiency of the learner is improved, and meanwhile, the targeted test paper organizing exercise is performed on different learning conditions of each learner, so that the demands of the learners on the personalized test paper organizing on the online learning platform are increasingly urgent.

The traditional paper-making mode adopts the recommendation idea of popular collaborative filtering in the E-commerce field, which analogizes commodities into test questions, analogizes users into learners and analogizes scores for the commodities into scores of the learners for the test questions, so that the test paper-making mode is applied to collaborative filtering to screen and obtain proper test questions, and a complete set of test paper is formed. However, the examination paper test in this way tends to dig the learning commonality of learners, that is, the examination questions given to the target learner are more from learners with high similarity to the examination questions, so that the simple examination questions and popular examination questions have a greater recommended probability, the way is not given for the specific learning state of the learner, so the examination paper result cannot be given for weak links mastered by knowledge points of the learner, the interpretability of the examination paper result is also lacking, and the learner prefers to form a set of examination questions suitable for the cognitive characteristics of the learner through the examination paper system so as to perform targeted examination question training on the weak knowledge points of the learner, so the way cannot be well applied to the technical field of intelligent education. The cognitive diagnosis method in cognitive psychology is also used in the examination paper of the learner in the later stage, the cognitive diagnosis model can diagnose the learning state of the learner, analyze the mastering conditions of the learner for each known point according to the score of the learner on the test paper of the labeled knowledge point, so as to obtain the correct answer probability of the learner for the given knowledge point, and then screen the test paper. Meanwhile, the traditional test paper combining method usually ignores the text information of the test questions, and the subject keywords contained in the test question text usually have larger association with the probability that the learner correctly answers the test questions, so that the analysis of the test question text is necessary to be added into the personalized test paper of the learner.

Through the above analysis, the problems and defects of the prior art are as follows:

(1) in the prior art, the traditional personalized test paper combining method neglects the influence of test question text information on the answer of a learner.

(2) In the prior art, the method for test paper composition by using cognitive diagnosis ignores the commonality among learners and the parameter estimation in the model is sensitive to a data set and may generate larger errors;

(3) in the prior art, the method for combining examination papers by using collaborative filtering cannot fully consider knowledge point mastering conditions of learners, can only combine papers according to learning commonalities of similar learners, neglects learning characteristics of the learners in the learning process, and has poor interpretability of a paper combining result.

The difficulty in solving the above problems is:

(1) how to integrate test question text information into the cognitive process of a learner, and connecting the text with the learner score to obtain a mapping relation between the text and the score;

(2) how to combine the learner cognitive characteristics represented in the cognitive diagnosis result with the learner learning commonalities represented in the collaborative filtering result, namely comprehensively considering the learning commonalities and characteristics and making a tendency group exercise according to the learning characteristics of the learner.

(3) How to measure or adjust the influence of the learner's learning characteristics, learning commonalities and test question text information on the final test paper result and improve the application capability of the test paper result under different expected conditions.

Through the above analysis, the significance of solving the above problems and defects is:

(1) in the aspect of test question text analysis, the invention uses the recurrent neural network model RNN to extract the text content of the test question and constructs the mapping relation between the learner learning state and the test question text information through a full connection layer.

(2) The learner cognitive characteristics represented in the cognitive diagnosis result are combined with the learner learning commonalities represented in the collaborative filtering result, the learning commonalities and characteristics and the test question text information are comprehensively considered, and the test questions which are difficult to test or improve knowledge mastering can be selected according to the teaching targets.

(3) The invention takes the result of the model as prior information, can carry out personalized learning and paper grouping according to the learning conditions of different learners, and provides a plurality of learning and analyzing results with the learning state of the learner, the text information of the test question and the commonality among the learners, the interpretability of the paper grouping result is strong, and the education by the factors is realized.

(4) Compared with the traditional test question grouping algorithm, the test result shows that the method has higher performance improvement compared with the traditional method, and the defects of the traditional method in the aspect of test question grouping are improved. The method utilizes richer information, gives more accurate personalized test for the learner, and improves the learning efficiency of the learner.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a personalized paper combining method and a personalized paper combining system integrating cognitive characteristics and test question text information. The invention aims to solve the problem that the traditional personalized paper combining method neglects the influence of test question text information on the answer of a learner in the prior art; the method for testing the test paper group by using cognitive diagnosis ignores the commonality among learners and may generate larger errors when the parameter estimation in the model is sensitive to the data set; the method for testing and composing the test paper by utilizing the collaborative filtering can not practice aiming at the knowledge point mastering condition of the learner, and can only perform the composing of the test paper according to the learning commonality of similar learners, neglects the learning characteristics of the learner in the learning process and has poor interpretability of the composing result.

The invention is realized in this way, a personalized paper combining method for combining the learner cognitive characteristic and the test question text information, the personalized paper combining method for combining the learner cognitive characteristic and the test question text information comprises the following steps:

estimating and calculating knowledge mastering conditions of learners by using a cognitive diagnosis model according to real answering conditions and test question knowledge point distribution of the learners, and predicting scores of the learners on specific test questions based on cognitive levels;

extracting the text content of the test questions by using a recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test questions through a full connection layer, and predicting the score of the learner on the specific test questions based on the text information;

thirdly, constructing a probability matrix decomposition objective function based on the obtained predicted scores of the learner based on the cognitive level and the text information, and predicting the potential score of the learner on the specific test question;

and step four, calculating KL divergence by utilizing the estimated knowledge mastering vector of the learner and the incremental knowledge mastering vector of the learner, and selecting the test paper which enables the knowledge mastering trend of the learner to be increased and has proper difficulty to form the personalized test by combining the potential scores of the learner on the test paper.

Further, in step one, the estimating and calculating knowledge mastering conditions of the learner by using the cognitive diagnosis model according to the real answer condition of the learner and the distribution of the knowledge points of the test questions and predicting the score of the learner on the specific test questions based on the cognitive level comprises:

(1.1) collecting test question knowledge point distribution labeled by domain experts and response data of learners; calculating a Q matrix represented by test question knowledge points required to be used in the cognitive diagnosis model according to the distribution of the test question knowledge points labeled by domain experts;

(1.2) calculating the ideal response condition of the learner according to the prior knowledge point mastering mode eta of the learner:

wherein, pi_ijShows the ideal answer situation of the learner i on the jth test question eta_ikRepresents the mastery condition of learner i on known recognition point k, q_jkWhether the known jth test question is examined or not is shownA knowledge point k;

(1.3) estimating the probability s of the learner doing wrong test questions under the condition of mastering a certain test question and examining all knowledge points according to an expectation-maximization algorithm, and estimating the probability g of the learner doing wrong test questions under the condition of not mastering all corresponding knowledge points;

(1.4) calculating the probability that the learner answers correctly according to the estimated ideal answer condition of the learner, the probability s that the learner makes wrong test questions under the condition that the learner masters a certain test question and examines all known points, and the probability g that the learner does wrong test questions under the condition that the learner does not master corresponding knowledge points:

(1.5) obtaining the total likelihood function of the DINA model:

wherein L is 2^K；

(1.6) calculating knowledge grasping conditions of the learner using maximum likelihood estimation based on the obtained total likelihood function:

(1.7) calculating the scoring condition of the learner on a new test question according to the knowledge mastering condition of the learner:

further, in the second step, the extracting of the text content of the test question by using the recurrent neural network model, and the construction of the mapping relationship between the learner learning state and the test question text information through the full connection layer, and the predicting of the score of the learner on the specific test question based on the text information includes:

(2.1) performing text word segmentation, stop word removal and other preprocessing on the test question text; processing the test questions by using a continuous Word bag model of a Word vector model Word2vec, predicting Word vectors of target words according to the Word vectors of a plurality of words in the context of the target words, vectorizing the pre-processed test questions, and obtaining an embedded expression of the Word level of the test questions;

(2.2) using the test question word vector as input, acquiring test question context characteristic representation by adopting a bidirectional long-and-short time memory neural network, and acquiring test question sentence level representation through mean pooling to obtain test question word embedding representation;

(2.3) constructing a full-connection deep neural network, performing feature fusion on the embedded vector of the test question words and the knowledge mastering vector of the learner to serve as input of the full-connection deep neural network, and outputting scores of specific students on the test question;

(2.4) bonding the results of the layers to each other₁,y₂,...,y_n) Processed by an output unit:

obtaining a value between [0, 1], representing the probability of correct answer of the student to the test question text, comparing the value with real score data, and performing weight correction in the network;

and (2.5) training the network model by using the training set data to obtain a trained neural network model, and predicting the learner response result based on the text information by using the obtained trained neural network model.

Further, in the step (2.1), the text word segmentation of the test question text includes: based on the mixed dictionary, a method combining a bidirectional maximum matching method and statistics is adopted to perform mixed word segmentation on the test question text.

Further, in the step (2.1), the removing stop words from the test question text includes: and adding words which are irrelevant to sentences and test question text themes, do not contribute to test question labeling tasks and have low frequency in the test question text into the disabled word bank, and deleting words of mixed participles in the test question text appearing in the disabled word bank.

Further, in the step (2.2), the obtaining of the contextual feature representation of the test question by using the bidirectional long-and-short term memory neural network includes:

the BilSTM network adopts two LSTMs to obtain the context characteristics of different test questions from opposite directions, and the formula is as follows:

wherein a is₁,a₂,b₁And b₂G (-) is a hidden layer activation function, which is a weight coefficient,

for the forward hidden layer output at time t,

outputting the backward hidden layer at the time t; and fusing hidden layer outputs of two directions at each moment to construct a final output h_t：

Wherein c is₁And c₂F (-) is the output activation function for the weight coefficients.

Further, in step (2.2), the obtaining of the sentence-level representation of the test question through the mean pooling process includes:

obtaining the embedded expression E of the test words by average pooling at the test sentence level_h：

E_h＝mean(h₁,...,h_t)

Where mean (-) is the average pooling operation, i.e., taking the average of the eigenvalues as output within the domain.

Further, in step (2.3), the fully-connected deep neural network includes:

in the fully-connected deep neural network, the calculation method of the nth node value of the mth layer is as follows:

wherein N is the number of the units of the upper layer,

represents a weight coefficient from the ith cell of the m-1 th layer to the nth cell of the m-1 th layer;

and (5) obtaining the potential mapping relation between the knowledge mastery of the learner, the test question text information and the score by adopting a relu activation function.

Further, in step three, the probability matrix decomposition objective function is constructed based on the obtained predicted scores of the learner based on the cognitive level and the text information, and predicting the potential scores of the learner on the specific test question comprises:

(3.1) the learners are marked as R by the obtained score prediction based on the cognitive level and the score prediction based on the text information¹,R²And constructing a potential answer representation of the learner on the test question by using a probability matrix decomposition algorithm:

u and V are a learner characteristic matrix and a score characteristic matrix in probability matrix decomposition respectively, and alpha and beta are adjusting parameters of the learning condition of the learner and test question text information respectively;

(3.2) constructing a final objective function of the probability matrix decomposition:

(3.3) optimizing the objective function by using a gradient descent method to obtain an optimal learner result characteristic matrix U, V:

(3.4) predicting the performance of the learner by using the optimal feature matrix U, V obtained by training:

further, in the fourth step, the step of calculating the KL divergence by using the estimated learner knowledge and mastery vector and the learner incremental knowledge and mastery vector, and combining the potential scores of the learner on the test questions to select the test paper which enables the learner to increase the knowledge and mastery trend and has proper difficulty to form the personalized test comprises the following steps:

(4.1) obtaining all incremental knowledge base vectors of the learner based on the analysis to obtain knowledge base vectors of the learner

0≤d≤D；

(4.2) calculating learner knowledge learning vector eta estimated by the cognitive diagnosis model_iMastery with incremental knowledge of all learners

KL divergence measure of (1):

(4.3) selecting test papers which enable learners to have an increased knowledge mastering trend and are suitable in difficulty to form a personalized test:

another objective of the present invention is to provide a personalized test paper combining learner cognitive characteristics and test question text information system implementing the personalized test paper combining learner cognitive characteristics and test question text information method, wherein the personalized test paper combining learner cognitive characteristics and test question text information system comprises:

the cognitive level-based score prediction module is used for estimating and calculating the knowledge mastering condition of the learner by utilizing a cognitive diagnosis model according to the real answer condition of the learner and the test question recognition point distribution and predicting the cognitive level-based score of the learner on a specific test question;

the score prediction module based on the text information is used for extracting the text content of the test questions by utilizing the recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test questions through a full connection layer and predicting the score of the learner on the specific test questions based on the text information;

the question selecting strategy module is used for constructing a probability matrix decomposition objective function based on the obtained predicted scores of the learner based on the cognitive level and the text information and predicting the potential scores of the learner on the specific test questions; and calculating KL divergence by utilizing the estimated learner knowledge control vector and the learner incremental knowledge control vector, and selecting the test paper which ensures that the knowledge control trend of the learner is increased and the test paper with proper difficulty is formed by combining the potential scores of the learner on the test paper.

Another object of the present invention is to provide a computer apparatus comprising a memory and a processor, wherein the memory stores a computer program, and the computer program, when executed by the processor, causes the processor to execute the personalized group paper method of fusing learner cognitive characteristics and test question text information.

Another object of the present invention is to provide a computer-readable storage medium storing a computer program, which when executed by a processor, causes the processor to execute the personalized test paper combining learner cognitive features and test question text information.

The invention also aims to provide an information data processing terminal which is used for realizing the personalized paper combining method integrating the cognitive characteristics of learners and test question text information.

By combining all the technical schemes, the invention has the advantages and positive effects that: the invention provides proper training test questions for the learner by integrating the cognitive characteristics of the learner and the test question text information, thereby getting rid of the problems of traditional one-thousand-person-one questions, problem sea tactics and the like, and the learner can also carry out targeted training better, thereby improving the learning efficiency. The invention has wide application prospect in the fields of personalized learning, adaptive learning and intelligent education.

The invention provides an individualized test paper combining method integrating learner cognitive characteristics and test question text information, which combines a cognitive diagnosis model, and carries out test paper combining practice for a learner based on model collaborative filtering and a test question text information extraction model. The invention analyzes the knowledge point mastering condition of the learner so as to obtain the learning state of the learner, and the used cognitive diagnosis model is a widely used cognitive diagnosis DINA model. In the aspect of test question text analysis, the invention uses the recurrent neural network model RNN to extract the text content of the test question, and constructs the mapping relation between the learner learning state and the test question text information through a full connection layer. The method takes the result of the model as prior information, and is used for training with other information in a collaborative filtering method based on probability matrix decomposition, so that the result of the group paper has the learning state of the learner, the text information of the test question and the commonality among the learners at the same time.

Compared with the traditional group paper calculation method, the personalized group paper method fusing the cognitive characteristics and the test question text information has the advantages that the performance of the method is greatly improved compared with the traditional method according to the experimental result, and the defects of the traditional method in the aspect of test question recommendation are overcome. The invention utilizes richer information to provide more accurate personalized test question recommendation for the learner, thereby improving the learning efficiency of the learner.

The invention realizes the personalized test paper combining method for combining the cognitive characteristics of the learner and the test paper text information, can combine the cognitive diagnosis, the test paper text information and the common information of the learner to recommend the test paper to the target learner, obtains a more accurate personalized test paper combining the cognitive characteristics and the test paper text information, greatly increases the self-learning efficiency of the learner, and helps the learner to make up the short knowledge. The method can be applied to the fields of intelligent education technology, education data mining and the like, and can also provide effective support for subsequent education resource recommendation and the like, help an online education platform and a digital education platform to better predict the score of the learner, thereby efficiently discovering weak links of knowledge points of the learner and taking accurate remedial measures.

Compared with other test question knowledge estimation methods, the personalized test paper combining method combining the learner cognitive characteristics and the test question text information provided by the invention has the advantages that the comparison results of the accuracy, the recall rate and the F1 value of the personalized test paper combining method combining the learner cognitive characteristics and the test question text information and other methods in the same data set are shown in table 1.

TABLE 1 comparison of the results

The experimental results show that: the personalized paper combining method for integrating the cognitive characteristics of the learner and the test question text information combines the cognitive characteristics of the learner, and the learner learns the commonalities and the test question text information. The accuracy of the volume result is obviously better than that of other comparative experiments. Therefore, experiments show that the personalized paper combining method fusing the cognitive characteristics of learners and test question text information is more effective than other methods in the aspects of accuracy, recall ratio, F1 value and the like.

Meanwhile, analysis shows that the paper grouping method and the probability matrix decomposition method based on the DINA are slightly unstable, and the accuracy rate of the paper grouping method and the probability matrix decomposition method is reduced along with the increase of the number of test questions of the paper grouping. Although the conventional probability matrix decomposition is easy to implement, the potential information in the extracted data set is insufficient, which leads to the low precision of the probability matrix decomposition under general conditions, especially when a large amount of training data is faced. In a word, the personalized test paper combining method combining the cognitive characteristics of the learner and the text information of the test questions has the best experimental effect, and the best test paper which can combine the test questions with different difficulty levels and promote the cognitive growth of the learner can be constructed according to the test difficulty and the examination and check target.

The invention introduces test question text information as an important measurement index for recommending the educational test questions, so that the key information in the test question text can be utilized by the method provided by the invention.

The invention integrates test question text information, cognitive diagnosis technology and a collaborative filtering method, the learner learning condition obtained from the test question text information and the cognitive diagnosis is integrated into an objective function with collaborative filtering optimization, and the relation among the three is introduced into adjustment parameter adjustment, so as to obtain an optimal paper group model matched with the current data set.

In conclusion, the personalized test paper combining method for the learner cognitive characteristics and the test paper text information provided by the invention realizes more accurate test paper recommendation and personalized test paper combination for the learner, combines the cognitive diagnosis, the test paper text information and the learner learning commonality to recommend the test paper for the target learner, can customize the test paper combination result according to the test target and the test paper difficulty, greatly increases the self-learning efficiency of the learner, and more quickly helps the learner to make up the short board on the knowledge of the learner in a classroom. The method can be applied to the fields of education resource recommendation and evaluation, education data mining and the like, so that effective support is provided for follow-up education resource recommendation and the like, an online education platform is assisted, a digital education platform can better predict the score of a learner and provide an individualized test paper scheme, and therefore weak links of knowledge points of the learner can be efficiently diagnosed and accurate remedial measures can be taken.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of a personalized test paper combining learner cognitive characteristics and test question text information according to an embodiment of the present invention.

Fig. 2 is a flowchart of a personalized test paper combining learner cognitive characteristics and test question text information according to an embodiment of the present invention.

FIG. 3 is a schematic structural diagram of a personalized test paper system integrating the cognitive characteristics of learners and test question text information according to an embodiment of the present invention;

in the figure: 1. a score prediction module based on cognitive level; 2. a score prediction module based on the text information; 3. and a topic selection strategy module.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Aiming at the problems in the prior art, the invention provides a personalized paper combining method for learner cognitive characteristics and test question text information, and the invention is described in detail below with reference to the accompanying drawings.

The symbols involved in the present invention are as follows:

as shown in fig. 1-2, the personalized test paper combining method for learner cognitive characteristics and test question text information according to the embodiment of the present invention includes the following steps:

s101, estimating and calculating knowledge mastering conditions of the learner by using a cognitive diagnosis model according to the real answering conditions and test question knowledge point distribution of the learner, and predicting the score of the learner on a specific test question based on the cognitive level;

s102, extracting the text content of the test question by using a recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test question through a full connection layer, and predicting the score of the learner on the specific test question based on the text information;

s103, constructing a probability matrix decomposition objective function based on the obtained learner cognitive level and text information-based prediction score, and predicting the potential score of the learner on a specific test question;

s104, calculating KL divergence by utilizing the estimated learner knowledge mastering vector and the learner incremental knowledge mastering vector, and selecting the test paper which enables the knowledge mastering trend of the learner to be increased and enables the test paper with proper difficulty to be formed into the personalized test by combining the potential scores of the learner on the test paper.

In step S101, the estimating and calculating knowledge mastering conditions of the learner according to the real answer condition of the learner and the distribution of knowledge points of the test questions by using the cognitive diagnosis model and predicting the score of the learner on the specific test question based on the cognitive level according to the embodiment of the present invention includes:

wherein, pi_ijShows the ideal answer situation of the learner i on the jth test question eta_ikRepresents the mastery condition of learner i on known recognition point k, q_jkWhether the known jth test question examines a knowledge point k is represented;

(1.5) obtaining the total likelihood function of the DINA model:

wherein L is 2^K；

in step S102, the extracting text content of the test question by using the recurrent neural network model, and constructing a mapping relationship between the learner learning state and the test question text information through the full connection layer according to the embodiment of the present invention, and predicting the score of the learner on the specific test question based on the text information includes:

In step (2.1), the text word segmentation for the test question text provided by the embodiment of the present invention includes: based on a mixed dictionary, a method combining a bidirectional maximum matching method and statistics is adopted to perform mixed word segmentation on the test question text.

In step (2.1), the removing stop words from the test question text provided by the embodiment of the present invention includes: and adding words which are irrelevant to sentences and test question text themes, do not contribute to test question labeling tasks and have low frequency in the test question text into the disabled word bank, and deleting words of mixed participles in the test question text appearing in the disabled word bank.

In step (2.2), the obtaining of the context feature representation of the test question by using the bidirectional long-short time memory neural network provided by the embodiment of the invention comprises:

for the forward hidden layer output at time t,

In step (2.2), the obtaining of the sentence-level representation of the test question through the mean pooling process provided by the embodiment of the present invention includes:

E_h＝mean(h₁,...,h_t)

In step (2.3), the fully-connected deep neural network provided by the embodiment of the present invention includes:

wherein N is the number of the units of the upper layer,

In step S103, the method for constructing a probability matrix decomposition objective function based on the obtained predicted score of the learner based on the cognitive level and the text information according to the embodiment of the present invention includes:

in step S104, the step of calculating KL divergence by using the estimated learner knowledge mastering vector and the learner incremental knowledge mastering vector according to the embodiment of the present invention, and combining the potential scores of the learner on the test questions to select the test paper which enables the learner to have an increased knowledge mastering trend and is formed by the test questions with proper difficulty to perform the personalized test includes:

0≤d≤D；

KL divergence measure of (1):

as shown in fig. 3, the personalized test paper system integrating the learner's cognitive characteristics and the test question text information provided in the embodiment of the present invention includes:

the cognitive level-based score prediction module 1 is used for estimating and calculating the knowledge mastering condition of the learner by utilizing a cognitive diagnosis model according to the real answering condition of the learner and the test question recognition point distribution, and predicting the cognitive level-based score of the learner on a specific test question;

the score prediction module 2 based on the text information is used for extracting the text content of the test questions by utilizing a recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test questions through a full connection layer, and predicting the score of the learner on the specific test questions based on the text information;

the question selecting strategy module 3 is used for constructing a probability matrix decomposition objective function based on the obtained predicted scores of the learner based on the cognitive level and the text information and predicting the potential scores of the learner on the specific test questions; and calculating KL divergence by utilizing the estimated learner knowledge control vector and the learner incremental knowledge control vector, and selecting the test paper which ensures that the knowledge control trend of the learner is increased and the test paper with proper difficulty is formed by combining the potential scores of the learner on the test paper.

The technical effects of the present invention will be further described with reference to specific embodiments.

Example 1:

the invention discloses a personalized paper combining method integrating the cognitive characteristics of learners and test question text information, which comprises the following steps:

step one, mining learning states such as knowledge mastering conditions of learners by using a cognitive diagnosis model according to real answering conditions and test question knowledge point distribution of the learners.

And step two, extracting the text content of the test question by using a recurrent neural network model, and constructing a mapping relation between the learning state of the learner and the text information of the test question through a full connection layer.

And step three, constructing a probability matrix decomposition objective function by integrating the learning state of the learner, the text information of the test questions and the cognitive commonality of the learner, and excavating the potential scores of the learner on the specific test questions.

And step four, calculating KL divergence by utilizing the estimated knowledge mastering vector of the learner and the incremental knowledge mastering vector of the learner, and selecting test questions with increased knowledge mastering tendency and proper difficulty to form the test paper for the personalized test by combining the potential scores of the learner on the test questions.

Further, the first step comprises:

step a): collecting test question knowledge point distribution marked by domain experts and answer data of learners;

step b): calculating a test question knowledge point representation Q matrix required to be used in the cognitive diagnosis model according to the distribution of test question knowledge points labeled by domain experts;

step c): according to the prior knowledge point mastering mode eta of the learner, calculating the ideal response condition of the learner:

π_ijexpressing the ideal answer of the learner i on the jth test questionCase, η_ikRepresenting the mastery of the knowledge point k by the learner i, q_jkWhether the known jth test question examines a knowledge point k is represented;

step d): estimating the probability s of a learner doing wrong test questions under the condition of mastering a certain test question and examining all knowledge points according to an expectation maximization algorithm, and estimating the probability g of the learner doing wrong test questions under the condition of not mastering all corresponding knowledge points;

step e): calculating the probability that the learner answers correctly according to the estimated ideal answering condition of the learner, the probability s that the learner answers wrong questions under the condition that the learner examines all knowledge points while mastering a certain test question, and the probability g that the learner answers the test questions under the condition that the learner does not master all the corresponding knowledge points:

step f): the overall likelihood function of the DINA model is thus obtained:

wherein L is 2^KDue to the inclusion of an implicit variable η_lAnd the maximum likelihood estimation cannot be directly carried out, so that the expectation maximization method is adopted to solve the following steps:

Step h): and M: respectively order

And

the following can be obtained:

wherein, therein

Representing the expectation of the number of learners who lack at least one required knowledge point of the j-th question in the learners who belong to the first knowledge point grasping mode,

to represent

The number of people answering the correct jth question expects,

and

means of

And

similarly, the difference lies in

And

is an expectation in the case where the learner mastered all the required knowledge points for the jth question. So that it can be calculated from the estimate obtained in step E

And

and thus a new value of s is obtained_jAnd g_jAnd (6) estimating.

Step i): calculating knowledge mastery of the learner using maximum likelihood estimation using the total likelihood function:

step j): calculating the scoring condition of the learner on a new test question according to the knowledge mastering condition of the learner:

further, the second step comprises:

step 1): preprocessing a test question text, wherein the preprocessing mainly comprises text word segmentation and stop word removal;

step 2): and performing text word segmentation on the test question. Based on a mixed dictionary, performing mixed word segmentation on the test text by adopting a method of combining a bidirectional maximum matching method with statistics;

step 3): and stopping words according to the mixed word segmentation result. And words which are irrelevant to the subjects of the sentences and the test questions and do not contribute to the test question labeling task are removed, and words with low frequency also do not contribute to the test question labeling task, so that the words with low frequency are also treated as stop words. According to the two rules, a stop word bank is established, words appearing in the stop word bank are deleted, and words with low frequency are deleted;

step 4): processing the test questions by using a Continuous Bag Of Words (CBOW) model Of a Word vector model Word2vec, vectorizing the pre-processed input test questions, and obtaining an embedded expression Of the Word level Of the test questions. The CBOW model predicts word vectors of the target words according to the word vectors of a plurality of words in the context of the target words, and vectorizes the test questions;

step 5): the test word vector is used as input, a Bidirectional Long Short-Term Memory neural network (BilSTM) is firstly adopted to obtain a test context feature representation, then a mean pooling operation is introduced to obtain a test sentence level representation, and thus a test word embedding representation is obtained.

Step 6): the BilSTM network adopts two LSTMs to acquire the context characteristics of different test questions from opposite directions, and the calculation is defined as:

for the forward hidden layer output at time t,

for the backward hidden layer output at the time t, finally fusing the hidden layer outputs in two directions at each time to construct a final output h_t：

Wherein c is₁And c₂F (-) is the output activation function, which is the weight coefficient;

step 7): obtaining the embedded expression E of the test words by average pooling at the test sentence level_h：

E_h＝mean(h₁,...,h_t)

Mean (-) is average pooling operation, namely average of characteristic values is taken as output in the field, representative information in the whole window information can be obtained, and feature dimensions of test question texts and the number of model network parameters are reduced.

Step 8): and constructing a fully-connected deep neural network, performing characteristic fusion on the test question word embedded vector and the learner knowledge mastering vector to serve as input of the network, and outputting the score of a specific student on the test question.

Step 9): in the fully-connected deep neural network, the calculation method of the nth node value of the mth layer is as follows:

wherein N is the number of the units of the upper layer,

obtaining a potential mapping relation between knowledge mastering of the learner and test question text information and scores by adopting a relu activation function;

step 10): will fully connect the results of the layers (y)₁,y₂,...,y_n) Processed by an output unit:

obtaining a value between [0, 1], representing the probability of correct answer of the student to the test question text, and comparing the value with the real score data, thereby realizing weight correction in the network;

step 11): after the training of the training set data, a trained neural network model can be obtained, so that the response performance of learners based on text information can be predicted.

Further, the third step comprises:

step A): respectively recording the learner predicted by the cognitive level and the learner predicted by the text information obtained by the second step and the third step as R¹,R²And constructing a potential answer representation of the learner on the test question by using a probability matrix decomposition algorithm:

step B): and deducing a final objective function of probability matrix decomposition according to a result calculation formula integrating the learning condition of the learner, the test question text information and the commonality of the learner:

step C): optimizing an objective function by using a gradient descent method, and firstly, respectively solving partial derivatives of two characteristic matrixes U and V according to the objective function:

respectively setting the partial derivatives as 0 to obtain the recursive formula iterative calculation of the method until the result is converged or the maximum iterative times are reached, and finally obtaining the optimal learner result characteristic matrix U, V:

step D): and finally, obtaining a result of performance prediction of the learner by using the optimal feature matrix U and V obtained by training:

step E): and adjusting hyper-parameters in the experiment and adjusting parameters alpha and beta of the learner learning condition and test question text information to obtain the parameters most suitable for the data set, thereby obtaining a final training model.

Further, the fourth step comprises:

step I): recording the learners' knowledge vector obtained by analysis as eta_iFrom η_iObtaining all incremental knowledge mastering vectors eta of learners_i ^(d)，0≤d≤D；

Step II): calculating learner knowledge mastering vector eta estimated by cognitive diagnosis model_iWith all learners' incremental knowledge_i ^(d)KL divergence measure of (1):

step III): therefore, the test paper for the personalized test is formed by selecting the test questions with the appropriate difficulty, which increase the knowledge mastering trend of the learner, so that a recommendation result with the learning condition of the learner, the text information of the test questions and the commonality of the learner is obtained.

Compared with other test question knowledge estimation methods, the personalized test paper combining method fusing the cognitive characteristics of learners and test question text information compares the Precision @ K, the Recall rate Recall @ K and the F1 value F1@ K, and the calculation method comprises the following steps:

wherein l (i) represents the customized learning test questions formulated for the ith learner, m (i) represents the test questions matched with the learner in the question bank, and l (i) andm (i) represents the intersection of the two. The Precision @ K represents the probability of the correct recommendation in the recommendation result, the Recall ratio Recall @ K is also called Recall ratio and represents the degree that the recommendation result matches the correct recommendation in the question bank, and the Precision ratio and the Recall ratio are in a certain amount of contradiction, namely the Recall ratio is low when the Precision ratio is high. In order to conveniently display the experimental results, the traditional cognitive diagnosis method is recorded as DINA, and the traditional collaborative filtering method is recorded as PMF.

The comparison results of the accuracy, recall rate and F1 value of the personalized test paper combining method with the learner cognitive characteristics and the test question text information and other methods in the same data set are shown in Table 1.

TABLE 1 comparison of the results

Meanwhile, analysis shows that the paper grouping model and the probability matrix decomposition method based on the DINA are slightly unstable, and the accuracy rate of the paper grouping model and the probability matrix decomposition method is reduced along with the increase of the number of paper grouping test questions. Although the conventional probability matrix decomposition is easy to implement, the potential information in the extracted data set is insufficient, which leads to the low precision of the probability matrix decomposition under general conditions, especially when a large amount of training data is faced. In a word, the personalized test paper combining method combining the cognitive characteristics of the learner and the text information of the test questions has the best experimental effect, and the best test paper which can combine the test questions with different difficulty levels and promote the cognitive growth of the learner can be constructed according to the test difficulty and the examination and check target.

The invention introduces test question text information as an important measurement index of the personalized group paper, so that key information in the test question text can be utilized by the method provided by the invention.

In conclusion, the personalized test paper combining method for combining the cognitive characteristics of the learner and the test question text information provided by the invention realizes a more accurate test paper combining method, the method combines three aspects of information of cognitive diagnosis, test question text information and learning commonality of the learner to make a test paper combining strategy for the target learner, the test paper combining result can be defined according to the test target and the test question difficulty, the self-learning efficiency of the learner is greatly improved, and the learner is helped to make up short boards on the knowledge of the learner in a classroom more quickly. The method can be applied to the fields of education resource recommendation and evaluation, education data mining and the like, so that effective support is provided for follow-up education resource recommendation and the like, an online education platform is assisted, a digital education platform can better predict the score of a learner, and an individualized test paper combination scheme is provided, so that the diagnosis of weak links of knowledge points of the learner is efficiently carried out, and accurate remedial measures are taken.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any modification, equivalent replacement, and improvement made by those skilled in the art within the technical scope of the present invention disclosed in the present invention should be covered within the scope of the present invention.

Claims

1. an individualized test-setting system of fusion learner cognitive characteristics and test question text information, is characterized in that, the application information data processing terminal, the personalized test-writing system of described fusion learner's cognitive characteristics and test question text information includes: :

The cognitive level-based score prediction module is used to estimate and calculate the learner's knowledge mastery by using the cognitive diagnostic model according to the learner's real answering situation and the distribution of knowledge points in the test questions, and predict the learner's cognitive-based knowledge on specific test questions. level score;

The score prediction module based on text information is used to extract the text content of the test questions by using the cyclic neural network model, and construct the mapping relationship between the learner's learning state and the test question text information through the fully connected layer, and predict the learner's score on a specific test question. Score for text information;

The topic selection strategy module is used to construct a probability matrix decomposition objective function based on the learned prediction scores based on cognitive level and text information, and predict the potential scores of learners on specific test questions; use the estimated learner knowledge to master the vector and The learner increments the knowledge mastery vector, calculates the KL divergence, and combines the learner's potential score on the test questions to select the test questions that increase the learner's knowledge mastery trend and have suitable difficulty to form the test paper of the personalized test.

2. a kind of personalized paper-making method of fusion learner's cognitive characteristic and test question text information, it is characterized in that, be applied to information data processing terminal, described fusion learner's cognitive characteristic and the individualized paper-making method of test question text information include:

According to the learners' real answers and the distribution of knowledge points in the test questions, the cognitive diagnosis model is used to estimate and calculate the learner's knowledge mastery, and predict the learner's score based on the cognitive level on a specific test question;

The text content of the test questions is extracted by using the recurrent neural network model, and the mapping relationship between the learner's learning state and the test question text information is constructed through the fully connected layer, and the learner's score based on the text information on a specific test question is predicted;

Construct a probability matrix factorization objective function based on the predicted scores of learners based on cognitive level and text information, and predict learners' potential scores on specific test questions;

Using the estimated learner's knowledge mastery vector and learner's incremental knowledge mastery vector, calculate the KL divergence, and combine the learner's potential score on the test questions to select the test questions that make the learner's knowledge mastery tend to increase and the difficulty is appropriate to form a personalized test 's test paper.

3. as claimed in claim 2, it is characterized in that, according to the real answering situation of the learner and the distribution of the knowledge points of the test questions, the cognitive diagnostic model is utilized, Estimates calculate a learner's knowledge mastery and predict a learner's cognitive level-based score on a given test item including:

(1.1) Collect the distribution of knowledge points of test questions marked by domain experts and the answer data of learners; according to the distribution of knowledge points of test questions marked by domain experts, calculate the Q matrix representing the knowledge points of test questions that need to be used in the cognitive diagnosis model;

(1.2) According to the learner's prior knowledge point mastering mode η, calculate the learner's ideal answering situation:

Among them, _πij represents the ideal answering situation of the learner i on the jth test question, _ηik represents the learner i's mastery of the knowledge point k, and _qjk represents whether the known jth test question tests the knowledge point k;

(1.3) According to the expectation maximization algorithm, estimate the probability s of the learner making a wrong test question when he has mastered all the knowledge points of a certain test question, and the probability g of the learner doing the correct test question without fully grasping the corresponding knowledge points;

(1.4) According to the estimated ideal answering situation of the learner, the probability s of the learner making a wrong test question when he has mastered all the knowledge points of a certain test question, and the probability s of the learner doing the correct test question without fully grasping the corresponding knowledge points g, calculate the probability that the learner will answer correctly:

(1.5) Obtain the total likelihood function of the DINA model:

Wherein, L= ^2K ;

(1.6) Based on the obtained total likelihood function, use the maximum likelihood estimation to calculate the learner's knowledge mastery:

(1.7) According to the learner's mastery of knowledge, calculate the score of the new question:

4. as claimed in claim 2, it is characterized in that, the described utilization of the cyclic neural network model to extract the text content of the test question, and constructed by the fully connected layer The mapping relationship between the learner's learning status and the textual information of the test questions, and the prediction of the learner's score based on the textual information on a specific test question includes:

(2.1) Perform text segmentation, remove stop words and other preprocessing on the test text; use the continuous word bag model of the word vector model Word2vec to process the test questions, predict the word vector of the target word according to the word vectors of several words in the context of the target word, and convert After preprocessing, the input test questions are vectorized to obtain the word-level embedding representation of the test questions;

(2.2) Using the word vector of the test question as input, using a bidirectional long-short-term memory neural network to obtain the context feature representation of the test question, and obtaining the sentence-level representation of the test question through mean pooling processing, and obtaining the embedded representation of the test inscription;

(2.3) Construct a fully-connected deep neural network, perform feature fusion with the embedding vector of the test inscription and the learner's knowledge mastering vector as the input of the fully-connected deep neural network, and output the score of a specific student on the test question;

(2.4) The results of the fully connected layer (y ₁ , y ₂ ,...,y _n ) are processed by the output unit:

Get a value between [0, 1], indicating the probability of the student answering the question text correctly, compare it with the real score data, and correct the weight in the network;

(2.5) Use the training set data to train the network model, obtain the trained neural network model, and use the obtained trained neural network model to predict the learners' answering results based on text information.

5. as claimed in claim 4, it is characterized in that, in step (2.1), the described text segmentation of test question text comprises: based on hybrid dictionary, using The two-way maximum matching method is combined with statistics to perform mixed word segmentation on the test text;

In step (2.1), the described removal of stop words from the test question text includes: adding words in the test question text that are irrelevant to the sentence and the subject of the test question text, do not contribute to the test question labeling task, and whose frequency is too low, into the stop word bank, Delete the words that appear in the stop word database by the mixed word segmentation in the test question text;

In step (2.2), the use of a bidirectional long-short-term memory neural network to obtain the test question context feature representation includes:

The BiLSTM network uses two LSTMs to obtain the contextual features of different test items from opposite directions. The formula is as follows:

where a ₁ , a ₂ , b ₁ and b ₂ are the weight coefficients, and g( ) is the hidden layer activation function,

is the output of the forward hidden layer at time t,

is the output of the backward hidden layer at time t; and fuses the output of the hidden layer at each time in both directions to construct the final output h _t :

where c ₁ and c ₂ are the weight coefficients, and f( ) is the output activation function.

6. as claimed in claim 4, it is characterized in that, in step (2.2), the obtained test question sentence-level representation by mean pooling processing comprises:

In the test question sentence-level representation layer, the test question word embedding representation E _h is obtained by average pooling:

E _h =mean(h ₁ ,...,h _t );

Among them, mean( ) is the average pooling operation, that is, the average of the feature values in the field is taken as the output;

In step (2.3), the fully connected deep neural network includes:

In the fully connected deep neural network, the calculation method of the nth node value of the mth layer is:

where N is the number of units in the previous layer,

Represents the weight coefficient from the ith unit of the m-1th layer to the nth unit of the mth layer;

Using the relu activation function, the potential mapping relationship between the learner's knowledge mastery, the textual information of the test question and the score is obtained.

7. as claimed in claim 2, it is characterized in that, based on the learner obtained based on cognitive level, the predicted score based on text information constructs probability matrix Decomposing the objective function to predict a learner's potential score on a particular question includes:

(3.1) Denote the obtained score prediction based on cognitive level and score prediction based on text information as R ¹ , R ² , and use the probability matrix factorization algorithm to construct the learner's potential answer representation on the test question:

Among them, U and V are the learner characteristic matrix and the achievement characteristic matrix in the probability matrix decomposition, respectively, and α and β are the adjustment parameters of the learner's learning status and the text information of the test question, respectively;

(3.2) Construct the final objective function of probability matrix decomposition:

(3.3) Use the gradient descent method to optimize the objective function to obtain the optimal learner performance characteristic matrix U, V:

(3.4) Use the optimal feature matrix U, V obtained by training to predict the learner's performance:

The estimated learner's knowledge mastery vector and the learner's incremental knowledge mastery vector are used to calculate the KL divergence, and combined with the learner's potential score on the test questions, select the test questions that make the learner's knowledge mastery trend increase and the difficulty is suitable to form a personality The chemistry test papers include:

(4.1) Obtaining the learner's knowledge mastery vector based on the analysis to obtain all the learner's incremental knowledge mastery vector

0≤d≤D;

(4.2) The learner's knowledge mastery vector η _i estimated by the computational cognitive diagnostic model and the incremental knowledge mastery of all learners

The KL divergence measure of :

(4.3) Select the questions that increase the tendency of learners' knowledge mastery and have suitable difficulty to form the test paper for personalized test:

8. A computer device, characterized in that the computer device comprises a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor is made to execute claim 1 -7 The personalized test-setting method that integrates the learner's cognitive characteristics and the textual information of the test questions.

9. a computer-readable storage medium, stored with a computer program, when the computer program was executed by the processor, the processor was made to execute the personality of the described fusion learner cognitive characteristic and test question text information of claim 1-7 method of grouping.

10 . An information data processing terminal, characterized in that, the information data processing terminal is used to implement the personalized test-setting method described in claims 1 to 7 that integrates learners' cognitive characteristics and test question text information. 11 .