Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a personalized paper combining method and a personalized paper combining system integrating cognitive characteristics and test question text information. The invention aims to solve the problem that the traditional personalized paper combining method neglects the influence of test question text information on the answer of a learner in the prior art; the method for testing the test paper group by using cognitive diagnosis ignores the commonality among learners and may generate larger errors when the parameter estimation in the model is sensitive to the data set; the method for testing and composing the test paper by utilizing the collaborative filtering can not practice aiming at the knowledge point mastering condition of the learner, and can only perform the composing of the test paper according to the learning commonality of similar learners, neglects the learning characteristics of the learner in the learning process and has poor interpretability of the composing result.
The invention is realized in this way, a personalized paper combining method for combining the learner cognitive characteristic and the test question text information, the personalized paper combining method for combining the learner cognitive characteristic and the test question text information comprises the following steps:
estimating and calculating knowledge mastering conditions of learners by using a cognitive diagnosis model according to real answering conditions and test question knowledge point distribution of the learners, and predicting scores of the learners on specific test questions based on cognitive levels;
extracting the text content of the test questions by using a recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test questions through a full connection layer, and predicting the score of the learner on the specific test questions based on the text information;
thirdly, constructing a probability matrix decomposition objective function based on the obtained predicted scores of the learner based on the cognitive level and the text information, and predicting the potential score of the learner on the specific test question;
and step four, calculating KL divergence by utilizing the estimated knowledge mastering vector of the learner and the incremental knowledge mastering vector of the learner, and selecting the test paper which enables the knowledge mastering trend of the learner to be increased and has proper difficulty to form the personalized test by combining the potential scores of the learner on the test paper.
Further, in step one, the estimating and calculating knowledge mastering conditions of the learner by using the cognitive diagnosis model according to the real answer condition of the learner and the distribution of the knowledge points of the test questions and predicting the score of the learner on the specific test questions based on the cognitive level comprises:
(1.1) collecting test question knowledge point distribution labeled by domain experts and response data of learners; calculating a Q matrix represented by test question knowledge points required to be used in the cognitive diagnosis model according to the distribution of the test question knowledge points labeled by domain experts;
(1.2) calculating the ideal response condition of the learner according to the prior knowledge point mastering mode eta of the learner:
wherein, piijShows the ideal answer situation of the learner i on the jth test question etaikRepresents the mastery condition of learner i on known recognition point k, qjkWhether the known jth test question is examined or not is shownA knowledge point k;
(1.3) estimating the probability s of the learner doing wrong test questions under the condition of mastering a certain test question and examining all knowledge points according to an expectation-maximization algorithm, and estimating the probability g of the learner doing wrong test questions under the condition of not mastering all corresponding knowledge points;
(1.4) calculating the probability that the learner answers correctly according to the estimated ideal answer condition of the learner, the probability s that the learner makes wrong test questions under the condition that the learner masters a certain test question and examines all known points, and the probability g that the learner does wrong test questions under the condition that the learner does not master corresponding knowledge points:
(1.5) obtaining the total likelihood function of the DINA model:
wherein L is 2K;
(1.6) calculating knowledge grasping conditions of the learner using maximum likelihood estimation based on the obtained total likelihood function:
(1.7) calculating the scoring condition of the learner on a new test question according to the knowledge mastering condition of the learner:
further, in the second step, the extracting of the text content of the test question by using the recurrent neural network model, and the construction of the mapping relationship between the learner learning state and the test question text information through the full connection layer, and the predicting of the score of the learner on the specific test question based on the text information includes:
(2.1) performing text word segmentation, stop word removal and other preprocessing on the test question text; processing the test questions by using a continuous Word bag model of a Word vector model Word2vec, predicting Word vectors of target words according to the Word vectors of a plurality of words in the context of the target words, vectorizing the pre-processed test questions, and obtaining an embedded expression of the Word level of the test questions;
(2.2) using the test question word vector as input, acquiring test question context characteristic representation by adopting a bidirectional long-and-short time memory neural network, and acquiring test question sentence level representation through mean pooling to obtain test question word embedding representation;
(2.3) constructing a full-connection deep neural network, performing feature fusion on the embedded vector of the test question words and the knowledge mastering vector of the learner to serve as input of the full-connection deep neural network, and outputting scores of specific students on the test question;
(2.4) bonding the results of the layers to each other1,y2,...,yn) Processed by an output unit:
obtaining a value between [0, 1], representing the probability of correct answer of the student to the test question text, comparing the value with real score data, and performing weight correction in the network;
and (2.5) training the network model by using the training set data to obtain a trained neural network model, and predicting the learner response result based on the text information by using the obtained trained neural network model.
Further, in the step (2.1), the text word segmentation of the test question text includes: based on the mixed dictionary, a method combining a bidirectional maximum matching method and statistics is adopted to perform mixed word segmentation on the test question text.
Further, in the step (2.1), the removing stop words from the test question text includes: and adding words which are irrelevant to sentences and test question text themes, do not contribute to test question labeling tasks and have low frequency in the test question text into the disabled word bank, and deleting words of mixed participles in the test question text appearing in the disabled word bank.
Further, in the step (2.2), the obtaining of the contextual feature representation of the test question by using the bidirectional long-and-short term memory neural network includes:
the BilSTM network adopts two LSTMs to obtain the context characteristics of different test questions from opposite directions, and the formula is as follows:
wherein a is
1,a
2,b
1And b
2G (-) is a hidden layer activation function, which is a weight coefficient,
for the forward hidden layer output at time t,
outputting the backward hidden layer at the time t; and fusing hidden layer outputs of two directions at each moment to construct a final output h
t:
Wherein c is1And c2F (-) is the output activation function for the weight coefficients.
Further, in step (2.2), the obtaining of the sentence-level representation of the test question through the mean pooling process includes:
obtaining the embedded expression E of the test words by average pooling at the test sentence levelh:
Eh=mean(h1,...,ht)
Where mean (-) is the average pooling operation, i.e., taking the average of the eigenvalues as output within the domain.
Further, in step (2.3), the fully-connected deep neural network includes:
in the fully-connected deep neural network, the calculation method of the nth node value of the mth layer is as follows:
wherein N is the number of the units of the upper layer,
represents a weight coefficient from the ith cell of the m-1 th layer to the nth cell of the m-1 th layer;
and (5) obtaining the potential mapping relation between the knowledge mastery of the learner, the test question text information and the score by adopting a relu activation function.
Further, in step three, the probability matrix decomposition objective function is constructed based on the obtained predicted scores of the learner based on the cognitive level and the text information, and predicting the potential scores of the learner on the specific test question comprises:
(3.1) the learners are marked as R by the obtained score prediction based on the cognitive level and the score prediction based on the text information1,R2And constructing a potential answer representation of the learner on the test question by using a probability matrix decomposition algorithm:
u and V are a learner characteristic matrix and a score characteristic matrix in probability matrix decomposition respectively, and alpha and beta are adjusting parameters of the learning condition of the learner and test question text information respectively;
(3.2) constructing a final objective function of the probability matrix decomposition:
(3.3) optimizing the objective function by using a gradient descent method to obtain an optimal learner result characteristic matrix U, V:
(3.4) predicting the performance of the learner by using the optimal feature matrix U, V obtained by training:
further, in the fourth step, the step of calculating the KL divergence by using the estimated learner knowledge and mastery vector and the learner incremental knowledge and mastery vector, and combining the potential scores of the learner on the test questions to select the test paper which enables the learner to increase the knowledge and mastery trend and has proper difficulty to form the personalized test comprises the following steps:
(4.1) obtaining all incremental knowledge base vectors of the learner based on the analysis to obtain knowledge base vectors of the
learner 0≤d≤D;
(4.2) calculating learner knowledge learning vector eta estimated by the cognitive diagnosis model
iMastery with incremental knowledge of all learners
KL divergence measure of (1):
(4.3) selecting test papers which enable learners to have an increased knowledge mastering trend and are suitable in difficulty to form a personalized test:
another objective of the present invention is to provide a personalized test paper combining learner cognitive characteristics and test question text information system implementing the personalized test paper combining learner cognitive characteristics and test question text information method, wherein the personalized test paper combining learner cognitive characteristics and test question text information system comprises:
the cognitive level-based score prediction module is used for estimating and calculating the knowledge mastering condition of the learner by utilizing a cognitive diagnosis model according to the real answer condition of the learner and the test question recognition point distribution and predicting the cognitive level-based score of the learner on a specific test question;
the score prediction module based on the text information is used for extracting the text content of the test questions by utilizing the recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test questions through a full connection layer and predicting the score of the learner on the specific test questions based on the text information;
the question selecting strategy module is used for constructing a probability matrix decomposition objective function based on the obtained predicted scores of the learner based on the cognitive level and the text information and predicting the potential scores of the learner on the specific test questions; and calculating KL divergence by utilizing the estimated learner knowledge control vector and the learner incremental knowledge control vector, and selecting the test paper which ensures that the knowledge control trend of the learner is increased and the test paper with proper difficulty is formed by combining the potential scores of the learner on the test paper.
Another object of the present invention is to provide a computer apparatus comprising a memory and a processor, wherein the memory stores a computer program, and the computer program, when executed by the processor, causes the processor to execute the personalized group paper method of fusing learner cognitive characteristics and test question text information.
Another object of the present invention is to provide a computer-readable storage medium storing a computer program, which when executed by a processor, causes the processor to execute the personalized test paper combining learner cognitive features and test question text information.
The invention also aims to provide an information data processing terminal which is used for realizing the personalized paper combining method integrating the cognitive characteristics of learners and test question text information.
By combining all the technical schemes, the invention has the advantages and positive effects that: the invention provides proper training test questions for the learner by integrating the cognitive characteristics of the learner and the test question text information, thereby getting rid of the problems of traditional one-thousand-person-one questions, problem sea tactics and the like, and the learner can also carry out targeted training better, thereby improving the learning efficiency. The invention has wide application prospect in the fields of personalized learning, adaptive learning and intelligent education.
The invention provides an individualized test paper combining method integrating learner cognitive characteristics and test question text information, which combines a cognitive diagnosis model, and carries out test paper combining practice for a learner based on model collaborative filtering and a test question text information extraction model. The invention analyzes the knowledge point mastering condition of the learner so as to obtain the learning state of the learner, and the used cognitive diagnosis model is a widely used cognitive diagnosis DINA model. In the aspect of test question text analysis, the invention uses the recurrent neural network model RNN to extract the text content of the test question, and constructs the mapping relation between the learner learning state and the test question text information through a full connection layer. The method takes the result of the model as prior information, and is used for training with other information in a collaborative filtering method based on probability matrix decomposition, so that the result of the group paper has the learning state of the learner, the text information of the test question and the commonality among the learners at the same time.
Compared with the traditional group paper calculation method, the personalized group paper method fusing the cognitive characteristics and the test question text information has the advantages that the performance of the method is greatly improved compared with the traditional method according to the experimental result, and the defects of the traditional method in the aspect of test question recommendation are overcome. The invention utilizes richer information to provide more accurate personalized test question recommendation for the learner, thereby improving the learning efficiency of the learner.
The invention realizes the personalized test paper combining method for combining the cognitive characteristics of the learner and the test paper text information, can combine the cognitive diagnosis, the test paper text information and the common information of the learner to recommend the test paper to the target learner, obtains a more accurate personalized test paper combining the cognitive characteristics and the test paper text information, greatly increases the self-learning efficiency of the learner, and helps the learner to make up the short knowledge. The method can be applied to the fields of intelligent education technology, education data mining and the like, and can also provide effective support for subsequent education resource recommendation and the like, help an online education platform and a digital education platform to better predict the score of the learner, thereby efficiently discovering weak links of knowledge points of the learner and taking accurate remedial measures.
Compared with other test question knowledge estimation methods, the personalized test paper combining method combining the learner cognitive characteristics and the test question text information provided by the invention has the advantages that the comparison results of the accuracy, the recall rate and the F1 value of the personalized test paper combining method combining the learner cognitive characteristics and the test question text information and other methods in the same data set are shown in table 1.
TABLE 1 comparison of the results
The experimental results show that: the personalized paper combining method for integrating the cognitive characteristics of the learner and the test question text information combines the cognitive characteristics of the learner, and the learner learns the commonalities and the test question text information. The accuracy of the volume result is obviously better than that of other comparative experiments. Therefore, experiments show that the personalized paper combining method fusing the cognitive characteristics of learners and test question text information is more effective than other methods in the aspects of accuracy, recall ratio, F1 value and the like.
Meanwhile, analysis shows that the paper grouping method and the probability matrix decomposition method based on the DINA are slightly unstable, and the accuracy rate of the paper grouping method and the probability matrix decomposition method is reduced along with the increase of the number of test questions of the paper grouping. Although the conventional probability matrix decomposition is easy to implement, the potential information in the extracted data set is insufficient, which leads to the low precision of the probability matrix decomposition under general conditions, especially when a large amount of training data is faced. In a word, the personalized test paper combining method combining the cognitive characteristics of the learner and the text information of the test questions has the best experimental effect, and the best test paper which can combine the test questions with different difficulty levels and promote the cognitive growth of the learner can be constructed according to the test difficulty and the examination and check target.
The invention introduces test question text information as an important measurement index for recommending the educational test questions, so that the key information in the test question text can be utilized by the method provided by the invention.
The invention integrates test question text information, cognitive diagnosis technology and a collaborative filtering method, the learner learning condition obtained from the test question text information and the cognitive diagnosis is integrated into an objective function with collaborative filtering optimization, and the relation among the three is introduced into adjustment parameter adjustment, so as to obtain an optimal paper group model matched with the current data set.
In conclusion, the personalized test paper combining method for the learner cognitive characteristics and the test paper text information provided by the invention realizes more accurate test paper recommendation and personalized test paper combination for the learner, combines the cognitive diagnosis, the test paper text information and the learner learning commonality to recommend the test paper for the target learner, can customize the test paper combination result according to the test target and the test paper difficulty, greatly increases the self-learning efficiency of the learner, and more quickly helps the learner to make up the short board on the knowledge of the learner in a classroom. The method can be applied to the fields of education resource recommendation and evaluation, education data mining and the like, so that effective support is provided for follow-up education resource recommendation and the like, an online education platform is assisted, a digital education platform can better predict the score of a learner and provide an individualized test paper scheme, and therefore weak links of knowledge points of the learner can be efficiently diagnosed and accurate remedial measures can be taken.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides a personalized paper combining method for learner cognitive characteristics and test question text information, and the invention is described in detail below with reference to the accompanying drawings.
The symbols involved in the present invention are as follows:
as shown in fig. 1-2, the personalized test paper combining method for learner cognitive characteristics and test question text information according to the embodiment of the present invention includes the following steps:
s101, estimating and calculating knowledge mastering conditions of the learner by using a cognitive diagnosis model according to the real answering conditions and test question knowledge point distribution of the learner, and predicting the score of the learner on a specific test question based on the cognitive level;
s102, extracting the text content of the test question by using a recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test question through a full connection layer, and predicting the score of the learner on the specific test question based on the text information;
s103, constructing a probability matrix decomposition objective function based on the obtained learner cognitive level and text information-based prediction score, and predicting the potential score of the learner on a specific test question;
s104, calculating KL divergence by utilizing the estimated learner knowledge mastering vector and the learner incremental knowledge mastering vector, and selecting the test paper which enables the knowledge mastering trend of the learner to be increased and enables the test paper with proper difficulty to be formed into the personalized test by combining the potential scores of the learner on the test paper.
In step S101, the estimating and calculating knowledge mastering conditions of the learner according to the real answer condition of the learner and the distribution of knowledge points of the test questions by using the cognitive diagnosis model and predicting the score of the learner on the specific test question based on the cognitive level according to the embodiment of the present invention includes:
(1.1) collecting test question knowledge point distribution labeled by domain experts and response data of learners; calculating a Q matrix represented by test question knowledge points required to be used in the cognitive diagnosis model according to the distribution of the test question knowledge points labeled by domain experts;
(1.2) calculating the ideal response condition of the learner according to the prior knowledge point mastering mode eta of the learner:
wherein, piijShows the ideal answer situation of the learner i on the jth test question etaikRepresents the mastery condition of learner i on known recognition point k, qjkWhether the known jth test question examines a knowledge point k is represented;
(1.3) estimating the probability s of the learner doing wrong test questions under the condition of mastering a certain test question and examining all knowledge points according to an expectation-maximization algorithm, and estimating the probability g of the learner doing wrong test questions under the condition of not mastering all corresponding knowledge points;
(1.4) calculating the probability that the learner answers correctly according to the estimated ideal answer condition of the learner, the probability s that the learner makes wrong test questions under the condition that the learner masters a certain test question and examines all known points, and the probability g that the learner does wrong test questions under the condition that the learner does not master corresponding knowledge points:
(1.5) obtaining the total likelihood function of the DINA model:
wherein L is 2K;
(1.6) calculating knowledge grasping conditions of the learner using maximum likelihood estimation based on the obtained total likelihood function:
(1.7) calculating the scoring condition of the learner on a new test question according to the knowledge mastering condition of the learner:
in step S102, the extracting text content of the test question by using the recurrent neural network model, and constructing a mapping relationship between the learner learning state and the test question text information through the full connection layer according to the embodiment of the present invention, and predicting the score of the learner on the specific test question based on the text information includes:
(2.1) performing text word segmentation, stop word removal and other preprocessing on the test question text; processing the test questions by using a continuous Word bag model of a Word vector model Word2vec, predicting Word vectors of target words according to the Word vectors of a plurality of words in the context of the target words, vectorizing the pre-processed test questions, and obtaining an embedded expression of the Word level of the test questions;
(2.2) using the test question word vector as input, acquiring test question context characteristic representation by adopting a bidirectional long-and-short time memory neural network, and acquiring test question sentence level representation through mean pooling to obtain test question word embedding representation;
(2.3) constructing a full-connection deep neural network, performing feature fusion on the embedded vector of the test question words and the knowledge mastering vector of the learner to serve as input of the full-connection deep neural network, and outputting scores of specific students on the test question;
(2.4) bonding the results of the layers to each other1,y2,...,yn) Processed by an output unit:
obtaining a value between [0, 1], representing the probability of correct answer of the student to the test question text, comparing the value with real score data, and performing weight correction in the network;
and (2.5) training the network model by using the training set data to obtain a trained neural network model, and predicting the learner response result based on the text information by using the obtained trained neural network model.
In step (2.1), the text word segmentation for the test question text provided by the embodiment of the present invention includes: based on a mixed dictionary, a method combining a bidirectional maximum matching method and statistics is adopted to perform mixed word segmentation on the test question text.
In step (2.1), the removing stop words from the test question text provided by the embodiment of the present invention includes: and adding words which are irrelevant to sentences and test question text themes, do not contribute to test question labeling tasks and have low frequency in the test question text into the disabled word bank, and deleting words of mixed participles in the test question text appearing in the disabled word bank.
In step (2.2), the obtaining of the context feature representation of the test question by using the bidirectional long-short time memory neural network provided by the embodiment of the invention comprises:
the BilSTM network adopts two LSTMs to obtain the context characteristics of different test questions from opposite directions, and the formula is as follows:
wherein a is
1,a
2,b
1And b
2G (-) is a hidden layer activation function, which is a weight coefficient,
for the forward hidden layer output at time t,
outputting the backward hidden layer at the time t; and fusing hidden layer outputs of two directions at each moment to construct a final output h
t:
Wherein c is1And c2F (-) is the output activation function for the weight coefficients.
In step (2.2), the obtaining of the sentence-level representation of the test question through the mean pooling process provided by the embodiment of the present invention includes:
obtaining the embedded expression E of the test words by average pooling at the test sentence levelh:
Eh=mean(h1,...,ht)
Where mean (-) is the average pooling operation, i.e., taking the average of the eigenvalues as output within the domain.
In step (2.3), the fully-connected deep neural network provided by the embodiment of the present invention includes:
in the fully-connected deep neural network, the calculation method of the nth node value of the mth layer is as follows:
wherein N is the number of the units of the upper layer,
represents a weight coefficient from the ith cell of the m-1 th layer to the nth cell of the m-1 th layer;
and (5) obtaining the potential mapping relation between the knowledge mastery of the learner, the test question text information and the score by adopting a relu activation function.
In step S103, the method for constructing a probability matrix decomposition objective function based on the obtained predicted score of the learner based on the cognitive level and the text information according to the embodiment of the present invention includes:
(3.1) the learners are marked as R by the obtained score prediction based on the cognitive level and the score prediction based on the text information1,R2And constructing a potential answer representation of the learner on the test question by using a probability matrix decomposition algorithm:
u and V are a learner characteristic matrix and a score characteristic matrix in probability matrix decomposition respectively, and alpha and beta are adjusting parameters of the learning condition of the learner and test question text information respectively;
(3.2) constructing a final objective function of the probability matrix decomposition:
(3.3) optimizing the objective function by using a gradient descent method to obtain an optimal learner result characteristic matrix U, V:
(3.4) predicting the performance of the learner by using the optimal feature matrix U, V obtained by training:
in step S104, the step of calculating KL divergence by using the estimated learner knowledge mastering vector and the learner incremental knowledge mastering vector according to the embodiment of the present invention, and combining the potential scores of the learner on the test questions to select the test paper which enables the learner to have an increased knowledge mastering trend and is formed by the test questions with proper difficulty to perform the personalized test includes:
(4.1) obtaining all incremental knowledge base vectors of the learner based on the analysis to obtain knowledge base vectors of the
learner 0≤d≤D;
(4.2) calculating learner knowledge learning vector eta estimated by the cognitive diagnosis model
iMastery with incremental knowledge of all learners
KL divergence measure of (1):
(4.3) selecting test papers which enable learners to have an increased knowledge mastering trend and are suitable in difficulty to form a personalized test:
as shown in fig. 3, the personalized test paper system integrating the learner's cognitive characteristics and the test question text information provided in the embodiment of the present invention includes:
the cognitive level-based score prediction module 1 is used for estimating and calculating the knowledge mastering condition of the learner by utilizing a cognitive diagnosis model according to the real answering condition of the learner and the test question recognition point distribution, and predicting the cognitive level-based score of the learner on a specific test question;
the score prediction module 2 based on the text information is used for extracting the text content of the test questions by utilizing a recurrent neural network model, constructing a mapping relation between the learning state of the learner and the text information of the test questions through a full connection layer, and predicting the score of the learner on the specific test questions based on the text information;
the question selecting strategy module 3 is used for constructing a probability matrix decomposition objective function based on the obtained predicted scores of the learner based on the cognitive level and the text information and predicting the potential scores of the learner on the specific test questions; and calculating KL divergence by utilizing the estimated learner knowledge control vector and the learner incremental knowledge control vector, and selecting the test paper which ensures that the knowledge control trend of the learner is increased and the test paper with proper difficulty is formed by combining the potential scores of the learner on the test paper.
The technical effects of the present invention will be further described with reference to specific embodiments.
Example 1:
the invention discloses a personalized paper combining method integrating the cognitive characteristics of learners and test question text information, which comprises the following steps:
step one, mining learning states such as knowledge mastering conditions of learners by using a cognitive diagnosis model according to real answering conditions and test question knowledge point distribution of the learners.
And step two, extracting the text content of the test question by using a recurrent neural network model, and constructing a mapping relation between the learning state of the learner and the text information of the test question through a full connection layer.
And step three, constructing a probability matrix decomposition objective function by integrating the learning state of the learner, the text information of the test questions and the cognitive commonality of the learner, and excavating the potential scores of the learner on the specific test questions.
And step four, calculating KL divergence by utilizing the estimated knowledge mastering vector of the learner and the incremental knowledge mastering vector of the learner, and selecting test questions with increased knowledge mastering tendency and proper difficulty to form the test paper for the personalized test by combining the potential scores of the learner on the test questions.
Further, the first step comprises:
step a): collecting test question knowledge point distribution marked by domain experts and answer data of learners;
step b): calculating a test question knowledge point representation Q matrix required to be used in the cognitive diagnosis model according to the distribution of test question knowledge points labeled by domain experts;
step c): according to the prior knowledge point mastering mode eta of the learner, calculating the ideal response condition of the learner:
πijexpressing the ideal answer of the learner i on the jth test questionCase, ηikRepresenting the mastery of the knowledge point k by the learner i, qjkWhether the known jth test question examines a knowledge point k is represented;
step d): estimating the probability s of a learner doing wrong test questions under the condition of mastering a certain test question and examining all knowledge points according to an expectation maximization algorithm, and estimating the probability g of the learner doing wrong test questions under the condition of not mastering all corresponding knowledge points;
step e): calculating the probability that the learner answers correctly according to the estimated ideal answering condition of the learner, the probability s that the learner answers wrong questions under the condition that the learner examines all knowledge points while mastering a certain test question, and the probability g that the learner answers the test questions under the condition that the learner does not master all the corresponding knowledge points:
step f): the overall likelihood function of the DINA model is thus obtained:
wherein L is 2KDue to the inclusion of an implicit variable ηlAnd the maximum likelihood estimation cannot be directly carried out, so that the expectation maximization method is adopted to solve the following steps:
step g): step E, utilizing the s obtained in the previous roundjAnd gjEstimating the calculation matrix P (R | eta) [ P (R) ]i|ηl)]I×LAnd using P (R | α) to calculate a matrix P (η | R) ═ P (η | R)l|Ri)]L×IThe value of (c).
Step h): and M: respectively order
And
the following can be obtained:
wherein, therein
Representing the expectation of the number of learners who lack at least one required knowledge point of the j-th question in the learners who belong to the first knowledge point grasping mode,
to represent
The number of people answering the correct jth question expects,
and
means of
And
similarly, the difference lies in
And
is an expectation in the case where the learner mastered all the required knowledge points for the jth question. So that it can be calculated from the estimate obtained in step E
And
and thus a new value of s is obtained
jAnd g
jAnd (6) estimating.
Step i): calculating knowledge mastery of the learner using maximum likelihood estimation using the total likelihood function:
step j): calculating the scoring condition of the learner on a new test question according to the knowledge mastering condition of the learner:
further, the second step comprises:
step 1): preprocessing a test question text, wherein the preprocessing mainly comprises text word segmentation and stop word removal;
step 2): and performing text word segmentation on the test question. Based on a mixed dictionary, performing mixed word segmentation on the test text by adopting a method of combining a bidirectional maximum matching method with statistics;
step 3): and stopping words according to the mixed word segmentation result. And words which are irrelevant to the subjects of the sentences and the test questions and do not contribute to the test question labeling task are removed, and words with low frequency also do not contribute to the test question labeling task, so that the words with low frequency are also treated as stop words. According to the two rules, a stop word bank is established, words appearing in the stop word bank are deleted, and words with low frequency are deleted;
step 4): processing the test questions by using a Continuous Bag Of Words (CBOW) model Of a Word vector model Word2vec, vectorizing the pre-processed input test questions, and obtaining an embedded expression Of the Word level Of the test questions. The CBOW model predicts word vectors of the target words according to the word vectors of a plurality of words in the context of the target words, and vectorizes the test questions;
step 5): the test word vector is used as input, a Bidirectional Long Short-Term Memory neural network (BilSTM) is firstly adopted to obtain a test context feature representation, then a mean pooling operation is introduced to obtain a test sentence level representation, and thus a test word embedding representation is obtained.
Step 6): the BilSTM network adopts two LSTMs to acquire the context characteristics of different test questions from opposite directions, and the calculation is defined as:
wherein a is
1,a
2,b
1And b
2G (-) is a hidden layer activation function, which is a weight coefficient,
for the forward hidden layer output at time t,
for the backward hidden layer output at the time t, finally fusing the hidden layer outputs in two directions at each time to construct a final output h
t:
Wherein c is1And c2F (-) is the output activation function, which is the weight coefficient;
step 7): obtaining the embedded expression E of the test words by average pooling at the test sentence levelh:
Eh=mean(h1,...,ht)
Mean (-) is average pooling operation, namely average of characteristic values is taken as output in the field, representative information in the whole window information can be obtained, and feature dimensions of test question texts and the number of model network parameters are reduced.
Step 8): and constructing a fully-connected deep neural network, performing characteristic fusion on the test question word embedded vector and the learner knowledge mastering vector to serve as input of the network, and outputting the score of a specific student on the test question.
Step 9): in the fully-connected deep neural network, the calculation method of the nth node value of the mth layer is as follows:
wherein N is the number of the units of the upper layer,
represents a weight coefficient from the ith cell of the m-1 th layer to the nth cell of the m-1 th layer;
obtaining a potential mapping relation between knowledge mastering of the learner and test question text information and scores by adopting a relu activation function;
step 10): will fully connect the results of the layers (y)1,y2,...,yn) Processed by an output unit:
obtaining a value between [0, 1], representing the probability of correct answer of the student to the test question text, and comparing the value with the real score data, thereby realizing weight correction in the network;
step 11): after the training of the training set data, a trained neural network model can be obtained, so that the response performance of learners based on text information can be predicted.
Further, the third step comprises:
step A): respectively recording the learner predicted by the cognitive level and the learner predicted by the text information obtained by the second step and the third step as R1,R2And constructing a potential answer representation of the learner on the test question by using a probability matrix decomposition algorithm:
u and V are a learner characteristic matrix and a score characteristic matrix in probability matrix decomposition respectively, and alpha and beta are adjusting parameters of the learning condition of the learner and test question text information respectively;
step B): and deducing a final objective function of probability matrix decomposition according to a result calculation formula integrating the learning condition of the learner, the test question text information and the commonality of the learner:
step C): optimizing an objective function by using a gradient descent method, and firstly, respectively solving partial derivatives of two characteristic matrixes U and V according to the objective function:
respectively setting the partial derivatives as 0 to obtain the recursive formula iterative calculation of the method until the result is converged or the maximum iterative times are reached, and finally obtaining the optimal learner result characteristic matrix U, V:
step D): and finally, obtaining a result of performance prediction of the learner by using the optimal feature matrix U and V obtained by training:
step E): and adjusting hyper-parameters in the experiment and adjusting parameters alpha and beta of the learner learning condition and test question text information to obtain the parameters most suitable for the data set, thereby obtaining a final training model.
Further, the fourth step comprises:
step I): recording the learners' knowledge vector obtained by analysis as etaiFrom ηiObtaining all incremental knowledge mastering vectors eta of learnersi (d),0≤d≤D;
Step II): calculating learner knowledge mastering vector eta estimated by cognitive diagnosis modeliWith all learners' incremental knowledgei (d)KL divergence measure of (1):
step III): therefore, the test paper for the personalized test is formed by selecting the test questions with the appropriate difficulty, which increase the knowledge mastering trend of the learner, so that a recommendation result with the learning condition of the learner, the text information of the test questions and the commonality of the learner is obtained.
Compared with other test question knowledge estimation methods, the personalized test paper combining method fusing the cognitive characteristics of learners and test question text information compares the Precision @ K, the Recall rate Recall @ K and the F1 value F1@ K, and the calculation method comprises the following steps:
wherein l (i) represents the customized learning test questions formulated for the ith learner, m (i) represents the test questions matched with the learner in the question bank, and l (i) andm (i) represents the intersection of the two. The Precision @ K represents the probability of the correct recommendation in the recommendation result, the Recall ratio Recall @ K is also called Recall ratio and represents the degree that the recommendation result matches the correct recommendation in the question bank, and the Precision ratio and the Recall ratio are in a certain amount of contradiction, namely the Recall ratio is low when the Precision ratio is high. In order to conveniently display the experimental results, the traditional cognitive diagnosis method is recorded as DINA, and the traditional collaborative filtering method is recorded as PMF.
The comparison results of the accuracy, recall rate and F1 value of the personalized test paper combining method with the learner cognitive characteristics and the test question text information and other methods in the same data set are shown in Table 1.
TABLE 1 comparison of the results
The experimental results show that: the personalized paper combining method for integrating the cognitive characteristics of the learner and the test question text information combines the cognitive characteristics of the learner, and the learner learns the commonalities and the test question text information. The accuracy of the volume result is obviously better than that of other comparative experiments. Therefore, experiments show that the personalized paper combining method fusing the cognitive characteristics of learners and test question text information is more effective than other methods in the aspects of accuracy, recall ratio, F1 value and the like.
Meanwhile, analysis shows that the paper grouping model and the probability matrix decomposition method based on the DINA are slightly unstable, and the accuracy rate of the paper grouping model and the probability matrix decomposition method is reduced along with the increase of the number of paper grouping test questions. Although the conventional probability matrix decomposition is easy to implement, the potential information in the extracted data set is insufficient, which leads to the low precision of the probability matrix decomposition under general conditions, especially when a large amount of training data is faced. In a word, the personalized test paper combining method combining the cognitive characteristics of the learner and the text information of the test questions has the best experimental effect, and the best test paper which can combine the test questions with different difficulty levels and promote the cognitive growth of the learner can be constructed according to the test difficulty and the examination and check target.
The invention introduces test question text information as an important measurement index of the personalized group paper, so that key information in the test question text can be utilized by the method provided by the invention.
The invention integrates test question text information, cognitive diagnosis technology and a collaborative filtering method, the learner learning condition obtained from the test question text information and the cognitive diagnosis is integrated into an objective function with collaborative filtering optimization, and the relation among the three is introduced into adjustment parameter adjustment, so as to obtain an optimal paper group model matched with the current data set.
In conclusion, the personalized test paper combining method for combining the cognitive characteristics of the learner and the test question text information provided by the invention realizes a more accurate test paper combining method, the method combines three aspects of information of cognitive diagnosis, test question text information and learning commonality of the learner to make a test paper combining strategy for the target learner, the test paper combining result can be defined according to the test target and the test question difficulty, the self-learning efficiency of the learner is greatly improved, and the learner is helped to make up short boards on the knowledge of the learner in a classroom more quickly. The method can be applied to the fields of education resource recommendation and evaluation, education data mining and the like, so that effective support is provided for follow-up education resource recommendation and the like, an online education platform is assisted, a digital education platform can better predict the score of a learner, and an individualized test paper combination scheme is provided, so that the diagnosis of weak links of knowledge points of the learner is efficiently carried out, and accurate remedial measures are taken.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any modification, equivalent replacement, and improvement made by those skilled in the art within the technical scope of the present invention disclosed in the present invention should be covered within the scope of the present invention.