CN116705294A - Interpretable dynamic cognitive diagnosis method based on memory network - Google Patents
Interpretable dynamic cognitive diagnosis method based on memory network Download PDFInfo
- Publication number
- CN116705294A CN116705294A CN202310640694.4A CN202310640694A CN116705294A CN 116705294 A CN116705294 A CN 116705294A CN 202310640694 A CN202310640694 A CN 202310640694A CN 116705294 A CN116705294 A CN 116705294A
- Authority
- CN
- China
- Prior art keywords
- features
- knowledge
- student
- interpretable
- diagnosis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000001149 cognitive effect Effects 0.000 title claims abstract description 55
- 238000003745 diagnosis Methods 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000012360 testing method Methods 0.000 claims abstract description 71
- 238000013528 artificial neural network Methods 0.000 claims abstract description 41
- 239000013598 vector Substances 0.000 claims abstract description 32
- 230000004044 response Effects 0.000 claims abstract description 16
- 238000012512 characterization method Methods 0.000 claims abstract description 11
- 230000008859 change Effects 0.000 claims abstract description 5
- 230000003993 interaction Effects 0.000 claims description 35
- 230000006870 function Effects 0.000 claims description 33
- 239000011159 matrix material Substances 0.000 claims description 29
- 230000002452 interceptive effect Effects 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 11
- 230000004927 fusion Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 8
- 238000005457 optimization Methods 0.000 claims description 3
- 238000012774 diagnostic algorithm Methods 0.000 claims description 2
- 238000002405 diagnostic procedure Methods 0.000 claims 5
- 241001649081 Dina Species 0.000 claims 1
- 230000002708 enhancing effect Effects 0.000 abstract description 3
- 230000007774 longterm Effects 0.000 abstract description 3
- 230000005012 migration Effects 0.000 abstract description 3
- 238000013508 migration Methods 0.000 abstract description 3
- 238000007418 data mining Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 10
- 238000013135 deep learning Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 238000011835 investigation Methods 0.000 description 4
- 210000002569 neuron Anatomy 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000007619 statistical method Methods 0.000 description 3
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000035484 reaction time Effects 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 241000764238 Isis Species 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 230000003930 cognitive ability Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Educational Technology (AREA)
- Databases & Information Systems (AREA)
- Educational Administration (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention belongs to the field of education data mining, and provides an interpretable dynamic cognitive diagnosis method based on a memory network, which comprises the following steps of: (1) Constructing an interpretable dynamic cognitive diagnosis framework based on a memory network; (2) updating knowledge proficiency using a memory network structure; (3) fusing student characteristics and test question characteristics; (4) Using the neural network structure modeling diagnosis algorithm, taking the final input characterization vector obtained in the step (3) as the input of a network structure, and outputting a student response result; (5) Predicting response of students and analyzing the change condition of knowledge proficiency. The method of the invention initializes the learning diagnosis of the learner from a plurality of angles to improve the interpretation of the model, and simultaneously utilizes the memory network to construct the migration representation of the learning skill level of the learner at the knowledge state level, thereby not only improving the accuracy of deducing the dynamic knowledge skill level of the learner, but also enhancing the capability of capturing the long-term dependence in the test question sequence.
Description
Technical Field
The invention belongs to the field of education data mining, and particularly relates to an interpretable dynamic cognitive diagnosis method based on a memory network.
Technical Field
In recent years, with the popularity of cognitive psychology and artificial intelligence technologies, the development of online platforms (MOOCs) and intelligent guidance systems on a large scale has become increasingly intelligent, and one key technology of these systems is cognitive diagnostics. Cognitive diagnostic theory serves as a new generation of educational measurement theory by modeling the cognitive processes of a learner and mining the potential abilities and skill states of the learner, rather than giving a score to the learner singly. Learner, exercise and knowledge concepts are the most important components of the cognitive diagnostic system, and the Cognitive Diagnostic Model (CDM) task aims to model complex relationships between the three components to simulate learner performance, deducing their accumulated knowledge state during exercise. The cognitive diagnosis can be divided into two types based on different modeling modes, namely a cognitive diagnosis model based on a statistical method and a cognitive diagnosis model based on a neural network.
The cognitive diagnosis model based on the statistical method models probability modeling of the response process of students through different learning assumptions, and further diagnoses the skill and skill state of the learner. On the one hand, based on the skill status of the learner, it can be classified into two cases of potential trait and specific knowledge skill. The cognitive diagnosis model based on the potential characteristics of the learner is represented by an Item Reaction Theory (IRT), models potential cognitive ability of students as continuous parameters, and is integrated with exercise difficulty characteristics. On the other hand, a cognitive diagnostic model modeled based on a specific knowledge skill state is represented by a connected deterministic input noise and gate (DINA) model, which uses binary discrete vectors to represent whether a learner grasps a certain concept, and assumes that the learner can correctly answer exercises only by grasping all concepts contained in the exercises, which has good interpretability. However, statistical-based CDM relies on artificially designed diagnostic functions, which are insufficient to capture complex interactions between learners and exercises.
The development of new generation artificial intelligence is benefited, the cognitive diagnosis problem based on a statistical method is solved after deep learning appears, and the introduction of the deep learning into the cognitive diagnosis has become a trend. For example, neuro-cognitive diagnostics (neuro CD) uses multiple Neural layers to model the complex interactive behavior of a learner with exercises, which improves the accuracy of the diagnostic results, but neuro CD ignores implicit relationships between knowledge concepts and between exercises. Relationship diagram driven cognitive diagnosis (RCD) uniformly models interaction and structural relationship through a multi-layer student-exercise-concept relationship diagram, and utilizes a multi-layer attention network to realize relationship aggregation between a local diagram and nodes; based on the mechanism of attention and neural networks, the Deep cognitive diagnostic framework (Deep CDM) considers the importance and interactions of knowledge concepts. To utilize textual information of exercises in cognitive diagnostics, related researchers have implemented neural network-based IRT models. In the process of combining cognitive diagnosis and deep learning, a great deal of research work is focused on improving the accuracy of a model without deep exploration of student skill mastery states. Enabling deep learning to learn diagnosis while the interaction between learner-practice-knowledge concepts can be well modeled, how to solve the "black box characteristics" of deep learning, enhancing the interpretability of the diagnosis process, becomes a urgent problem to be solved.
Disclosure of Invention
The invention aims to provide an interpretable dynamic cognitive diagnosis method based on a memory network, aiming at the problems that a deep learning modeling cognitive diagnosis function is not interpretable, a student knowledge mastering state can change dynamically and the like. Firstly, referring to parameters in a psychological measurement theory, introducing learner capacity, test question difficulty and distinction degree, guessing and error parameters of a learner, and based on response time, providing multidimensional characteristics such as response speed and the like, initializing learning diagnosis of the learner from multiple angles so as to improve model interpretability. Secondly, the memory network is utilized to construct the migration representation of the learning knowledge proficiency level at the knowledge state level, so that the accuracy of deducing the learning dynamic knowledge proficiency level is improved, and the capability of capturing the long-term dependence in the test question sequence is enhanced.
In order to achieve the aim, the invention adopts the following technical scheme.
The invention provides an interpretable dynamic cognitive diagnosis method based on a memory network, which comprises the following steps of:
(1) Constructing an interpretable dynamic cognitive diagnosis framework based on a memory network; the method comprises feature extraction, feature interaction and cognitive diagnosis modeling based on interpretable dynamics of a memory network;
(2) Updating knowledge proficiency by using a memory network structure; taking the student knowledge features extracted in the step (1) as the input of a network structure, and storing and outputting the knowledge proficiency;
(3) Fusing student characteristics and test question characteristics; fusing the student capability features and the speed features extracted in the step (1) with the knowledge proficiency features and the test question features of the step (2) to obtain a final input characterization vector;
(4) Using the neural network structure modeling diagnosis algorithm, taking the final input characterization vector obtained in the step (3) as the input of a network structure, and outputting a student response result; the diagnosis algorithm consists of a neural network structure and a loss function;
(5) And collecting a data set, training a neural network structure, predicting response of students and analyzing the change condition of knowledge proficiency.
In the above technical solution, the constructing the memory network-based interpretable dynamic cognitive diagnostic framework in step (1) specifically includes:
(1-1) extracting features, wherein the feature extraction comprises the steps of extracting student features, test question features and interaction features, the student features comprise knowledge proficiency features, capability features and speed features, the test question features comprise difficulty features, distinguishing degree features and Q matrix, and the interaction features comprise guessing error features; the Q matrix represents knowledge points of examination questions, the column represents knowledge points, the row represents examination questions, and the element only takes a binary matrix of 0 or 1, for example, if the first question examines knowledge point 1, the first row is first column 1, and the other rows are first column 0;
(1-2) feature interaction, namely, reflecting the student features and test question feature reference items to a theoretical model and a DINA model to interact, namely, each feature in the test question features interacts with each feature of the students;
(1-3) based on the interpretable dynamic cognitive diagnosis modeling of the memory network, outputting diagnosis prediction results according to the diagnosis data of the students, wherein the results comprise knowledge grasping state results and answer prediction scores of the students on test questions, and the interpretable dynamic cognitive diagnosis model based on the memory network consists of three parts, namely initialization parameters, feature fusion and deep diagnosis.
In the above technical solution, the updating of the knowledge proficiency level by using the memory network structure in the step (2) specifically includes:
(2-1) extracting a knowledge embedding vector from the test questions, calculating the related weight of the knowledge embedding vector, and extracting an increase vector from the test questions and the answer records;
(2-2) passing the growth vector extracted in the step (2-1) through ENN and ANN to obtain forgetting information and memory information; wherein, ENN comprises a layer of neural network and a Tanh activation function, and ANN comprises a layer of neural network and a Sigmoid activation function;
and (2-3) embedding the knowledge calculated in the step (2-1) into the vector correlation weight and combining and representing the forgetting information and the memory information obtained in the step (2-2).
In the above technical solution, the feature fusion in the step (3) specifically includes:
(3-1) making poor interaction capability characteristics between the student capability characteristics and the test question difficulty characteristics;
(3-2) multiplying the knowledge proficiency characteristic obtained by updating the memory network by the Q matrix to obtain an interactive grasping characteristic;
and (3-3) fusing the interactive capability features, the interactive mastering features and the student speed features extracted based on the response time to obtain final feature characterization.
In the above technical solution, the neural network structure modeling and diagnosing algorithm in step (4) specifically includes:
(4-1) selecting an appropriate network structure;
(4-2) randomly initializing parameters;
(4-3) applying a depth residual network;
(4-4) fitting the data using a neural network.
In the above technical solution, the training of the neural network structure in the step (5) specifically includes:
(5-1) collecting three real world datasets, namely JUNYI, EDNET and ASSIST;
(5-2) selecting a cross entropy loss function as a loss function in the feedback neural network structure to measure the loss between the predicted value and the true value;
(5-3) performing error back propagation, and selecting a method for calculating the error gradient under each weight in real time to update the parameters;
(5-4) selecting an optimization algorithm optimizer. Step () and a back propagation algorithm backword () to minimize the loss function.
Compared with the prior art, the invention has the following outstanding substantive characteristics and remarkable progress:
1. the invention provides an interpretable cognitive diagnosis framework based on a memory network. The learning diagnosis of the learner is initialized from multiple angles based on the multidimensional characteristics such as response speed and the like, so as to improve the interpretation of the model.
2. The method utilizes the memory network to construct the migration representation of the learning knowledge proficiency at the knowledge state level, and the knowledge interaction process not only improves the accuracy of deducing the learning dynamic knowledge proficiency, but also enhances the capability of capturing the long-term dependence in the test question sequence. Finally, two methods of addition combination and multiplication combination are utilized in the aspect of feature fusion, and better learning prediction performance is realized.
3. The method carries out comprehensive experimental evaluation on three real world data sets, and the results prove that the method has superiority and interpretability in the aspect of cognitive diagnosis performance. The framework benefits from the strong learning capacity of deep learning and the interpretability of psychological measurement, realizes better learning prediction performance, and can analyze and describe the learning track of a learner based on the learning proficiency of students at different time output by a memory network.
Drawings
Fig. 1 is a schematic diagram of a memory forgetting rule of student learning.
Fig. 2 is a flow chart of cognitive diagnostics.
FIG. 3 is a diagram of an interpretable dynamic cognitive diagnostic framework based on a memory network.
Detailed Description
The method of the embodiment of the invention constructs an interpretable dynamic cognitive diagnosis framework based on a memory network, and mainly dynamically constructs the knowledge state of a learner by extracting multidimensional student features and test question features, and outputs the knowledge states of students at different moments, thereby judging the learning and forgetting of the learner. Specifically, firstly, the knowledge proficiency, the capability and the speed characteristics of students are extracted, the difficulty, the distinguishing degree of test questions and the interactive guessing and error characteristics among the test questions of the students are extracted, as the learning of the students is regular, the students can learn or forget knowledge in the learning process, the learning forgetting rule of the students is shown in figure 1, the knowledge state change condition of the learners can be dynamically constructed by using a memory network, the characteristics of the learners and the characteristics of the test questions are fused by referring to an item reflection theory and a DINA, the response time information and the history information of the learners are fully utilized, the interpretability is given to the diagnosis results, and the experimental results show that the model of the embodiment has better learning prediction performance and the knowledge state of the learners is proved to be positively migrated at different moments.
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
(1) Constructing an interpretable dynamic cognitive diagnostic framework based on a memory network
The flow chart of the cognitive diagnosis is shown in fig. 2, and the difficulty, the distinguishing degree, the proficiency degree, the capability and the speed of the test questions are learned by using the answer information of the students. The knowledge proficiency of students and the learning state representation of learners at different moments are learned by using the Q matrix modeling of the characteristics and the expert labels.
As shown in fig. 3, an interpretable dynamic cognitive diagnostic framework based on a memory network is constructed, which comprises feature extraction, feature interaction and modeling diagnostic algorithms. The features include student features (knowledge proficiency, ability, speed), test question features (Q matrix, difficulty, degree of differentiation) and interaction features (guess, error). The feature interaction is to aggregate the extracted features to obtain final characterization vectors, and then to fuse the final characterization vectors and then to throw the final characterization vectors into the deep neural network for training.
(1-1) feature extraction
The feature extraction comprises student feature extraction, test question feature extraction and interaction feature extraction, wherein the student feature extraction comprises knowledge proficiency feature, capability feature and speed feature, the test question feature comprises difficulty feature, distinguishing degree feature and Q matrix, and the interaction feature comprises guessing error feature.
(1-1-1) student characteristics: in the process of evaluating and answering the results by the students, each evaluation and answering result is based on the current test question proficiency level, the student capability characteristic and the speed characteristic of the students, namely the test question proficiency level M of the students in the model t Student ability Ω and speed V.
(1-1-2) test question feature: the potential characteristics of the evaluation items are diversified. The same student will have differences in evaluating the response on different questions of the same skill. The Q matrix is insufficient for directly describing the test question features, so that the Q matrix, the difficulty matrix and the distinguishing degree features are selected as the test question features. On the representation of the test question skill difficulty matrix, the fitting of the neural network is selected to extract the test question difficulty characteristics and the distinguishing degree characteristics, under the support of the strong fitting capacity of the neural network, the investigation difficulty can be fitted through the answering data of students, and the real investigation difficulty which tends to be the fitting investigation difficulty is enabled through reverse iteration.
(1-1-3) interaction characteristics: after the characteristics of the students and the test questions are obtained, the relationship between the students and the test questions needs to be further considered. The interactions themselves are considered herein to also be implying extractable features, and the guess coefficients and error coefficients are considered as interaction features.
(1-2) feature interactions
As shown in an embedded layer in fig. 3 (a), the student characteristics and the test question characteristics are interacted with each other by reflecting the theoretical model and the DINA model with reference to the items, namely, each characteristic in the test question characteristics is interacted with each characteristic of the student;
(1-3) based on the interpretable dynamic cognitive diagnosis modeling of the memory network, outputting diagnosis prediction results according to the diagnosis data of the students, wherein the results comprise knowledge grasping state results and answer prediction scores of the students on test questions.
In order to achieve the purpose of accurate diagnosis, the interpretable dynamic cognitive diagnostic model based on the memory network comprises three parts: initializing parameters, feature fusion and depth diagnosis.
(1-3-1) initialization parameters
The main task of the initialization is to randomly initialize and generate parameters, wherein the parameters are taken from an IRT model and a DINA model of a cognitive diagnostic model and are expressed in a proper data form in a neural network. The parameters are initialized in the form of tensorpatter. The continuous mastery degree of each concept by each student is initialized. The initialization mode can provide a good correction space for reverse feedback iteration, so that the mastering degree of each knowledge point of a student is closer to a true value.
Initializing a test question difficulty matrix K= { K jk } J×K Test question distinguishing matrix E= { E jk } J×K . Two parameter vectors slip and guess are randomly initialized to represent the error coefficient and guess coefficient, respectively, being tested. S= [ S ] 1 ,s 2 ,...,s j ],G=[g 1 ,g 2 ,...,g j ]The error coefficient and the guess coefficient of the test question j are respectively. Initializing studentsSkill mastery factor μ of i it ={μ itk Randomly initializing student i's velocity factor τ i ={τ ik Capacity factor alpha of student i i ={α ik }。
(1-3-2) feature fusion
Combining the student speed factor tau with the trainable matrix A, and obtaining a speed characteristic V through a sigmoid function, wherein the formula is as follows:
V=sigmoid(τ*A)
multiplying the initialized trainable matrix B of the student capacity factor alpha, and obtaining the capacity characteristic omega by a sigmoid function, wherein the formula is as follows:
Ω=sigmoid(α*B)
mention of (1-3-1) to μ t Is the knowledge mastering factor of the student at the time t, and is multiplied by the trainable matrix C to obtain the knowledge mastering feature M of the student t The formula is as follows:
M t =sigmoid(μ t *C)
speed interaction: the student speed characteristic V obtained above is interacted with the reaction time rt of the student for answering the test questions, and the formula is as follows:
ξ=V-rt
capability interaction: the student ability characteristic omega obtained above is subjected to difference with the test question difficulty, and then multiplied by the degree of distinction, and the formula is as follows:
knowledge proficiency level interaction: the student grasping feature M obtained above t Multiplying the test question Q matrix, and combining the guess and error parameters of the interaction characteristics, wherein the formula is as follows:
(1-3-3) depth diagnostics
On the basis of the two, the probability of the correct answer of the student is predicted by using the good fitting of the neural network. In this process, back propagation can yield rich intermediate results. Such as the degree of mastery, ability and speed of a student for a particular knowledge, the difficulty and distinction of knowledge points in examination questions and examination questions. Such intermediate conclusions are a beneficial complement to cognitive diagnostics.
(2) Updating knowledge proficiency using memory network structure
(2-1) calculating knowledge embedding vector correlation weights to extract growth vectors
And extracting a knowledge embedding vector from the test questions, calculating the related weight of the knowledge embedding vector, and extracting a growth vector from the test questions and the answer records. As shown in the update layer (c) of fig. 3, a knowledge vector k is extracted from the test questions q examined at time t t Exercise q t Multiplying the embedding matrix D with the successive embedding vectors k t By taking k t And each vector M k (i) The softmax activation function of the inner product between to further calculate the correlation weights:
(2-2) acquiring forgetting information and memory information
And (3) passing the growth vector extracted in the step (2-1) through an ENN (one-layer neural network and a Tanh activation function) and an ANN (one-layer neural network and a sigmoid activation function) to obtain forgetting information and memory information. According to the test question q input at time t t Sum answer record r t Extracting growth vector v t Obtaining f through ENN and ANN t And m t The formula is as follows:
f t =Sigmoid(F T v t +b e )
m t =Tanh(H T v t +b a )
and (2-3) embedding the knowledge calculated in the step (2-1) into the vector correlation weight and combining and representing the forgetting information and the memory information obtained in the step (2-2). The update formula for knowledge state is as follows:
M t =M t-1 *(1-W t-1 *f t )+W t-1 *m t
(3) Feature fusion
Fusing and interacting the student characteristics and test question characteristics, and referring to the following IRT formula to obtain student capability characteristics theta i And question difficulty feature b j The interaction capability is obtained by making a difference, the grasping degree is multiplied by the test question Q matrix by combining with the DINA model, and the interaction grasping characteristic is obtained by combining with the error guessing parameter in the interaction characteristic.
(3-1) differentiating the student ability characteristic and the test question difficulty characteristic to obtain the interaction ability characteristic
The student ability characteristic omega obtained above is subjected to difference with the test question difficulty, and then multiplied by the degree of distinction, and the formula is as follows:
(3-2) multiplying the knowledge proficiency feature obtained by the memory network update by the Q matrix to obtain an interactive grasping feature
The student grasping feature M obtained above t Multiplying the test questions by the Q matrix of the test questions, and combining the guess and error parameters of the test questions, wherein the formula is as follows:
(3-3) fusing the interactive capability features, the interactive mastering features and the student speed features extracted based on the response time to obtain final feature characterization
The student speed characteristic V obtained above is interacted with the reaction time rt of the student for answering the test questions, and the formula is as follows:
ξ=V-rt
the joint model of addition and multiplication is two representative models commonly used for data aggregation under different assumptions. The additive joint model assumes that each component isIs interchangeable. In contrast, the multiplicative joint model assumes that each component is concurrent. In the case of a joint model, the joint model,representing student u i Probability that problem q can be solved at time t, < >>Comprises xi and jersey>M t And xi represents the speed module,representation capability Module, M t Representing a knowledge mastering state module. Based on this->Is composed of knowledge, ability and speed, named +.>And->The two models are represented as follows:
addition joint model:
multiplication joint model:
in the embodiment, an addition joint model is selected, and feature fusion is realized through a concat function in a pytorch framework.
(4) Neural network structure modeling diagnosis algorithm
(4-1) selecting an appropriate network Structure
Compared with the traditional model which utilizes the parameter estimation modeling function, the neural network does not need a plurality of assumptions for learning the interactive function from the data, and has proved that the neural network can be infinitely approximated to any continuous function and has stronger fitting capability, and the model is more universal. The actual neural network modeling interaction function is as follows:
f 2 =[x T ,f 1 ]
y=φ(W 3 ×f 2 +b 3 )
wherein f 1 For outputting the first and second full connection layers, f 2 Is the output of the residual network, also for f 1 And x. W (W) i B, as weight parameters of all the connection layers i And y is the final output prediction result for the bias parameter.
(4-2) random initialization parameters
The main task of the initialization is to randomly initialize and generate parameters, wherein the parameters are taken from an IRT model and a DINA model of a cognitive diagnostic model and are expressed in a proper data form in a neural network. Assuming a skill test, test questions with J questions, test K skills, and answer by I students.
The Q matrix is one of the core concepts of cognitive diagnostics, q= { Q jk } J×K Is the incidence matrix of test questions and skills, q jk =1 denotes the question j investigation skill k, q jk =0 means that question j does not examine skill k. Student answer matrix R i ={r ij } I×J ,ri j =1 means student I correctly answered question j, otherwise r ij =0. Building a model requires initializing the following parameters:
problem initialization module: initializing a test question difficulty matrix K= { K jk } J×K ,k jk ∈[0,1]Difficulty coefficient for representing application of skill K in problem J, test question distinguishing matrix E= { E jk } J×K ,k jk ∈[0,1]Representing skillsK the discrimination coefficient applied in problem J.
Student initialization module: initializing student i skill mastery factor mu it ={μ itk },μ itk ∈[0,1]Representing the mastering state of student i on skill k at time t, randomly initializing speed factor tau of student i i ={τ ik },τ ik ∈[0,1]The answering speed of the test questions j representing the skills k of the students and the capability factor alpha of the students i i ={α ik },α ik ∈[0,1]Indicating the ability of the student.
Initializing interaction characteristics: two parameter vectors slip and guess are randomly initialized to represent the error coefficient and guess coefficient, respectively, being tested. S= [ S ] 1 ,s 2 ,...,s j ],G=[g 1 ,g 2 ,...,g j ]The error coefficient and the guess coefficient of the test question j are respectively.
(4-3) application of depth residual network
The depth residual network is used for introducing residual blocks in the process of constructing the neural network, is originally used for relieving the problems of gradient elimination and gradient explosion in the training process of the neural network, and is used for enhancing input in the model. The residual model takes X as input, and obtains mapping X after passing through a plurality of hidden layers 2 Directly combining X and X by using a splicing mode 2 Splicing is carried out and input into the output layer as a whole.
(4-4) fitting data Using neural networks
With the appropriate weights and offsets in the neural network, the process of adjusting the weights and offsets to fit the training data is called learning. The learning of the neural network is generally divided into the following four steps:
(1) randomly selecting a portion of data from the training data;
(2) calculating gradients of the loss function (by adopting an error back propagation method) on the weight parameters;
(3) slightly updating the weight parameters along the gradient direction;
(4) repeating the steps (1) to (3).
(5) Collecting data sets, training neural network structures
(5-1) collecting three real world datasets, namely Junyi, EDNET and ASSIST
ASSIST is an open dataset collected by Assistments (an online mentor system) that contains records of answers for learners in the 2009-2010 school and relationships between exercises and knowledge concepts;
the JunyI is taken from an online learning platform Junyacademy, the data set comprises answer records from 10 months in 2012 to 1 month in 2015, each exercise only comprises a concept, one concept is only contained in one exercise, interaction among concepts marked by expert is provided, and the study is to take out answer records of 15000 students before answer times;
EDNET data set is a large-scale hierarchical data set collected by an online educational platform named Santa, which contains various student activities such as question answering, course consumption, course purchase, and the like;
the data preprocessing is performed on the original data set, including data cleaning, outlier processing, deleting the learner's data only in response, and the like.
(5-2) selecting a Cross-entropy loss function as the loss function
In the structure of the feedback neural network, the model selects a cross entropy loss function as a loss function to measure the loss between a predicted value and a true value, and the effectiveness of the model is proved by pursuing a lower loss value. The cross entropy loss function formula can be depicted as:
(5-3) performing error back propagation, selecting a method for calculating the error gradient under each weight in real time for updating the parameters
The interactive capability features, the interactive mastering features and the student speed features are fused in the front to obtain X, and the formula is as follows:
after receiving the mixed input X, X is transferred to the first fully connected layer (Linear layer). X is linearly mapped on the first full connection layer to obtain z 1 Then processing through a sigmoid activation function to obtain X 1 . Then transmit X 1 And entering a second full connection layer, and repeating the steps. Repeating linear-sigmoid treatment twice to obtain mapping product X 2 . The formula is described as follows:
X i+1 =sigmoid(z i )
in the model, the back propagation plays a role in fitting the updated parameters, ΔW ij For updating the formula of the parameter, the formula is described as follows:
variable W ij Representing the neuron weights between i and j, define ΔW ij For weight update, η is the learning rate,representing the partial derivative of the square error function. X is X i Delta for the output of the current neuron j Errors (i.e. actual and predicted values) generated for the j-neurons of the current layerError between). Input section X to neuron j i Is made up of the outputs X of the upper layer neurons I i Is obtained by a weighted sum of (a) and (b).
(5-4) selection of optimization algorithm optimizer. Step () and back propagation algorithm back () to minimize loss function
W ij =W ij +ΔW ij Therefore W is ij =W ij —ηX i δ j
What is not described in detail in this specification is prior art known to those skilled in the art.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents and improvements made within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (6)
1. An interpretable dynamic cognitive diagnosis method based on a memory network is characterized by comprising the following steps:
(1) Constructing an interpretable dynamic cognitive diagnosis framework based on a memory network; the method comprises feature extraction, feature interaction and cognitive diagnosis modeling based on interpretable dynamics of a memory network;
(2) Updating knowledge proficiency by using a memory network structure; taking the student knowledge features extracted in the step (1) as the input of a network structure, and storing and outputting the knowledge proficiency;
(3) Fusing student characteristics and test question characteristics; fusing the student capability features and the speed features extracted in the step (1) with the knowledge proficiency features and the test question features of the step (2) to obtain a final input characterization vector;
(4) Using the neural network structure modeling diagnosis algorithm, taking the final input characterization vector obtained in the step (3) as the input of a network structure, and outputting a student response result; the diagnosis algorithm consists of a neural network structure and a loss function;
(5) And collecting a data set, training a neural network structure, predicting response of students and analyzing the change condition of knowledge proficiency.
2. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the constructing the memory network-based interpretable dynamic cognitive diagnostic framework in step (1) specifically includes:
(1-1) extracting features, wherein the feature extraction comprises the steps of extracting student features, test question features and interaction features, the student features comprise knowledge proficiency features, capability features and speed features, the test question features comprise difficulty features, distinguishing degree features and Q matrix, and the interaction features comprise guessing error features; the Q matrix represents knowledge points of examination questions;
(1-2) feature interaction, namely, reflecting the student features and test question feature reference items to a theoretical model and a DINA model to interact, namely, each feature in the test question features interacts with each feature of the students;
(1-3) based on the interpretable dynamic cognitive diagnosis modeling of the memory network, outputting diagnosis prediction results according to the diagnosis data of the students, wherein the results comprise knowledge grasping state results and answer prediction scores of the students on test questions, and the interpretable dynamic cognitive diagnosis model based on the memory network consists of three parts, namely initialization parameters, feature fusion and deep diagnosis.
3. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the updating of knowledge proficiency using memory network structure in step (2) specifically includes:
(2-1) extracting a knowledge embedding vector from the test questions, calculating the related weight of the knowledge embedding vector, and extracting an increase vector from the test questions and the answer records;
(2-2) passing the growth vector extracted in the step (2-1) through ENN and ANN to obtain forgetting information and memory information; wherein, ENN comprises a layer of neural network and a Tanh activation function, and ANN comprises a layer of neural network and a Sigmoid activation function;
and (2-3) embedding the knowledge calculated in the step (2-1) into the vector correlation weight and combining and representing the forgetting information and the memory information obtained in the step (2-2).
4. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the feature fusion in step (3) includes:
(3-1) making poor interaction capability characteristics between the student capability characteristics and the test question difficulty characteristics;
(3-2) multiplying the knowledge proficiency characteristic obtained by updating the memory network by the Q matrix to obtain an interactive grasping characteristic;
and (3-3) fusing the interactive capability features, the interactive mastering features and the student speed features extracted based on the response time to obtain final feature characterization.
5. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the neural network structure modeling diagnostic algorithm in step (4) specifically includes:
(4-1) selecting an appropriate network structure;
(4-2) randomly initializing parameters;
(4-3) applying a depth residual network;
(4-4) fitting the data using a neural network.
6. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the neural network structure training of step (5) specifically includes:
(5-1) collecting three real world datasets, namely JUNYI, EDNET and ASSIST;
(5-2) selecting a cross entropy loss function as a loss function in the feedback neural network structure to measure the loss between the predicted value and the true value;
(5-3) performing error back propagation, and selecting a method for calculating the error gradient under each weight in real time to update the parameters;
(5-4) selecting an optimization algorithm optimizer. Step () and a back propagation algorithm backword () to minimize the loss function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310640694.4A CN116705294A (en) | 2023-05-31 | 2023-05-31 | Interpretable dynamic cognitive diagnosis method based on memory network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310640694.4A CN116705294A (en) | 2023-05-31 | 2023-05-31 | Interpretable dynamic cognitive diagnosis method based on memory network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116705294A true CN116705294A (en) | 2023-09-05 |
Family
ID=87823206
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310640694.4A Pending CN116705294A (en) | 2023-05-31 | 2023-05-31 | Interpretable dynamic cognitive diagnosis method based on memory network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116705294A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117763361A (en) * | 2024-02-22 | 2024-03-26 | 泰山学院 | Student score prediction method and system based on artificial intelligence |
-
2023
- 2023-05-31 CN CN202310640694.4A patent/CN116705294A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117763361A (en) * | 2024-02-22 | 2024-03-26 | 泰山学院 | Student score prediction method and system based on artificial intelligence |
CN117763361B (en) * | 2024-02-22 | 2024-04-30 | 泰山学院 | Student score prediction method and system based on artificial intelligence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113033808A (en) | Deep embedded knowledge tracking method based on exercise difficulty and student ability | |
CN106960245A (en) | A kind of individualized medicine evaluation method and system based on cognitive process chain | |
CN111461442A (en) | Knowledge tracking method and system based on federal learning | |
CN116705294A (en) | Interpretable dynamic cognitive diagnosis method based on memory network | |
KR20190066849A (en) | Custom STEM e-learning platform based on big data and machine learning | |
Alnagar | Using artificial neural network to predicted student satisfaction in e-learning | |
CN115510286A (en) | Multi-relation cognitive diagnosis method based on graph convolution network | |
Bao et al. | Theoretical model and quantitative assessment of scientific thinking and reasoning | |
CN117807422A (en) | Higher-order cognitive diagnosis method based on multilayer attention network | |
CN114676903A (en) | Online prediction method and system based on time perception and cognitive diagnosis | |
Chen et al. | LogCF: Deep Collaborative Filtering with Process Data for Enhanced Learning Outcome Modeling. | |
CN116166998B (en) | Student performance prediction method combining global and local features | |
CN117349362A (en) | Dynamic knowledge cognitive hierarchy mining method, system, equipment and terminal | |
Pei et al. | Self-Attention Gated Cognitive Diagnosis for Faster Adaptive Educational Assessments | |
AL-Fayyadh et al. | Modelling an Adaptive Learning System Using Artificial Intelligence | |
CN115795015A (en) | Comprehensive knowledge tracking method for enhancing test question difficulty | |
CN116402134A (en) | Knowledge tracking method and system based on behavior perception | |
CN116361744A (en) | Learner cognition tracking method and system for learning procedural evaluation | |
CN115205072A (en) | Cognitive diagnosis method for long-period evaluation | |
CN114117033B (en) | Knowledge tracking method and system | |
Zhou | Research on teaching resource recommendation algorithm based on deep learning and cognitive diagnosis | |
CN112785039B (en) | Prediction method and related device for answer score rate of test questions | |
Vrettaros et al. | Gpnn techniques in learning assessment systems | |
Lapenok et al. | Using neural network mathematical models to solve pedagogical problems | |
Azoulay et al. | Adaptive task selection in automated educational software: A comparative study |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |