CN116705294A

CN116705294A - Interpretable dynamic cognitive diagnosis method based on memory network

Info

Publication number: CN116705294A
Application number: CN202310640694.4A
Authority: CN
Inventors: 黄涛; 杨凯; 耿晶; 杨华利; 胡盛泽; 张�浩; 刘三女牙; 杨宗凯
Original assignee: Central China Normal University
Current assignee: Central China Normal University
Priority date: 2023-05-31
Filing date: 2023-05-31
Publication date: 2023-09-05

Abstract

The invention belongs to the field of education data mining, and provides an interpretable dynamic cognitive diagnosis method based on a memory network, which comprises the following steps of: (1) Constructing an interpretable dynamic cognitive diagnosis framework based on a memory network; (2) updating knowledge proficiency using a memory network structure; (3) fusing student characteristics and test question characteristics; (4) Using the neural network structure modeling diagnosis algorithm, taking the final input characterization vector obtained in the step (3) as the input of a network structure, and outputting a student response result; (5) Predicting response of students and analyzing the change condition of knowledge proficiency. The method of the invention initializes the learning diagnosis of the learner from a plurality of angles to improve the interpretation of the model, and simultaneously utilizes the memory network to construct the migration representation of the learning skill level of the learner at the knowledge state level, thereby not only improving the accuracy of deducing the dynamic knowledge skill level of the learner, but also enhancing the capability of capturing the long-term dependence in the test question sequence.

Description

Interpretable dynamic cognitive diagnosis method based on memory network

Technical Field

The invention belongs to the field of education data mining, and particularly relates to an interpretable dynamic cognitive diagnosis method based on a memory network.

Technical Field

In recent years, with the popularity of cognitive psychology and artificial intelligence technologies, the development of online platforms (MOOCs) and intelligent guidance systems on a large scale has become increasingly intelligent, and one key technology of these systems is cognitive diagnostics. Cognitive diagnostic theory serves as a new generation of educational measurement theory by modeling the cognitive processes of a learner and mining the potential abilities and skill states of the learner, rather than giving a score to the learner singly. Learner, exercise and knowledge concepts are the most important components of the cognitive diagnostic system, and the Cognitive Diagnostic Model (CDM) task aims to model complex relationships between the three components to simulate learner performance, deducing their accumulated knowledge state during exercise. The cognitive diagnosis can be divided into two types based on different modeling modes, namely a cognitive diagnosis model based on a statistical method and a cognitive diagnosis model based on a neural network.

The cognitive diagnosis model based on the statistical method models probability modeling of the response process of students through different learning assumptions, and further diagnoses the skill and skill state of the learner. On the one hand, based on the skill status of the learner, it can be classified into two cases of potential trait and specific knowledge skill. The cognitive diagnosis model based on the potential characteristics of the learner is represented by an Item Reaction Theory (IRT), models potential cognitive ability of students as continuous parameters, and is integrated with exercise difficulty characteristics. On the other hand, a cognitive diagnostic model modeled based on a specific knowledge skill state is represented by a connected deterministic input noise and gate (DINA) model, which uses binary discrete vectors to represent whether a learner grasps a certain concept, and assumes that the learner can correctly answer exercises only by grasping all concepts contained in the exercises, which has good interpretability. However, statistical-based CDM relies on artificially designed diagnostic functions, which are insufficient to capture complex interactions between learners and exercises.

The development of new generation artificial intelligence is benefited, the cognitive diagnosis problem based on a statistical method is solved after deep learning appears, and the introduction of the deep learning into the cognitive diagnosis has become a trend. For example, neuro-cognitive diagnostics (neuro CD) uses multiple Neural layers to model the complex interactive behavior of a learner with exercises, which improves the accuracy of the diagnostic results, but neuro CD ignores implicit relationships between knowledge concepts and between exercises. Relationship diagram driven cognitive diagnosis (RCD) uniformly models interaction and structural relationship through a multi-layer student-exercise-concept relationship diagram, and utilizes a multi-layer attention network to realize relationship aggregation between a local diagram and nodes; based on the mechanism of attention and neural networks, the Deep cognitive diagnostic framework (Deep CDM) considers the importance and interactions of knowledge concepts. To utilize textual information of exercises in cognitive diagnostics, related researchers have implemented neural network-based IRT models. In the process of combining cognitive diagnosis and deep learning, a great deal of research work is focused on improving the accuracy of a model without deep exploration of student skill mastery states. Enabling deep learning to learn diagnosis while the interaction between learner-practice-knowledge concepts can be well modeled, how to solve the "black box characteristics" of deep learning, enhancing the interpretability of the diagnosis process, becomes a urgent problem to be solved.

Disclosure of Invention

The invention aims to provide an interpretable dynamic cognitive diagnosis method based on a memory network, aiming at the problems that a deep learning modeling cognitive diagnosis function is not interpretable, a student knowledge mastering state can change dynamically and the like. Firstly, referring to parameters in a psychological measurement theory, introducing learner capacity, test question difficulty and distinction degree, guessing and error parameters of a learner, and based on response time, providing multidimensional characteristics such as response speed and the like, initializing learning diagnosis of the learner from multiple angles so as to improve model interpretability. Secondly, the memory network is utilized to construct the migration representation of the learning knowledge proficiency level at the knowledge state level, so that the accuracy of deducing the learning dynamic knowledge proficiency level is improved, and the capability of capturing the long-term dependence in the test question sequence is enhanced.

In order to achieve the aim, the invention adopts the following technical scheme.

The invention provides an interpretable dynamic cognitive diagnosis method based on a memory network, which comprises the following steps of:

(1) Constructing an interpretable dynamic cognitive diagnosis framework based on a memory network; the method comprises feature extraction, feature interaction and cognitive diagnosis modeling based on interpretable dynamics of a memory network;

(2) Updating knowledge proficiency by using a memory network structure; taking the student knowledge features extracted in the step (1) as the input of a network structure, and storing and outputting the knowledge proficiency;

(3) Fusing student characteristics and test question characteristics; fusing the student capability features and the speed features extracted in the step (1) with the knowledge proficiency features and the test question features of the step (2) to obtain a final input characterization vector;

(4) Using the neural network structure modeling diagnosis algorithm, taking the final input characterization vector obtained in the step (3) as the input of a network structure, and outputting a student response result; the diagnosis algorithm consists of a neural network structure and a loss function;

(5) And collecting a data set, training a neural network structure, predicting response of students and analyzing the change condition of knowledge proficiency.

In the above technical solution, the constructing the memory network-based interpretable dynamic cognitive diagnostic framework in step (1) specifically includes:

(1-1) extracting features, wherein the feature extraction comprises the steps of extracting student features, test question features and interaction features, the student features comprise knowledge proficiency features, capability features and speed features, the test question features comprise difficulty features, distinguishing degree features and Q matrix, and the interaction features comprise guessing error features; the Q matrix represents knowledge points of examination questions, the column represents knowledge points, the row represents examination questions, and the element only takes a binary matrix of 0 or 1, for example, if the first question examines knowledge point 1, the first row is first column 1, and the other rows are first column 0;

(1-2) feature interaction, namely, reflecting the student features and test question feature reference items to a theoretical model and a DINA model to interact, namely, each feature in the test question features interacts with each feature of the students;

(1-3) based on the interpretable dynamic cognitive diagnosis modeling of the memory network, outputting diagnosis prediction results according to the diagnosis data of the students, wherein the results comprise knowledge grasping state results and answer prediction scores of the students on test questions, and the interpretable dynamic cognitive diagnosis model based on the memory network consists of three parts, namely initialization parameters, feature fusion and deep diagnosis.

In the above technical solution, the updating of the knowledge proficiency level by using the memory network structure in the step (2) specifically includes:

(2-1) extracting a knowledge embedding vector from the test questions, calculating the related weight of the knowledge embedding vector, and extracting an increase vector from the test questions and the answer records;

(2-2) passing the growth vector extracted in the step (2-1) through ENN and ANN to obtain forgetting information and memory information; wherein, ENN comprises a layer of neural network and a Tanh activation function, and ANN comprises a layer of neural network and a Sigmoid activation function;

and (2-3) embedding the knowledge calculated in the step (2-1) into the vector correlation weight and combining and representing the forgetting information and the memory information obtained in the step (2-2).

In the above technical solution, the feature fusion in the step (3) specifically includes:

(3-1) making poor interaction capability characteristics between the student capability characteristics and the test question difficulty characteristics;

(3-2) multiplying the knowledge proficiency characteristic obtained by updating the memory network by the Q matrix to obtain an interactive grasping characteristic;

and (3-3) fusing the interactive capability features, the interactive mastering features and the student speed features extracted based on the response time to obtain final feature characterization.

In the above technical solution, the neural network structure modeling and diagnosing algorithm in step (4) specifically includes:

(4-1) selecting an appropriate network structure;

(4-2) randomly initializing parameters;

(4-3) applying a depth residual network;

(4-4) fitting the data using a neural network.

In the above technical solution, the training of the neural network structure in the step (5) specifically includes:

(5-1) collecting three real world datasets, namely JUNYI, EDNET and ASSIST;

(5-2) selecting a cross entropy loss function as a loss function in the feedback neural network structure to measure the loss between the predicted value and the true value;

(5-3) performing error back propagation, and selecting a method for calculating the error gradient under each weight in real time to update the parameters;

(5-4) selecting an optimization algorithm optimizer. Step () and a back propagation algorithm backword () to minimize the loss function.

Compared with the prior art, the invention has the following outstanding substantive characteristics and remarkable progress:

1. the invention provides an interpretable cognitive diagnosis framework based on a memory network. The learning diagnosis of the learner is initialized from multiple angles based on the multidimensional characteristics such as response speed and the like, so as to improve the interpretation of the model.

2. The method utilizes the memory network to construct the migration representation of the learning knowledge proficiency at the knowledge state level, and the knowledge interaction process not only improves the accuracy of deducing the learning dynamic knowledge proficiency, but also enhances the capability of capturing the long-term dependence in the test question sequence. Finally, two methods of addition combination and multiplication combination are utilized in the aspect of feature fusion, and better learning prediction performance is realized.

3. The method carries out comprehensive experimental evaluation on three real world data sets, and the results prove that the method has superiority and interpretability in the aspect of cognitive diagnosis performance. The framework benefits from the strong learning capacity of deep learning and the interpretability of psychological measurement, realizes better learning prediction performance, and can analyze and describe the learning track of a learner based on the learning proficiency of students at different time output by a memory network.

Drawings

Fig. 1 is a schematic diagram of a memory forgetting rule of student learning.

Fig. 2 is a flow chart of cognitive diagnostics.

FIG. 3 is a diagram of an interpretable dynamic cognitive diagnostic framework based on a memory network.

Detailed Description

The method of the embodiment of the invention constructs an interpretable dynamic cognitive diagnosis framework based on a memory network, and mainly dynamically constructs the knowledge state of a learner by extracting multidimensional student features and test question features, and outputs the knowledge states of students at different moments, thereby judging the learning and forgetting of the learner. Specifically, firstly, the knowledge proficiency, the capability and the speed characteristics of students are extracted, the difficulty, the distinguishing degree of test questions and the interactive guessing and error characteristics among the test questions of the students are extracted, as the learning of the students is regular, the students can learn or forget knowledge in the learning process, the learning forgetting rule of the students is shown in figure 1, the knowledge state change condition of the learners can be dynamically constructed by using a memory network, the characteristics of the learners and the characteristics of the test questions are fused by referring to an item reflection theory and a DINA, the response time information and the history information of the learners are fully utilized, the interpretability is given to the diagnosis results, and the experimental results show that the model of the embodiment has better learning prediction performance and the knowledge state of the learners is proved to be positively migrated at different moments.

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.

(1) Constructing an interpretable dynamic cognitive diagnostic framework based on a memory network

The flow chart of the cognitive diagnosis is shown in fig. 2, and the difficulty, the distinguishing degree, the proficiency degree, the capability and the speed of the test questions are learned by using the answer information of the students. The knowledge proficiency of students and the learning state representation of learners at different moments are learned by using the Q matrix modeling of the characteristics and the expert labels.

As shown in fig. 3, an interpretable dynamic cognitive diagnostic framework based on a memory network is constructed, which comprises feature extraction, feature interaction and modeling diagnostic algorithms. The features include student features (knowledge proficiency, ability, speed), test question features (Q matrix, difficulty, degree of differentiation) and interaction features (guess, error). The feature interaction is to aggregate the extracted features to obtain final characterization vectors, and then to fuse the final characterization vectors and then to throw the final characterization vectors into the deep neural network for training.

(1-1) feature extraction

The feature extraction comprises student feature extraction, test question feature extraction and interaction feature extraction, wherein the student feature extraction comprises knowledge proficiency feature, capability feature and speed feature, the test question feature comprises difficulty feature, distinguishing degree feature and Q matrix, and the interaction feature comprises guessing error feature.

(1-1-1) student characteristics: in the process of evaluating and answering the results by the students, each evaluation and answering result is based on the current test question proficiency level, the student capability characteristic and the speed characteristic of the students, namely the test question proficiency level M of the students in the model _t Student ability Ω and speed V.

(1-1-2) test question feature: the potential characteristics of the evaluation items are diversified. The same student will have differences in evaluating the response on different questions of the same skill. The Q matrix is insufficient for directly describing the test question features, so that the Q matrix, the difficulty matrix and the distinguishing degree features are selected as the test question features. On the representation of the test question skill difficulty matrix, the fitting of the neural network is selected to extract the test question difficulty characteristics and the distinguishing degree characteristics, under the support of the strong fitting capacity of the neural network, the investigation difficulty can be fitted through the answering data of students, and the real investigation difficulty which tends to be the fitting investigation difficulty is enabled through reverse iteration.

(1-1-3) interaction characteristics: after the characteristics of the students and the test questions are obtained, the relationship between the students and the test questions needs to be further considered. The interactions themselves are considered herein to also be implying extractable features, and the guess coefficients and error coefficients are considered as interaction features.

(1-2) feature interactions

As shown in an embedded layer in fig. 3 (a), the student characteristics and the test question characteristics are interacted with each other by reflecting the theoretical model and the DINA model with reference to the items, namely, each characteristic in the test question characteristics is interacted with each characteristic of the student;

(1-3) based on the interpretable dynamic cognitive diagnosis modeling of the memory network, outputting diagnosis prediction results according to the diagnosis data of the students, wherein the results comprise knowledge grasping state results and answer prediction scores of the students on test questions.

In order to achieve the purpose of accurate diagnosis, the interpretable dynamic cognitive diagnostic model based on the memory network comprises three parts: initializing parameters, feature fusion and depth diagnosis.

(1-3-1) initialization parameters

The main task of the initialization is to randomly initialize and generate parameters, wherein the parameters are taken from an IRT model and a DINA model of a cognitive diagnostic model and are expressed in a proper data form in a neural network. The parameters are initialized in the form of tensorpatter. The continuous mastery degree of each concept by each student is initialized. The initialization mode can provide a good correction space for reverse feedback iteration, so that the mastering degree of each knowledge point of a student is closer to a true value.

Initializing a test question difficulty matrix K= { K _jk } _J×K Test question distinguishing matrix E= { E _jk } _J×K . Two parameter vectors slip and guess are randomly initialized to represent the error coefficient and guess coefficient, respectively, being tested. S= [ S ] ₁ ,s ₂ ,...,s _j ],G＝[g ₁ ,g ₂ ,...,g _j ]The error coefficient and the guess coefficient of the test question j are respectively. Initializing studentsSkill mastery factor μ of i _it ＝{μ _itk Randomly initializing student i's velocity factor τ _i ＝{τ _ik Capacity factor alpha of student i _i ＝{α _ik }。

(1-3-2) feature fusion

Combining the student speed factor tau with the trainable matrix A, and obtaining a speed characteristic V through a sigmoid function, wherein the formula is as follows:

V＝sigmoid(τ*A)

multiplying the initialized trainable matrix B of the student capacity factor alpha, and obtaining the capacity characteristic omega by a sigmoid function, wherein the formula is as follows:

Ω＝sigmoid(α*B)

mention of (1-3-1) to μ _t Is the knowledge mastering factor of the student at the time t, and is multiplied by the trainable matrix C to obtain the knowledge mastering feature M of the student _t The formula is as follows:

M _t ＝sigmoid(μ _t *C)

speed interaction: the student speed characteristic V obtained above is interacted with the reaction time rt of the student for answering the test questions, and the formula is as follows:

ξ＝V-rt

capability interaction: the student ability characteristic omega obtained above is subjected to difference with the test question difficulty, and then multiplied by the degree of distinction, and the formula is as follows:

knowledge proficiency level interaction: the student grasping feature M obtained above _t Multiplying the test question Q matrix, and combining the guess and error parameters of the interaction characteristics, wherein the formula is as follows:

(1-3-3) depth diagnostics

On the basis of the two, the probability of the correct answer of the student is predicted by using the good fitting of the neural network. In this process, back propagation can yield rich intermediate results. Such as the degree of mastery, ability and speed of a student for a particular knowledge, the difficulty and distinction of knowledge points in examination questions and examination questions. Such intermediate conclusions are a beneficial complement to cognitive diagnostics.

(2) Updating knowledge proficiency using memory network structure

(2-1) calculating knowledge embedding vector correlation weights to extract growth vectors

And extracting a knowledge embedding vector from the test questions, calculating the related weight of the knowledge embedding vector, and extracting a growth vector from the test questions and the answer records. As shown in the update layer (c) of fig. 3, a knowledge vector k is extracted from the test questions q examined at time t _t Exercise q _t Multiplying the embedding matrix D with the successive embedding vectors k _t By taking k _t And each vector M ^k (i) The softmax activation function of the inner product between to further calculate the correlation weights:

(2-2) acquiring forgetting information and memory information

And (3) passing the growth vector extracted in the step (2-1) through an ENN (one-layer neural network and a Tanh activation function) and an ANN (one-layer neural network and a sigmoid activation function) to obtain forgetting information and memory information. According to the test question q input at time t _t Sum answer record r _t Extracting growth vector v _t Obtaining f through ENN and ANN _t And m _t The formula is as follows:

f _t ＝Sigmoid(F ^T v _t +b _e )

m _t ＝Tanh(H ^T v _t +b _a )

and (2-3) embedding the knowledge calculated in the step (2-1) into the vector correlation weight and combining and representing the forgetting information and the memory information obtained in the step (2-2). The update formula for knowledge state is as follows:

M _t ＝M _t-1 *(1-W _t-1 *f _t )+W _t-1 *m _t

(3) Feature fusion

Fusing and interacting the student characteristics and test question characteristics, and referring to the following IRT formula to obtain student capability characteristics theta _i And question difficulty feature b _j The interaction capability is obtained by making a difference, the grasping degree is multiplied by the test question Q matrix by combining with the DINA model, and the interaction grasping characteristic is obtained by combining with the error guessing parameter in the interaction characteristic.

(3-1) differentiating the student ability characteristic and the test question difficulty characteristic to obtain the interaction ability characteristic

The student ability characteristic omega obtained above is subjected to difference with the test question difficulty, and then multiplied by the degree of distinction, and the formula is as follows:

(3-2) multiplying the knowledge proficiency feature obtained by the memory network update by the Q matrix to obtain an interactive grasping feature

The student grasping feature M obtained above _t Multiplying the test questions by the Q matrix of the test questions, and combining the guess and error parameters of the test questions, wherein the formula is as follows:

(3-3) fusing the interactive capability features, the interactive mastering features and the student speed features extracted based on the response time to obtain final feature characterization

The student speed characteristic V obtained above is interacted with the reaction time rt of the student for answering the test questions, and the formula is as follows:

ξ＝V-rt

the joint model of addition and multiplication is two representative models commonly used for data aggregation under different assumptions. The additive joint model assumes that each component isIs interchangeable. In contrast, the multiplicative joint model assumes that each component is concurrent. In the case of a joint model, the joint model,representing student u _i Probability that problem q can be solved at time t, < >>Comprises xi and jersey>M _t And xi represents the speed module,representation capability Module, M _t Representing a knowledge mastering state module. Based on this->Is composed of knowledge, ability and speed, named +.>And->The two models are represented as follows:

addition joint model:

multiplication joint model:

in the embodiment, an addition joint model is selected, and feature fusion is realized through a concat function in a pytorch framework.

(4) Neural network structure modeling diagnosis algorithm

(4-1) selecting an appropriate network Structure

Compared with the traditional model which utilizes the parameter estimation modeling function, the neural network does not need a plurality of assumptions for learning the interactive function from the data, and has proved that the neural network can be infinitely approximated to any continuous function and has stronger fitting capability, and the model is more universal. The actual neural network modeling interaction function is as follows:

f ₂ ＝[x ^T ,f ₁ ]

y＝φ(W ₃ ×f ₂ +b ₃ )

wherein f ₁ For outputting the first and second full connection layers, f ₂ Is the output of the residual network, also for f ₁ And x. W (W) _i B, as weight parameters of all the connection layers _i And y is the final output prediction result for the bias parameter.

(4-2) random initialization parameters

The main task of the initialization is to randomly initialize and generate parameters, wherein the parameters are taken from an IRT model and a DINA model of a cognitive diagnostic model and are expressed in a proper data form in a neural network. Assuming a skill test, test questions with J questions, test K skills, and answer by I students.

The Q matrix is one of the core concepts of cognitive diagnostics, q= { Q _jk } _J×K Is the incidence matrix of test questions and skills, q _jk =1 denotes the question j investigation skill k, q _jk =0 means that question j does not examine skill k. Student answer matrix R _i ＝{r _ij } _I×J ，ri _j =1 means student I correctly answered question j, otherwise r _ij =0. Building a model requires initializing the following parameters:

problem initialization module: initializing a test question difficulty matrix K= { K _jk } _J×K ，k _jk ∈[0,1]Difficulty coefficient for representing application of skill K in problem J, test question distinguishing matrix E= { E _jk } _J×K ，k _jk ∈[0,1]Representing skillsK the discrimination coefficient applied in problem J.

Student initialization module: initializing student i skill mastery factor mu _it ＝{μ _itk }，μ _itk ∈[0,1]Representing the mastering state of student i on skill k at time t, randomly initializing speed factor tau of student i _i ＝{τ _ik }，τ _ik ∈[0,1]The answering speed of the test questions j representing the skills k of the students and the capability factor alpha of the students i _i ＝{α _ik }，α _ik ∈[0,1]Indicating the ability of the student.

Initializing interaction characteristics: two parameter vectors slip and guess are randomly initialized to represent the error coefficient and guess coefficient, respectively, being tested. S= [ S ] ₁ ,s ₂ ,...,s _j ],G＝[g ₁ ,g ₂ ,...,g _j ]The error coefficient and the guess coefficient of the test question j are respectively.

(4-3) application of depth residual network

The depth residual network is used for introducing residual blocks in the process of constructing the neural network, is originally used for relieving the problems of gradient elimination and gradient explosion in the training process of the neural network, and is used for enhancing input in the model. The residual model takes X as input, and obtains mapping X after passing through a plurality of hidden layers ₂ Directly combining X and X by using a splicing mode ₂ Splicing is carried out and input into the output layer as a whole.

(4-4) fitting data Using neural networks

With the appropriate weights and offsets in the neural network, the process of adjusting the weights and offsets to fit the training data is called learning. The learning of the neural network is generally divided into the following four steps:

(1) randomly selecting a portion of data from the training data;

(2) calculating gradients of the loss function (by adopting an error back propagation method) on the weight parameters;

(3) slightly updating the weight parameters along the gradient direction;

(4) repeating the steps (1) to (3).

(5) Collecting data sets, training neural network structures

(5-1) collecting three real world datasets, namely Junyi, EDNET and ASSIST

ASSIST is an open dataset collected by Assistments (an online mentor system) that contains records of answers for learners in the 2009-2010 school and relationships between exercises and knowledge concepts;

the JunyI is taken from an online learning platform Junyacademy, the data set comprises answer records from 10 months in 2012 to 1 month in 2015, each exercise only comprises a concept, one concept is only contained in one exercise, interaction among concepts marked by expert is provided, and the study is to take out answer records of 15000 students before answer times;

EDNET data set is a large-scale hierarchical data set collected by an online educational platform named Santa, which contains various student activities such as question answering, course consumption, course purchase, and the like;

the data preprocessing is performed on the original data set, including data cleaning, outlier processing, deleting the learner's data only in response, and the like.

(5-2) selecting a Cross-entropy loss function as the loss function

In the structure of the feedback neural network, the model selects a cross entropy loss function as a loss function to measure the loss between a predicted value and a true value, and the effectiveness of the model is proved by pursuing a lower loss value. The cross entropy loss function formula can be depicted as:

(5-3) performing error back propagation, selecting a method for calculating the error gradient under each weight in real time for updating the parameters

The interactive capability features, the interactive mastering features and the student speed features are fused in the front to obtain X, and the formula is as follows:

after receiving the mixed input X, X is transferred to the first fully connected layer (Linear layer). X is linearly mapped on the first full connection layer to obtain z ₁ Then processing through a sigmoid activation function to obtain X ₁ . Then transmit X ₁ And entering a second full connection layer, and repeating the steps. Repeating linear-sigmoid treatment twice to obtain mapping product X ₂ . The formula is described as follows:

X _i+1 ＝sigmoid(z _i )

in the model, the back propagation plays a role in fitting the updated parameters, ΔW _ij For updating the formula of the parameter, the formula is described as follows:

variable W _ij Representing the neuron weights between i and j, define ΔW _ij For weight update, η is the learning rate,representing the partial derivative of the square error function. X is X _i Delta for the output of the current neuron _j Errors (i.e. actual and predicted values) generated for the j-neurons of the current layerError between). Input section X to neuron j _i Is made up of the outputs X of the upper layer neurons I _i Is obtained by a weighted sum of (a) and (b).

(5-4) selection of optimization algorithm optimizer. Step () and back propagation algorithm back () to minimize loss function

W _ij ＝W _ij +ΔW _ij Therefore W is _ij ＝W _ij —ηX _i δ _j

What is not described in detail in this specification is prior art known to those skilled in the art.

It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents and improvements made within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. An interpretable dynamic cognitive diagnosis method based on a memory network is characterized by comprising the following steps:

2. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the constructing the memory network-based interpretable dynamic cognitive diagnostic framework in step (1) specifically includes:

(1-1) extracting features, wherein the feature extraction comprises the steps of extracting student features, test question features and interaction features, the student features comprise knowledge proficiency features, capability features and speed features, the test question features comprise difficulty features, distinguishing degree features and Q matrix, and the interaction features comprise guessing error features; the Q matrix represents knowledge points of examination questions;

3. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the updating of knowledge proficiency using memory network structure in step (2) specifically includes:

4. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the feature fusion in step (3) includes:

5. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the neural network structure modeling diagnostic algorithm in step (4) specifically includes:

(4-1) selecting an appropriate network structure;

(4-2) randomly initializing parameters;

(4-3) applying a depth residual network;

(4-4) fitting the data using a neural network.

6. The memory network-based interpretable dynamic cognitive diagnostic method of claim 1, wherein the neural network structure training of step (5) specifically includes:

(5-1) collecting three real world datasets, namely JUNYI, EDNET and ASSIST;