CN117634551A

CN117634551A - Double-view knowledge tracking method for concept relation reasoning

Info

Publication number: CN117634551A
Application number: CN202311665971.3A
Authority: CN
Inventors: 罗森林; 吴松凌; 潘丽敏; 周瑾洁; 吴舟婷
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2023-12-06
Filing date: 2023-12-06
Publication date: 2024-03-01

Abstract

The invention relates to a double-view knowledge tracking method for concept relation reasoning, belonging to the technical field of computer and information science. Firstly, constructing a concept relation matrix and a knowledge state matrix through a causal GRU based on learner answer records; then constructing a knowledge state diagram based on the two matrixes, extracting neighbor node characteristics by using a diagram convolution network, and obtaining knowledge state diagram embedding of a concept association view angle; meanwhile, weighting and summing the knowledge state matrix sequences according to the concept relation matrix to obtain a historical knowledge state matrix of the time-dependent view angle; finally, combining with the new question concept embedding, predicting whether the learner answers correctly. Aiming at the problem that the existing method does not fully mine the concept association relation and knowledge state time dependence, the invention provides a double-view knowledge tracking method for concept relation reasoning, and improves the accuracy of knowledge tracking prediction.

Description

Double-view knowledge tracking method for concept relation reasoning

Technical Field

The invention relates to a double-view knowledge tracking method for concept relation reasoning, belonging to the technical field of computer and information science.

Background

The intelligent education system is widely applied to modern education industry by virtue of the advantages of less space-time constraint, high convenience and personalized custom learning. However, the large-scale application of the intelligent education system generates massive learning data, and only the education subscribers analyze the data, so that the requirement of users on personalized customized learning services cannot be met. Therefore, the study deep learning knowledge tracking method automatically measures the knowledge level of students, provides a personalized learning scheme and has important meaning and value for artificial intelligence aided education. With the development of deep learning, many research works have been conducted on related technologies in the field of knowledge tracking. The neural network type classification adopted by the deep learning knowledge tracking method can be divided into a knowledge tracking method of a cyclic neural network, a knowledge tracking method of a memory enhancement network and a knowledge tracking method of a graph neural network.

1. Knowledge tracking method of cyclic neural network

The knowledge tracking method of the cyclic neural network captures sequence information by using hidden states of the cyclic neural network and the long-term and short-term memory network respectively, so that probability of students answering questions or concepts correctly in each time step is predicted, and compared with a traditional machine learning method, the knowledge tracking method of the cyclic neural network achieves obvious performance improvement. However, this type of method still has a non-negligible limitation: the relationships between multiple concepts cannot be modeled; information is easily lost when processing very long sequences; assume that the correlation between all questions in the sequence is consistent.

2. Knowledge tracking method of memory-enhanced network

The knowledge tracking method of the memory-enhanced network adopts key value memory to represent knowledge states: the key matrix represents the relevance mapping of the problem to the concepts and the value matrix represents the student's mastery of each concept. The key value memory is added with a memory matrix on the basis of a standard cyclic neural network, so that the network is allowed to reserve a plurality of hidden state vectors, and the memory capacity of the network is improved. However, the knowledge tracking method of the memory enhancement network does not mine and utilize the relation among concepts, and the influence of the problem correlation and the concept correlation on the time dependency relation of the sequence is ignored.

3. Knowledge tracking method of graph neural network

The knowledge tracking method of the graph neural network models the knowledge tracking problem by using a conceptual graph structure, and re-expresses the knowledge tracking problem as a time sequence node class classification problem. In the graph structure, nodes represent concepts and edges represent relationships between concepts. The knowledge tracking method of the graph neural network simulates a knowledge transfer theory in the education field, and when a learner learns a concept, the mastery degree of the existing concept is changed, and the mastery degree of the related concept is also changed. In this type of approach, however, the relationships between concepts are simply modeled as bi-directional dependencies rather than precondition relationships that are more realistic; and the construction of the edges is based on statistical analysis of the data, which is not effective in the case of limited data.

In summary, the existing deep learning knowledge tracking method mainly has the following problems: (1) The conceptual relationships are modeled as simple bi-directional relationships and require statistical analysis of large-scale data; (2) The effect of problem correlation and concept correlation on the time dependence of the sequence is ignored.

Disclosure of Invention

The invention aims to provide a double-view knowledge tracking method for concept relationship reasoning, aiming at the problem that the existing knowledge tracking method does not fully mine concept association relations and knowledge state time dependence.

The design principle of the invention is as follows: firstly, constructing a concept relation matrix and a knowledge state matrix through a causal GRU based on learner answer records; then constructing a knowledge state diagram based on the two matrixes, extracting neighbor node characteristics by using a diagram convolution network, and obtaining knowledge state diagram embedding of a concept association view angle; meanwhile, weighting and summing the knowledge state matrix sequences according to the concept relation matrix to obtain a historical knowledge state matrix of the time-dependent view angle; finally, combining with the new question concept embedding, predicting whether the learner answers correctly.

The technical scheme of the invention is realized by the following steps:

step 1, creating a learnable concept relation matrix which represents precondition relation of concepts; multiplying the ownership of the GRU by the concept relationship matrix to generate a causal GRU.

And 2, generating embedded vectors corresponding to the indexes for the concepts and answers of the questions by using the learnable dense embedded matrix.

And 3, representing a single answer record of the learner by using the concept and the embedded vector of the answer, inputting the causal GRU to circularly update the knowledge state, and adjusting the parameters of the concept relation matrix according to the gradient information.

Step 4, constructing a knowledge state diagram based on the concept relation matrix and the current knowledge state matrix, and extracting neighbor node characteristics by using a directed graph convolution network to obtain knowledge state diagram embedding of the concept association view angle; and weighting and summing the knowledge state matrix sequences according to the concept relation matrix to obtain a historical knowledge state matrix of the time dependent view angle.

Step 4.1, using the concept as a node of the graph; splitting the current knowledge state matrix into knowledge state vectors of all concepts as node characteristics; constructing unidirectional edges among nodes according to the concept relation matrix, and constructing a knowledge state diagram; and extracting features of the knowledge state diagram by using a directed graph convolution network to obtain knowledge state diagram embedding of the concept association view angle.

Step 4.2, multiplying the one-hot vector of the concept index of the current problem with the concept relation matrix to obtain the influence vector of other concepts on the current concept; multiplying the one-hot vector sequence with each item in the concept index of the previous problem to obtain an influence weight sequence of the previous knowledge state matrix sequence on the current knowledge state matrix; and weighting and summing the previous knowledge state matrix sequences by using the influence weight sequences to obtain a historical knowledge state matrix of the time-dependent view.

And 5, combining knowledge state diagram embedding, historical knowledge state matrix embedding and new problem concept embedding, and obtaining the probability of answering a new problem by a learner through a fully connected network and a Sigmoid activation function.

Advantageous effects

Compared with a knowledge tracking method of a cyclic neural network and a knowledge tracking method of memory enhancement, the method can infer and utilize the precondition relation among concepts, model the influence among knowledge state vectors of the concepts, and consider the influence of the precondition relation among the concepts on the time dependency relation of the sequence.

Compared with a knowledge tracking method of a graph neural network, the method uses an end-to-end form to automatically infer the precondition relationship among concepts, so that the problem of poor modeling effect of the concept relationship caused by insufficient statistical data quantity is avoided; and the capturing capability of time dependence of the sequence is enhanced by combining the cyclic neural network.

Drawings

FIG. 1 is a schematic diagram of a method for tracking knowledge from two perspectives for concept relationship reasoning in the present invention.

Detailed Description

For a better illustration of the objects and advantages of the invention, embodiments of the method of the invention are described in further detail below in conjunction with examples.

Experimental data comes from published educational data sets including online coaching platform annual statistics data sets ASSISTments2009 and ASSISTments2015, and the karma university Statics class data set Statics2011. The dataset is divided into: 20% of the samples were used as test sets, 80% as training-validation sets, and 80% of the samples were used as training sets and 20% as validation sets. The basic properties of the individual data sets are shown in table 1.

TABLE 1 knowledge tracking experimental data

In the experimental process, a binary cross entropy loss function is used as a model optimization target, an Adam optimizer is used as a model optimization method, and the embedding dimension and the hidden state dimension are both set to 128.

The experiment adopts the prediction Accuracy (ACC) and the Area (AUC) of the lower part of the ROC curve as evaluation indexes of knowledge tracking prediction results, and the calculation formula of the ACC is as follows:

where N is the total number of experimental test samples and M is the number of samples for which the model predicts correctly.

The calculation formula of AUC is as follows:

where P is the number of positive samples (i.e., the answer label is the correct sample), N is the number of negative samples (i.e., the answer label is the wrong sample), P _i Representing the predictive probability of the model to the ith positive sample, n _j Representing the model's predictive probability for the j-th negative sample.

The experiment is run on a server of a Linux Ubuntu 64-bit operating system, the CPU is Intel (R) Xeon (R) Gold 6248R, the RAM is 128GB, and the GPU is RTX 4090.

The specific flow of the experiment is as follows:

Step 1.1, for a random positive parameter matrixThe double random matrix P is obtained using the Sinkhorn algorithm. The double random matrix is a differentiable approximation of the permutation matrix, and multiplying the matrix by the permutation matrix permutes the order of the columns or rows of the matrix, so that P can rank the concepts sequentially and participate in the gradient computation.

Step 1.2, triangular matrix under random positive parametersUsing element-by-element Sigmoid, a lower triangular matrix L of elements approaching 0 or 1 is obtained, which represents the precondition relationships between the ordered concepts. If the value of the element of L in the ith row and jth column is close to 1 and i > j, then concept j is a prerequisite for concept i.

Step 1.3, the calculation formula of the conceptual relation matrix M is as follows:

M＝PLP ^T (6)

wherein P is ^T Is the transpose of the dual random matrix P.

Because M is a matrix of two random positive parametersAnd->Calculated, its parameters can be learned by gradient descent.

And step 1.4, multiplying all weights of the common GRU with the concept relation matrix M element by element to become a causal GRU. Finally, the implementation formula of the causal GRU unit is as follows:

z _t ＝Sigmoid(M⊙W _z ·[h _t-1 ，x _t ]+b _z ) (7)

r _t ＝Sigmoid(M⊙W _r ·[h _t-1 ，x _t ]+b _r ) (8)

wherein z is _t 、r _t 、And h _t Representing an update gate, a reset gate, a candidate hidden state and a hidden state, respectively, W _z 、W _r And W is a learnable weight matrix parameter, b _z 、b _r And b is a learnable bias parameter, x _t Indicating the current input, +..

And 3.1, splicing the concept of the question in the answer record at the current moment and the embedded vector of the answer, and inputting the causal GRU.

x _t ＝Concat(c _t ，a _t ) (11)

h _t ＝GRU _c (x _t ，h _t-1 ) (12)

Wherein c _t And a _t Embedded vectors respectively representing concepts and answers of current questions, concat representing splicing operation of vectors, GRU _c Representing a causal GRU calculation unit.

And 3.2, the concept relation matrix and the weight matrix parameters and the bias parameters of the causal GRU participate in gradient updating.

Step 3.3, extracting the hidden state matrix h _t As a current knowledge state matrix.

The following describes the computation of the directed graph rolling network.

Step 4.1.1, calculating a directed graph adjacency matrix A by a conceptual relation matrix M, wherein the calculation formula is as follows:

A＝f(Sigmoid(M ^T )) (13)

step 4.1.2, first and second order neighbor expressions of the directed graph are calculated. First-order proximity matrix A _F The calculation is as follows:

A _F (i，j)＝A ^sym (i，j) (15)

wherein A is _F (i, j) and A ^sym (i, j) each represents matrix A _F And A ^sym The element in the ith row and the jth column takes the value A ^sym Is in a symmetrical form of an adjacent matrix A, and the element value rule is as follows: as long as there is one edge slave node v _i To node v _j Or from node v _j To node v _i Then A ^sym (i, j) =1, otherwise a ^sym (i，j)＝0。

Second order input proximity matrixAnd second order output proximity matrix->The calculation formula of (2) is as follows:

wherein,and->Respectively represent matrix->And->The element in the ith row and the jth column takes the value A _m,n (m, n.e { i, j, v, k }) represents the value of the element of the adjacent matrix A in the nth row and the nth column, and k is the number of rows and columns of the adjacent matrix A.

Step 4.1.3, three kinds of transformation Z are respectively carried out on the directed graph _F 、And->

Wherein Θ is a linear transformation operation based on a learnable weight matrix parameter and a bias parameter;A _x +λI，i is an identity matrix>Representation->To the 1/2 th power of the inverse matrix of (x E { F, S) _in ，S _out -a }; x is node characteristic matrix, i.e. current knowledge state h _t 。

Proximity matrix A _x Corresponding degree matrix D _x The calculation rule is as follows:

wherein D is _x (i, i) and A _x (i, j) respectively represent matrix D _x And A _x The element in the ith row and jth column takes a value.

Step 4.1.4, directed graph rolling network outputThe method comprises the following steps:

where alpha and beta are learnable parameters,representing knowledge state graph embedding->The superscript (0) of (2) indicates the number of layers of the convolutional network, x ε { F, S } _in ，S _out }。

Step 4.2, multiplying the one-hot vector of the concept index of the current problem with the concept relation matrix to obtain an influence vector w of other concepts on the current concept _t ：

Wherein the method comprises the steps ofA one-hot vector representing the current concept index, M being a concept relationship matrix;

and then w is _t Multiplying each item in the one-hot vector sequence of the concept index of the previous problem to obtain an influence weight sequence W of the previous knowledge state matrix sequence on the current knowledge state matrix _t ：

Wherein the method comprises the steps of A one-hot vector representing a conceptual index of a previous ith problem;

weighting and summing the previous knowledge state matrix sequences by using the influence weight sequences to obtain a historical knowledge state matrix of the time dependent view angle

Wherein h is _i An i-th term representing a prior knowledge state matrix sequence.

Step 5, combining knowledge state diagram embedding, historical knowledge state matrix and new problem concept embedding, and obtaining probability of learner answering new problem through fully connected network and Sigmoid activation function

Wherein W is _pre And b _pre Respectively a weight and a bias parameter which can be learned,knowledge state diagram embedding representing the current moment +.>Representing a historical knowledge state matrix, c _t+1 Representing a new problem concept embedding.

Test results: the experiment is based on a conceptual relation reasoning double-view knowledge tracking method, and knowledge tracking prediction is carried out on ASSISTMENTS2009, ASSISTMENTS2015 and Statics2011 education data sets. The AUC and ACC on ASSISTents 2009 data set are respectively 83.41% and 76.97%, the AUC and ACC on ASSISTents 2015 data set are respectively 78.45% and 76.95%, the AUC and ACC on static 2011 data set are respectively 82.75% and 79.81%, and the method has good effect on knowledge tracking prediction tasks.

While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims

1. The double-view knowledge tracking method for concept relation reasoning is characterized by comprising the following steps of:

step 1, creating a learnable concept relation matrix which represents precondition relation of concepts; multiplying the ownership weight of the GRU by the concept relation matrix to generate a causal GRU;

step 2, generating embedded representations corresponding to the indexes for the concepts and answers of the questions by using a learnable dense embedded matrix;

step 3, representing a single answer record of a learner by using the concept and the embedded vector of the answer, inputting a causal GRU to update the knowledge state circularly, and adjusting parameters of a concept relation matrix according to gradient information;

step 4, constructing a knowledge state diagram based on the concept relation matrix and the current knowledge state matrix, and extracting neighbor node characteristics by using a directed graph convolution network to obtain knowledge state diagram embedding of the concept association view angle; weighting and summing the knowledge state matrix sequences according to the concept relation matrix to obtain a historical knowledge state matrix of the time dependent view angle;

2. The method for tracking the knowledge of the double view angle of concept relationship reasoning according to claim 1, wherein the method comprises the following steps: in the step 4, the concept is used as a node of the graph; splitting the current knowledge state matrix into knowledge state vectors of all concepts as node characteristics; constructing unidirectional edges among nodes according to the concept relation matrix, and constructing a knowledge state diagram; and extracting features of the knowledge state diagram by using a directed graph rolling network to obtain knowledge state diagram embedding of the concept association view angle.

3. The method for tracking the knowledge of the double view angle of concept relationship reasoning according to claim 1, wherein the method comprises the following steps: in step 4, the one-hot vector of the concept index of the current problem is related to the conceptMultiplying by the system matrix to obtain the influence vector of other concepts on the current conceptWherein->A one-hot vector representing the current concept index, M being a concept relationship matrix; and then w is _t Multiplying each item in the one-hot vector sequence of the concept index of the previous problem to obtain an influence weight sequence of the previous knowledge state matrix sequence on the current knowledge state matrix>Wherein-> A one-hot vector representing a conceptual index of a previous ith problem; weighting and summing the previous knowledge state matrix sequences by using the influence weight sequences to obtain a historical knowledge state matrix of the time dependent view angle> Wherein h is _i An i-th term representing a prior knowledge state matrix sequence.