CN113283488A

CN113283488A - Learning behavior-based cognitive diagnosis method and system

Info

Publication number: CN113283488A
Application number: CN202110542027.3A
Authority: CN
Inventors: 许斌; 毛亦铭
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2021-05-18
Filing date: 2021-05-18
Publication date: 2021-08-20
Anticipated expiration: 2041-05-18
Also published as: CN113283488B

Abstract

The embodiment of the invention provides a learning behavior-based cognitive diagnosis method and a system, wherein the method comprises the following steps: determining student numbers and answer numbers to be cognized and diagnosed, wherein the student numbers and the answer numbers correspond to student answers and corresponding video records contained in a learning course one by one; inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model; the diagnosis model is obtained based on test question samples, corresponding knowledge point labels and corresponding video samples and corresponding video label training; the diagnosis model is used for constructing a course graph based on the test question sample, the corresponding knowledge point label, the corresponding video sample and the corresponding video label, and conducting corresponding student cognitive diagnosis on the learning course to be subjected to cognitive diagnosis after node information of the course graph is updated through a graph neural network. The embodiment of the invention realizes the effective prediction of the knowledge level of the student.

Description

Learning behavior-based cognitive diagnosis method and system

Technical Field

The invention relates to the technical field of intelligent education, in particular to a learning behavior-based cognitive diagnosis method and system.

Background

Cognitive diagnosis is an important task proposed for online education scenes, aims to diagnose the knowledge level of students by utilizing the behaviors of the students on online education websites, and is one of key technologies for solving numerous intelligent applications (personalized test question recommendation and learning path planning) of online education.

The general cognitive diagnosis method relies on the question making records of students on an online learning platform, modeling is carried out on both the students and the test questions after the question making sequence and the score of each student are obtained, the students and the test questions are modeled in the modeling process, the answer results of the students are predicted, the student answer prediction is a two-classification problem, the model gradually optimizes parameters in the classification process, and the final student knowledge level mastering degree is obtained by outputting intermediate characteristic variables. Traditional cognitive diagnosis is divided into one-dimensional continuous models and high-dimensional discrete models. However, in the online course learning system, students not only answer questions but also learn knowledge in courses through videos on the system, and the existing method ignores the behavior of learning through videos. In addition, the existing method treats the test questions as independent individuals, and does not dig deep connection among the test questions, for example, the test questions with the same knowledge point are considered to have stronger correlation. Meanwhile, rich information contained in the online course structure is not well utilized.

Disclosure of Invention

The embodiment of the invention provides a learning behavior-based cognitive diagnosis method and system, which are used for solving the problems of part or all of the problems in the cognitive diagnosis method of the existing online education system.

In a first aspect, an embodiment of the present invention provides a cognitive diagnosis method based on learning behaviors, including:

determining student numbers and answer numbers to be cognized and diagnosed, wherein the student numbers and the answer numbers correspond to student answers and corresponding video records contained in a learning course one by one;

inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model;

the diagnosis model is obtained based on test question samples, corresponding knowledge point labels and corresponding video samples and corresponding video label training;

the diagnosis model is used for constructing a course graph based on the test question sample, the corresponding knowledge point label, the corresponding video sample and the corresponding video label, and conducting corresponding student cognitive diagnosis on the learning course to be subjected to cognitive diagnosis after node information of the course graph is updated through a graph neural network.

Preferably, the diagnostic model comprises a multi-vector model, a pre-processing model, a predictive model and a parameter update model;

inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model, wherein the method comprises the following steps:

inputting the student number and the answer number to be cognitively diagnosed into the multi-vector model, and outputting a plurality of vectors including a student knowledge level vector, a test question investigation knowledge point vector, a test question discrimination vector, a test question difficulty vector and a video difficulty vector;

inputting the vectors into the preprocessing model, and obtaining a preprocessing result according to the following formula:

wherein, F^knInvestigating knowledge point vectors for test questions, F^sAs student knowledge level vector, F^eDivide the vector for the test question F^dAs test question difficulty vector, F^vIs a video difficulty vector;

inputting the preprocessing result into the prediction model, and outputting a student answer score prediction value based on an interaction function of a fully-connected neural network;

and inputting the predicted values of the student answer scores into the parameter updating model, updating parameters by taking a cross entropy function constructed based on the predicted values of the student answer scores and the real values as a loss function through back propagation, and outputting the cognitive diagnosis results of the students after the loss function is converged.

Preferably, the multi-vector model comprises a trainable matrix of parameters of a student's knowledge level;

after the node information of the course graph is updated through the graph neural network, the node updating result H is updated according to the video sample and the test question sample^kRepartitioning of the video node representation matrix V^kAnd the test question node representation matrix E^k；

The student knowledge level vector is obtained based on the student number and a trainable parameter matrix of the student knowledge level, and the formula is as follows:

F^s＝sigmoid(x^s×B)；

the test question investigation knowledge point vector is obtained based on the answer number and the incidence matrix of the test questions and the knowledge points, and the formula is as follows:

F^kn＝x^e×Q；

the test question discrimination vector is obtained based on the answer number and the trainable parameter matrix of the test question discrimination capability, and the formula is as follows:

F^e＝sigmoid(x^e×D)；

the test question difficulty vector is obtained based on the answer number and the test question node expression matrix, and the formula is as follows:

F^d＝sigmoid(x^e×E^k)；

the video difficulty vector is obtained based on the answer number and the video node expression matrix, and the formula is as follows:

F^v＝sigmoid(x^e×V^k)；

wherein x is^SDenotes student number, x^eThe method comprises the steps of representing test question numbers, B representing a trainable parameter matrix of student knowledge levels, Q representing an incidence matrix of test questions and knowledge points, D representing a trainable parameter matrix of test question distinguishing capacity, sigmoid being an activation function and used for mapping student knowledge level vectors, test question distinguishing vectors, test question difficulty vectors and video difficulty vectors to be between 0 and 1, and E^kAnd V^kAnd respectively representing a matrix for the test question nodes and a matrix for the video nodes.

Preferably, the interaction function of the fully-connected neural network is as follows:

f₁＝ReLU(W₁×x^T+b₁)，

f₂＝ReLU(W₂×f₁+b₂)，

y＝sigmoid(W₃×f₂+b₃)；

wherein x is the result of the pretreatment, W₁、W₂And W₃Respectively, input product parameters of each layer of the fully-connected neural network, b₁、b₂And b₃The bias parameters are respectively the bias parameters of each layer of the fully-connected neural network, ReLU is an activation function and is used for increasing the nonlinear mapping of the fully-connected neural network, and sigmoid is an activation function and is used for mapping the student answer score predicted value between 0 and 1.

Preferably, the cross entropy function constructed by the predicted student answer score value and the real student answer score value is as follows:

wherein r is_iFor the real value of student's answer score of the ith test question, y_iAnd (4) predicting the answer score of the student of the corresponding ith test question.

Preferably, the step of constructing the curriculum schedule based on the test question sample, the corresponding knowledge point label, the corresponding video sample and the corresponding video label comprises the following steps:

taking the test question sample and the corresponding video sample as course elements w_iConstruct course element set W ═ { W ═ W_i∣w_i∈C_j,C_jE.g., M }, and all course elements w in the course element set_iNodes as a course graph; wherein M represents the curriculum corpus, C_iRepresenting a course;

according to the question sampleAnd marking the knowledge points to obtain marked knowledge points, performing character string matching on the marked knowledge points and the same knowledge points in the subtitles of the corresponding video samples to generate video marks, and constructing an incidence matrix Q of the test questions and the knowledge points based on the course element set, wherein the incidence matrix Q is { Q ═ Q { (Q) } of the test questions and the knowledge points_ij} | W | × | Kn |; wherein the content of the first and second substances,

kn is the set of knowledge points, k_iIs a marked knowledge point, Q_ij1 denotes course element w_iContaining knowledge points k_jElse, course element w_iDoes not contain knowledge point k_j，Kr＝[(k_i,k_j)]Kr is a set of knowledge point relationships, k_iIs k_jFirstly, correcting knowledge points;

inputting a word vector model into the text of the test question sample and the subtitle of the video sample, and obtaining vector representation of the text of the test question sample and the subtitle of the video sample as node characteristics F based on the course element set;

and respectively obtaining an adjacent matrix based on course structure information and an adjacent matrix based on knowledge point association information by taking the distance and the strength of the connection as weights according to the distance between the course elements and the connection between the knowledge points contained in the course elements as edges between the nodes, and merging and normalizing the adjacent matrix based on the course structure information and the adjacent matrix based on the knowledge point association information to obtain an adjacent matrix A of the nodes.

Preferably, the updating node information of the course graph through the graph neural network includes: and (3) performing iterative update on the nodes by adopting the following node update function:

where, σ is the activation function,

being self-connecting adjacency matricesA is an adjacency matrix of nodes, I is an identity matrix,

is that

The diagonal matrix of (a).

In a second aspect, an embodiment of the present invention provides a cognitive diagnosis system based on learning behaviors, including:

the system comprises a number determining unit, a learning course determining unit and a learning course judging unit, wherein the number determining unit is used for determining the number of students to be cognitively diagnosed and the number of answers, and the numbers of the students and the answers correspond to the answers of the students and corresponding video records contained in the learning course one by one;

the cognitive diagnosis unit is used for inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model;

In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the learning behavior-based cognitive diagnosis method according to any one of the above-mentioned first aspects when executing the program.

In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the learning behavior-based cognitive diagnosis method according to any one of the above-mentioned first aspect.

According to the learning behavior-based cognitive diagnosis method and system provided by the embodiment of the invention, the student cognitive diagnosis result output by the diagnosis model is obtained by inputting the student number and the answer number into the diagnosis model, the diagnosis model is obtained by modeling videos and test questions in the learning course corresponding to the student number and the answer number and training by adopting a graph neural network, and the knowledge level of students can be effectively predicted.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flow chart of a learning behavior-based cognitive diagnosis method provided by the invention;

FIG. 2 is a block diagram of a diagnostic model provided by the present invention;

FIG. 3 is a schematic structural diagram of a learning behavior-based cognitive diagnosis system provided by the present invention;

FIG. 4 is a schematic structural diagram of a cognitive diagnostic unit provided by the present invention;

fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The technical basis of the invention is as follows: traditional cognitive diagnosis is divided into one-dimensional continuous models and high-dimensional discrete models. The one-dimensional continuity model is represented by an Item Response Theory (IRT) model, the characteristics of each student And each test question are represented by one-dimensional variables, a logistic regression function is used for calculating the probability of the student answering the test question, the high-dimensional discrete model is represented by a Determini Inputs (Noisy-And gate (DINA) model, the high-dimensional vectors are used for representing the students, the student vectors correspond to the knowledge points, And the mastery degree of a specific knowledge point is represented by the value of each dimension. Meanwhile, a Q matrix is constructed to represent the correlation between the test questions and the knowledge points, and error parameters and guess parameters are introduced into the DINA model, so that the student portrait is better modeled. The method of the invention adopts a trainable matrix vector to model students and test questions, and uses an artificial neural network as an interaction function to predict the answer results of the students.

Knowledge point (Knowledge focus) is the content that is mainly taught in the course. For example, "red and black trees," "B-trees," "vectors" are knowledge points for the "data structure" class, and "memory management," "threads," "disk management" are knowledge points for the "operating system" class. It is formally defined as

Where Kn is the set of knowledge points, k_iIs a specific knowledge point.

The Knowledge Component (Knowledge Component) is defined as a graph composed of Knowledge points, and includes the Knowledge points and the relationships between the Knowledge points. Its formal definition may be expressed as K ═ Kn, Kr, where

Indicates a knowledge point, Kr ═ k [ ("k")_i,k_j)]Representing the relation of knowledge points, the method only considers one relation of sequential revision relations, i.e. if k_iIs k_jFirst repair knowledge point of (k)_i,k_j) In Kr.

Course (Course), defined as the Course of on-line education, consisting of video and test questions, formally defined as

Wherein C is_iThe course is represented by a presentation of the lesson,

representing the jth node in the course. t is t_ijRepresenting the node type, including video and test question, kc_ijRepresenting the set of knowledge points that the node contains, c_ijRepresenting the textual content represented by the node.

Curriculum corps, defined as a collection of multiple curriculums. Its formal definition can be expressed as

Wherein M represents the curriculum corpus, C_iRepresenting a course. For convenience of representation, all Elements in M are reintegrated into a Course element set (Course Elements), whose formal definition may be represented as W ═ { W ═ W_i∣w_i∈C_j,C_jE.m, where W represents the set.

And the Q matrix (Q-matrix) is defined as an association matrix of the test questions and the knowledge points. Its formal definition is Q ═ Q_ij} | W | × | Kn |, where Q_ij1 denotes course element W_iContaining knowledge points k_j。

The Course Graph (Course Graph) is defined as a Graph formed by the questions and videos in the Course corpus, and is formally defined as G ═ a, F, where G denotes the Graph, F denotes the node characteristics, and a denotes the adjacency matrix of the nodes.

The learning behavior-based cognitive diagnosis method and system provided by the invention are described below with reference to fig. 1 to 5.

The embodiment of the invention provides a learning behavior-based cognitive diagnosis method. Fig. 1 is a schematic flowchart of a learning behavior-based cognitive diagnosis method according to an embodiment of the present invention, as shown in fig. 1, the method includes:

step 110, determining the serial numbers of students to be cognized and diagnosed and the serial numbers of answers, wherein the serial numbers of the students and the serial numbers of the answers correspond to the answers of the students and corresponding video records contained in the learning course one by one;

step 120, inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model;

Compared with the prior art, the method provided by the embodiment of the invention models the videos and the test questions in the learning course, establishes deep connection between the test questions by using the course structure information and the knowledge point correlation information, adopts the graph neural network for training, provides a new data set aiming at the scene, and obtains the best result on the student answer prediction task by using the data set. When the embodiment of the invention is used for carrying out the graph neural network training, the output vector length is aligned with the vector length representing the knowledge level of the student, so that the subsequent calculation is convenient, and simultaneously, each dimension of the vector can be aligned with a knowledge point in the practical sense to represent the difficulty of investigating the knowledge point, so that the model has better interpretability.

Based on any of the above embodiments, as shown in fig. 2, the diagnosis model includes a multi-vector model, a preprocessing model, a prediction model and a parameter updating model;

inputting the vectors into the preprocessing model, and obtaining a preprocessing result according to the following formula (1):

Specifically, the three-layer fully-connected neural network is used as an interaction function, the student numbers and the answer test question numbers are input, and a plurality of constructed vectors are obtained. Firstly, preprocessing a plurality of vectors, taking a preprocessing result as the input of a three-layer fully-connected neural network, outputting the prediction probability of a diagnosis model to a test question and answer pair through an interactive function of the fully-connected neural network, then using a cross entropy function as a loss function, updating model parameters through back propagation, and taking out a student knowledge level matrix B after a loss value is converged, namely, a cognitive diagnosis result.

In any of the above embodiments, the multi-vector model comprises a trainable parameter matrix of student knowledge levels;

The student knowledge level vector is obtained based on the student number and a trainable parameter matrix of the student knowledge level, and formula (2) is as follows:

F^s＝sigmoid(x^s×B)； (2)

the test question investigation knowledge point vector is obtained based on the answer number and the incidence matrix of the test question and the knowledge point, and the formula (3) is as follows:

F^kn＝x^e×Q； (3)

the test question discrimination vector is obtained based on the answer number and the trainable parameter matrix of the test question discrimination capability, and the formula (4) is as follows:

F^e＝sigmoid(x^e×D)； (4)

the test question difficulty vector is obtained based on the answer number and the test question node expression matrix, and the formula (5) is as follows:

F^d＝sigmoid(x^e×E^k)； (5)

the video difficulty vector is obtained based on the answer number and the video node expression matrix, and a formula (6) is as follows:

F^v＝sigmoid(x^e×V^k)； (6)

Specifically, a trainable parameter matrix B representing the knowledge level of students and a trainable parameter matrix D representing the discrimination ability of test questions are firstly constructed, and specifically, each knowledge level vector of students can be represented as F^s＝sigmoid(x^s×B)，x^SNumber for indicating studentSigmoid is an activation function, and the student knowledge level vector is mapped between 0 and 1. Simultaneously, the obtained test question nodes are used for representing a matrix E^kObtaining a test question difficulty vector: f^d＝sigmoid(x^e×E^k)，x^eAnd expressing test question numbers, using sigmoid as an activation function, and mapping the test question difficulty vectors to 0-1. The matrix V is expressed by using video nodes in the same way^kObtaining a video difficulty vector:

F^v＝sigmoid(x^e×V^k) And sigmoid is an activation function, and the video difficulty vector is mapped between 0 and 1. And (3) expressing the vector of the examination question investigation knowledge points by using a Q matrix: f^kn＝x^eXQ. Meanwhile, a test question discrimination vector is also defined: f^e＝sigmoid(x^eAnd multiplied by D), sigmoid is an activation function, and the test question discrimination vector is mapped between 0 and 1.

Based on any one of the above embodiments, the interaction functions (7), (8) and (9) of the fully-connected neural network are as follows:

f₁＝ReLU(W₁×x^T+b₁), (7)

f₂＝ReLU(W₂×f₁+b₂), (8)

y＝sigmoid(W₃×f₂+b₃)； (9)

wherein x is the result of the pretreatment, W₁、W₂And W₃Respectively, input product parameters of each layer of the fully-connected neural network, b₁、b₂And b₃And respectively the bias parameters of each layer of the fully-connected neural network, wherein ReLU is an activation function and is used for increasing the nonlinear mapping of the fully-connected neural network, and sigmoid is an activation function and is used for mapping the student answer score predicted value to be between 0 and 1.

Based on any one of the above embodiments, the cross entropy function (10) constructed by the predicted value and the true value of the student answer score is as follows:

Based on any of the above embodiments, the constructing of the course graph based on the test question sample and the corresponding knowledge point label and the corresponding video sample and the corresponding video label includes the following steps:

specifically, a student answer record and a watching video record are input, and all test questions and videos therein are extracted as elements of W.

Obtaining marked knowledge points according to the knowledge point marks corresponding to the test question samples, carrying out character string matching on the marked knowledge points and the same knowledge points in the subtitles of the corresponding video samples to generate video marks, and constructing an association matrix Q (Q) of the test question and the knowledge points based on the course element set_ij} | W | × | Kn |; wherein the content of the first and second substances,

kn is the set of knowledge points, k_iIs a marked knowledge point, Q_ij1 denotes course element w_iContaining knowledge points k_jElse, course element w_iDoes not contain knowledge point k_j，k_iIs k_jFirstly, correcting knowledge points;

specifically, knowledge point labeling is carried out on all test questions, and then character string matching is carried out on the labeled knowledge points and the same knowledge points in the video captions automatically to generate video labels so as to construct a Q matrix with a label structure. The curriculum graph is constructed using the W and Q matrices described above, using all elements in W as nodes in G. When the student question making result is predicted, only the videos related to the question containing the knowledge points are selected for interaction, and interference of a large number of other videos on the question making result is avoided.

specifically, a pre-training model BERT-Chinese is used as a word vector model, video subtitles and test question texts are input, and output vector representation is used as a node feature F: BERT (text (w)).

And respectively obtaining an adjacent matrix based on course structure information and an adjacent matrix based on knowledge point association information by taking the distance and the strength of the connection as weights according to the distance between the video samples and the connection between the knowledge points contained in the video samples, and merging and normalizing the adjacent matrix based on the course structure information and the adjacent matrix based on the knowledge point association information to obtain an adjacent matrix A of the nodes.

Specifically, the method of the embodiment of the invention constructs the adjacency matrix A based on the course structure information and the knowledge point association information. The following two auxiliary functions (11) and (12) are first defined:

wherein, MS (w)_i,w_j) 1 represents w_iAnd w_jBelonging to the same course, MC (w)_i) N represents w_iIs element number n in the course.

Adjacency matrix weight A based on course structure information^SThe calculation formula (13) is as follows:

denotes w_iAnd w_jBased on the weight of the course structure, the calculation method comprises the following steps:

if w is_iAnd w_jAre elements in the same lesson and their distance in the lesson is less than lambda, then the weight is the reciprocal of their distance, otherwise 0.λ is an artificially defined hyper-parameter.

Adjacency matrix weight A based on knowledge point association information^KThe calculation formula (14) is as follows:

wherein the content of the first and second substances,

denotes w_iAnd w_jBased on the weight of the associated information of the knowledge points, if w_iAnd w_jHaving the same knowledge point, then

Is 1 if w_iAnd w_jIf the owned knowledge points have sequential revision relationship, then

Is alpha, otherwise 0. Alpha is an artificially defined hyper-parameter.

Finally merge A^SAnd A^KObtaining A: a ═ Normalization (A)^S+A^K). Wherein Normalization is a Normalization function.

Based on any embodiment of the foregoing, the performing node information update on the course graph through the graph neural network includes: the nodes are iteratively updated using a node update function (15) as follows:

where, σ is the activation function,

is a self-connected adjacency matrix, A is an adjacency matrix of nodes, I is an identity matrix,

is that

The diagonal matrix of (a).

Specifically, the nodes are updated by using the graph convolution neural network on the constructed curriculum graph G. The node update function of the conventional graph neural network is: h^k＝M(A,H^k-1；θ^k) Where M is the information transfer function, θ^kIs a trainable parameter matrix H^kIs a vector representation of the nodes over k iterations. For the graph convolution neural network, the specific node update function is:

where σ is the activation function, here ReLU is chosen,

is a contiguous matrix of self-connecting,

is that

The diagonal matrix of (a). After the updating is finished, the H is re-divided according to the video and the test questions^kRepresenting a matrix V for video nodes^kAnd the test question node representation matrix E^k。

The learning behavior-based cognitive diagnosis system provided by the invention is described below, and the learning behavior-based cognitive diagnosis method described below and the learning behavior-based cognitive diagnosis method described above can be referred to correspondingly.

Fig. 3 is a schematic structural diagram of a learning behavior-based cognitive diagnosis system according to an embodiment of the present invention, as shown in fig. 3, the system includes a number determination unit 310 and a cognitive diagnosis unit 320;

a number determining unit 310, configured to determine a student number and an answer number to be subjected to cognitive diagnosis, where the student number and the answer number correspond to student answers and corresponding video records included in a learning course one to one;

the cognitive diagnosis unit 320 is used for inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model;

Compared with the prior art, the system provided by the embodiment of the invention models videos and test questions in a learning course, establishes deep connection between the test questions by using course structure information and knowledge point correlation information, adopts a graph neural network for training, provides a new data set aiming at the scene, and obtains the best result on a student answer prediction task by using the data set. When the embodiment of the invention is used for carrying out the graph neural network training, the output vector length is aligned with the vector length representing the knowledge level of the student, so that the subsequent calculation is convenient, and simultaneously, each dimension of the vector can be aligned with a knowledge point in the practical sense to represent the difficulty of investigating the knowledge point, so that the model has better interpretability.

Based on any of the above embodiments, as shown in fig. 4, the cognitive diagnosis unit includes a multi-vector module 410, a preprocessing module 420, a prediction module 430, and a parameter update module 440;

the multi-vector module 410 is configured to input the student number and the answer number to be subjected to cognitive diagnosis, and output a plurality of vectors including a student knowledge level vector, a test question investigation knowledge point vector, a test question discrimination vector, a test question difficulty vector, and a video difficulty vector;

the preprocessing module 420 is configured to input the vectors into the preprocessing model, and obtain a preprocessing result according to the following formula (16):

the prediction module 430 is configured to input the preprocessing result and output a predicted student answer score value based on an interaction function of a fully-connected neural network;

the parameter updating module 440 is configured to input the predicted student answer score values, update parameters by back propagation by using a cross entropy function constructed based on the predicted student answer score values and the actual values as a loss function, and output a student cognitive diagnosis result after the loss function converges.

In any of the above embodiments, the multi-vector module comprises a trainable parameter matrix of student knowledge levels;

The student knowledge level vector is derived based on the student number and a trainable parameter matrix of the student knowledge level, and formula (17) is as follows:

F^s＝sigmoid(x^s×B)； (17)

the test question investigation knowledge point vector is obtained based on the answer number and the incidence matrix of the test question and the knowledge point, and the formula (18) is as follows:

F^kn＝x^e×Q； (18)

the test question discrimination vector is obtained based on the answer number and the trainable parameter matrix of the test question discrimination capability, and a formula (19) is as follows:

F^e＝sigmoid(x^e×D)； (19)

the test question difficulty vector is obtained based on the answer number and the test question node expression matrix, and a formula (20) is as follows:

F^d＝sigmoid(x^e×E^k)； (20)

the video difficulty vector is obtained based on the answer number and the video node expression matrix, and the formula (21) is as follows:

F^v＝sigmoid(x^e×V^k)； (21)

Based on any one of the above embodiments, the interaction functions (22), (23) and (24) of the fully-connected neural network are as follows:

f₁＝ReLU(W₁×x^T+b₁)， (22)

f₂＝ReLU(W₂×f₁+b₂)， (23)

y＝sigmoid(W₃×f₂+b₃)； (24)

Based on any one of the above embodiments, the cross entropy function (25) constructed by the predicted value and the true value of the student answer score is as follows:

obtaining marked knowledge points according to the knowledge point labels corresponding to the test question samples, performing character string matching on the marked knowledge points and the same knowledge points in the video captions to generate video labels, and constructing an incidence matrix Q (Q) of the test questions and the knowledge points based on the course element set_ij} | W | × | Kn |; wherein the content of the first and second substances,

Based on any embodiment of the foregoing, the performing node information update on the course graph through the graph neural network includes: the nodes are iteratively updated using a node update function (26) as follows:

where, σ is the activation function,

is that

The diagonal matrix of (a).

For a specific example of practical application, for the learning behavior-based cognitive diagnosis method and system provided by the present invention, since no public data set can provide records for students to watch online videos, data of 12 computer courses are extracted from the large-scale mule course database moococube, in order to ensure the quality of data, records of students with the number of answers less than 8 or with the number of wrong answers less than 2 are filtered, and records of answers and questions are sampled to balance positive and negative examples, and finally 271960 learning records are obtained, including 2093 students, 519 trials, 857 videos and 101 marked knowledge points. The results of the experiment are shown in table 1 below:

TABLE 1

From the above table 1, compared with other methods, the method and the system provided by the invention have the best results in the three indexes of accuracy, root mean square error and AUC, and the effectiveness of the method is fully proved.

Aiming at the learning behaviors of students on a admiration course website, the invention firstly proposes that the knowledge state of the students is individually modeled by combining the video learning behavior records of the students with the answer records. The learning behavior of the student is characterized to more perfectly represent the knowledge state of the student. And automatically labeling the related knowledge points of the videos watched by the students, modeling according to the time sequence, and combining the current question making records to give more accurate horizontal diagnosis results of the students.

Aiming at the problem that test questions are independent, the invention designs a course graph to mine rich information in test question texts and course structures. All videos and test questions are taken as nodes on the graph, video subtitles and test question texts are taken as node characteristics through a word vector model, course structures and knowledge point correlation are taken as edge weight basis, and better characteristic representation is obtained through a graph neural network.

The invention provides a cognitive diagnosis data set based on a real admiration platform, which removes records with the number of answers less than eight for students when constructing the data set, avoids the records of students with high lesson leaving rate, and simultaneously ensures that at least two wrong answers are provided in the answers to balance positive and negative examples of samples. The data set comprises complete student watching videos and answer records, and a large number of experiments prove that the method provided by the invention can more effectively predict the knowledge level of students and the output has interpretability.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor)510, a communication Interface (Communications Interface)520, a memory (memory)530 and a communication bus 540, wherein the processor 510, the communication Interface 520 and the memory 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform a learning behavior-based cognitive diagnostic method comprising: determining student numbers and answer numbers to be cognized and diagnosed, wherein the student numbers and the answer numbers correspond to student answers and corresponding video records contained in a learning course one by one; inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model; the diagnosis model is obtained based on test question samples, corresponding knowledge point labels and corresponding video samples and corresponding video label training; the diagnosis model is used for constructing a course graph based on the test question sample, the corresponding knowledge point label, the corresponding video sample and the corresponding video label, and conducting corresponding student cognitive diagnosis on the learning course to be subjected to cognitive diagnosis after node information of the course graph is updated through a graph neural network.

Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, an embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer can execute the learning behavior-based cognitive diagnosis method provided by the above methods, where the method includes: determining student numbers and answer numbers to be cognized and diagnosed, wherein the student numbers and the answer numbers correspond to student answers and corresponding video records contained in a learning course one by one; inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model; the diagnosis model is obtained based on test question samples, corresponding knowledge point labels and corresponding video samples and corresponding video label training; the diagnosis model is used for constructing a course graph based on the test question sample, the corresponding knowledge point label, the corresponding video sample and the corresponding video label, and conducting corresponding student cognitive diagnosis on the learning course to be subjected to cognitive diagnosis after node information of the course graph is updated through a graph neural network.

In yet another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the learning behavior-based cognitive diagnosis methods provided above, the method including: determining student numbers and answer numbers to be cognized and diagnosed, wherein the student numbers and the answer numbers correspond to student answers and corresponding video records contained in a learning course one by one; inputting the student number and the answer number to be subjected to cognitive diagnosis into a diagnosis model to obtain a student cognitive diagnosis result output by the diagnosis model; the diagnosis model is obtained based on test question samples, corresponding knowledge point labels and corresponding video samples and corresponding video label training; the diagnosis model is used for constructing a course graph based on the test question sample, the corresponding knowledge point label, the corresponding video sample and the corresponding video label, and conducting corresponding student cognitive diagnosis on the learning course to be subjected to cognitive diagnosis after node information of the course graph is updated through a graph neural network.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A learning behavior-based cognitive diagnostic method, comprising:

2. The learning behavior-based cognitive diagnosis method according to claim 1, wherein the diagnosis model comprises a multi-vector model, a preprocessing model, a prediction model and a parameter update model;

3. The learning behavior-based cognitive diagnostic method of claim 2, wherein the multi-vector model comprises a trainable parameter matrix of student knowledge levels;

F^s＝sigmoid(x^s×B)；

F^kn＝x^e×Q；

F^e＝sigmoid(x^e×D)；

F^d＝sigmoid(x^e×E^k)；

F^v＝sigmoid(x^e×V^k)；

4. The learning behavior-based cognitive diagnostic method of claim 2, wherein the interaction function of the fully-connected neural network is as follows:

f₁＝ReLU(W₁×x^T+b₁)，

f₂＝ReLU(W₂×f₁+b₂)，

y＝sigmoid(W₃×f₂+b₃)；

5. The learning behavior-based cognitive diagnosis method according to claim 2, wherein the cross entropy function constructed by the predicted student answer score and the actual student answer score is as follows:

6. The learning behavior-based cognitive diagnosis method according to claim 1, wherein the step of constructing a curriculum schedule based on the test question samples and corresponding knowledge point labels and corresponding video samples and corresponding video labels comprises the following steps:

obtaining marked knowledge points according to the knowledge point marks corresponding to the test question samples, carrying out character string matching on the marked knowledge points and the same knowledge points in the subtitles of the corresponding video samples to generate video marks, and constructing an incidence matrix Q (Q) of the test question, the video and the knowledge points based on the course element set_ij} | W | × | Kn |; wherein the content of the first and second substances,

7. The learning behavior-based cognitive diagnosis method according to claim 1 or 3, wherein the node information updating of the curriculum schedule through the graph neural network comprises: and (3) performing iterative update on the nodes by adopting the following node update function:

where, σ is the activation function,

is that

The diagonal matrix of (a).

8. A learning behavior-based cognitive diagnostic system, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the learning behavior based cognitive diagnostic method according to any one of claims 1 to 7 are implemented when the program is executed by the processor.

10. A non-transitory computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the learning behavior based cognitive diagnostic method according to any one of claims 1 to 7.